On the impact of pull request decisions on future contributions Damien Legay Alexandre Decan Tom Mens Software Engineering Lab Software Engineering Lab Software Engineering Lab University of Mons University of Mons University of Mons Mons, Belgium Mons, Belgium Mons, Belgium damien.legay@umons.ac.be alexandre.decan@umons.ac.be tom.mens@umons.ac.be Abstract—The pull-based development process has become tribute to the project directly; and the pull request (PR) prevalent on platforms such as GitHub as a form of distributed approach where only project integrators are allowed to do so. software development. Potential contributors can create and With the PR approach, external contributions are managed submit a set of changes to a software project through pull requests. These changes can be accepted, discussed or rejected by indirectly: would-be contributors create a fork of the repository the maintainers of the software project, and can influence further and, once they have addressed an issue or lack in the project, contribution proposals. As such, it is important to examine the they request for their modifications to be “pulled” to the practices that encourage contributors to a project to submit pull repository by submitting a pull request. The project integrators requests. Specifically, we consider the impact of prior pull re- can decide to approve these PRs, which are then merged quests on the acceptance or rejection of subsequent pull requests. We also consider the potential effect of rejecting or ignoring pull into the main project’s codebase. PRs are extremely valuable, requests on further contributions. In this preliminary research, as they represent a major part of the project’s continued we study three large projects on GitHub, using pull request data evolution and expansion. It is therefore important to incentivise obtained through the GitHub API, and we perform empirical people to create pull requests, thereby contributing to the analyses to investigate the above questions. Our results show that project. Previous studies have attempted to identify the factors continued contribution to a project is correlated with higher pull request acceptance rates and that pull request rejections lead to influencing whether and when a PR will be merged [1]–[3]. fewer future contributions. We expand upon this work, by focusing on determining those patterns of PR-acceptance behaviour that are indicative I. I NTRODUCTION of continued contribution. Our working hypothesis is that peo- ple contributing to a project repository through PRs may get The turn of the century saw the rise of version control demotivated (and hence stop contributing) if their submitted systems (VCS) to support large-scale software engineering PRs get rejected too often, or if too many of them are left open projects. Centralised VCS (e.g. CVS and Subversion) allow without any decision to merge them. Evolutionary insights in developers to share a common repository. Decentralised ones such phenomena may help us to understand which of such (e.g., Mercurial and git) allow each developer to own a local factors tend to dissuade people to keep contributing to a given copy of the repository containing the full change history. project. To this extent, we quantitatively study the following This enables collaborative (often geographically distributed) research questions using techniques based on survival analysis: software development on an hitherto unmatched scale. It has RQ1 : How are PR acceptance and rejection rates influenced given birth to extremely popular online hosting platforms such by previous PRs? As a contributor accrues familiarity with a as GitHub, BitBucket and Mozdev, allowing thousands of project, he becomes more able to contribute effectively, which people to remotely work together on the same projects. These we expect to result in a lower PR rejection rate. Similarly, platforms provide additional features on top of their underlying as integrators become more acquainted with a contributor, VCS to further support distributed collaborative development. they may develop a favourable bias towards his PRs, further Examples of such features are issue tracking, code review, decreasing rejection rates. integrated discussions, team management, documentation & RQ2 : To which extent does PR acceptance or rejection wiki and integration with external tools. influence further contributions? When his PRs are rejected, Today, git has become the most popular distributed VCS by a developer could become discouraged and stop, temporarily a large margin1 . It will thereby be the focus of our current or permanently, contributing to the project as a result. research. git supports two types of development processes:the RQ3 : To which extent do PRs left open influence further shared repository approach, where all contributors are given contributions? A PR is sometimes left open for a long period, write access to the central repository and can therefore con- neither rejected nor merged into the core project. We posit 1 For anecdotal evidence, based on a 2016 survey with 881 votes, 87% this may constitute a form of ”soft” rejection, wherein the of responders identified git as their VCS of choice https://rhodecode.com/ integrators want to avoid alienating the contributor but do not insights/version-control-systems-2016 want to merge the PR. Seeing a large number of untreated PRs may send an implicit message to potential contributors programming language), the application domain of the project that the project integrators are unwilling or unable to process (e.g., the database application domain sees fewer merged PRs the volume of contributions they receive, and, therefore, that than the IDE domain), the maturity of a project (older projects their participation to the project would not be valued or useful. accept fewer PRs) and the number of developers on the project. To provide preliminary evidence for these RQs, we carry Tsay et al. [7] explored both technical and social factors out an empirical analysis on a large number of PRs in three that contribute to acceptance of PRs. They found that, although large, popular and long-lived projects on GitHub. We focus technical factors like the presence of tests in the PR and a small on GitHub because it is undoubtedly one of the largest and number of lines changed contribute to a higher probability most active online hosting services for git projects. of acceptance, social factors, such as whether the contributor follows the user that closes the PR, had stronger associations II. R ELATED W ORK to PR acceptance than technical ones. Several researchers have studied aspects related to the PR- Terrel et al. [8] established that PR acceptance is subject based software development process, either qualitatively or to a bias against women, when their gender is identifiable. quantitatively. Gousios and Zaidman proposed a PR dataset [4] Rastogi et al. [9] built upon the factors identified in [1] and including 900 projects and 350,000 PRs extracted using [7], adding information about the geographical location of GHTorrent. Through a mixed-method analysis of 291 GitHub contributors and integrators. They conclude that PR acceptance projects, Gousios et al. [1] established that the PR-based rate is higher when both contributor and integrator are from the development approach is used as frequently as the shared same country, with the exception of India, and that contributors repository approach on GitHub. They observed that most PRs from some countries (e.g., Switzerland and Japan) see their are short, receive few comments and are processed quickly. contributions more frequently accepted than contributors from They also found that most PR rejections are due to the other countries (e.g., China and Germany). distributed nature of the pull-based process (e.g., PRs that are already obsolete upon creation). III. M ETHODOLOGY In a follow-up work [3], they interviewed 645 contributors The main goal of our research is to study the longevity of to examine their work practices and identify the challenges PR-based contributions to large open source software projects. they face. They found that while contributors tend to check We focus on software development through GitHub, the if their intended contribution is already covered, they do largest and most active online hosting service for git projects. not communicate their intended contributions. Interviewed As of 2018-09-30, GitHub has hosted 96M+ repositories, contributors outlined that poor responsiveness on the part of 31M+ developers, and 200M+ PRs and about one third of integrators could be a barrier to attracting or retaining contrib- these repositories and PRs were created in the last 12 months.2 utors. Contributors also stated that it is hard to accept rejection For this exploratory research, we selected three case studies of their PRs, as rejected PRs could harm their reputation as of large open source git projects on GitHub. These projects developers. Conversely, it is hard for integrators to explain the have been obtained by convenience sampling. This method is reasons for rejecting PRs. Rejecting a PR without alienating acceptable for getting preliminary research insights, and will its contributor was already identified as a challenge of the be replaced in a later phase to obtain a bigger corpus that PR-based model [2]. In that paper, they evaluated PRs from covers a larger set of relevant projects. an integrator’s point of view by interviewing 749 project The main criteria for our selected sample were that the integrators in order to understand which criteria are used to projects should be representative of a typical PR-based soft- determine the quality of a PR and how they prioritise the ware development process. To do so, the projects needed to be evaluation of contributions. They found that most integrators mature (i.e., have a time span of several years), have an active decide to merge PRs based on project’s objectives, their quality development history with a huge number of commits and as measured by compliance to the project guidelines, test contributors, and of course contain a very large number of PRs, coverage and passing continuous integration checks. in order to be able to derive statistically significant results from Yu et al. [5] studied the factors that contribute to latency in their analysis. In addition to this, we selected projects written PR reviews, defining this latency as the “time interval between in three different languages to ensure sufficient diversity. The pull request creation and closing date”. They found that PR three selected projects are ansible, rails and kubernetes. latency is mainly affected by process-related factors such as Some of their characteristics are shown in Table I. whether a PR was assigned to a specific reviewer or not. They Because we have observed problems of missing or inconsis- also found that continuous integration is a dominant factor in tent data when using GHTorrent, we decided to extract the PR PR latency. data of the selected projects from GitHub repositories through Rahman and Roy [6] categorised the technical issues dis- the GitHub API directly. For each PR, the data contains cussed in PR comments and analysed information about information about the PR creation date, its status (accepted, projects and developers to obtain insights into PR acceptance rejected or open), its closing date (for accepted and rejected or rejection. They discovered that the rate of PR rejection PRs), the GitHub ID of its author and its PR number. This is highly correlated to the programming language used (e.g., Java PRs are more frequently rejected than PRs for the C 2 https://octoverse.github.com 2 PR number corresponds to a chronological ordering of issues TABLE II: Likelihood to contribute again opened in the repository, of which PRs are a subset. repository after acceptance after rejection ansible 85.6% 73.0% TABLE I: Project repository characteristics on 24/10/2018 rails 83.9% 69.2% kubernetes 95.5% 88.2% repository language start year #contributors #commits #PR ansible Python 2012 3930 40k 27k rails Ruby 2010 3683 70k 22k kubernetes Go 2014 1861 71k 42k projects, contributors are more likely to make subsequent PRs if their prior PRs were accepted. We then used the statistical technique of survival analysis IV. P RELIMINARY RESULTS (a.k.a. event history analysis) [10]. Given a specific “event RQ1 : How are PR acceptance and rejection rates influenced of interest” (in our case: acceptance or rejection of a PR), by previous PRs? survival analysis models the “time to event” data during a given observation period. Survival functions model the survival To answer RQ1 , we examined whether repeat contributions rate, i.e., the expected time duration until the event of interest impact a contributor’s PR acceptance rate. To that effect, for occurs. The models take into account the “censoring” of some each repository we analysed the PR acceptance rate in function observed subjects, either because they enter or leave the study of the number of submitted PRs by each contributor. during the observation period, or because the event of interest Figure 1 displays, for each positive integer threshold x was not observed for them during the observation period. between 1 and 250, the PR acceptance rate (blue curve) and A common non-parametric statistic used to estimate survival rejection rate (orange curve) considering the first x PRs of each functions is the Kaplan-Meier estimator [11]. contributor only, thereby discarding contributors having less We performed an analysis of the survival probability to than x PRs. The green curve shows the number of contributors submit a new PR in function of the time elapsed since the latest having submitted at least x PRs. Thresholds above 250 are submission (at that time) of a PR by the same contributor. In excluded due to the low fraction of contributors having that order to assess if the PR acceptance rate influences the delay many submissions: 0.33% for ansible, 0.15% for rails and for a contributor to submit new PRs, we considered three ac- 1.23% for kubernetes. ceptance rate classes: [0, 33%[, [33%, 67%[ and [67%, 100%]. One can observe in all three examined project repositories The survival curves are shown in Figure 3. We observe that, as contributors submit more PRs, their acceptance rates that the survival probability is higher for classes of higher increase significantly. Over the first 50 PRs, we observe a rise acceptance rate, regardless of the considered projects. For from 54.2% to 80.0% for ansible, from 61.3% to 81.4% for instance, after ten days, the probability to submit a new PR is rails, and from 49.1% to 74.3% for kubernetes. Beyond the 72.2% in Ansible if the acceptance rate is over 67%, while this 50 first PRs, all three projects saw continuous increase in PR probability drops to 52.9% if the acceptance rate is between acceptance rates as contributors submitted more PRs to them. 33% and 67%, and even to 31.5% if the acceptance rate is These results agree with prior findings by Tsay et al [7] below 33%. Similar patterns can be observed for the two other and Gousios et al [1], but are to be nuanced, given the projects. We carried out pairwise log-rank tests to compare rapid decrease in number of contributors as the threshold x whether statistically significant differences could be found increases. As a consequence, in Figure 2 we looked at the PR between the survival curves. The differences were statistically acceptance rate of all contributors, excluding the few that made confirmed at α = 0.01 (after a Bonferroni correction [12]), over 250 contributions. We observe that, while contributors i.e., the null hypotheses, assuming that the survival curves for with a high number of PRs tend to have a consistently high different acceptance rate classes are the same, were rejected. PR acceptance rate, the behaviour for contributors with few We performed a proportional hazards regression based on PRs is quite unpredictable: they can have either low or high Cox regression to determine to which extent the acceptance acceptance rates. Therefore, although the number of previous rate impacts the probability of further contribution [13]. The PRs influences acceptance rate, this can only be verified Cox regression is a method for investigating the effect of starting from a certain threshold of PRs, below which no several variables upon a specified event’s hazard rate. For this conclusion can be reached as to whether such an influence analysis, we included the following factors: the acceptance rate exists. of all prior PRs by the same contributor; the number of prior PRs made by this contributor, and the time elapsed since the RQ2 : To which extent does PR acceptance or rejection influ- contributor’s first PR (the contributor’s age). ence further contributions? While related work (e.g., [1]) has studied the impact of TABLE III: Influence of acceptance rate, number of PRs, and PR acceptance rate on future PR decision time, RQ2 focuses contributor age on the time required to submit a new PR. on the impact of PR acceptance rate on the likelihood of regression coefficients making further PRs. To do so, we compared the probability repository acceptance rate #prior PRs contributor age concordance ansible 0.5481 0.0038 -0.0009 0.689 to contribute again after either a rejected or an accepted PR. rails 0.4630 0.0047 -0.0015 0.728 The results are presented in Table II. In all three considered kubernetes 0.7455 0.0031 -0.0016 0.637 3 ansible rails kubernetes Fig. 1: Acceptance rate of the first x PRs of each contributor. ansible rails kubernetes Fig. 2: Acceptance rate of all PRs by contributor. Table III summarizes the results we obtained. The concor- V. T HREATS TO VALIDITY dance (fourth column) provides the goodness of fit of the model. It is comprised between 0 (perfect anti-concordance) A threat to the validity of this paper is the fact that we and 1 (perfect concordance). The table also reports the re- only selected three projects in this exploratory phase, so the gression coefficients for the three considered factors. These preliminary findings might not generalise to bigger sets of coefficients measure the magnitude of the impact of the afore- projects. Choosing only large, popular and mature projects are mentioned factors on the probability to submit a new PR. All also a source of bias, as Rahman and Roy [6] found that such these coefficients are statistically significant (p < 0.01 after factors affect PR acceptance rates. Bonferroni correction). Their values signify that an increase Another threat is that the PR status returned by the GitHub of one increment (10% in acceptance rate, 1 prior PR or 1 API does not necessarily correspond to the actual fate of the day since the contributor’s first PR) multiplies the probability PR in some repositories. One such case is homebrew-core, to submit a PR by a factor ecoefficient . For instance, in the case of where the policy of the repository is to close most PRs without Kubernetes: an increase of 10% in PR acceptance rate modifies merging them, but to integrate those they deem appropriate the probability to submit a new PR by a factor 1.0774, each through another mechanism, such as the integrators commit- prior PR by 1.0032 and each day since the contributor’s first ting the changes themselves.If analyses were to be applied PR by 0.9984. to this repository, the rate of acceptance would be artificially low due to that specific PR handling policy. Another example is that of angular, wherein PRs marked with specific tags RQ3 : To which extent do PRs left open influence further (“PR action: merge” and “PR target: *” where * represents contributions? one or more branch branches to merge the PR into) will have To provide insight into RQ3 , we looked at the proportion of their relevant code automatically integrated into the repository PRs that were ultimately accepted or rejected given the time through commits. Those PRs will appear to be rejected on it took to decide (the PR’s age). We excluded PRs that were GitHub, even though they aren’t. It would be possible to left open, since no decision has been reached for those. This recover the actual PR status based on those tags, which is is plotted in Figure 4. We notice that, the longer a PR remains not the case for homebrew-core. open, the higher the probability that it will be rejected. After a Yet another threat is tied to the way we have identified threshold x, PRs have a higher probability to be rejected than contributors. It has been reported that a single individual may accepted. The threshold is 28 days for ansible, 5 days for use multiple identities in different capacities or at different rails and 25 days for kubernetes. Presuming that contributors times on software repositories [14]–[16]. More specifically, it are aware of this phenomenon, we expect that they implicitly may be the case that the same author owns multiple GitHub consider PRs left open for a too long duration as being tacitly accounts, or that multiple authors contribute under the same rejected, producing effects similar to those identified in the GitHub account. In that case, we may have erroneously previous RQ. This preliminary result needs to be confirmed attributed the PRs to an incorrect identity. This could have with further analyses. affected our findings. Therefore, as future work, we aim to 4 ansible rails kubernetes Fig. 3: Survival curves for the probability to submit a next PR, grouped by acceptance rate classes. ansible rails kubernetes Fig. 4: Proportion of PRs that were ultimately accepted in function of their age empirically study the impact of such incorrect identifications. [3] Georgios Gousios, Margaret-Anne Storey, and Alberto Bacchelli. Work practices and challenges in pull-based development: The contributor’s VI. C ONCLUSION perspective. In International Conference on Software Engineering, pages 285–296. ACM, 2016. The collaborative development of open-source software [4] Georgios Gousios and Andy Zaidman. A dataset for pull-based develop- through a pull-based contribution process involves subtle so- ment research. In Working Conference on Mining Software Repositories, cial interactions that can influence the frequency and likelihood pages 368–371. ACM, 2014. [5] Y. Yu, H. Wang, V. Filkov, P. Devanbu, and B. Vasilescu. Wait for it: of contribution to a repository, or even its ability to retain Determinants of pull request evaluation latency on GitHub. In Working contributors. Recent qualitative results have highlighted that Conference on Mining Software Repositories, pages 367–371, May 2015. contributors do not appreciate the rejection of their PRs, and [6] Mohammad Masudur Rahman and Chanchal K. Roy. An insight into the pull requests of GitHub. In Working Conference on Mining Software that they find poor responsiveness from integrators frustrating. Repositories, pages 364–367. ACM, 2014. Integrators, on the other hand, are wary of alienating contrib- [7] Jason Tsay, Laura Dabbish, and James Herbsleb. Influence of social and utors in their handling of PRs. technical factors for evaluating contribution in GitHub. In International In this paper, we provide preliminary quantitative results Conference on Software Engineering, pages 356–366. ACM, 2014. [8] Josh Terrell, Andrew Kofink, Justin Middleton, Clarissa Rainear, Emer- showing that a contributor’s PRs are more more likely to be son Murphy-Hill, Chris Parnin, and Jon Stallings. Gender differences accepted when he has submitted more PRs previously. We and bias in open source: pull request acceptance of women versus men. also reveal the impact of PR decisions on the willingness PeerJ Computer Science, 3:e111, May 2017. [9] Ayushi Rastogi, Nachiappan Nagappan, Georgios Gousios, and André of contributors to contribute anew. Indeed, fewer contributors van der Hoek. Relationship between geographical location and evalua- submit a new PR after the previous one was rejected than when tion of developer contributions in Github. In International Symposium the previous one was accepted. This highlights the importance on Empirical Software Engineering and Measurement. ACM, 2018. [10] O. Aalen, O. Borgan, and H. Gjessing. Survival and Event History for project integrators to avoid aleniating contributors, lest they Analysis: A Process Point of View. Springer, 2008. lose their contributions. [11] E. L. Kaplan and P. Meier. Nonparametric estimation from incomplete observations. J. American Statistical Association, 53(282):457–481, ACKNOWLEDGEMENTS 2012. [12] Winston Haynes. Bonferroni Correction, pages 154–154. Springer, New This research was supported by the FRQ-FNRS collabo- York, 2013. rative research project R.60.04.18.F SECOHealth, the Excel- [13] D. R. Cox. Regression models and life-tables. Journal of the Royal lence of Science project 30446992 SECO-ASSIST financed by Statistical Society. Series B (Methodological), 34(2):187–220, 1972. [14] Mathieu Goeminne and Tom Mens. A comparison of identity merge FWO-Vlaanderen and F.R.S.-FNRS, and F.R.S.-FNRS Grant algorithms for software repositories. Science of Computer Programming, T.0017.18. 78(8):971–986, August 2013. [15] Erik Kouters, Bogdan Vasilescu, Alexander Serebrenik, and Mark G. J. R EFERENCES van den Brand. Who’s who in Gnome: using LSA to merge software [1] Georgios Gousios, Martin Pinzger, and Arie van Deursen. An ex- repository identities. In International Conference on Software Mainte- ploratory study of the pull-based software development model. In nance, pages 592–595. IEEE, 2012. International Conference on Software Engineering, pages 345–355. [16] I. S. Wiese, J. T. d. Silva, I. Steinmacher, C. Treude, and M. A. ACM, 2014. Gerosa. Who is who in the mailing list? Comparing six disambiguation [2] Georgios Gousios, Andy Zaidman, Margaret-Anne Storey, and Arie van heuristics to identify multiple addresses of a participant. In International Deursen. Work practices and challenges in pull-based development: Conference on Software Maintenance and Evolution, pages 345–355, The integrator’s perspective. In International Conference on Software Oct 2016. Engineering, pages 358–368. IEEE Press, 2015. 5