=Paper= {{Paper |id=Vol-2767/paper03 |storemode=property |title=On the Evolutionary Properties of Fix Inducing Changes |pdfUrl=https://ceur-ws.org/Vol-2767/02-QuASoQ-2020.pdf |volume=Vol-2767 |authors=Syed Fatiul Huq,Md. Aquib Azmain,Nadia Nahar,Md. Nurul Ahad Tawhid |dblpUrl=https://dblp.org/rec/conf/apsec/HuqANT20 }} ==On the Evolutionary Properties of Fix Inducing Changes== https://ceur-ws.org/Vol-2767/02-QuASoQ-2020.pdf
                                            8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)




On the Evolutionary Properties of Fix Inducing Changes
Syed Fatiul Huqa , Md. Aquib Azmainb , Nadia Naharc and Md. Nurul Ahad Tawhidd
Institute of Information Technology, University of Dhaka, Dhaka, Bangladesh


                                          Abstract
                                          A major aspect of maintaining the quality of software systems is the management of bugs. Bugs are commonly fixed in
                                          a corrective manner; detected after the code is tested or reported in production. Analyzing Fix-Inducing Changes (FIC) —
                                          developer code that introduces bugs — provides the opportunity to estimate these bugs proactively. This study analyzes
                                          the evolution of FICs to visualize patterns associated with the introduction of bugs throughout and within project releases.
                                          Furthermore, the association between FICs and complexity metrics, an important element of software evolution, is extracted
                                          to quantify the characteristics of buggy code. The findings indicate that FICs become less frequent as the software evolves
                                          and more commonly appear in the early stages of individual releases. It is also observed that FICs are correlated to longer
                                          Commit intervals. Lastly, FICs are found to be more present in codes with fewer lines and less cyclomatic complexity, which
                                          corresponds with the law of growing complexity in software evolution.

                                          Keywords
                                          software evolution, fix-inducing changes, data mining,



1. Introduction                                                                                                         these hold information about the software’s evolution.
                                                                                                                        As software evolves, so does its management, personnel,
Software projects evolve over time [1] to introduce new                                                                 design and code. This changing environment can affect
features while fixing bugs [2] that appear in parallel. The                                                             the introduction of bugs and vice versa. The existence
conventional way of handling the bugs is by detecting                                                                   and properties of such correlations can be established by
faulty codes with test cases [3] based on user reports and                                                              analyzing the evolution of FICs. Observing the evolution
writing patches [4] that eliminate the fault. In this way,                                                              of FICs can also help uncover its relation with other evo-
a bug is fixed only after it is written. Another way of                                                                 lutionary factors, for instance, the system’s complexity
managing bugs is proactively understanding how bugs                                                                     [11]. With this aim, this study answers the following
occur in systems. In this preventive process, Fix-Inducing                                                              Research Questions (RQs):
Changes (FICs) — code that introduces bugs which in-                                                                       RQ1: How do FICs evolve with the software? This
duce a later fix [5] — are analyzed. FICs can be tracked                                                                RQ observes how FICs change in frequency and ratio
from a project’s change history by looking for instances                                                                as the software system evolves. The evolution of the
of bug fixes and the code changed in these fixes. An FIC                                                                software is measured with its releases.
provides information about the code changes, the devel-                                                                    RQ2: How do FICs exist within releases? A sin-
oper writing the bug, and the state of the development                                                                  gle release depicts the software team’s complete flow of
process at the time of introducing the bug. These can                                                                   activities. The flow starts with the team taking in new re-
unveil important characteristics of the project, processes                                                              quirements to update the features of the software to their
and developers that potentially cause bugs.                                                                             finalization, testing and deployment. This RQ observes
   Studies analyzing FICs have observed how these are                                                                   how FICs appear and change in this flow.
related to or affected by properties of the software devel-                                                                RQ3: How do FICs relate to Commit interval?
opment lifecycle. For instance, Sliwerski et al. [5], apart                                                             The interval between Commits signify the amount of
from coining the term, related FICs with two developer                                                                  tasks assigned to developers, along with gaps between
activities: the day of coding and the amount of code in                                                                 activities. This RQ answers whether FICs behave differ-
a single Commit. Yin et al. [6] observed how bug fixes                                                                  ently than regular Commits in terms of these intervals.
themselves can introduce new bugs. Other studies in-                                                                       RQ4: How do FICs relate to system complexity?
clude relations with code smells[7], code coupling [8],                                                                 According to Lehman’s law of evolution, system complex-
developer sentiment [9, 10] and more.                                                                                   ity is a vital part of a software’s evolution [11]. The law
   Since FICs are a component of the software’s history,                                                                dictates that complexity increases as the software evolves.
                                                                                                                        Since FICs are instances where bugs are introduced, and
QuASoQ 2020: International Workshop on Quantitative Approaches to                                                       bugs can be affected by system complexity, this RQ ob-
Software Quality, 1st December 2020, Singapore                                                                          serves the relation between the two entities. Specifically,
email: bsse0732@iit.du.ac.bd (S. F. Huq); bsse0718@iit.du.ac.bd
(Md. A. Azmain); nadia@iit.du.ac.bd (N. Nahar);
                                                                                                                        in this RQ, FIC is correlated to Lines of Code (LoC) and
tawhid@iit.du.ac.bd (Md. N. A. Tawhid)                                                                                  Cyclomatic Complexity (CC) as commonly used metrics
                                    © 2020 Copyright for this paper by its authors. Use permitted under Creative
                                    Commons License Attribution 4.0 International (CC BY 4.0).
                                                                                                                        to quantify complexity [12, 13].
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073
                                    CEUR Workshop Proceedings (CEUR-WS.org)




                                                                                                                   13
             8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)




   For this study, eight Java repositories from GitHub             Osman et al.[19] extracted bug-fix patterns by mining
with a total of 142,555 Commits have been analyzed.             change history of 717 open source projects. They man-
From the Commits, FICs are detected, release information        ually inspected the patterns to retrieve the context and
are extracted and relevant metrics are calculated. The          reasons that cause those bugs.
findings show that FICs decrease as the software evolves,          So far, FICs have been analyzed to derive relationships
while remaining more prevalent early in release cycles.         with different project metrics. While the evolution of
Statistically analyzing the data shows that FICs contain        Commits has been observed, the evolutionary properties
larger intervals and their code reduced LoC and CC than         of FICs have yet been studied.
regular Commits.

                                                                3. Methodology
2. Related Work
                                                             This study observes how Fix-Inducing Changes (FIC)
Sliwerski et al.[5] introduced the term Fix-Inducing Changes evolve throughout the lifetime of software projects and
(FIC), providing a process that detects FICs in version how these relate to complexity metrics. The methodol-
history from Concurrent Versions System (CVS) with ogy of the study is divided into three parts, described as
bug reports from Bugzilla. Moreover, they showed a follows.
relation between FICs and number of files changed. An-
toniol et al.[14] showed that FICs create adverse effect 3.1. Fix-Inducing Change (FIC) Detection
and produce unexpected results. They presented a robust
approach to detect groups of co-changing files in CVS FICs are changes to code that causes problems to the
repositories.                                                software system. FICs are the introduction of bugs or
   Yin et al.[6] identified and analyzed incorrect bug fixes errors to the software, inducing fixes in the future. Hence,
which introduce new ones instead. They analyzed the these can be detected from the changes that fix bugs and
code of operating systems namely, FreeBSD, Linux and errors.
OpenSolaris. This approach also combined version con-           This study utilizes Commits, the documented changes
trol systems and bug repository to categorize changes. in software projects that are managed through version
They proved that Fix Inducing Fix (FIF) can cause crashes, controlling. Commits contain the exact lines and files
hangs, data corruption or security problems.                 where changes are made along with information of and
   Bavota et al.[7] showed that 15% refactoring tasks in- message from the developer who posted these. The de-
duce bugs, analyzing 52 kinds of refactoring on 3 Java tection of FICs through Commits is conducted in the
projects. They detected inheritance related refactoring following steps, influenced by the process of [7]
as the most error-prone refactoring.
                                                                 1. All Commits are fetched from GitHub reposito-
   In order to analyze FICs, various works focused on dif-
                                                                     ries.
ferent properties of change that would induce the bugs.
                                                                 2. Commit messages are extracted to detect terms
Levin et al.[15] and Menzies et al.[16] focused on source
                                                                     such as “bug”, “fix” or “patch”. These terms sig-
code changes of affected files. Fukushima et al. [17] intro-
                                                                     nify that the aim of the Commit is related to the
duced developer experience, time of day, time interval of
                                                                     management of bugs.
commit and some other properties of change that would
                                                                 3. Now the changes in these Commits are analyzed.
induce bugs. Sadiq et al. [8] related FICs with change
                                                                     Since the study deals with Java projects, it is first
couplings to find that recent change couples provide bet-
                                                                     checked whether the changes occur in “.java” files.
ter insight on new errors. Huq et al. [9] showed that
                                                                     Commits with no changes to such files indicate
developer sentiment is related with FICs, where positive
                                                                     that the Commits dealt with non-code entities of
comments and reviews in Pull Requests can lead to buggy
                                                                     the software (configuration files, documentation
Commits.
                                                                     etc.). Furthermore, the changes made in “.java”
   Weicheng et al.[18] explored the relation between de-
                                                                     files are analyzed to see whether the changes were
veloper Commit patterns in GitHub and software evo-
                                                                     code comments, which also signifies the absence
lution. They used four metrics to measure the Commit
                                                                     of code entities.
activity of developers and code evolution: changes, inter-
val, author and source code dependency. Moreover, this           4.  Then, the type of the edit made by the Commit
paper showed techniques to visualize these metrics for               is checked. There are three types of edit: Insert,
a given project. They developed a tool named Commits                 Delete and Replace. An Insert edit means that a
Analysis Tool (CAT) that finds that the changes in previ-            patch code is added onto the existing code base.
ous versions can affect the file which is dependent on it            However, it does not help to track which part of
in the next version.                                                 the previous code was buggy. There is no way
                                                                     of tracking back to a Commit that introduced a




                                                           14
            8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)




                                                               Figure 2: The methodology for extracting complexity metrics
                                                               from Commits



Figure 1: The methodology for identifying FICs from Com-          Next, Commits are labeled based on their assigned tags.
mits                                                           However, since Commits are automatically assigned all
                                                               future tags, the labeling was conducted in two parts. First,
                                                               Commits are extracted from all tags. Then, for every tag,
       bug. Hence, Commits with Insert edit types are          only those Commits that were posted after the previous
       discarded from further consideration.                   release tag are assigned to the current one.
    5. After the filtering process, the remaining Com-            As Commits are segmented into releases, analysis for
       mits are labeled as “Fixing Commits” or “Fixes”.        Research Question (RQ) 1 is conducted. For each release,
       These are the Commits that removed buggy code.          the number of FIC and non-FIC is calculated based on
                                                               section 3.1. This segmentation is further elaborated for
    6. Next, origin of each legitimate change in the Fix
                                                               RQ2, by dividing each release into three equal parts. The
       is tracked using the blame function, which re-
                                                               three divisions are extracted to better understand the
       turns the Commit where a specified changed line
                                                               early, middle and late stages of a single release.
       was last added or modified. These Commits are
       labeled as FICs.
                                                               3.3. Metrics Extraction
                                                             To analyze the relation between FICs and complexity
3.2. Evolution Analysis
                                                             metrics — Line of Code (LoC) and Cyclomatic Complex-
To understand the evolution of FICs, the projects’ re- ity (CC) — first the changes to code are extracted. This
lease tags are analyzed. Release tags define iterative final includes the content of the changed files and numbers
versions of the software in the project’s lifetime. Since of lines which are modified or deleted. In the case of file
Commits are assigned these tags, it is possible to catego- contents, along with that of the current Commit, con-
rize Commits based on releases.                              tents of its parent Commit are also extracted. Parent
   To analyze releases, first non-release tags are filtered Commit is referred to the Commit directly prior to the
out based on naming structures. Usually the release tags current Commit. The contents of the parent Commit
in most projects abide by the pattern: “v #.#.#”. The provides information of the state of code before the cur-
rest are tags depicting other information like branches rent Commit’s changes. For FICs, their parents retains
or experiments. However, the structure of naming tags the properties of the code where the bug was introduced.
can vary with projects. For instance, patterns in projects With the contents of the current and parent Commits,
like ElasticSearch or Commons-lang are “Elasticsearch the two metrics are calculated in the following manner:
#.#.#” and “commons-lang-#.#.#” respectively. Hence, tags
are manually analyzed for each repository. Additionally,         1. LoC: To calculate the line of code, without consid-
versions that are release candidates are discarded since            ering comments, first the Abstract Syntax Tree
these do not depict final releases.                                 (AST)   [20] of a program is generated from the




                                                          15
                 8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)




        changed files using JavaParser1 . It is verified Table 1
        whether the changes have been conducted on Repository description of the eight projects
        executable code or not. Therefore, blank spaces                                           Lifetime   Contri-
        are eliminated. Next, the modified lines in the          Project Name        Commits
                                                                                                  (Years)    -butors
        current content are checked to identify whether          Apache Tomcat       19360        8.5        21
        these are comments based on the AST. If none of          Google Guava        4798         5          187
        the changed lines are executable code, then the          Mockito             5019         6.7        155
        file is not further considered. Otherwise, the LoC       Commons-lang        5396         10         115
        of the parent content is calculated.                     Apache Hadoop       21435        4.8        191
     2. CC: To calculate cyclomatic complexity, the method Selenium                  23550        9.3        435
        in which change was introduced is first identified.      Elastic Search      44975        8.5        1216
        This is done by taking each changed line of the          Spring Framework    18022        6.4        369
        current content and tracing its state back in the        Total               142555       59.1       2689
        parent content. The generated AST of the parent
        content is traversed for each line. Each changed
        line of executable code is individually assessed to the system tend to contain more instances of bug in-
        and provided an associated method. This pro- troduction. This can be due to a rapidly changing and
        vides a list of changed methods for a single file. volatile initial requirement, formative and incomplete
        Each of their cyclomatic complexities are calcu- development processes, lack of collaborative experience
        lated, aggregating all possible paths (If-else, loops, among the developers, or an insufficiency of reviewing
        switch statements).                                    and testing resources. But as the software evolves, the
                                                               FICs get reduced, as an indication of bolstered testing
                                                               and quality assurance processes, and project maturity.
4. Experimentation and Findings                                   On the other hand, projects Tomcat and Hadoop in
                                                               Figures 3(a, e) show the opposite trend, where FICs are
Description of the dataset and the results observed for more predominant in later versions. This could happen
the 4 Research Questions (RQs) are described as follows. due to a decreased level of scrutiny in reviewing efforts,
                                                               a overhaul of new requirements, or other project and
4.1. Dataset                                                   personnel related events. Only Figure 3(f) showcases
                                                               a slightly more uniform pattern of FICs for the project
To conduct this research, eight well known Java projects Selenium. Although there are spikes of FICs occurring
are chosen from GitHub’s repositories. These projects are in specific versions, there is no apparent progression in
open source and use GitHub as their primary medium of the appearance of FICs.
code storage and version control, enabling the extraction         Such visualizations of the evolution of FICs help in
of all necessary Commits. Details of the repositories are observing the history of the project in terms of buggy
displayed in Table 1. The projects comprise of a total changes. This can be related to other aspects of projects
of 142555 Commits to analyze. All eight repositories that coincide with the decrease and increase of FICs to
are used to analyze Research Questions (RQ) 1, 2 and 3. understand what affects the introduction of bugs from a
RQ4, which requires the source code, utilizes the first high level perspective.
five repositories.
                                                                 4.3. RQ2: FICs in Releases
4.2. RQ1: Evolution of FICs
                                                                 In RQ2, the pattern of FICs within individual releases
The 1st RQ aims to understand how FICs evolve, in terms          is observed. In Figure 4, the appearance of FICs within
of frequency and ratio, throughout the lifetime of soft-         releases is displayed as black circles, where the size of
ware projects. The graphs in Figure 3 showcase the evo-          the circle is determined by the proportion of FIC on the
lution of FICs in the eight software repositories analyzed.      total number of Commits in that stage. The releases are
The different repositories show different types of patterns.     divided into three stages: early, middle and late, and for
In the majority of patterns, as seen in Figures 3(b, c, d, g,    some projects, versions are merged for visibility.
h) for projects Guava, Mockito, Commons-lang, Elastic-              It can be seen that for almost all the projects, FICs are
search and Spring-framework respectively, FICs appear            more predominant in early and middle stages of releases
in the early stages of the projects’ lifetime and decrease       compared to late ones. The exceptions are Tomcat and
in newer versions. This indicates that earlier changes           Spring framework, where FICs are similarly or more pre-
                                                                 vailing in the late stages. The high level of appearance of
    1                                                            FICs in early and middle stages of a release can be con-
        https://javaparser.org/




                                                            16
             8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)




                                                   (a) Project: tomcat




                                                   (b) Project: guava




                                                  (c) Project: mockito




                                               (d) Project: commons-lang




                                                  (e) Project: hadoop




                                                  (f) Project: selenium




                                                (g) Project: elasticsearch




                                             (h) Project: spring-framework
Figure 3: Evolution of FICs: FIC frequency, Non-FIC frequency and FIC ratio




                                                           17
             8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)




Figure 4: FICs in releases: early, middle and late stages




                                                            18
             8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)




tributed to a higher emphasis on adding and updating              Table 2
features in those stages. The late stages are more focused        Results for RQ3 and RQ4
on debugging and deployment efforts.                                          Commit                  Standard
                                                                   Metric                   Average                 P-value
                                                                              Type                    Deviation
4.4. RQ3: FIC Interval                                                        FIC           285.54    2027.59
                                                                   Interval                                         < 2.2𝑒 −16
                                                                              Regular       177.13    1113.37
The 3rd RQ deals with the relation between Commit in-                         FIC           508.63    507.94
                                                                   LoC                                              < 2.2𝑒 −16
tervals and FICs. Table 2 shows that, on average, the                         Regular       636.6     760.61
interval (in minutes) for FICs is longer than that for reg-                   FIC           3.49      3.64
                                                                   CC                                               6.48𝑒 −10
ular Commits. The p-value of < 2.2𝑒 −16 solidifies this                       Regular       4.04      4.42
difference as significant. This indicates that either a large
amount of work relates to FICs (as shown by [5]), or that
the developer introduces bug when they are away from                     when maintenance starts to outrank development,
the code for a long time.                                                and the software stabilizes. With this intuition
   The finding can help in preemptively detecting buggy                  proven through data, the finding can be applied
code. Commits posted after a longer period than average                  to change the way software is developed. A more
can be given extra emphasis in reviews. Additionally,                    test-driven approach can be adopted in software
developers should be suggested not to disassociate from                  projects from the beginning to mitigate the large
the code for a long time.                                                influx of bugs.
                                                                      2. Comparative history: By graphically extract-
                                                                         ing the evolution of FICs in software projects, the
4.5. RQ4: FIC and Complexity Metrics
                                                                         appearance of bugs can be historically analyzed.
The last RQ observes whether FICs are related to the                     This history can unearth valuable insight, for ex-
complexity metrics of software evolution: Line of Code                   ample periods of time or certain releases where
(LoC) and Cyclomatic Complexity (CC). It can be seen in                  FICs peaked in number. These exceptions can
Table 2 that the average LoC of code where FICs occur                    be comparatively analyzed with other metrices
is lower than that of regular Commits, with a p-value of                 related to the project. The metrices can range
< 2.2𝑒 −16 , making this difference significant. The result              from code properties like components developed,
says that a lower LoC relates to buggy code. Hence the                   design patterns used etc, or project aspects like
smaller, less busy components need to be given more                      type of assignment, assigned developer, developer
emphasis in coding correctly and reviewing for bugs.                     turnover etc. The proponents may vary from
   Next, as seen in Table 2, FICs occur in methods with                  project to project, hence the historical data of
significantly less CC than regular Commit, based on the                  FICs can be used as a constant reference to such
6.48𝑒 −10 p-value. A lower CC means that the tasks in                    differing metrices.
methods are logically simpler. And yet, bugs, as statis-              3. Intervals and bugs: RQ3 provides insight into
tically shown, tend to be introduced in such methods.                    the correlation between Commit interval and FICs,
Similar to LoC, this result prompts for a higher level of                showing the tendency of larger intervals causing
scrutiny when dealing with smaller and simpler methods.                  bugs. This finding can be applied in project man-
   Both of these findings support the evolution of FICs                  agement, by monitoring the absences of develop-
compared to complexity metrics. As described by Lehman’s                 ers. Developers who have been absent from the
law of evolution[11], complexity rises as software evolves,              development process for longer periods should
hence increasing LoC and CC. Similarly, based on RQ1,                    be assigned to tasks that are less sensitive and
FICs decrease in ratio in most cases, which is solidified                their work be reviewed more intensely. Further-
by its inverse relation with the complexity metrics.                     more, as also observed by Sliwerski et al. [5],
                                                                         large amount of changes in a single Commits that
                                                                         cause higher time for completion should be regu-
5. Result Discussion                                                     lated for FICs.
From the findings generated in this study, the following              4. Software evolution and complexity: The last
interpretations and applications can be estimated:                       finding demonstrates how FICs are correlated
                                                                         with line of code (LoC) and cyclomatic complex-
    1. Early bugs: From both Research Questions (RQs)                    ity (CC). These metrices, referred to as complex-
       1 and 2, it can be seen that bugs appear mostly                   ity metrices in the domain of software evolution,
       in the early stages of versions and release cy-                   are important in understanding the evolution of
       cles. This finding solidifies the intuition that early            FICs. As graphically shown in RQ1, FICs tend to
       code tends to cause more bugs than later ones                     decrease as the software evolves. On the other




                                                             19
             8th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2020)




        hand, complexity increases with the software’s                gineering Conference (APSEC), IEEE, 2019, pp.
        age [11]. This indicates an inverted relationship             514–521.
        between the two metrices, which is proven in             [10] S. F. Huq, A. Z. Sadiq, K. Sakib, Is developer sen-
        RQ4 where FICs are found to be related with a                 timent related to software bugs: An exploratory
        larger LoC and CC in the code.                                study on github commits, in: 2020 IEEE 27th Inter-
                                                                      national Conference on Software Analysis, Evolu-
                                                                      tion and Reengineering (SANER), IEEE, 2020, pp.
6. Conclusion                                                         527–531.
                                                                 [11] M. M. Lehman, Laws of software evolution revis-
This study analyzes GitHub repositories to extract Fix-
                                                                      ited, in: European Workshop on Software Process
inducing Changes (FICs) — changes that introduce buggy
                                                                      Technology, Springer, 1996, pp. 108–124.
code — and observes its evolution and characteristics. It is
                                                                 [12] C. F. Kemerer, S. Slaughter, An empirical approach
seen that FICs tend to occur in earlier versions and stages
                                                                      to studying software evolution, IEEE transactions
of releases. There is also a significant delay in posting
                                                                      on software engineering 25 (1999) 493–509.
FICs than regular Commits. Lastly, when relating with
                                                                 [13] G. Xie, J. Chen, I. Neamtiu, Towards a better under-
complexity metrics, FICs show up in code with less LoC
                                                                      standing of software evolution: An empirical study
and less CC than regular Commits. This corresponds
                                                                      on open source software, in: 2009 IEEE Interna-
with the decreasing FIC and increasing complexity of
                                                                      tional Conference on Software Maintenance, IEEE,
software evolution.
                                                                      2009, pp. 51–60.
                                                                 [14] G. Antoniol, V. F. Rollo, G. Venturi, Detecting
References                                                            groups of co-changing files in cvs repositories, in:
                                                                      Eighth International Workshop on Principles of
 [1] M. M. Lehman, L. A. Belady, Program evolution:                   Software Evolution (IWPSE’05), IEEE, 2005, pp.
     processes of software change, Academic Press Pro-                23–32.
     fessional, Inc., 1985.                                      [15] S. Levin, A. Yehudai, Boosting automatic commit
 [2] M. Monperrus, Automatic software repair: a bib-                  classification into maintenance activities by utiliz-
     liography, ACM Computing Surveys (CSUR) 51                       ing source code changes, in: Proceedings of the
     (2018) 1–24.                                                     13th International Conference on Predictive Models
 [3] N. Chauhan, Software Testing: Principles and Prac-               and Data Analytics in Software Engineering, ACM,
     tices, Oxford University, 2010.                                  2017, pp. 97–106.
 [4] T. Ackling, B. Alexander, I. Grunert, Evolving              [16] T. Menzies, J. Greenwald, A. Frank, Data min-
     patches for software repair, in: Proceedings of                  ing static code attributes to learn defect predictors,
     the 13th annual conference on Genetic and evolu-                 IEEE transactions on software engineering 33 (2006)
     tionary computation, 2011, pp. 1427–1434.                        2–13.
 [5] J. Śliwerski, T. Zimmermann, A. Zeller, When do             [17] T. Fukushima, Y. Kamei, S. McIntosh, K. Yamashita,
     changes induce fixes?, in: ACM sigsoft software                  N. Ubayashi, An empirical study of just-in-time de-
     engineering notes, volume 30, ACM, 2005, pp. 1–5.                fect prediction using cross-project models, in: Pro-
 [6] Z. Yin, D. Yuan, Y. Zhou, S. Pasupathy, L. Bairava-              ceedings of the 11th Working Conference on Min-
     sundaram, How do fixes become bugs?, in: Pro-                    ing Software Repositories, ACM, 2014, pp. 172–181.
     ceedings of the 19th ACM SIGSOFT symposium                  [18] Y. Weicheng, S. Beijun, X. Ben, Mining github:
     and the 13th European conference on Foundations                  Why commit stops–exploring the relationship be-
     of software engineering, ACM, 2011, pp. 26–36.                   tween developer’s commit pattern and file version
 [7] G. Bavota, B. De Carluccio, A. De Lucia, M. Di Penta,            evolution, in: 2013 20th Asia-Pacific Software Engi-
     R. Oliveto, O. Strollo, When does a refactoring                  neering Conference (APSEC), volume 2, IEEE, 2013,
     induce bugs? an empirical study, in: 2012 IEEE                   pp. 165–169.
     12th International Working Conference on Source             [19] H. Osman, M. Lungu, O. Nierstrasz, Mining fre-
     Code Analysis and Manipulation, IEEE, 2012, pp.                  quent bug-fix code changes, in: 2014 Software Evo-
     104–113.                                                         lution Week-IEEE Conference on Software Main-
 [8] A. Z. Sadiq, M. J. I. Mostafa, K. Sakib, On the evolu-           tenance, Reengineering, and Reverse Engineering
     tionary relationship between change coupling and                 (CSMR-WCRE), IEEE, 2014, pp. 343–347.
     fix-inducing changes (2019).                                [20] I. Neamtiu, J. S. Foster, M. Hicks, Understanding
 [9] S. F. Huq, A. Z. Sadiq, K. Sakib, Understanding                  source code evolution using abstract syntax tree
     the effect of developer sentiment on fix-inducing                matching, in: Proceedings of the 2005 international
     changes: An exploratory study on github pull re-                 workshop on Mining software repositories, 2005,
     quests, in: 2019 26th Asia-Pacific Software En-                  pp. 1–5.




                                                            20