<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Proceedings of the SQAMIA</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Relationship Between Design and Defects for Software in Evolution</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>MATIJA MILETI C´</string-name>
          <email>mmiletic@riteh.hr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>MONIKA VUKUSˇ IC´</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>GORAN MAUSˇ A</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>TIHANA GALINAC GRBAC</string-name>
          <email>tgalinac@riteh.hr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Rijeka, Faculty of Engineering</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <volume>6</volume>
      <fpage>11</fpage>
      <lpage>13</lpage>
      <abstract>
        <p>Successful prediction of defects at an early stage is one of the main goals of software quality assurance. Having an indicator of the severity of defect occurrence may bring further bene t to allocation of testing resources. Code churn in terms of modi ed lines of code was found to be a good indicators of bugs. However, that metric does not reveal the structural changes of the code design. The idea of this project is to analyze the relationship between the evolution of object oriented software metrics and the appearance of defects. To achieve this, the absolute and relative di erences between the initial version of a class and its current version were calculated for each metrics as indicators of design change. Then, the correlation between these di erences and the number of defects were analyzed. Our case study showed that certain metrics have no in uence on defect occurrence, while several of them exhibit moderate level of correlation. In addition, we concluded that the relative di erences were inappropriate indicator for determining the relationship.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Categories and Subject Descriptors: D.2.5 [SOFTWARE ENGINEERING]: Testing and Debugging—Tracing; D.2.9
[SOFTWARE ENGINEERING]: Management—Software quality assurance (SQA); H.3.3 [INFORMATION STORAGE AND
RETRIEVAL] Information Search and Retrieval
Additional Key Words and Phrases: Design chage, defect occurrence, evolving software, correlation</p>
    </sec>
    <sec id="sec-2">
      <title>1. INTRODUCTION</title>
      <p>In software development, one of the main problems is predicting defects during the software product
life cycle. In our analysis, we focused on that problem and investigated whether it can be tackled as
early as in the design stage of evolving software. In the design stage of software life cycle, most of the
effort is invested in emergence of a module. Then, with each upgrade of this module, it is possible to
break concepts of good design. Numerous changes are introduced, and these changes can also cause
defects. From this standpoint, we would like to find the design metrics that are critical to the
appearance of defects. Product metrics have been successfully used to build prediction models [Basili et al.
1996], so a great number of such metrics is investigated in this study.</p>
      <p>When developing a software product, it is essential to plan the necessary actions needed to upgrade
a certain functionality. Often, through this process, developers need to modify, delete or add new lines
of program code. The amount of program code that goes through these actions is called code churn.
Developers should strive to minimize the quantity of program lines that will be modified, thus
achieving a smaller code churn ratio [Munson and Elbaum 1998]. After the development phase, the software
10:2
product also requires maintenance. Many changes are often introduced during maintenance, which are
caused by various factors. Some of them include changes in the environment, unused code, rarely
updated code, etc. With the aging of the software product and the consequent links of the above-mentioned
factors, the software becomes increasingly difficult to maintain, leading to software design
deterioration [Land 2002]. Code churn was used to predict defects in many studies but a finer investigation of
source code changes yields stronger correlation [Giger et al. 2011].</p>
      <p>Projects in evolution are consisting of a number of versions of program code. In the design stage of
each version, new requirements are introduced into the existing system architecture. The effort put
into this phase affects the effort required to fix defects [Nadir et al. 2016]. Different design metrics
of the object-oriented program code can be measured then. It would be highly beneficial to consider
the effect of the program design change on the severity of defect occurrence at this early stage. Their
relationship will be analyzed as a first step to achieve this goal. The main contribution of this paper is
the methodology for investigating the effect of the program design change on the severity of defect
occurrence. The design change is represented by the relative and absolute differences of software metrics
between consecutive releases of an evolving project. The number of defects is regarded as the severity
of occurrence of defects, with higher values corresponding to greater severity.</p>
      <p>The study analyzed the afore mentioned relationship between calculated differences and defect
occurrence according to the methodology presented in section 2. This project performed a case study
based on a sample of data from two Eclipse open source projects according to the description in section
3. The results of this analysis are represented in form of scatter plot diagrams and 3D histograms
in section 4. Furthermore, the results were grouped for easier comprehension and interpretation
discussed in section 5. Based on calculated data, it is possible to determine if there are some metrics
that developers should pay special attention to in the design phase. In addition to these pieces of
information, it should be possible to reduce the defect appearance. A future step would be to build a
software defect prediction model based on the output data of our program. For example, Genetic
Programming algorithms have recently been used to generate high-quality predictors that perform this
task successfully [Mausˇa and Grbac 2017].</p>
    </sec>
    <sec id="sec-3">
      <title>2. METHODOLOGY</title>
      <p>Figure 1 displays data flow from our project and encapsulates all the steps in data processing. Input
data includes several consecutive releases of open-source projects and contains the values of software
metrics and the number of defect occurrences. The output data is consisted of csv files that contain
program design change and graphical representations of correlation analysis.</p>
      <p>To measure the program design change, static code attributes software metrics are used. The amount
of change is expressed by the relative and absolute difference between the current and a previous
release of a class. Relative code churn was found to have stronger correlation with defect density that
absolute churn for lines of code (LOC) [Nagappan and Ball 2005] so this study will analyze both. The
values of metrics and calculated differences for each metric are written in output file so that we can use
them in the later analysis. The program also adds a binary indicator if the class is found in a previous
version of program code, named ”Class found”. It is worth noting that a great number of software
modules in a complex system is expected not to contain defects or code change [Fenton and Ohlsson
2000] [Runeson et al. 2013]. Since this project is interested in the severity of defect occurrence, the
non-defective software modules are not considered. Furthermore, because of interest in the quantity
of design change, the software modules without a previous version are also not considered.</p>
      <p>Within this case study, two algorithms are prepared: ”compareFiles.py” and ”compareOldestFiles.py”.
These algorithms calculate the amount of change for all the metrics of software modules in a selected
(current) release. The ”compareFiles.py” script computes the differences between the current release
and all the previous releases selected by the user, while the ”compareOldestFiles.py” script computes
the differences between the current release of a software module and its first (oldest) appearance in
the evolution of the project. The computed amount of change is added to input data and algorithms
generate the ”writeResults.csv” or ”writeOldestResults.csv” output files.</p>
      <p>These output files are afterwards used as input files for ”correlationCount.py” and
”generateHistogram.py” algorithms. The ”correlationCount.py” algorithm generates a scatter plot diagram between
the amount of change of each metric and the number of defects, with corresponding correlation
coefficients. The correlation analysis is a standard procedure in analyzing the relationship between a
metric and defect count and it may even reveal a good candidate predictors for defect prediction models
[Zhang et al. 2017]. The ”generateHistogram.py” algorithm generates a histogram with a 3D
projection. To analyze the results, they are grouped according to Spearman’s correlation coefficient. The aim
of this analysis is to detect which design metrics are critical to the severity of defect occurrence.</p>
    </sec>
    <sec id="sec-4">
      <title>3. CASE STUDY</title>
      <p>The idea of this project was to analyze data from different datasets and try to find relationship between
input data and defect occurrence in program code. To accomplish this, the following functionalities were
implemented iteratively:
(1) Algorithm that computes absolute and relative difference between metrics for every class inside
current project version and one or more selected files for comparing.
(2) Algorithm that computes relative and absolute differences for every class from current file with
the oldest version of project that contains certain class.
(3) Graphical representation of data collected from previous algorithms:
(a) Histogram within 3D projection showing relationship between relative or absolute differences,
defect count and frequency of defect occurrences inside a certain range.
(b) Scatter diagrams showing relationship between relative or absolute differences and defect
count accompanied with calculated Pearsons and Spearman’s coefficient.</p>
    </sec>
    <sec id="sec-5">
      <title>3.1 Datasets</title>
      <p>Input datasets that are used in our research included 5 consecutive versions of Eclipse JDT (Java
development tools) and PDE (Plug-in Development Environment) open source projects. The data were
collected using a systematically defined data collection procedure [Mausˇa et al. 2016] implemented in
Bug-Code Analyzer tool. The JDT project provides the tool plug-ins that implement a Java IDE
supporting the development of any Java application, including Eclipse plug-ins1. The Plug-in Development
Environment (PDE) provides tools to create, develop, test, debug, build and deploy Eclipse plug-ins,
fragments, features, update sites, and RCP products2. The included releases are named: 2.0, 2.1, 3.0,
3.1 and 3.2.</p>
      <p>The datasets are given in csv format with comma as the delimiter and point as decimal mark. First
row contains description of metrics: column 1 is the file path, columns 2 49 are independent variables
(software metrics) described with their abbreviations, column 50 is the dependent variable (number
of defects). The datasets were cleared of files that contained .example or .tests in their path. Full list
of metrics and the explanation of their abbreviations can be found in [Mausˇa and Grbac 2017] and
on-line3.</p>
    </sec>
    <sec id="sec-6">
      <title>3.2 Evaluation</title>
      <p>After the D’Agostino and Pearson’s normality test was performed, it was found that input data is not
normally distributed. This analysis is necessary to select an appropriate method for the calculation
of correlation coefficients. Unlike Pearson’s correlation, the Spearman’s correlation analysis does not
require the data to be normally distributed. Hence, this non-parametric statistical method had to be
used for the purpose of correlation expression [D’Agostino and Pearson 1973].</p>
      <p>Spearman’s correlation coefficient is a statistical measure of the power of a monotonic relationship
between paired data. It is calculated according to the formula presented in equation 1 [Myers and Well
1991].</p>
      <p>The interpretation is similar to Pearson’s correlation, but closer to a stronger monotonic relationship.
The relationships between variables may be determined by their correlative dependence as follows:
(1) Positive correlation - the small value of one variable corresponds to the small value of the second
variable, also the high value of one variable corresponds to the high value of the other variable
(2) Negative correlation - the small value of one variable corresponds to the high value of the second
variable and vice versa.
(3) Correlation does not exist when a value of a variable can not be inferred from the value of another
variable based on the value of a variable. The points in such a graph are scattered [Evans 1996].</p>
      <p>Correlation is the size of the effect and so we can verbally describe the correlation force using the
Table I [Evans 1996].
1http://www.eclipse.org/jdt/
2http://www.eclipse.org/pde/
3http://www.seiplab.riteh.uniri.hr/wp-content/uploads/2016/12/Table-of-Metrics.pdf
rs = 1</p>
      <p>n
6 P d2</p>
      <p>i
i=1
n3 n
1
rs
1
where rs denotes its value in a sample s. The range of possible values is as follows [Lehman 2005] :
(1)
(2)</p>
    </sec>
    <sec id="sec-7">
      <title>4. RESULTS</title>
      <p>The first analysis counted the number of files that had its earliest version in each of the previous
releases. Table II shows this number as follows: first column represents the number of files that had
no previous release, the second column represent the number of files that had its earliest version in
the previous release, and so on until the last column shows the number of files that had its earliest
version in the oldest release.</p>
      <p>Project
JDT
PDE</p>
      <p>In Table II we can see that in ”PDE” project the number of newly released files is approximately
equally distributed in every release. For the ”JDT” project we can see that most of the files were
created in the first release (2.0). Every following release is mainly built upon the previous release and
the number of files depends on the newly added features. With the exception of Release 2.1, most of
the other versions have similarly distributed newly released files.</p>
      <p>Tables III and IV show the obtained Spearman correlation coefficients for the following pairs:
—Correlation between the metric’s real value and number of defects (column ”value”)
—Correlation between the absolute difference of the metric and number of defects (column ”abs diff ”)
—Correlation between the relative difference of the metric and number of defects (column ”rel diff ”)</p>
      <p>Under the column ”single previous rls” the correlation values are shown for the comparison between
the fifth version of the open source project release and its previous version. Under the column
”oldest rls” the same values are shown, but for the comparison between the fifth version and the oldest
appearance of every class in it.</p>
      <p>Using the previously mention tables we can see which of the metrics achieve the greatest
correlation coefficient. These metric have the most significant impact on the occurrence of defects. For that
reason, in Tables III and IV we showed only the metrics with the correlation coefficient value greater
then ”0.20”. Metrics that fall into the category of ”Very poor correlation” in both datasets include the
following: HCLOC, MI, DIT, MINC, S R, FOUT, NSUP, NSUB, INTR, MOD. Metrics that achieve the
”Very poor correlation” only in ”JDT” project include HCWORD, LCOM, COH, EXT, MPC. Similarly,
metrics that fall into that category only in ”PDE” project include CLOC, CWORD, AVCC, CBO, CCML,
F IN, HIER, SIX, NCO.</p>
      <p>The degree of correlation between the absolute and relative differences and the number of defects
are shown in a scatter plot diagram representing a two-dimensional graph. The y-axis represents the
number of defects, while the x-axis represents calculated absolute or relative difference of the certain
metric. Additionally, Spearman’s correlation coefficient is calculated and displayed above diagram. In
order to make results more intuitive and better visualized, results are shown graphically on histogram
in 3D projection. From them it is easier to bring conclusions about how the data is divided between
10:6
given ranges. Every histogram has x, y, and z axis. X-axis represents computed relative or absolute
difference between metrics. Y-axis represents data about defect count from the given input file. Z-axis
represents frequency of data appearance from y-axis in a range on x-axis. Due to limited space, these
diagrams are shown only for SLOC L metric which had the greatest value of correlation coefficient in
both projects in figures 2 and 3. After absolute difference value higher than 500 the number of defect
occurrences is negligible. The biggest concentration of defect occurrence with values between 0 and
5 are in range of absolute difference from 0 to 500, with frequency value from z axis of around 500.
Similarly, on the right side of the figure the second histogram represents number of defect occurrences
inside a certain range of relative difference on x axis. From the Figure 3 we can see that the biggest
concentration of the defect occurrences are in a range of 0 to 1 values on x axis. After the relative
difference reaches the value of 1 there is no significant defect occurrence.</p>
    </sec>
    <sec id="sec-8">
      <title>5. DISCUSSION</title>
      <p>In this chapter we will give a more detailed overview of the obtained results and show them on the
example of specific metrics. Further more, we will discuss results obtained from two different datasets,
that refer to ”JDT” and ”PDE” open source project releases.</p>
      <p>Analysis of the obtained results have shown that certain metrics from different datasets fall into
the same correlation range. Analyzing both datasets, we noticed that the greatest correlation
coefficient are achieved for the difference between the value of the metric itself and the number of defects.
This confirms the general appropriateness of metrics for software defect prediction and indicates their
capability to distinguish the severity of defect occurrence. However, this study was motivated to
investigate whether similar effect can be discovered for the quantity of change. When sorting the data by
the ”value” column we can observe that most of the metrics fall into the category of moderate
correlation. The lowest value of correlation coefficients are achieved between the relative difference and the
number of defects. Hence, we used the metrics from the ”abs diff ” column for this discussion.</p>
      <p>When analyzing the data from the Tables III and IV we can see that the highest correlation
coefficient are most often achieved by the same metrics. For example, we can see from the Table IV that
metrics SLOC L and SLOC P achieve similar correlation coefficients in single and oldest releases. In
contrast, metric R R falls into ”Moderate correlation” in the single version, but in the oldest version it
falls into negative ”Very poor correlation”. Hence, we can conclude that both of the proposed approaches
may contribute to the research performed in this project.</p>
      <p>Tables III and IV do not contain the same set of metrics because the ones with low correlation
coefficient were omitted. However, there are some metrics with higher correlation coefficient. Metrics LOC
(Total Lines of code), SLOC P (Pyisical executable source lines of code) and SLOC L (Logical source
lines of code) achieve the greatest positive correlation coefficients in both datasets. The correlation of
these metrics is intuitive because the greater amount of code is introduced, the greater is the
possibility for a defect to occur. It is also important to mention the cyclomatic complexity metric which
is used to indicate the complexity of a program [McCabe 1976]. The correlation coefficients of metrics
MVG (McCabe VG Complexity), AVCC (Average Cyclomatic Complexity of all the methods in the class)
and MAXCC (Maximum Cyclomatic Complexity of any method in the class) fall into the category of
positive ”moderate correlation”. This shows that more independent paths through an algorithm, i.e.
higher cyclomatic complexity, increase the possibility of defect occurrence. Furthermore, MVG exhibits
the highest correlation coefficient in JDT project when a file is compared to its oldest version.</p>
      <p>For better representation of data Figures 2 and 3 are displaying computed results for SLOC L
metric. The figures clearly show higher Spearman’s correlation coefficient for relationship between the
absolute difference value and the number of defect occurrence, with coefficient value around 0.48, than
for the relative difference value. Spearman’s correlation coefficient between the relative difference and
defect occurrence is around 0.20, thus is very low. This is a strong indicator that absolute differences
are more relevant for measuring the afore mentioned relationship of defect occurrence.</p>
      <p>From the ”JDT” dataset the following metrics achieved the greatest correlation coefficients in the
single previous version: HCLOC, MVC, LOC, SLOC P, SLOC L. Similarly in the oldest version,
metrics MVG, LOC, SLOC P, SLOC L and C SLOC achieved the greatest correlation coefficients. Metrics
with the greatest correlation coefficient in the ”PDE” dataset from the single previous version include
UWCS, LMC, BLOC, MVG and No Methods metrics. From the oldest version they include BLOC, RFC,
LMC, No Methods and UWCS metrics.</p>
    </sec>
    <sec id="sec-9">
      <title>5.1 Threat to Validity</title>
      <p>The validity of this small scale case study is affected by the choice of data. This is a first preliminary
study, so it is based only on a sample of data from five consecutive releases of two open source projects
from Eclipse community, namely, JDT and PDE. Thus, external validity is the main threat to this study
10:8
because it cannot discover general conclusions. Nevertheless, the obtained results are a motivation to
put more effort in this research direction. Projects from different background, like different
communities, development methodologies or written in different programming language need to be included to
obtain more general conclusions. As industrial data are difficult to obtain, the construct validity is also
threatened. That is why the chosen projects are complex and long lasting ones that may approximate
industrial projects. The statistical analyses are threats to internal validity, but they are all based on
well known and widely used tests. The conclusions about the level of correlation and the importance
of software metrics for defect prediction lacks a precise explanation, hence threatening the conclusion
validity. The causality of the conclusions is unknown and it remains open to speculations.</p>
    </sec>
    <sec id="sec-10">
      <title>CONCLUSION</title>
      <p>When developing a project it is very important to pay attention to the appearance of defects in
program code. Since the design is the phase in which the requirements are introduced into the existing
system architecture, it is particularly important to pay attention to the occurrence of malfunctions that
Metrics
LOC
SLOC P
SLOC L
MVG
BLOC
C SLOC
HCWORD
No Methods
LCOM
NOS
HBUG
HEFF
UWCS
INST
PACK
RFC
NLOC
R R
COH
LMC
LCOM2
MAXCC
HVOL
NQU
EXT
TCC
MPC
CCOM
HLTH
may arise due to necessary changes. The goal of this project was to establish the correlation between
changes in design metrics and the appearance of defects.</p>
      <p>This paper provides a methodology for estimating the relationship between the quantity of change
and the severity of defect occurrence. The methodology was used in a small scale case study which
used 5 subsequent releases of ”JDT” and ”PDE” Eclipse open source community projects as datasets.</p>
      <p>The algorithms that implement the proposed methodology have been applied to all software metrics
in the datasets. Generated output data is used as the input file to algorithms that computes metric
correlation and graphically display obtained relationships. The results were afterwards grouped
according to Spearman’s correlation coefficient, as described in section 3.1. The analysis showed a very
week or week correlation between changes in metric value and defect occurrence in design stage for
most of the metrics. Furthermore, there was no strong correlation found between any of the design
metrics and the defect occurrence. However, these results were somewhat expected. The aim of this
project was to find design metrics that could bring additional information in the prediction of defects.
For example, SLOC P, SLOC L and HEFF metrics exhibit a moderate correlation between the changes
in the value of a particular metric and defect occurrence. The moderate level of correlation is an
indication enough that the proposed metric calculation may improve the categorization of severity of defect
occurrence and possibly the software defect prediction. The fact that the values of metrics themselves,
which are traditionally already used for software defect prediction, have similar values of correlation,
encourages us to continue this research. When these metrics are known it is possible to pay more
attention to them when introducing new requirements, or re-designing the existing software modules,
and thus reduce the possibility of defect occurrence.</p>
      <p>To summarize, this research came to following conclusions:
(1) Metrics like HCLOC, MI, DIT, MINC, S R, FOUT, NSUP, NSUB, INTR and MOD exhibit very poor
correlation in both projects and may have lower impact on the level of defect occurrence.
(2) Metrics SLOC L and SLOC P achieve the highest correlation coefficients in both input datasets.</p>
      <p>Hence, we can conclude that they have the greatest impact on defect occurrence.
(3) Spearman’s correlation coefficients between the relative difference and defect occurrence are very
low. Thus, we can conclude that they are not good indicators of design change quality.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Victor R. Basili</surname>
          </string-name>
          ,
          <string-name>
            <surname>Lionel C. Briand</surname>
            , and Walce´lio
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Melo</surname>
          </string-name>
          .
          <year>1996</year>
          .
          <article-title>A Validation of Object-Oriented Design Metrics As Quality Indicators</article-title>
          .
          <source>IEEE Trans. Softw. Eng</source>
          .
          <volume>22</volume>
          ,
          <issue>10</issue>
          (Oct.
          <year>1996</year>
          ),
          <fpage>751</fpage>
          -
          <lpage>761</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Ralph D'Agostino</surname>
            and
            <given-names>Egon S.</given-names>
          </string-name>
          <string-name>
            <surname>Pearson</surname>
          </string-name>
          .
          <year>1973</year>
          .
          <article-title>Tests for Departure from Normality. Empirical Results for the Distributions of b2 and pb1</article-title>
          .
          <source>Biometrika 60</source>
          ,
          <issue>3</issue>
          (
          <year>1973</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>James D.</given-names>
            <surname>Evans</surname>
          </string-name>
          .
          <year>1996</year>
          .
          <article-title>Straightforward Statistics for the Behavioral Sciences</article-title>
          . Brooks/Cole Publishing Company.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Norman E.</given-names>
            <surname>Fenton</surname>
          </string-name>
          and
          <string-name>
            <given-names>Niclas</given-names>
            <surname>Ohlsson</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Quantitative Analysis of Faults and Failures in a Complex Software System</article-title>
          .
          <source>IEEE Trans. Softw. Eng</source>
          .
          <volume>26</volume>
          ,
          <issue>8</issue>
          (Aug.
          <year>2000</year>
          ),
          <fpage>797</fpage>
          -
          <lpage>814</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Emanuel</given-names>
            <surname>Giger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Pinzger</surname>
          </string-name>
          , and
          <string-name>
            <surname>Harald</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Gall</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Comparing Fine-grained Source Code Changes and Code Churn for Bug Prediction</article-title>
          .
          <source>In Proceedings of the 8th Working Conference on Mining Software Repositories (MSR '11)</source>
          . ACM, New York, NY, USA,
          <fpage>83</fpage>
          -
          <lpage>92</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Rikard</given-names>
            <surname>Land</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Software Deterioration And Maintainability A Model Proposal</article-title>
          . In
          <source>In Proceedings of Second Conference on Software Engineering Research and Practise in Sweden (SERPS)</source>
          ,
          <source>Blekinge Institute of Technology Research Report</source>
          <year>2002</year>
          :
          <fpage>10</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Ann</given-names>
            <surname>Lehman</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>JMP for Basic Univariate and Multivariate Statistics: A Step-by-step Guide</article-title>
          . SAS Institute.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Goran Mausˇa and Tihana Galinac Grbac</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Co-evolutionary multi-population genetic programming for classification in software defect prediction: An empirical case study</article-title>
          .
          <source>Appl. Soft Comput</source>
          .
          <volume>55</volume>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Goran</surname>
            <given-names>Mausˇa</given-names>
          </string-name>
          , Tihana Galinac Grbac, and Bojana Dalbelo Basˇic´.
          <year>2016</year>
          .
          <article-title>A systematic data collection procedure for software defect prediction</article-title>
          .
          <source>Comput. Sci. Inf</source>
          . Syst.
          <volume>13</volume>
          ,
          <issue>1</issue>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Thomas J.</given-names>
            <surname>McCabe</surname>
          </string-name>
          .
          <year>1976</year>
          .
          <article-title>A Complexity Measure</article-title>
          .
          <source>IEEE Transactions on Software Engineering SE-2</source>
          ,
          <issue>4</issue>
          (Dec.
          <year>1976</year>
          ),
          <fpage>308</fpage>
          -
          <lpage>320</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>John C.</given-names>
            <surname>Munson</surname>
          </string-name>
          and
          <string-name>
            <given-names>Sebastian G.</given-names>
            <surname>Elbaum</surname>
          </string-name>
          .
          <year>1998</year>
          .
          <article-title>Code Churn: A Measure for Estimating the Impact of Code Change</article-title>
          .
          <source>In Proceedings of the International Conference on Software Maintenance (ICSM '98)</source>
          . IEEE Computer Society, Washington, DC, USA.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Jerome L. Myers</surname>
            and
            <given-names>Arnold D.</given-names>
          </string-name>
          <string-name>
            <surname>Well</surname>
          </string-name>
          .
          <year>1991</year>
          .
          <article-title>Research Design and Statistical Analysis</article-title>
          .
          <source>HarperCollins.</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Shahab</given-names>
            <surname>Nadir</surname>
          </string-name>
          , Detlef Streitferdt, and
          <string-name>
            <given-names>Christina</given-names>
            <surname>Burggraf</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Industrial Software Developments Effort Estimation Model</article-title>
          .
          <source>In 2016 International Conference on Computational Science and Computational Intelligence (CSCI)</source>
          .
          <volume>1248</volume>
          -
          <fpage>1252</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Nachiappan</given-names>
            <surname>Nagappan</surname>
          </string-name>
          and
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Ball</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Use of Relative Code Churn Measures to Predict System Defect Density</article-title>
          .
          <source>In Proceedings of the 27th International Conference on Software Engineering (ICSE '05)</source>
          . ACM, New York, NY, USA,
          <fpage>284</fpage>
          -
          <lpage>292</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Per</given-names>
            <surname>Runeson</surname>
          </string-name>
          , Tihana Galinac Grbac, and Darko Huljenic´.
          <year>2013</year>
          .
          <article-title>A Second Replicated Quantitative Analysis of Fault Distributions in Complex Software Systems</article-title>
          .
          <source>IEEE Transactions on Software Engineering</source>
          <volume>39</volume>
          (
          <year>2013</year>
          ),
          <fpage>462</fpage>
          -
          <lpage>476</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Feng</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Ahmed E. Hassan,
          <string-name>
            <surname>Shane McIntosh</surname>
            ,
            <given-names>and Ying</given-names>
          </string-name>
          <string-name>
            <surname>Zou</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>The Use of Summation to Aggregate Software Metrics Hinders the Performance of Defect Prediction Models</article-title>
          .
          <source>IEEE Trans. Softw. Eng</source>
          .
          <volume>43</volume>
          ,
          <issue>5</issue>
          (May
          <year>2017</year>
          ),
          <fpage>476</fpage>
          -
          <lpage>491</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>