<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the Non-Generalizability in Bug Prediction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Haidar Osman</string-name>
          <email>osman@inf.unibe.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Software Composition Group University of Bern</institution>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Bug prediction is a technique used to estimate the most bug-prone entities in software systems. Bug prediction approaches vary in many design options, such as dependent variables, independent variables, and machine learning models. Choosing the right combination of design options to build an e↵ ective bug predictor is hard. Previous studies do not consider this complexity and draw conclusions based on fewer-than-necessary experiments. We argue that each software project is unique from the perspective of its development process. Consequently, metrics and machine learning models perform di↵ erently on di↵ erent projects, in the context of bug prediction. We confirm our hypothesis empirically by running di↵ erent bug predictors on di↵ erent systems. We show there are no universal bug prediction configurations that work on all projects.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>A bug predictor is an intelligent system (model) trained on data derived from software (metrics)
to make a prediction (number of bugs, bug proneness, etc.) about software entities (packages,
classes, files, methods, etc.).</p>
      <p>
        Bug prediction helps developers focus their quality assurance e↵ orts on the parts of the system
that are more likely to contain bugs. Bug prediction takes advantage of the fact that bugs are not
evenly distributed across the system but they rather tend to cluster [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. The distribution of bugs
follows the Pareto principle, i.e., 80% of the bugs are located in 20% of the files [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. An e↵ ective
bug predictor locates the highest number of bugs in the least amount of code.
      </p>
      <p>
        Over the last two decades, bug prediction has been a hot topic for research in software
engineering and many approaches have been devised to build e↵ ective bug predictors. Researchers have
been probing the problem/solution space trying to find universal solutions regarding the software
metrics to use [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ][
        <xref ref-type="bibr" rid="ref10">10</xref>
        ][
        <xref ref-type="bibr" rid="ref15">15</xref>
        ][
        <xref ref-type="bibr" rid="ref20">20</xref>
        ][
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][
        <xref ref-type="bibr" rid="ref9">9</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and machine learning models to employ [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ][
        <xref ref-type="bibr" rid="ref12">12</xref>
        ][
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref14">14</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] to
predict bugs in software entities.
      </p>
      <p>However, these studies do not consider the complexity of building a bug predictor, a process
that has many design options to choose from:
Copyright c by the paper’s authors. Copying permitted for private and academic purposes.
Proceedings of the Seminar Series on Advanced Techniques and Tools for Software Evolution SATToSE2016
(sattose.org), Bergen, Norway, 11-13 July 2016, published at http://ceur-ws.org
2. The independent variables (i.e., the metrics used to train the model like source code metrics,
change metrics, etc.).
3. The dependent variable or the model output (e.g., bug proneness, number of bugs, bug
density).
4. The granularity of prediction (e.g., package, class, binary).
5. The evaluation method (e.g., accuracy measures, percentage of bugs in percentage of software
entities).</p>
      <p>Most previous approaches vary one design option, which is the studied one, and fix all
others. This a↵ ects the generalizability of the findings because every option a↵ ects the others and,
consequently, the overall outcome, as shown in Figure 1.</p>
      <p>Binary
Package / Module
Class / File
Method
Line of Code</p>
      <p>Level of Prediction
issuitablefor</p>
      <p>hoanstahneimpact</p>
      <sec id="sec-1-1">
        <title>Independent Variables</title>
        <p>Source Code Metrics
Version History Metrics
Organizational Metrics
Input
has an impact
on the</p>
      </sec>
      <sec id="sec-1-2">
        <title>Prediction Model</title>
        <p>Regression
Probability
Binary
Cache
Confusion Matrix
Statistical Correlation
Percentage of Bugs
Cost-Aware Evaluation
Evaluation Method
hasonanthiempact
of the
decides
decides
Output</p>
      </sec>
      <sec id="sec-1-3">
        <title>Dependent Variable</title>
        <p>Class (buggy/non-buggy)
Bug Proneness
Number of Bugs
Bug Density</p>
        <p>
          It has been shown previously that a model trained on data from a specific project does not
perform well on another project and the so called cross-project defect prediction rarely works
[
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. We take this idea even further and hypothesise that bug prediction findings are inherently
non-generalizable. A bug prediction configuration that works with one system may not work
with another because software systems have di↵ erent teams, development methods, frameworks,
and architectures. All these factors a↵ ect the correlation between di↵ erent metrics and software
defects.
        </p>
        <p>To confirm our hypothesis, we run an extended empirical study where we try di↵ erent bug
prediction configurations on di↵ erent systems. We show that no single configuration generalizes
to all our subject systems and every system has its own “best” bug prediction configuration.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Experimental Setup</title>
      <sec id="sec-2-1">
        <title>Dataset</title>
        <p>
          We run the experiments on the “bug prediction data set” provided by D’Ambros et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] to
serve as a benchmark for bug prediction studies. This data set contains software metrics on the
class level for five software systems (Eclipse JDT Core, Eclipse PDE UI, Equinox Framework,
Lucene, and Mylyn). Using this data set constrains the level of prediction to be on the class level.
We compare source code metrics and version history metrics (change metrics) as the independent
variables.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Dependent Variable</title>
        <p>All bug-prediction approaches predict one of the following: (1) the classification of the software
entity (buggy or bug-free), (2) the number of bugs in the software entity, (3) the probability of
a software entity to contain bugs (bug proneness), (4) the bug-density of a software entity (bugs
per LOC), or (5) the set of software entities that will contain bugs in the near future (e.g., within
a month). In this study, we consider two dependent variables: number of bugs and classification.</p>
      </sec>
      <sec id="sec-2-3">
        <title>Evaluation Method</title>
        <p>
          An e↵ ective bug predictor should locate the highest number of bugs in the least amount of code.
Recently, researchers have drawn attention to this principle and proposed evaluation schemes
to measure the cost or e↵ ort of using a bug prediction model [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ][
          <xref ref-type="bibr" rid="ref1">1</xref>
          ][
          <xref ref-type="bibr" rid="ref9">9</xref>
          ][
          <xref ref-type="bibr" rid="ref11">11</xref>
          ][
          <xref ref-type="bibr" rid="ref8">8</xref>
          ][
          <xref ref-type="bibr" rid="ref18">18</xref>
          ][
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Cost-aware
evaluation schemes rely on the fact that a bug predictor should produce an ordered list of software
entities, and measure the maximum percentage of predicted faults in the top k% of lines of code
of a system. These schemes take the number of lines of code (LOC) as a proxy for the e↵ ort of
unit testing and code reviewing.
        </p>
        <p>
          In this study, we use an evaluation scheme called cost-e↵ ectiveness (CE), proposed by Arisholm
et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. CE ranges between 1 and +1. The closer CE gets to +1, the more cost-e↵ ective the
bug predictor is. A value of CE around zero indicates that there is no gain in using the bug
predictor. Once CE goes below zero, it means that using the bug predictor costs more than not
using it.
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>Machine Learning Model</title>
        <p>
          For classification, we use Random Forest (RF), K-Nearest Neighbour (KNN), Support Vector
Machine (SVM), and Neural Networks (NN). To predict the number of bugs (regression), we use
linear regression (LR), SVM, KNN, and NN. We used the Weka data mining tool [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] to build
these prediction models1.
        </p>
      </sec>
      <sec id="sec-2-5">
        <title>Procedure</title>
        <p>3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>For every configuration, we randomly split the data set into a training set (70%) and a test set
(30%) in a way that retains the ratio between buggy and non-buggy entities. Then we train the
prediction model on the training set and run it on the test set and calculate the CE of the bug
predictor. For each configuration, we repeat this process 30 times and take the mean CE.
First, we compare the di↵ erent machine learning models. To see if machine learning models
perform di↵ erently, we apply the analysis of variance, ANOVA, and the post-hoc analysis, Tukey’s
HSD (honest significant di↵ erence), to the di↵ erent models in classification and to di↵ erent models
in regression.</p>
      <p>The tests were carried out at 0.95 confidence level. Only when the ANOVA test is statistically
significant, do we carry out the post-hoc test. Otherwise, we only report the best performing
model. Statistically significant results are reported in boldface in the result Tables 1, 2, and 3. As
can be seen from the results in Table 1, It is clear that di↵ erent machine learning models actually
perform di↵ erently. Also there is no dominant model that stands out as the best model throughout
the experiments.</p>
      <p>Second, to compare the two di↵ erent types of metrics, we compare the best-performing model
using source code metrics and the best-performing model using change metrics using the student’s
t-test at the 95% confidence level. We compare using both classification and regression. Table 2
shows the results of the test, where bold text indicates statistically significant results. It can be
deduced from the results that source code metrics are better than change metrics in some projects
and worse in others. No type of metrics is constantly the best for all projects in the dataset.
1We use Weka’s default configuration values for the models</p>
      <p>Third, we compare the the two types of response variables (classification vs regression) by
comparing the best performing model from each using also the student’s t-test at the 95% confidence
level. Table 3 shows that the comparisons are in favour of regression all projects in our dataset
but with statistical significance only in case of Equinox. This means that treating bug prediction
as a regression problem is more cost e↵ ective than classification.</p>
      <p>Finally, we compare configurations with the highest CE for the five projects in the data set. In
Table 4, we report the highest mean CE and the configuration of the bug predictor behind. The
results show that there is no global configuration of settings that suits all projects.</p>
      <p>To summarize the results of the experiments, we make the following observations:
1. Di↵ erent machine learning models actually perform di↵ erently in predicting bugs and there
is no dominant model that stands out as the best for all projects.
2. There is no general rule about which metrics are better at predicting bugs.
3. The configurations of the most cost-e↵ ective bug predictor vary from one project to another.
4. The cost e↵ ectiveness of bug prediction is di↵ erent from one system to another.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>Building a software bug predictor is a complex process with many interleaving design choices. In
the bug prediction literature, researchers have overlooked this complexity, suggesting
generalizability where none is warranted. Our results suggest that a universal set of bug prediction
configurations is unlikely to exist. Among the five subject systems we have, no type of metrics
stands out as the best and no machine learning algorithm prevails when building for building a
cost-e↵ ective bug predictor. This indicates a need for more research to revisit literature findings
while taking bug prediction complexity into account. In the future we plan to explore di↵ erent
ways to automatically find the most e↵ ective bug prediction configurations for a specific project.
This enables a bug predictor to be adaptive to the di↵ erent characteristics of di↵ erent software
projects without manual intervention.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Arisholm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Briand</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E. B.</given-names>
            <surname>Johannessen</surname>
          </string-name>
          .
          <article-title>A systematic and comprehensive investigation of methods to build and evaluate fault prediction models</article-title>
          .
          <source>J. Syst. Softw.</source>
          ,
          <volume>83</volume>
          (
          <issue>1</issue>
          ):
          <fpage>2</fpage>
          -
          <lpage>17</lpage>
          , Jan.
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Canfora</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. De Lucia</surname>
            ,
            <given-names>M. Di</given-names>
          </string-name>
          <string-name>
            <surname>Penta</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Oliveto</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Panichella</surname>
            , and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Panichella</surname>
          </string-name>
          .
          <article-title>Multiobjective cross-project defect prediction</article-title>
          .
          <source>In Software Testing, Verification and Validation (ICST)</source>
          ,
          <year>2013</year>
          IEEE Sixth International Conference on, pages
          <fpage>252</fpage>
          -
          <lpage>261</lpage>
          , Mar.
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>M. D'Ambros</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lanza</surname>
            , and
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Robbes</surname>
          </string-name>
          .
          <article-title>An extensive comparison of bug prediction approaches</article-title>
          .
          <source>In Proceedings of MSR 2010 (7th IEEE Working Conference on Mining Software Repositories)</source>
          , pages
          <fpage>31</fpage>
          -
          <lpage>40</lpage>
          . IEEE CS Press,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K. O.</given-names>
            <surname>Elish</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. O.</given-names>
            <surname>Elish</surname>
          </string-name>
          .
          <article-title>Predicting defect-prone software modules using support vector machines</article-title>
          .
          <source>Journal of Systems and Software</source>
          ,
          <volume>81</volume>
          (
          <issue>5</issue>
          ):
          <fpage>649</fpage>
          -
          <lpage>660</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Giger</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. D'Ambros</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Pinzger</surname>
            , and
            <given-names>H. C.</given-names>
          </string-name>
          <string-name>
            <surname>Gall</surname>
          </string-name>
          .
          <article-title>Method-level bug prediction</article-title>
          .
          <source>In Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement</source>
          , pages
          <fpage>171</fpage>
          -
          <lpage>180</lpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cukic</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Singh</surname>
          </string-name>
          .
          <article-title>Robust prediction of fault-proneness by random forests</article-title>
          .
          <source>In Software Reliability Engineering</source>
          ,
          <year>2004</year>
          .
          <source>ISSRE</source>
          <year>2004</year>
          . 15th International Symposium on, pages
          <fpage>417</fpage>
          -
          <lpage>428</lpage>
          . IEEE,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hall</surname>
          </string-name>
          , E. Frank,
          <string-name>
            <given-names>G.</given-names>
            <surname>Holmes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Pfahringer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Reutemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          .
          <article-title>The weka data mining software: an update</article-title>
          .
          <source>ACM SIGKDD explorations newsletter</source>
          ,
          <volume>11</volume>
          (
          <issue>1</issue>
          ):
          <fpage>10</fpage>
          -
          <lpage>18</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>H.</given-names>
            <surname>Hata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Mizuno</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Kikuno</surname>
          </string-name>
          .
          <article-title>Bug prediction based on fine-grained module histories</article-title>
          .
          <source>In Proceedings of the 34th International Conference on Software Engineering</source>
          , ICSE '
          <volume>12</volume>
          , pages
          <fpage>200</fpage>
          -
          <lpage>210</lpage>
          , Piscataway, NJ, USA,
          <year>2012</year>
          . IEEE Press.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kamei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Matsumoto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monden</surname>
          </string-name>
          , K.-i. Matsumoto,
          <string-name>
            <given-names>B.</given-names>
            <surname>Adams</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Hassan</surname>
          </string-name>
          .
          <article-title>Revisiting common bug prediction findings using e↵ ort-aware models</article-title>
          .
          <source>In Software Maintenance (ICSM)</source>
          ,
          <year>2010</year>
          IEEE International Conference on, pages
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          , Sept.
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Zimmermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. J. W.</given-names>
            <surname>Jr.</surname>
          </string-name>
          , and
          <string-name>
            <given-names>A.</given-names>
            <surname>Zeller</surname>
          </string-name>
          .
          <article-title>Predicting faults from cached history</article-title>
          .
          <source>In ICSE '07: Proceedings of the 29th international conference on Software Engineering</source>
          , pages
          <fpage>489</fpage>
          -
          <lpage>498</lpage>
          , Washington, DC, USA,
          <year>2007</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>K.</given-names>
            <surname>Kobayashi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Matsuo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Inoue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hayase</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kamimura</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Yoshino</surname>
          </string-name>
          . ImpactScale:
          <article-title>Quantifying change impact to predict faults in large software systems</article-title>
          .
          <source>In Proceedings of the 2011 27th IEEE International Conference on Software Maintenance, ICSM '11</source>
          , pages
          <fpage>43</fpage>
          -
          <lpage>52</lpage>
          , Washington, DC, USA,
          <year>2011</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lessmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Baesens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mues</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Pietsch</surname>
          </string-name>
          .
          <article-title>Benchmarking classification models for software defect prediction: A proposed framework and novel findings</article-title>
          .
          <source>IEEE Trans. Softw</source>
          . Eng.,
          <volume>34</volume>
          (
          <issue>4</issue>
          ):
          <fpage>485</fpage>
          -
          <lpage>496</lpage>
          ,
          <year>July 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mende</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Koschke</surname>
          </string-name>
          .
          <article-title>Revisiting the evaluation of defect prediction models</article-title>
          .
          <source>In Proceedings of the 5th International Conference on Predictor Models in Software Engineering, PROMISE '09</source>
          , pages
          <issue>7</issue>
          :
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          :
          <fpage>10</fpage>
          , New York, NY, USA,
          <year>2009</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Menzies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Milton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Turhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cukic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Bener</surname>
          </string-name>
          .
          <article-title>Defect prediction from static code features: Current results, limitations, new approaches</article-title>
          .
          <source>Automated Software Engg.</source>
          ,
          <volume>17</volume>
          (
          <issue>4</issue>
          ):
          <fpage>375</fpage>
          -
          <lpage>407</lpage>
          , Dec.
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R.</given-names>
            <surname>Moser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Pedrycz</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Succi</surname>
          </string-name>
          .
          <article-title>A comparative analysis of the e ciency of change metrics and static code attributes for defect prediction</article-title>
          .
          <source>In Proceedings of the 30th International Conference on Software Engineering</source>
          , ICSE '
          <volume>08</volume>
          , pages
          <fpage>181</fpage>
          -
          <lpage>190</lpage>
          , New York, NY, USA,
          <year>2008</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ostrand</surname>
          </string-name>
          , E. Weyuker, and
          <string-name>
            <given-names>R.</given-names>
            <surname>Bell</surname>
          </string-name>
          .
          <article-title>Predicting the location and number of faults in large software systems</article-title>
          .
          <source>Software Engineering</source>
          , IEEE Transactions on,
          <volume>31</volume>
          (
          <issue>4</issue>
          ):
          <fpage>340</fpage>
          -
          <lpage>355</lpage>
          , Apr.
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T. J.</given-names>
            <surname>Ostrand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Weyuker</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Bell</surname>
          </string-name>
          .
          <article-title>Where the bugs are</article-title>
          .
          <source>In ACM SIGSOFT Software Engineering Notes</source>
          , volume
          <volume>29</volume>
          , pages
          <fpage>86</fpage>
          -
          <lpage>96</lpage>
          . ACM,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>F.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Posnett</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Devanbu</surname>
          </string-name>
          .
          <article-title>Recalling the “imprecision” of cross-project defect prediction</article-title>
          .
          <source>In In the 20th ACM SIGSOFT FSE. ACM</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Sherer</surname>
          </string-name>
          .
          <article-title>Software fault prediction</article-title>
          .
          <source>Journal of Systems and Software</source>
          ,
          <volume>29</volume>
          (
          <issue>2</issue>
          ):
          <fpage>97</fpage>
          -
          <lpage>105</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>E. J.</given-names>
            <surname>Weyuker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. J.</given-names>
            <surname>Ostrand</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Bell</surname>
          </string-name>
          .
          <article-title>Do too many cooks spoil the broth? using the number of developers to enhance defect prediction models</article-title>
          .
          <source>Empirical Softw</source>
          . Engg.,
          <volume>13</volume>
          (
          <issue>5</issue>
          ):
          <fpage>539</fpage>
          -
          <lpage>559</lpage>
          , Oct.
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>T.</given-names>
            <surname>Zimmermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nagappan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gall</surname>
          </string-name>
          , E. Giger, and
          <string-name>
            <given-names>B.</given-names>
            <surname>Murphy</surname>
          </string-name>
          .
          <article-title>Cross-project defect prediction: A large scale experiment on data vs. domain vs. process</article-title>
          .
          <source>In Proceedings of the the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering</source>
          , ESEC/FSE '09, pages
          <fpage>91</fpage>
          -
          <lpage>100</lpage>
          , New York, NY, USA,
          <year>2009</year>
          . ACM.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>