<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Way Ahead for Bug-fix time Prediction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Meera Sharma</string-name>
          <email>meerakaushik@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Madhu Kumari</string-name>
          <email>mesra.madhu@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>V.B.Singh</string-name>
          <email>vbsinghdcacdu@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Delhi College of Arts &amp; Commerce, University of Delhi</institution>
          ,
          <addr-line>Delhi</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, University of Delhi</institution>
          ,
          <addr-line>Delhi</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <fpage>31</fpage>
      <lpage>38</lpage>
      <abstract>
        <p>- The bug-fix time i.e. the time to fix a bug after the bug was introduced is an important factor for bug related analysis, such as measuring software quality or coordinating development effort during bug triaging. Previous work has proposed many bug-fix time prediction models that use various bug attributes (number of developers who participated in fixing the bug, bug severity, bug-opener's reputation, number of patches) for predicting the fix time of a newly reported bug. In this paper, we have investigated the associations between bug attributes and the bug-fix time. We have proposed two approaches to apply association rule mining. In the first approach, we have used Apriori algorithm to predict the fix time of a newly coming bug based on the bug's severity, priority summary terms and assignee. In second approach, we have used k-means clustering method to get groups of correlated variables followed by association rule mining inside each cluster. We have collected 1,695 bug reports of three products namely AddOnSDK, Thunderbird and Bugzilla of Mozilla open source project to mine association rules. Results show that for given set of bug attributes, we can predict the bug-fix time for newly coming bugs which will help in software quality improvement. A large number of association rules having high confidence and support with higher severity and priority as antecedents and short bug-fix time as consequent show that more important bugs are fixed without any delay. This information is useful in determining software quality. We also observe that our approach for bug-fix time prediction will be helpful in bug triaging by assigning a bug to the most potential and experienced assignee who will solve the bug in minimum time period. This will again help in software quality improvement. In nutshell, we can say that association rule mining based bug-fix time prediction can help managers to improve the software development process.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In a database, the interesting correlations, frequent
patterns, associations or casual structures among the attributes
can be discovered by using association rule mining. Let C is a
database of transactions and each transaction T is a set of
items. An association rule is an expression A⇒ D, where A is
called antecedent and D is called consequent. A⇒ D reveals
that whenever a transaction T contains A, then T also contains
D with a specified confidence and support. The confidence of
a rule is defined as percentage/fraction of the number of
transactions that contain A∪D to the total number of
transactions that contain A. It is a measure of the rule’s
strength or certainty [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Support of a rule is defined as the
percentage/fraction of transactions that contain A∪D to the
total number of transactions in the database. It corresponds to
statistical significance or usefulness of the rule. Minimum
support count is defined as the number of transactions required
for an item set to satisfy minimum support. Association rule
mining generates all association rules that have a support
greater than minimum support min.Supp(A⇒D), in the
database, i.e., the rules are frequent. The rules must also have
confidence greater than minimum confidence min.Conf(A⇒
D), i.e., the rules are strong.
      </p>
      <p>In a wide range of science and business areas association
rule mining can be applied successfully. Several performance
studies have resulted in better accuracy for associative
classification than state-of-the-art classification methods
[918].</p>
    </sec>
    <sec id="sec-2">
      <title>Clustering is a partitioning method in which a group of</title>
      <p>data points is partitioned into a small number of clusters. In
kmeans clustering algorithm, the function k-means partitions
data into k mutually exclusive clusters, and returns the index
of the cluster to which it has assigned each observation.</p>
    </sec>
    <sec id="sec-3">
      <title>Unlike hierarchical clustering, k-means clustering operates on</title>
      <p>actual observations (rather than the larger set of dissimilarity
measures), and creates a single level of clusters. The
distinctions mean that k-means clustering is often more
suitable than hierarchical clustering for large amount of data.</p>
    </sec>
    <sec id="sec-4">
      <title>The successful use of association rule mining in various fields motivates us to apply it to the open source software bug data set [9-18].</title>
    </sec>
    <sec id="sec-5">
      <title>The organization of rest of the paper is as follows. Section</title>
    </sec>
    <sec id="sec-6">
      <title>2 gives the description and preprocessing of data. Section 3</title>
      <p>describes the model building. Section 4 presents the results.</p>
    </sec>
    <sec id="sec-7">
      <title>Section 5 discusses about related work. Section 6 tells about the threats to validity and finally section 7 concludes the paper with future research directions. II.</title>
    </sec>
    <sec id="sec-8">
      <title>DATA SETS DESCRIPTION AND DATA PREPROCESSING</title>
      <p>We collected bug reports from Bugzilla bug tracking
system with status “verified”, “resolved” and “closed” and
resolution “fixed” because only these types of bug reports
contain the consistent information for the experiment. We
have compared and validated the collected bug reports against
general change data (i.e. CVS or SVN records). Number of
bug reports collected in the observed period is given in table I.</p>
      <p>In order to apply association rule mining, we have
quantified different bug attributes namely severity, priority,
summary, assignee and fix time.</p>
    </sec>
    <sec id="sec-9">
      <title>We have preprocessed the bug summary attribute to</title>
      <p>
        extract terms in RapidMiner tool [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] with the help of
following steps:
      </p>
      <p>Tokenization: the process of breaking a stream of text into
words, phrases, symbols, or other meaningful elements called
tokens is called ‘tokenization’. We have considered a word or
a term as a token.</p>
      <p>Stop Word Removal: words which are commonly used in
the text but do not carry useful meaning like prepositions,
conjunctions, articles, verbs, nouns, pronouns, adverbs,
adjectives are called stop words. We have removed all the stop
words from bug summary.</p>
      <p>
        Stemming to base stem: the process of converting derived
words to their base word (stem) is known as stemming.
Standard Porter stemming algorithm can be utilized for
stemming [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
      </p>
      <p>Feature Reduction: tokens of minimum 3 and maximum
40 occurrences have been considered because most of the data
mining algorithm may not be able to handle large feature sets.</p>
      <p>Weight by Information Gain or InfoGain: it is helpful in
determining the importance or relevance of the term. It helps
in selection of top few terms in the data set.</p>
    </sec>
    <sec id="sec-10">
      <title>We have made a workflow in RapidMiner to extract a set</title>
      <p>of terms from bug summary attribute. We have taken tokenize
mode as non-letters and in filter tokens parameter we have set
min chars value as 3 and max chars value as 50. We used
English dictionary to filter the stop words.</p>
      <p>III.</p>
    </sec>
    <sec id="sec-11">
      <title>MODEL BUILDING</title>
    </sec>
    <sec id="sec-12">
      <title>Our study consists of following steps:</title>
      <sec id="sec-12-1">
        <title>1. Data Extraction</title>
      </sec>
      <sec id="sec-12-2">
        <title>2. Data Pre-processing</title>
      </sec>
      <sec id="sec-12-3">
        <title>3. Data Preparation</title>
        <p>From CVS repository:
https://bugzilla.mozilla.org/, downloaded bug
reports for 3 products of Mozilla open source
project.
b. Store the downloaded bug reports in excel file
for further processing.
a. In RapidMiner developed a workflow to extract
individual terms of bug summary.
4. Association Rule Mining and Clustering
a. For different severity and preiority levels, we
have taken numeric values from 1 to 7 and from
8 to 12.
b. Assigned a numeric value from 13 to 43 to top
30 terms based on InfoGain.
c. For each assignee take a unique numeric value.
d. Filtered bug-fix time for 0 to 99 days as
maximum number of bugs has fix time in this
range only. Define three bug-fix time ranges: 0
to19 days, 20 to 64 days and 65 to 99 days.</p>
      </sec>
    </sec>
    <sec id="sec-13">
      <title>Assign a numeric value from 1 to 3 to these three ranges. a. b.</title>
    </sec>
    <sec id="sec-14">
      <title>ARMADA (Association Rule Miner And</title>
    </sec>
    <sec id="sec-15">
      <title>Deduction Analysis) is a Data Mining tool of</title>
    </sec>
    <sec id="sec-16">
      <title>MATLAB software that extracts Association</title>
    </sec>
    <sec id="sec-17">
      <title>Rules from numerical data files using a variety</title>
      <p>
        of selectable techniques and criteria [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. We
have applied Apriori algorithm by using
      </p>
    </sec>
    <sec id="sec-18">
      <title>ARMADA tool. As a result we get association rules for bug-fix time prediction with severity, priority, summary terms and assignee as antecedents.</title>
    </sec>
    <sec id="sec-19">
      <title>We have applied k-means clustering algorithm in SPSS(Statistical Package for Social Sciences) software followed by Apriori algorithm for each resulting cluster by using MATLAB software</title>
      <p>with minimum confidence 20% and minimum
support 7%.</p>
      <sec id="sec-19-1">
        <title>5. Testing and Validation</title>
      </sec>
    </sec>
    <sec id="sec-20">
      <title>Assess the resulting association rules in terms of different performance measures namely support and confidence.</title>
    </sec>
    <sec id="sec-21">
      <title>IV. RESULTS AND DISCUSSION</title>
      <p>In this paper, we have proposed two approaches to apply
association mining. In first approach, we have mined the
association rules for bug-fix time prediction with bug severity,
priority, summary terms and assignee as antecedents by
applying Apriori algorithm of ARMADA tool in MATLAB
software. We have considered association rules with minimum
confidence 20% and minimum support 7% for AddOnSDK
and Bugzilla products. In thunderbird product we have very
less number of bug reports as a result of which we get
association rules with minimum confidence 20% and support
3%. All the 3 datasets have more than 100 rules. For this
reason, we do not list them all, but instead we present top 5
rules based on the highest confidence. In table II we have
presented top five association rules of AddOnSDK product for
three defined ranges.</p>
      <p>TABLE II.</p>
      <p>TOP FIVE ASSOCIATION RULES FOR ADDONSDK
3.
4.
5.</p>
      <p>Term {text}
⇒ Bug-fix time {65-99 days} @ (9%, 29%)
Severity {Major} ᴧ Term {con} ᴧ Term {text}
⇒ Bug-fix time {65-99 days} @ (7%, 27%)
Term {con} ᴧ Term {text}
⇒ Bug-fix time {65-99 days} @ (7%, 25%)
Priority {P1} ᴧ Term {tab}
⇒ Bug-fix time {65-99 days} @ (8%, 23%)</p>
      <p>The first association rule is a six antecedent rule, which
reveals that a bug with priority P1, assignee Alexander Poirot
and summary containing terms con, test, content and fail can
have a fix time of 0 to 19 days with a significance of 10
percent and a certainty of 100 percent. Second association rule
means that a bug with severity Major, priority P1, and
summary containing terms con, test, content and fail can have
a fix time of 0 to 19 days with a significance of 8 percent and
a certainty of 100 percent. Third rule shows that a bug with
severity Major, priority P1 and summary containing terms
con, content and fail can have a fix time of 0 to 19 days with a
significance of 7 percent and a certainty of 100 percent. Rule
four reveals that 11 percent of the bugs in the bug data set
have priority P1, assignee Alexandre Poirot, summary
containing terms con, content, fail and bug-fix time of 0 to 19
days. 100 percent of the bugs in the bug data set that have
priority P1, assignee Alexandre Poirot, summary containing
terms con, content, fail also have bug-fix time of 0-19 days.</p>
    </sec>
    <sec id="sec-22">
      <title>The fifth rule shows that the bug having severity Major,</title>
      <p>priority P1 and summary containing terms con, content and
fail can have bug-fix time of 0 to 19 days with a significance
of 9 percent and a certainty of 100 percent. Similarly we have
interpreted association rules of other bug-fix time ranges.</p>
    </sec>
    <sec id="sec-23">
      <title>We have shown top five association rules to predict bugfix time for Thunderbird product in table III.</title>
      <p>Assignee {David} ᴧ Term {messag}
⇒ Bug-fix time {20-64 days} @ (3%, 75%)</p>
      <p>The first association rule is a four antecedent rule, which
reveals that a bug with severity Major, and summary
containing terms add, icon and address can have a fix time of
0 to 19 days with a significance of 3 percent and a certainty of
100 percent. Second association rule means that a bug with
severity Major, priority P3, and summary containing terms
text and box can have a fix time of 0 to 19 days with a
significance of 3 percent and a certainty of 100 percent. Third
rule shows that a bug with severity Major, priority P3 and
summary containing terms window and assignee Andreas
Nilssson can have a fix time of 0 to 19 days with a
significance of 3 percent and a certainty of 100 percent. Rule
four reveals that 3 percent of the bugs in the bug data set have
summary containing terms tool, toolbar, assignee Blake
Winton and bug-fix time of 0 to 19 days. 100 percent of the
bugs in the bug data set that have summary containing terms
tool, toolbar and assignee Blake Winton also have bug-fix
time of 0-19 days. The fifth rule shows that the bug with
summary containing terms config, auto and assignee Blake
Winton can have bug-fix time of 0 to 19 days with a
significance of 3 percent and a certainty of 100 percent.</p>
    </sec>
    <sec id="sec-24">
      <title>Similarly we have interpreted association rules of other bugfix time ranges.</title>
    </sec>
    <sec id="sec-25">
      <title>We have shown top five association rules to predict bugfix time for Bugzilla product in table IV.</title>
      <p>3.
4.
5.
1.
2.
3.
4.
5.</p>
      <p>Priority {P3} ᴧ Term {edit}
⇒ Bug-fix time {20-64 days} @ (10%, 67%)
Severity {Major} ᴧ Term {temp} ᴧ Term {templat}
⇒ Bug-fix time {20-64 days} @ (8%, 62%)
Priority {P3} ᴧ Term {user}
⇒ Bug-fix time {20-64 days} @ (8%, 57%)
Severity {Major} ᴧ Term {temp}
⇒ Bug-fix time {20-64 days} @ (8%, 57%)</p>
      <p>Bug-fix time 65-99 days
Assignee {Gervase Markham} ᴧ Term{temp} ᴧ Term{templat}
⇒ Bug-fix time {65-99 days} @ (7%, 39%)
Assignee {Gervase Markham} ᴧ Term{cgi}
⇒ Bug-fix time {65-99 days} @ (7%, 39%)
Assignee {Matthew Barnson}
⇒ Bug-fix time {65-99 days} @ (10%, 38%)
Assignee {Max Kanat-Alexander} ᴧ Term{ing}
⇒ Bug-fix time {65-99 days} @ (9%, 31%)
Assignee {Dawn Endico}
⇒ Bug-fix time {65-99 days} @ (7%, 30%)</p>
      <p>The first association rule is a six antecedent rule, which
reveals that a bug with severity Major, priority P1and
summary containing terms check, set, setup and checksetup
can have a fix time of 0 to 19 days with a significance of 11
percent and a certainty of 100 percent. Second association rule
means that a bug with priority P1, and summary containing
terms check, set, setup and checksetup can have a fix time of 0
to 19 days with a significance of 7 percent and a certainty of
100 percent. Third rule shows that a bug with assignee Daniel
Buchner and summary containing terms bug, hang and chang
can have a fix time of 0 to 19 days with a significance of 7
percent and a certainty of 100 percent. Rule four reveals that
7 percent of the bugs in the bug data set have priority P3,
summary containing terms bug, ing, bugzilla and bug-fix time
of 0 to 19 days. 100 percent of the bugs in the bug data set
that have priority P3 and summary containing terms bug, ing
and Bugzilla also have bug-fix time of 0-19 days. The fifth
rule shows that a bug with priority P3, assignee Daniel
Buchner and summary containing terms hang and chang can
have bug-fix time of 0 to 19 days with a significance of 7
percent and a certainty of 100 percent. Similarly we have
interpreted association rules of other bug-fix time ranges.</p>
    </sec>
    <sec id="sec-26">
      <title>In order to analyze the rule length (number of antecedents) of association rules, we draw the distribution of association rules across all the datasets (Fig. 1 to 3).</title>
      <p>Figure 1 to 3 show that we have maximum association
rules with two antecedents (length 2) across all the datasets.</p>
      <p>We observe that in all products, we have some rules with
same antecedents and consequent except assignee. These rules
reveal that for different assignee we have same bug-fix time
for same values of other attributes. In this case we will prefer
an assignee with higher confidence value to whom we can
assign the bug as he is more potential and experienced in
fixing such type of bugs. In this way the proposed approach
will help in bug triaging which will help in software quality
improvement.</p>
    </sec>
    <sec id="sec-27">
      <title>We have observed following rules from AddOnSDK product.</title>
      <p>1. Severity {Major} ᴧ Term {test} ᴧ Assignee {Alexandre
Poirot}
⇒ Bug-fix time {0-19 days} @ (16%, 89%)
2. Severity {Major} ᴧ Term {test} ᴧ Assignee {Dave
Townsend}
⇒ Bug-fix time {0-19 days} @ (12%, 71%)
3. Severity {Major} ᴧ Term {test} ᴧ Assignee {Erik Vold}
⇒ Bug-fix time {0-19 days} @ (8%, 50%)
4. Severity {Major} ᴧ Priority {P1} ᴧ Term {con} ᴧ
Assignee {Will Bamberg}
⇒ Bug-fix time {20-64 days} @ (11%, 65%)
5. Severity {Major} ᴧ Priority {P1} ᴧ Term {con} ᴧ
Assignee {Alexandre Poirot}
⇒ Bug-fix time {20-64 days} @ (9%, 35%)</p>
    </sec>
    <sec id="sec-28">
      <title>First three rules reveals that bugs with severity Major and summary containing term test have three choices of assignee</title>
      <p>i.e. Alexandre Poirot or Dave Townsend or Erik Vold to get
fixed in 0 to 19 days with certainty of 89, 71 and 50 percent
respectively. We observe that the bug should be assigned to</p>
    </sec>
    <sec id="sec-29">
      <title>Alexandre Poirot as the rule with this assignee gives highest</title>
      <p>certainty. Similarly we can infer from last two rules that we
should assign the bug to Will Bamberg as the rule with this
assignee gives higher certainty. Similar inference we can draw
for other two datasets also.</p>
      <p>We observe that in all products we have some rules with
same antecedents except assignee. These rules reveal that
different assignee will fix same bugs with same attributes with
different bug-fix time. In this case, we will prefer an assignee
with lower fix time in fixing such type of bugs. In this way the
proposed approach will help in choosing assignee which can
fix the bug in shortest time.</p>
      <p>We have observed following rules from Bugzilla product.
1. Severity {Major} ᴧ Assignee {Terry Weissman}
⇒ Bug-fix time {0-19 days} @ (67%, 80%)
2. Severity {Major} ᴧ Assignee {Bradley Baetz}
⇒ Bug-fix time {20-64 days} @ (7%, 44%)
3. Severity {Major} ᴧ Assignee {Max Kanat-Alexander}
⇒ Bug-fix time {65-99 days} @ (8%, 22%)
4. Priority{P1} ᴧ Assignee {Dave Miller}</p>
      <p>⇒ Bug-fix time {0-19 days} @ (7%, 78%)
5. Priority{P1} ᴧ Assignee {Max Kanat-Alexander}
⇒ Bug-fix time {20-64 days} @ (11%, 42%)</p>
    </sec>
    <sec id="sec-30">
      <title>First three rules reveals that bugs with severity Major can be assigned to three different assignee: Terry Weissman,</title>
      <p>Bradley Baetz and Max Kanat-Alexander. All the three
assignee will fix the same bug with severity Major with
different fix time ranges. We will preferably assign the bug to
an assignee who will fix it in minimum time and i.e. Terry</p>
    </sec>
    <sec id="sec-31">
      <title>Weissman. Similarly we can infer from last two rules that we should assign the bug to Dave Miller as he will solve the bug earliest. Similar inference we can draw for other two datasets also.</title>
    </sec>
    <sec id="sec-32">
      <title>In second approach, we have presented clustering based</title>
      <p>association rule mining for bug-fix time prediction. We have
partitioned the AddOnSDK dataset into 5 clusters using
kmeans clustering method. In cluster 1, there is only one data.</p>
    </sec>
    <sec id="sec-33">
      <title>Cluster 2 contains 93 data, cluster 3 contains 379 data, cluster</title>
    </sec>
    <sec id="sec-34">
      <title>4 contains 115 data and cluster 5 contains 28 data. After</title>
      <p>portioning, we have applied Apriori algorithm on each cluster
with minimum confidence 20% and minimum support 2%.</p>
      <p>Table V presents top five association rules from five
clusters formed by k-means clustering for AddOnSDK
product.</p>
      <p>Association Rules (minimum support=2%, minimum
confidence=20%)
Bug-fix time 0-19 days</p>
      <p>Cluster 2
Term {con} ᴧ Term {test} ᴧ Term{fail}
⇒ Bug-fix time {0-19 days} @ (5%, 100%)
Priority {P1} ᴧ Term {con} ᴧ Term {test}
⇒ Bug-fix time {0-19 days} @ (5%, 100%)</p>
      <p>Assignee {Alexandre Poirot} ᴧ Term {test} ᴧ Term{fail}
⇒ Bug-fix time {0-19 days} @ (5%, 100%)
Priority{P1} ᴧ Assignee { Alexandre Poirot } ᴧ Term {test}
⇒ Bug-fix time {0-19 days} @ (5%, 100%)
Priority {P1} ᴧ Term{con} ᴧ Term {test} ᴧ Term {fail}
⇒ Bug-fix time {0-19 days} @ (5%, 100%)</p>
      <p>Cluster 3
Priority {P1} ᴧ Term {fire} ᴧ Term {test} ᴧ Term{firefox}
⇒ Bug-fix time {0-19 days} @ (7%, 100%)
Priority {P1} ᴧ Assignee {Alexandre Poirot} ᴧ Term {fail} ᴧ
Term {test}
⇒ Bug-fix time {0-19 days} @ (7%, 100%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {test} ᴧ
Term{firefox}
⇒ Bug-fix time {0-19 days} @ (7%, 100%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {test} ᴧ Term {fire}
⇒ Bug-fix time {0-19 days} @ (7%, 100%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {test} ᴧ Term {fire}
ᴧ Term{firefox}
⇒ Bug-fix time {0-19 days} @ (7%, 100%)</p>
      <p>Cluster 4
Severity {Major} ᴧ Priority{P2} ᴧ Term {cfx}
⇒ Bug-fix time {0-19 days} @ (2%, 100%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {get}
⇒ Bug-fix time {0-19 days} @ (2%, 100%)
Severity {Major} ᴧ Priority{P2} ᴧ Term {get}
⇒ Bug-fix time {0-19 days} @ (2%, 100%)
Severity {Major} ᴧ Priority{P2} ᴧ Assignee {Alexandre
Poirot} ᴧ Term {get}
⇒ Bug-fix time {0-19 days} @ (2%, 100%)
Severity {Major} ᴧ Priority{P3} ᴧ Term {fail}
⇒ Bug-fix time {0-19 days} @ (2%, 100%)</p>
      <p>Cluster 5
Severity {Major} ᴧ Assignee {Alexandre Poirot} ᴧ Term
{con} ᴧ Term {content}
⇒ Bug-fix time {0-19 days} @ (5%, 83%)
Severity {Major} ᴧ Term {con} ᴧ Term {content}
⇒ Bug-fix time {0-19 days} @ (5%, 71%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {fail}
⇒ Bug-fix time {0-19 days} @ (6%, 67%)
Priority{P1} ᴧ Term {fail} ᴧ Term {win} ᴧ Term {window}
⇒ Bug-fix time {0-19 days} @ (5%, 63%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {fail} ᴧ Term {test}
⇒ Bug-fix time {0-19 days} @ (5%, 63%)</p>
      <p>Bug-fix time 20-64 days</p>
      <p>Cluster 2
Severity {Major} ᴧ Priority {P4} ᴧ Assignee {Will Bamberg}
ᴧ Term {con} ᴧ Term {doc}
⇒ Bug-fix time {20-64 days} @ (5%, 100%)
Severity {Major} ᴧ Priority {P3} ᴧ Assignee {Will Bamberg}
ᴧ Term {updat} ᴧ Term {doc}
⇒ Bug-fix time {20-64 days} @ (5%, 100%)
Severity {Major} ᴧ Priority {P1} ᴧ Assignee {Will Bamberg}
ᴧ Term {document} ᴧ Term {doc}
⇒ Bug-fix time {20-64 days} @ (6%, 100%)
Severity {Major} ᴧ Assignee {Will Bamberg} ᴧ Term {con}
ᴧ Term {doc}
⇒ Bug-fix time {20-64 days} @ (5%, 100%)
Priority {P3} ᴧ Assignee {Will Bamberg} ᴧ Term {con} ᴧ
Term {doc}
⇒ Bug-fix time {20-64 days} @ (5%, 100%)</p>
      <p>Cluster 3
Severity {Major} ᴧ Priority {P1} ᴧ Assignee {Will Bamberg}
ᴧ Term {doc} ᴧ Term {document}
⇒ Bug-fix time {20-64 days} @ (8%, 62%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {page}
⇒ Bug-fix time {20-64 days} @ (9%, 60%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {tab}
⇒ Bug-fix time {20-64 days} @ (10%, 59%)
Severity {Major} ᴧ Priority {P2} Term {mod}
1.
2.
3.
4.
5.
1.
2.
3.
4.
5.
1.
2.
3.
4.
5.
1.
2.
3.
4.
5.
1.
2.
3.
4.
5.</p>
      <p>⇒ Bug-fix time {20-64 days} @ (7%, 54%)
Assignee {Will Bamberg} ᴧ Term {document}
⇒ Bug-fix time {20-64 days} @ (16%, 53%)</p>
      <p>Cluster 4
Severity {Major} ᴧ Priority{P1} ᴧ Assignee {Will Bamberg}
ᴧ Term {doc}
⇒ Bug-fix time {20-64 days} @ (2%, 100%)
Severity {Major} ᴧ Assignee {Will Bamberg} ᴧ Term {doc}
⇒ Bug-fix time {20-64 days} @ (3%, 100%)
Severity {Major} ᴧ Priority {P1} ᴧ Term {doc}
⇒ Bug-fix time {20-64 days} @ (3%, 100%)
Priority {P1} ᴧ Assignee {Will Bamberg} ᴧ Term {doc}
⇒ Bug-fix time {20-64 days} @ (2%, 100%)
Severity {Major} ᴧ Assignee {Will Bamberg} ᴧ Term {updat}
⇒ Bug-fix time {20-64 days} @ (2%, 100%)</p>
      <p>Cluster 5
Severity {Major} ᴧ Term {win} ᴧ Term {window} ᴧ Term
{updat} ᴧ Term {private}
⇒ Bug-fix time {20-64 days} @ (5%, 100%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {window} ᴧ Term
{updat} ᴧ Term {private}
⇒ Bug-fix time {20-64 days} @ (5%, 100%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {win} ᴧ Term
{updat} ᴧ Term {private}
⇒ Bug-fix time {20-64 days} @ (5%, 100%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {mod} ᴧ Term
{modul} ᴧ Term {private}
⇒ Bug-fix time {20-64 days} @ (5%, 100%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {win} ᴧ Term
{window} ᴧ Term {updat} ᴧ Term {private}
⇒ Bug-fix time {20-64 days} @ (5%, 100%)</p>
      <p>Bug-fix time 65-99 days</p>
      <p>Cluster 2
Severity {Major} ᴧ Term {tab }
⇒ Bug-fix time {65-99 days} @ (6%, 35%)
Term {tab}
⇒ Bug-fix time {65-99 days} @ (6%, 33%)
Severity {Major} ᴧ Term {window}
⇒ Bug-fix time {65-99 days} @ (5%, 25%)
Severity {Major} ᴧ Term {win} ᴧ Term {window}
⇒ Bug-fix time {65-99 days} @ (5%, 25%)
Term {window}
⇒ Bug-fix time {65-99 days} @ (5%, 24%)</p>
      <p>Cluster 3
Priority{P1} ᴧ Term {modul}
⇒ Bug-fix time {65-99 days} @ (7%, 25%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {modul}
⇒ Bug-fix time {65-99 days} @ (7%, 27%)
Priority{P1} ᴧ Term {mod} ᴧ Term {modul}
⇒ Bug-fix time {65-99 days} @ (7%, 25%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {mod}
⇒ Bug-fix time {65-99 days} @ (7%, 21%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {mod} ᴧ Term
{modul}
⇒ Bug-fix time {65-99 days} @ (7%, 27%)</p>
      <p>Cluster 4
Severity {Enhancement} ᴧ Priority{P3}
⇒ Bug-fix time {65-99 days} @ (2%, 67%)
Severity {Major} ᴧ Term {text}
⇒ Bug-fix time {65-99 days} @ (2%, 67%)
Severity {Major} ᴧ Term {sdk}
⇒ Bug-fix time {65-99 days} @ (2%, 40%)
Severity {Major} ᴧ Priority{P1} ᴧ Term {text}
⇒ Bug-fix time {65-99 days} @ (2%, 67%)
Priority{P1} ᴧ Term {text}
⇒ Bug-fix time {65-99 days} @ (2%, 67%)</p>
      <p>Cluster 5
Severity {Major} ᴧ Priority{P1} ᴧ Assignee { Dave Townsend
} ᴧ Term {con} ᴧ Term {add} ᴧ Term {text}</p>
      <p>⇒ Bug-fix time {65-99 days} @ (2%, 100%)
Severity {Major} ᴧ Priority{P1} ᴧ Assignee { Dave Townsend
} ᴧ Term {con} ᴧ Term {test} ᴧ Term {text}
⇒ Bug-fix time {65-99 days} @ (2%, 100%)
Severity {Major} ᴧ Priority{P1} ᴧ Assignee { Dave Townsend
} ᴧ Term {con} ᴧ Term {test} ᴧ Term {add}
⇒ Bug-fix time {65-99 days} @ (2%, 100%)
Priority{P1} ᴧ Term {test} ᴧ Term {add} ᴧ Term {fail} ᴧ
Term {error} ᴧ Term {addon}
⇒ Bug-fix time {65-99 days} @ (2%, 67%)
Severity {Major} ᴧ Priority{P1} ᴧ Assignee { Dave Townsend
} ᴧ Term {con} ᴧ Term {test} ᴧ Term {add} ᴧ Term {text}
⇒ Bug-fix time {65-99 days} @ (2%, 100%)</p>
    </sec>
    <sec id="sec-35">
      <title>We observe that, if we apply association mining after</title>
      <p>clustering, we get different association rules. As we are
partitioning the datasets into clusters, we get association rules
with decreased support count i.e. 2%. Results also show that,
the confidence count lies in the range of 21 to 100%.</p>
    </sec>
    <sec id="sec-36">
      <title>We get the similar results for other datasets.</title>
    </sec>
    <sec id="sec-37">
      <title>V. RELATED WORK</title>
    </sec>
    <sec id="sec-38">
      <title>In last few years, a number of valuable studies have been</title>
      <p>
        conducted to address the problem of bug-fix time prediction.
A study on 72,482 bug reports from nine versions of Linux
software named Ubuntu has been conducted by [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Results
show that people participating in groups of size ranging from
      </p>
    </sec>
    <sec id="sec-39">
      <title>1 to 8 users fixed 95% bug reports. The study results in 92%</title>
      <p>
        linear relationship between the number of people participating
in fixing a bug report and bug-fix time. The applied linear
regression model resulted in R2 up to 0.98. At attempt has
been made on 512,474 bug reports of five open source projects
–Eclipse, Chrome and three products of Mozilla project –
Thunderbird, Firefox and Seamonkey to test the prediction
performance of existing models by using multivariate and
univariate regression [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. As a result it was found that existing
models have predictive power between 30% and 49% and
more independent attributes can be included. No correlation
was found between bug-fix likelihood, bug-opener’s
reputation and the time it takes to fix a bug. A model has been
proposed for six projects: Eclipse JDT, Eclipse Platform,
      </p>
    </sec>
    <sec id="sec-40">
      <title>Mozilla Core, Mozilla Firefox, Gnome GStreamer and Gnome</title>
    </sec>
    <sec id="sec-41">
      <title>Evolution to predict that how promptly a new bug report will</title>
      <p>
        receive attention [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Results show an improvement in bug-fix
time prediction accuracy if number of developers and number
of comments are included. An attempt has been made to show
the bug-fix time trends in Mozilla and Apache projects [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
It was found that on average resolution time for bugs of
priority levels 4 and 5 exceeds 100 days, bugs of the priority
level 2 are resolved in 80 days or less and bugs of the priority
level 1 or 3 are resolved in 30 days or less. An attempt has
been made to focus on the delays incurred by developers
during bug fixing [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. A study has been conducted to filter
out the data sets by identifying the potential outliers in the
distribution of the fix-time attribute. Results showed that
filtering these outliers can improve the accuracy of the
prediction models [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ].
      </p>
      <p>
        An attempt has been made to present an application of
association rule mining to predict software defect associations
and defect correction effort with SEL defect data [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]. The
results show that for the defect association prediction, the
minimum accuracy is 95.38 percent, and the false negative
rate is just 2.84 percent; and for the defect correction effort
prediction, the accuracy is 93.80 percent for defect isolation
effort prediction and 94.69 percent for defect correction effort
prediction. Recently, a study discussed the application of
association mining in bug triaging. Authors have used Apriori
algorithm to predict the right developer to work on the bug by
taking bug’s severity, priority and summary terms as the
antecedents [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]
      </p>
    </sec>
    <sec id="sec-42">
      <title>To best of our knowledge, no approach has been proposed</title>
      <p>till now to mine association rules among different bug
attributes to predict bug-fix time. Managers can use
association rules to improve development process by doing a
bug-fix time prediction for a given set of bug attributes.</p>
    </sec>
    <sec id="sec-43">
      <title>Several performance studies have resulted in better accuracy</title>
      <p>
        for associative classification than state-of-the-art classification
methods [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref14 ref15 ref16 ref17 ref18 ref9">9-18</xref>
        ]. Our work has been motivated by the
successful application of association rule mining in various
fields.
      </p>
    </sec>
    <sec id="sec-44">
      <title>VI. THREATS TO VALIDITY</title>
    </sec>
    <sec id="sec-45">
      <title>Factors that can affect the validity of our study are as follow:</title>
    </sec>
    <sec id="sec-46">
      <title>Construct Validity: We have not empirically validated the independent attributes taken in our study.</title>
    </sec>
    <sec id="sec-47">
      <title>Internal Validity: Except the four attributes namely severity,</title>
      <p>priority, summary terms and assignee taken in our study,
developer’s reputation can also be considered as it is an
important attribute which can contribute in bug-fix time
prediction.</p>
    </sec>
    <sec id="sec-48">
      <title>External Validity: We have considered only open source</title>
    </sec>
    <sec id="sec-49">
      <title>Mozilla products. The study can be extended for other open source and closed source software.</title>
    </sec>
    <sec id="sec-50">
      <title>Reliability: RapidMiner, SPSS and MATLAB software have</title>
      <p>been used in this paper for model building and testing. The
increasing use of these software confirms the reliability of the
experiments. Errors in performance measures such as accuracy
of these tools has not been considered and handled.</p>
    </sec>
    <sec id="sec-51">
      <title>VII. CONCLUSION</title>
      <p>The time to fix a bug after the bug was introduced is called
bug-fix time. It is an important factor for bug related analysis,
such as measuring software quality or coordinating
development effort during bug triaging. Prior work has
proposed many bug-fix time prediction models based on
various bug attributes (number of developers who participated
in fixing the bug, bug severity, bug-opener’s reputation,
number of patches) for predicting the fix time of a newly
reported bug. Several studies have been conducted by using
classification and regression models. We have proposed an
approach for bug-fix time prediction based on other bug
attributes namely summary terms, priority, severity and
assignee by using Apriori algorithm and k-means clustering
followed by Apriori algorithm. We have also used k-means
clustering method to get groups of correlated variables
followed by association rules mining inside each cluster. We
have validated our results on 1,695 bug reports of
AddOnSDK, Thunderbird and Bugzilla products of Mozilla
open source project. We have presented top five association
rules for 20% minimum confidence and 3% and 7% minimum
support. We observe that, if we apply association mining after
clustering, we get different association rules. As we are
partitioning the datasets into clusters, we get association rules
with decreased support count i.e. 2%. Results show that, the
confidence count lies in the range of 21 to 100%.</p>
      <p>By using these rules we can predict the bug-fix time for a
newly coming bug. We also observe that our approach for
bug-fix time prediction will be helpful in bug triaging by
assigning a bug to the most potential and experienced assignee
that will solve the bug in minimum time period. Prediction of
bug-fix time will help the managers in measuring software
quality and in software development process. From results, we
can observe the number of association rules having high
confidence and support with higher severity and priority as
antecedents and short bug-fix time as consequent. A large
number for such rules show that more important bugs are
fixed with out any delay. This information is useful in
determining software quality during software evolution
process. Further, for bugs with long predicted fix time we
need to pay more attention to the related source files to make
sure that the files remain stable during fixing process. This
will again help in determining software quality. We will
extend our work with other association mining algorithms to
empirically validate the results.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Whitehead</surname>
          </string-name>
          , “
          <article-title>How long did it take to fix bugs?</article-title>
          ,
          <source>” Int. Workshop Mining Software Repositories</source>
          . New York, NY, USA, ACM, pp.
          <fpage>173</fpage>
          -
          <lpage>174</lpage>
          ,
          <year>2006</year>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Hooimeijer</surname>
          </string-name>
          and
          <string-name>
            <given-names>W.</given-names>
            <surname>Weimer</surname>
          </string-name>
          , “
          <source>Modeling bug report quality,” ASE</source>
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Anbalagan</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Vouk</surname>
          </string-name>
          , “
          <article-title>On predicting the time taken to correct bug reports in open source projects</article-title>
          ,
          <source>” Int. Conf. Software Management (Edmonton</source>
          , AB). IEEE, pp.
          <fpage>523</fpage>
          -
          <lpage>526</lpage>
          , September 20-
          <issue>26</issue>
          ,
          <year>2009</year>
          , DOI= http://ieeexplore.ieee.
          <source>org/10</source>
          .1109/ICSM.
          <year>2009</year>
          .
          <volume>5306337</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Neamtiu</surname>
          </string-name>
          , “
          <article-title>Bug-fix Time Prediction Models: Can We Do Better?</article-title>
          ,” 8th
          <string-name>
            <given-names>Working</given-names>
            <surname>Conf. Mining Software Repositories</surname>
          </string-name>
          (New York, NY, USA).
          <source>ACM</source>
          , pp.
          <fpage>207</fpage>
          -
          <lpage>210</lpage>
          ,
          <year>2012</year>
          , DOI= http://dl.acm.
          <source>org/10</source>
          .1145/1985441.1985472.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Giger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pinzger</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Gall</surname>
          </string-name>
          , “
          <article-title>Predicting the fix time of bugs,”</article-title>
          <source>Int. Workshop Recommendation Systems on Software Enginnering</source>
          (New York, NY, USA), ACM, pp.
          <fpage>52</fpage>
          -
          <lpage>56</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kumari</surname>
          </string-name>
          and
          <string-name>
            <given-names>V.B.</given-names>
            <surname>Singh</surname>
          </string-name>
          , “
          <article-title>Understanding the Meaning of Bug Attributes</article-title>
          and Prediction Models,” 5th IBM Collaborative Academia Research Exchange Workshop,
          <string-name>
            <surname>I-CARE</surname>
          </string-name>
          , Article No.
          <volume>15</volume>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Agrawal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Imielinski</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Swami</surname>
          </string-name>
          , “
          <article-title>Mining Association Rules between Sets of Items in Large Databases,” SIGMOD Conf</article-title>
          .
          <source>Management of Data</source>
          , ACM, May
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shepperd</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cartwright</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Mair</surname>
          </string-name>
          , “
          <article-title>Software defect association mining and defect correction effort prediction</article-title>
          ,
          <source>” IEEE Transactions on Software Engineering</source>
          , Vol.
          <volume>32</volume>
          (
          <issue>2</issue>
          ) pp.
          <fpage>69</fpage>
          -
          <lpage>82</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Ali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Manganaris</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Srikant</surname>
          </string-name>
          , “
          <article-title>Partial Classification Using Association Rules,” Int. Conf. Knowledge Discovery and Data Mining</article-title>
          ., pp.
          <fpage>115</fpage>
          -
          <lpage>118</lpage>
          ,
          <year>1997</year>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>G.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , L. Wong, and
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          , “CAEP:
          <article-title>Classification by Aggregating Emerging Patterns,”</article-title>
          <source>Int. Conf. Discovery Science</source>
          , pp.
          <fpage>30</fpage>
          -
          <lpage>42</lpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Hsu</surname>
          </string-name>
          , and Y. Ma, “Integrating Classification and Association Rule Mining,
          <source>” Int. Conf. Knowledge Discovery and Data Mining</source>
          , pp.
          <fpage>80</fpage>
          -
          <lpage>86</lpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>R.</given-names>
            <surname>She</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ester</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.L.</given-names>
            <surname>Gardy</surname>
          </string-name>
          and
          <string-name>
            <given-names>F.L.</given-names>
            <surname>Brinkman</surname>
          </string-name>
          , “
          <article-title>Frequent-Subsequence-Based Prediction of Outer Membrane Proteins</article-title>
          ,
          <source>” ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>K.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.Q.</given-names>
            <surname>Zhou</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.C.</given-names>
            <surname>Liew</surname>
          </string-name>
          , “
          <article-title>Building Hierarchical Classifiers Using Class Proximity,”</article-title>
          <source>Int. Conf. Very Large Data Bases</source>
          , pp.
          <fpage>363</fpage>
          -
          <lpage>374</lpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>K.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhou</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          , “
          <article-title>Growing Decision Tree on Support-Less Association Rules</article-title>
          .”
          <source>Int. Conf. Knowledge Discovery and Data Mining</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Li</surname>
          </string-name>
          , “
          <article-title>Mining Web Logs for Prediction Models in WWW Caching and Prefetching,”</article-title>
          <source>ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>X.</given-names>
            <surname>Yin</surname>
          </string-name>
          and J. Han, “
          <source>CPAR: Classification Based on Predictive Association Rules,” SIAM Int. Conf. Data Mining</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.T.T.</given-names>
            <surname>Ying</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.G.</given-names>
            <surname>Murphy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ng</surname>
          </string-name>
          and
          <string-name>
            <surname>M.C.</surname>
          </string-name>
          Chu-Carroll,.
          <source>“Predicting Source Code Changes by Mining Revision History,” Int. Workshop Mining Software Repositories</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>T.</given-names>
            <surname>Zimmermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Weigerber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Diehl</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Zeller</surname>
          </string-name>
          , “Mining Version Histories to Guide
          <source>Software Changes,” Int. Conf. Software Engineering</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>I.</given-names>
            <surname>Mierswa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wurst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Klinkenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Scholz</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Euler</surname>
          </string-name>
          , “YALE:
          <article-title>Rapid Prototyping for Complex Data Mining Tasks,”</article-title>
          <source>ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD-06)</source>
          ,
          <year>2006</year>
          (http://www.
          <source>rapid-i.com).</source>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>M.</given-names>
            <surname>Porter</surname>
          </string-name>
          , “
          <article-title>An algorithm for suffix stripping,” Program</article-title>
          .Vol.
          <volume>14</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>130</fpage>
          -
          <lpage>137</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21] “http://in.mathworks.com/.../3016-armada
          <article-title>-data-mining-tool-version-</article-title>
          <issue>1- 4</issue>
          ”,
          <year>2015</year>
          , URL: http://in.mathworks.com/[accessed:
          <fpage>2015</fpage>
          -07-24].
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mockus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. T.</given-names>
            <surname>Fielding</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Herbsleb</surname>
          </string-name>
          , “
          <article-title>Two case studies of open source software development: Apache and Mozilla,”</article-title>
          <source>ACM Trans. on Software Eng</source>
          .
          <source>Vol. (11)3</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Plassea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nianga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Saportaa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Villeminotb</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Leblondb</surname>
          </string-name>
          , “
          <article-title>Combined use of association rules mining and clustering methods to find relevant links between binary rare attributes in a large data set</article-title>
          ,
          <source>” Computational Statistics &amp; Data Analysis, ELSEVIER</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kumari</surname>
          </string-name>
          and
          <string-name>
            <given-names>V.B.</given-names>
            <surname>Singh</surname>
          </string-name>
          , “
          <article-title>Bug Assignee Prediction Using Association Rule Mining</article-title>
          ,
          <source>” ICCSA</source>
          <year>2015</year>
          ,
          <string-name>
            <surname>Part</surname>
            <given-names>IV</given-names>
          </string-name>
          ,
          <source>LNCS 9158</source>
          , pp.
          <fpage>444</fpage>
          -
          <lpage>457</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>F.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Khomh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zou</surname>
          </string-name>
          and
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Hassan</surname>
          </string-name>
          ,
          <source>“An Empirical Study on Factors Impacting Bug Fixing Time,” 19th Working Conference on Reverse Engineering (WCRE)</source>
          , pp.
          <fpage>225</fpage>
          -
          <lpage>234</lpage>
          ,
          <fpage>15</fpage>
          -
          <lpage>18</lpage>
          Oct
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>W.</given-names>
            <surname>AbdelMoez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kholief</surname>
          </string-name>
          and
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Elsalmy</surname>
          </string-name>
          , “
          <article-title>Improving bug fixtime prediction model by filtering out outliers</article-title>
          ,” International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE),
          <year>2013</year>
          , pp.
          <fpage>359</fpage>
          -
          <issue>364</issue>
          ,
          <fpage>9</fpage>
          -11 May
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>