<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Rework Effort Estimation of Self-admitted Technical Debt</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Solomon Mensah</string-name>
          <email>smensah2-c@my.cityu.edu.hk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jacky Keung</string-name>
          <email>Jacky.Keung@cityu.edu.hk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Franklin Bosu</string-name>
          <email>michael.bosu@wintec.ac.nz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kwabena Ebo Bennin</string-name>
          <email>kebennin2-c@my.cityu.edu.hk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre for Business, Information Technology and Enterprise Wintec</institution>
          ,
          <addr-line>Hamilton</addr-line>
          ,
          <country country="NZ">New Zealand</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, City University of Hong Kong</institution>
          ,
          <addr-line>Hong Kong</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <fpage>72</fpage>
      <lpage>75</lpage>
      <abstract>
        <p>-Programmers sometimes leave incomplete, temporary workarounds and buggy codes that require rework. This phenomenon in software development is referred to as Selfadmitted Technical Debt (SATD). The challenge therefore is for software engineering researchers and practitioners to resolve the SATD problem to improve the software quality. We performed an exploratory study using a text mining approach to extract SATD from developers' source code comments and implement an effort metric to compute the rework effort that might be needed to resolve the SATD problem. The result of this study confirms the result of a prior study that found design debt to be the most predominant class of SATD. Results from this study also indicate that a significant amount of rework effort of between 13 and 32 commented LOC on average per SATD prone source file is required to resolve the SATD challenge across all the four projects considered. The text mining approach incorporated into the rework effort metric will speed up the extraction and analysis of SATD that are generated during software projects. It will also aid in managerial decisions of whether to handle SATD as part of on-going project development or defer it to the maintenance phase.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Keywords—Self-admitted Technical Debt; Rework Effort; Text
Mining; Source code comments; Source code analysis</p>
      <p>I.</p>
      <p>INTRODUCTION</p>
      <p>
        The increasing pressure to deliver fast software products to
customers sometimes forces project managers to impose
unrealistic deadlines on their developers. As a result, these
developers intentionally commit incomplete code, buggy code
and temporary fixes in order to meet the expectation of their
customers. This practice could produce errors which might
require rework. These intentional or self-admitted errors are
assumed as mistakes by the software development team. Potdar
and Shihab [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] describe this phenomenon of weak software
development process resulting in series of long-term overheads
in the maintenance phase as Self-admitted Technical Debt
(SATD). The debt metaphor is gradually becoming a research
focus [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][
        <xref ref-type="bibr" rid="ref3">3</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] with studies aimed at finding solutions for
combating or minimizing the developers’ coding errors and
shortcuts of producing less quality applications [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        Harrington's concept of “cost of poor quality” [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] in relation
to technical debt basically refers to the cost involved in resolving
defective products. According to Chatzigeorgiou et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], the
concept of “cost of poor quality” does not only deal with the cost
for rectifying the gap between optimum and actual products but
also involves the effort required to resolve defects in delivered
products.
      </p>
      <p>
        The challenging question that arises among project
managers prior to release of software product is “Should we
meet our short-term business objective and release the product
as soon as possible or we should take our time and fix the code
before release?” From either point of view, a loss or debt in
relation to software quality can be incurred. It is worth noting
that not all SATD can realistically be repaid. In this study, the
effort involved in resolving these debts is described as Rework
Effort. Rework effort from the point of view of Bhardwaj and
Rana [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] plays a significant role in software testing and leads
to additional cost in software development. For a released
product to be more robust and long-term effective, there is the
need to consider the amount of rework effort that is needed to
fix all identified SATD in the software project.
      </p>
      <p>
        To study the issue of this debt metaphor, we extracted
source code comments from four large open-source software
projects and performed an exploratory study analysis on the
corpus of code comments with the intention of estimating the
rework effort necessary to fix the SATD tasks. Based on a
vocabulary of SATD indicators manually identified by Potdar
and Shihab [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we developed an automated text mining
approach to assist in the extraction and estimation of the
rework effort for SATD tasks. We classify the SATD tasks into
five classes based on the classification scheme by Maldonado
and Shihab [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] using the algorithm in Section C. The
contribution of this work is twofold: to the best of our
knowledge this is the first study to use text mining in
identifying SATD from source code comments and to estimate
rework effort of SATD.
      </p>
      <p>The remaining sections of the paper are organized as follows.
Section II highlights the methodological procedure employed.
Section III addresses the results from the empirical analysis of
the study. Finally, Section IV presents the threats to validity and
Section V gives a summary of the study based on conclusions
and future directions of the study.</p>
      <p>II.</p>
    </sec>
    <sec id="sec-2">
      <title>METHODOLOGY</title>
      <p>The exploratory analysis for this study was performed using
the MATLAB toolkit (version R2014b) and the R Software
(version 3.2.2). These toolkits enabled in the setting up of the
text mining algorithm by constructing regular expressions for
the source code analysis and searching for patterns for SATD
from the open-source projects.</p>
      <sec id="sec-2-1">
        <title>A. Datasets</title>
        <p>
          For the purpose of this study, we chose four
wellcommented open-source projects made available at
openhub.net. These datasets were first extracted by Potdar and
Shihab [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] for a manual exploratory study of SATD. The four
projects are ArgoUML, Chromium OS, Apache HTTP Server
and Eclipse Platform project. The description of the
opensource projects is presented in Table I. In each project, the
following metrics were extracted - the total number of Lines of
Code (LOC), lines of source code comments, contributors or
developers and the dates of software release.
        </p>
        <p>
          Preprocessing is an important phase in text mining and text
classification. For an efficient regular expression matching, we
preprocessed the extracted open-source code comments based
on data cleaning, stopword filtering, and term weighting. In the
dataset cleaning process, we used the text mining approach to
remove punctuation marks in the form of ~!@,.-#$%^*][|\ from
the corpus of code comments. Again, we filtered out noise in
the form of blank lines and white spaces within strings from
each project. Stopwords occurring frequently (such as and, this,
the, or, of, am, it, on, at) were removed because they
contributed less in the text mining and classification process.
These words were searched and removed following an
approach by Fabrizio [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. We assigned term weights to the
various SATD code comments in all cases of the project
datasets to know the frequency at which the SATD indicators
occurred in the source code comments. The assignment of term
weights was done based on term frequency-inverse document
frequency (tfidf) [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] which is a well-known ranking function in
text mining and information retrieval. The tfidf function is
composed of the product of the term frequency (tf) and the
inverse document frequency (idf). We define these two terms in
(1) and (2) with respect to each project dataset.
        </p>
        <p>tf (t, d )  ft,d
md
d  D</p>
        <p>D
idf (t, D)  loge N
(1)
(2)</p>
      </sec>
      <sec id="sec-2-2">
        <title>C. Proposed Text Mining Technique</title>
        <p>We proposed a text mining technique (Algorithm 1) for
mining SATD tasks using source code comments. This
technique plays a significant role in transforming source code
comments into numeric counts based on the assignment of term
weights for easy modeling and rework effort estimation. The
text mining technique for commented source code is divided
into 5 phases as follows:
Phase I: Preprocessing phase of the project datasets
Phase II: Extraction of code comments containing SATD
Phase III: Categorization of SATD classes
Phase IV: Computation of term weights for SATD tasks
Phase V: Computation of Rework Effort for SATD tasks</p>
        <p>Provision of some notations of the various variable names
used is made available. The algorithm constructed with regular
expressions is supplied with the contributor/developer details
and their respective comments made. Prior to Phase I, we
employed the textscan function to read the separated strings in
each of the code comments into separate vectors for each
system studied. This function also contributed in reading
commented strings with whitespaces.</p>
        <p>In Phase I, punctuation and special characters such as {“
”:\;!/.@[]-?#%^()’ ’} were eliminated from each of the source
code comment and contributor using the punct[ ] function and
result assigned to the variable P (line 1). Stop words such as is,
are, of, the, that, with, a, so, to, by, but, if, it, and, in, what, how
and other related words were removed in line 2 and the
remaining result assigned to SW variable.</p>
        <p>
          In Phase II, SATD comments void of stop words were
extracted using an implemented extract_satd function
containing the array of SATD indicators [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] in the first for loop
from lines 3 to 5.
        </p>
        <p>
          In Phase III, we made use of a dictionary of indicators,
StD_type for the various types of SATD tasks as studied by
Maldonado and Shihab [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Thus, with the help of this
dictionary, we can search and extract the various types of SATD
tasks in line 7.
        </p>
        <p>
          With the help of the tfidf for each case, statistical analysis
was made on the transformed dataset for statistical inferences.
In Phase IV, we made use of tfidf [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] in the second for loop
statement from lines 9 to 13 to compute the term weights for the
SATD list. In line 10, the total number of terms per each
comment within each corpus was computed and the term
frequency computed in line 11 as the ratio of the number of
searched and targeted t terms to the end result in line 10. We
computed the inverse document frequency in line 12 ignoring
case sensitiveness of terms in the grepl1 function. The grepl
function returns a logical vector containing searched SATD
comments. The tfidf values were computed in line 13 for each
SATD code comment. In Phase V, the rework effort (RW) is
computed in step 16 and further explained in equation (4).
where
t
tfidf (t, d , D)  tf (t, d )  idf (t, D) (3)
ft,d = frequency of term (t) in an SATD comment (d)
md = number of terms in a given SATD comment
D = total number of SATD comments per source file
Nt = number of SATD comments with a given term (t)
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>1 grepl is a function in the CRAN library of R which returns a particular string when found in the search</title>
        <p>space.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Algorithm 1 Source Code Comment Text Mining</title>
      <p>Notations:</p>
      <p>P: remove punctuations’ function
SW: remove stop words’ function
Q: total number of commented tasks per project
D: total number of SATD commented tasks per project
SATD: List of SATD comments
class: Class of SATD indicators
tfidf: term frequency inverse document frequency
Input:</p>
      <p>DCS: Dataset of contributors and source code comments
StW[ ]: array of stopwords
punct[ ]: array of punctuation characters
StD: array of SATD indicators
StD_type: array of types of SATD indicators</p>
      <sec id="sec-3-1">
        <title>RsF: rank source files</title>
        <p>Output:</p>
        <p>RW: Rework Effort for SATD tasks
Procedure</p>
        <p>// Remove Punctuation Characters
1: P ← remove_punct("punct[]", DCS)</p>
        <p>// Remove Stop Words
2: SW ← remove_stop.words(P, StW[])</p>
        <p>//Extract SATD comments from corpus
3: for i, i=1,...,Q do
4: SATD[i] ← extract_satd(P[i], StD)
5: end for</p>
        <p>// Categorization of SATD Tasks
6: for l, l=1,…,D do
7: class[l] ← categorize(StD_type[l])
8: end for</p>
        <p>// Compute term weights for SATD list using tfidf
9: for j, j=1,…,D do</p>
        <p>//Computing number of terms(t) per each SATD comments
10: tf_tot[j] ← compute(SATD[j], length)
11: tf[j] ← count(t terms) / tf_tot[j]
12: idf[j]← log(D / sum(grepl(SATD[j], ignore.case)))
13: tfidf[j] ← tf[j] * idf[j]
14: k ← cos(RsF, StD)
15: Sk ← count(StD, file[k])</p>
        <p>//Computation of Rework Effort
16: RW ← compute(LOC[j]/Sk)
17: end for
18: Output RW</p>
      </sec>
      <sec id="sec-3-2">
        <title>D. Rework Effort Estimation Metric for SATD</title>
        <p>
          In the quest of investigating the extent of rework effort in
relation to resolving commented LOC prone to SATD, we
formulated a rework effort metric based on a study by Zhao et
al. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The rework effort (RW) metric is defined as follows:
n k
  LOC(Fij )
RW  j1 i1 (4)
        </p>
        <p>
          Sk
where LOC(Fij) denotes the commented LOC of the ith source
file in the ranked list for the jth SATD indicator. Sk is the number
of SATD indicators contained in the k ranked source files (step
15). n is the total number of SATD indicators. Thus, given any
software project containing n commented LOC in a number of
source files, we first compute the term weights of the source
files, followed by a ranking process [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and use the cosine
similarity to obtain the k ranked source files. The cosine
similarity finds the close relation between the source files and
SATD indicators [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] with the intention of obtaining k files
prone to SATD (step 14). The k SATD prone files were
obtained based on a cosine similarity threshold of at least 0.7.
In relation to each kth file, we extract the commented LOC that
contains SATD. This is done repeatedly until all the commented
LOC tasks are obtained from the n source files as the numerator
in (4). RW is computed as the ratio of the numerator (LOC(Fij))
and denominator (Sk). We present a sample of the code
comments prone to SATD below.
        </p>
        <p>Examples of SATD comments
* Don’t wait around; just abandon it *
* Leave it for next release *
* Do nothing and bail out *
* Strictly speaking, this is a design error *
* DESIGN ERROR: a mix of repositories *
* TODO: this isn’t quite right but is ok for now *</p>
        <p>
          This list of SATD indicators [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] formed the vocabulary of
words which was used in the proposed text mining approach for
the rework effort estimation. With respect to previous study [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ],
the SATD commented tasks were categorized into five classes
– requirement debt, design debt, testing debt, defect debt and
documentation debt. The explanation with examples of the
classes of SATD are elaborated in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>We evaluated the classification performance of the proposed
text mining approach by averaging the precision and recall
values across the 4 open-source projects.</p>
        <p>III.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>RESULTS</title>
      <sec id="sec-4-1">
        <title>A. RQ1: What is the dominant class of self-admitted technical debt?</title>
        <p>
          Question RQ1 is similar to the one posed in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. Because we
used different datasets from those used in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], we decided to test
the postulation that design debt is the predominant class of
SATD in each of the open-source projects. The distribution of
this class of debt was irrespective of the size of the project. For
example, Apache project with 452 SATD comments had design
debt of 62.1%, Eclipse with 167 SATD comments had design
debt of 56.5%. Similarly, the design debts for AgroUML (512
SATD comments) and Chromium (975 SATD comments) were
56.5% and 67.5% respectively. Clearly, all design debts are
more than 50% of SATD comments in each project. This result
confirms a similar result by Maldonado and Shihab [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] that
found that design debt contributes between 42% and 84% of all
identified SATD in different systems.
        </p>
        <p>Precision (P) and Recall (R) values of confusion matrices
created from the text mining approach for the classification
were as follows: requirement debt (P=0.84, R=0.77), design
debt (P=0.85, R=0.84), testing debt (P=0.87, R=0.92), defect
debt (P=0.76, R=0.82) and documentation debt (P=0.81,
R=0.79).</p>
      </sec>
      <sec id="sec-4-2">
        <title>B. RQ2: What is the extent of rework effort required to resolve SATD in open-source projects?</title>
        <p>
          Table 2 indicates the estimated rework effort (measured in
average commented LOC per SATD prone source file of each
system) for the maintenance team to resolve these SATD within
the source files of the respective systems studied. It should be
noted that Req’t and Docu in Table 2 denote Requirement and
Document debts respectively. From the perspective of
considering all the five classes of debts, it was realized that
design debt required substantial rework effort as elaborated in
Table 2. Thus, the rework effort for resolving design debt in
AgroUML is 7.9, Chromium is 17.1, Eclipse is 11.8 and lastly,
Apache is 12.6 commented LOC on average per SATD prone
source file. Similarly, test and defect debts were also of key
interest in this study which needed rework apart from design
debts. These two debts even though known by the development
team that it will lead to long-term bugs upon release were left
unfixed. This we believe will be due to the time-to-market
constraint as mentioned by Fernández-Sánchez et al. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>Based on results from Table 2, there is no unique pattern in
relation to the SATD rework effort and the size of the
opensource projects. A typical example is seen in Eclipse and
Apache. Even though Eclipse has 437,640 commented LOC
much larger than that of Apache with 54,295 (Table 1), the
amount of SATD rework effort for Eclipse is 11.8 as compared
to 12.6 in Apache (Table 2). It can be seen that the rework effort
estimation of about 13-32 commented LOC on average per
SATD prone source file across the selected projects could affect
the quality of the software product.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>IV. THREATS TO VALIDITY</title>
      <p>The first threat to validity in this study is the use of
wellcommented open-source project datasets. This constraint might
not be a representative sample of the total population of
opensource projects since not all projects are well-commented.
Thus, the findings of this study cannot confidently be
generalized. The selected projects used are popular and large in
size. Therefore, the examination of all the developers’
comments from the projects with the intention of resolving the
self-admitted technical debt (SATD) problem can form a good
foundation for researchers to conduct more in-depth studies in
this field. Secondly, the list of SATD indicators used from
previous study might not be a generalized representation of all
SATD in the software development and maintenance
environment. Since this study focused on source code comment
analysis, we were constraint of gathering more information
especially from industry to validate the results obtained.</p>
    </sec>
    <sec id="sec-6">
      <title>V. CONCLUSION</title>
      <p>In this study, we performed an exploratory analysis with a
proposed text mining approach on source code comments of
four open-source projects. With the help of transforming the
source code comments into term weights, we were able to
estimate the rework effort for fixing these debts. This study
addressed two main research questions:
RQ1: What is the dominant class of self-admitted technical debt?</p>
      <p>Results from the study indicate that out of all the five classes
of SATD, design debts (56.5% - 67.5%) is the predominant
class of SATD for all the four systems.</p>
      <sec id="sec-6-1">
        <title>RQ2: What is the extent of rework effort required to resolve</title>
      </sec>
      <sec id="sec-6-2">
        <title>SATD in open-source projects?</title>
        <p>The result of this study indicate that rework effort of
between 13 and 32 commented LOC on average per SATD
prone source file will have be addressed in order to fix the
SATD. In order to improve the long term quality of the
software, it is essential that developers are encouraged to avoid
SATD.</p>
        <p>The proposed approach is a novel technique which can
assist in the estimation of rework effort needed to fix SATD
tasks that demands rework.</p>
        <p>In going forward, we intend to validate our approach based
on industrial case studies and different versions of open-source
datasets to facilitate result generalization.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Potdar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Shihab</surname>
          </string-name>
          .
          <article-title>"An exploratory study on self-admitted technical debt</article-title>
          .
          <article-title>" 2014 IEEE International Conference on Software Maintenance and Evolution (ICSME)</article-title>
          . IEEE,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Xu</surname>
          </string-name>
          .
          <article-title>"Is Learning-toRank Cost-Effective in Recommending Relevant Files for Bug Localization?." Software Quality, Reliability and Security (QRS</article-title>
          ),
          <source>2015 IEEE International Conference on. IEEE</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E. S.</given-names>
            <surname>Maldonado</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Shihab</surname>
          </string-name>
          .
          <article-title>"Detecting and quantifying different types of self-admitted technical Debt."</article-title>
          <source>Managing Technical Debt (MTD)</source>
          ,
          <source>2015 IEEE 7th International Workshop on. IEEE</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Raghavan</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Schütze</surname>
          </string-name>
          .
          <article-title>Introduction to information retrieval</article-title>
          . Vol.
          <volume>1</volume>
          . No. 1. Cambridge: Cambridge university press,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W.</given-names>
            <surname>Sultan</surname>
          </string-name>
          , E. Shihab, and
          <string-name>
            <given-names>L.</given-names>
            <surname>Guerrouj</surname>
          </string-name>
          .
          <article-title>"</article-title>
          <source>Examining the Impact of Selfadmitted Technical Debt on Software Quality." 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering. IEEE</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Padioleau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yuanyuan</surname>
          </string-name>
          .
          <article-title>"Listening to programmersTaxonomies and characteristics of comments in operating system code</article-title>
          .
          <source>" Software Engineering</source>
          ,
          <year>2009</year>
          .
          <source>ICSE 2009. IEEE 31st International Conference on Software Engineering. IEEE</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>H. J.</given-names>
            <surname>Harrington</surname>
          </string-name>
          ,
          <article-title>"Poor-Quality Cost: Implementing, Understanding, and Using the Cost of Poor Quality (Quality and Reliability)</article-title>
          .
          <source>"</source>
          (
          <year>1987</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chatzigeorgiou</surname>
          </string-name>
          , et al.
          <article-title>"Estimating the breaking point for technical debt</article-title>
          .
          <source>" Managing Technical Debt (MTD)</source>
          ,
          <source>2015 IEEE 7th International Workshop on. IEEE</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Fernández-Sánchez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Garbajosa</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Yagüe</surname>
          </string-name>
          .
          <article-title>"A framework to aid in decision making for technical debt management."</article-title>
          <source>Managing Technical Debt (MTD)</source>
          ,
          <source>2015 IEEE 7th International Workshop on. IEEE</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>S. Fabrizio.</surname>
          </string-name>
          <article-title>"Machine learning in</article-title>
          <source>automated text categorization."ACM computing surveys (CSUR) 34.1</source>
          (
          <year>2002</year>
          ):
          <fpage>1</fpage>
          -
          <lpage>47</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bhardwaj</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Rana</surname>
          </string-name>
          .
          <article-title>"Impact of Size and Productivity on Testing and Rework Efforts for Web-based Development Projects."</article-title>
          <source>ACM SIGSOFT Software Engineering Notes 40.2</source>
          (
          <year>2015</year>
          ):
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>