<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Analytics to Quantify the Interest of Self-Admitted Technical Debt</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Principles of Software Languages Group (POSL), Kyushu University</institution>
          ,
          <addr-line>Fukuoka</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Yasutaka Kamei</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <fpage>68</fpage>
      <lpage>71</lpage>
      <abstract>
        <p>-Technical debt refers to the phenomena of taking a shortcut to achieve short term development gain at the cost of increased maintenance effort in the future. The concept of debt, in particular, the cost of debt has not been widely studied. Therefore, the goal of this paper is to determine ways to measure the 'interest' on the debt and use these measures to see how much of the technical debt incurs positive interest, i.e., debt that indeed costs more to pay off in the future. To measure interest, we use the LOC and Fan-In measures. We perform a case study on the Apache JMeter project and find that approximately 42 - 44% of the technical debt incurs positive interest.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>
        Technical debt was first coined by Cunningham in 1992 to
refer to the phenomena of taking a shortcut to achieve short
term development gain at the cost of increased maintenance
effort in the future [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The technical debt community,
organized through the managing technical debt workshop [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], has
studied many aspects of technical debt, including its detection
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], impact [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and the appearance of technical debt in
the form of code smells [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Most recently, we developed
an approach to identify technical debt from code comments,
referred to as self-admitted technical debt (SATD). SATD
refers to the situation where developers know that the current
implementation is not optimal and write comments alerting
the inadequacy of the solution.
      </p>
      <p>
        In the last few years, an increasing amount of work has
focused on SATD. In particular, our prior work focused on the
detection of SATD [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and the classification of different types
of SATD and the development of datasets to enable future
studies on SATD [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Other work by Bavota and Russo [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
performed an empirical study of SATD on a large number
of Apache projects showed that SATD is prevalent in open
source projects, is long lived and is increasing over time. A
study by Wehaibi et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] examined the impact of SATD
on quality and found that SATD does not necessarily relate
to more defects, however, it does make the software system
more complex.
      </p>
      <p>Although the metaphor of technical debt has been well
studied, to the best of our knowledge, the cost of debt/interest
has not been extensively studied. Measuring the interest of
the technical debt is one of the challenges in the field, since
it requires for the detection of the technical debt, the tracking
of the debt over time and the development of measures to
accurately quantify this debt. Given that SATD allows us to
know the exact method the technical debt exists in, we are able
to perform fine-grained analysis of the code, which enables us
to quantify interest of the debt. In this paper, the interest refers
to the additional difficulty in repaying the debt.</p>
      <p>We first propose the use of code metrics, in particular the
well-known Lines of Code (LOC) and Fan-In, to measure
interest. We use LOC since it highly correlates with most code
complexity metrics and Fan-In1 since it allows us to measure
how much a piece of code is depended on by other code.
Then, we use the developed measure to determine how much
of the SATD incurs positive interest. In a case study on the
Apache JMeter project, we find that using LOC, 44.2% and
using Fan-In 42.2% of the SATD in JMeter incurs positive
interest.</p>
      <p>The rest of the paper is organized as follows; Section II
introduces our approach to quantify interest of SATD. Section
III describes a preliminary study using the developed measure.
Finally, Section IV draws conclusions and our future work.</p>
    </sec>
    <sec id="sec-2">
      <title>II. APPROACH</title>
      <p>
        To perform our study, we need to determine the SATD in the
codebase, locate when the SATD was introduced in the project
and when it was later removed. Then, we use our measures
of interest, i.e., LOC and Fan-In, to compare the size of the
code and the amount of dependence other code had on the TD
code in order to quantify interest (Figure 1).
1. SATD Extraction. In order to measure interest of the
TD, our first step is to identify where it exists. Since we
focus particularly on SATD, we use code comments found
in the source code. We extract and parse the source code
of JMeter version 2.10. To perform the parsing, we use the
JDEODORANT tool [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], which allows us to extract a comment
and map it to its corresponding method. Then, we apply a
series of filters to remove irrelevant comments, e.g.,
copyrightrelated comments. Finally, the 2nd author2 manually classified
all comments to determine if they are SATD comments or not
and mapped these comments to their respective methods. In
this study, we assume that SATD exists in the method where
1UNDERSTAND calculates Fan-In as the number of inputs a function uses
plus the number of unique subprograms calling the function [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>2The 2nd author who made the classification has more than 8 years of
experience working in the industry as a software engineer, during this time he
designed, implemented and maintained several programs using, in particular
the Java programming language.
Source</p>
      <p>Code
Repository</p>
      <p>SATD</p>
      <p>
        Extraction
(Details in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] )
      </p>
      <p>Identifying</p>
      <p>SATD
Introduction
and Removal</p>
      <p>Determining
Metrics that</p>
      <p>Measure
Interest</p>
      <p>Calculating</p>
      <p>Interest</p>
      <p>
        File
JsseSSLManager.java
MonitorGraph.java
ProxyControl.java
SmtpPanel.java
the comment is identified. Details regarding the dataset and
the filtering applied can be found in our earlier work [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
2. Identifying SATD Introduction and Removal. Since we
are interested in measuring the interest, we need to determine
the ‘change’ over time in these SATD methods. For each
of the SATD comments identified by us, we use several
git commands (e.g., git log -- &lt;PATH_TO_FILE&gt; and
git cat-file &lt;SHA1&gt;:&lt;PATH_TO_FILE&gt;), to trace a
comment back to the commit where it was introduced. We
perform this task by replaying the history commit-by-commit.
Using the same technique, we are also able to detect the
removal of SATD. We detect the removal of SATD when we
find that the commit is removed or changed.
3. Determining Metrics that Measure Interest. Once we
are able to determine the SATD comments and their associated
methods, we would like to calculate the interest that is incurred
over time (i.e., from the introduction of the technical debt to
its removal). To do so, we extracted 16 code metrics using
the UNDERSTAND TOOL [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In particular, we selected all
method-level complexity and size metrics that Understand is
able to provide.
      </p>
      <p>The reasons that we focused on complexity and size metrics
are: 1) our intuition tells us that if a piece of code is introduced
and then becomes more complex, then that is a good proxy
for it being more difficult to deal with in the future, i.e., it
incurred interest; and 2) prior work has shown that size metrics
are typically highly correlated with complexity metrics, hence,
we figured using size metrics (if they are highly correlated with
complexity metrics in our case) would be an easier alternative
to using complexity metrics.</p>
      <p>We measured the Spearman correlation between the
complexity and size metrics and found that indeed all metrics
except Fan-In are highly correlated with LOC. Therefore, we
decided to use the LOC metric as a measure of interest. In
addition, since Fan-In is an indicator of how much a method
is depended on, we decided to also include the Fan-In metric
when calculating interest. The intuition being that if a method
is depended on lightly when the SATD is introduced and then
has many more dependencies in the future, then dealing with
this SATD is much more difficult (since many dependencies
may be affected). In the end, we settled on using the two
metrics, LOC and Fan-In, as measures of interest.
4. Calculating Interest. Using our metrics, we consider the
relative LOC and Fan-In values between the introduced and
removed versions as interest. We calculate the interest per
SATD instance. For example, if arbitrary metric values in
the introduced and removed versions are 10 and 20 in the
method where the SATD exists, the relative size is 100 (i.e,
100 ⇤ (201010) ). In cases where the SATD is not yet removed,
we use the numbers from the latest version of JMeter. Our
assumption here is that if the SATD incurs positive interest,
then it will be more difficult to remove in the future, e.g., if
the code becomes more complex compared to when the debt
was taken, then it will be more difficult to deal with.</p>
      <p>While the paper tackles the research topic that accelerates a
new research direction (i.e., quantifying interest of SATD), it
also has the weakness of our current approach. We elaborate
on the weakness of our current approach in Section IV.</p>
    </sec>
    <sec id="sec-3">
      <title>III. INITIAL CASE STUDY</title>
      <p>
        Motivation. There exist several previous studies that
focused on understanding SATD (e.g., the detection of technical
debt [
        <xref ref-type="bibr" rid="ref12 ref6">6, 12</xref>
        ] and the impact of SATD on software quality [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]).
However, to the best of our knowledge, there are no studies
that help in the quantification of SATD interest. Therefore, we
would like to know how we can measure interest and if SATD
actually incurs positive interest.
      </p>
      <p>
        Datasets. To conduct our initial case study, we use data from
the Apache JMeter open source. We use JMeter since we have
used this dataset in the past [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ], and know that it cotains
instances of SATD and uses Git as the version control system,
which many of our tools are designed to work on. In particular,
we use release v2.10 of JMeter, which contains 81,307 SLOC
in 1,181 classes, contains 20,084 comments, and has 33 unique
contributors.
      </p>
      <p>Approach. To calculate interest of SATD, we follow the
approach we explained in Section II. We show the number of
SATD, the percentage of the technical debt that has positive
interest, and the distribution of interest for technical debt that
incurs an positive interest rate.</p>
      <p>Results. We find that there is a high correlation between
LOC and the other product metrics, except Fan-In. From the
highly correlated metrics, we selected LOC as the metric
to calculate interest, since intuitively it is easier to measure
and comprehend. Therefore, we settled on using two product
metrics (i.e., LOC and Fan-In) to measure interest.</p>
      <p>Table I shows the number of SATD and the percentage of
the technical debt that has positive interest in all technical
debt. The table shows that 44.2% of technical debt incurs a
positive interest rate in terms of LOC and 42.2% of the SATD
has it in terms of Fan-In. We can see that in some cases, there
can be negative interest (13.8% using LOC and 8.1% using
Fan-In), where the SATD method gets smaller or have less
Fan-In after the introduction of the SATD. There are also cases
where nothing changes in terms of LOC and Fan-In between
the SATD introduction and removal. Lastly, it is important to
note that there is not large difference between in the amount
of positive and no change interest rates using LOC and Fan-In.</p>
      <p>Next, we would like to know how high is the positive
interest rate. This analysis provides us with more insight about
the SATD that incurs a positive interest rate. Table II and
Figure 2 show that the distribution of interest for the SATD
that incurs a positive rate. We note that we limit the x-axis of
Figure 2 to 100% for readability. We see from Figure 2, that
the distributions are left-skewed, indicating that the majority
of the SATD ranges between 6.9-12.4 and 50.0 in terms of
LOC and Fan-In. Our findings clearly indicate that there is
SATD that incurs a positive interest rate and different types
of SATD have different values of interest, which shows that
we should be prioritizing SATD based on its interest, i.e., all
SATD is not equal.</p>
      <p>44.2% of technical debt incurs a positive rate in terms
of LOC and 42.2% of technical debt incurs it in terms of
Fan-In.</p>
    </sec>
    <sec id="sec-4">
      <title>IV. CONCLUSION In this paper, we introduced an approach to quantify interest of SATD. Our proposed approach uses software product metrics to lead to measure the interest from software projects.</title>
      <p>The results of our initial case study using the Apache JMeter
project show that 44.2% of technical debt has a positive rate
in terms of LOC and 42.2% of technical debt has it in terms
of Fan-In.</p>
      <p>Future work. This paper only shows an early idea to quantity
the interest of SATD. Therefore, there remain many challenges
to address in the future.</p>
      <p>
        • To calculate interest, we use the relative size of metric
values between two versions of SATD-introduction and
removal. However, the period, in time, is not considered
to calculate the interest. Therefore, in the future we would
like to take the period into account when calculating the
interest.
• There are several type of SATD, such as defect and design
SATD. The previous study [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] shows that the percentage
of SATD varies depending on the type of technical debt
and the studied systems. For example, the projects that
have limited time to develop features are likely to leave
comments of features that need to be implemented in the
future. To better understand the interest, in the future we
would like to analyze the interest per type of SATD.
• The interest varies among technical debt. If we can
understand the reason why some of SATD has larger interest,
we can make use of such insights for future development.
Therefore, we would like to manually investigate why
some of the SATD has larger interest.
• Generally speaking, software systems are always evolving
over time for implementing new functionality and fixing
defects. Therefore, even if the size of the SATD method
increases, it is not clear how best to evaluate the effects
on the interest of SATD. We would like to compare
the impact of software evolution on methods in the
two groups, SATD v.s. non-SATD, to draw a relative
comparison that controls for general evolution.
      </p>
    </sec>
    <sec id="sec-5">
      <title>ACKNOWLEDGMENT This research was partially supported by JSPS KAKENHI Grant Numbers 15H05306.</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1] International workshop on managing technical
          <source>debt (MTD)</source>
          . https://www.sei.cmu.edu/community/td2016/. Accessed:
          <fpage>2016</fpage>
          -10-16.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bavota</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Russo</surname>
          </string-name>
          .
          <article-title>A large-scale empirical study on self-admitted technical debt</article-title>
          .
          <source>In Proc. Int'l Conf. on Mining Software Repositories (MSR)</source>
          , pages
          <fpage>315</fpage>
          -
          <lpage>326</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>W.</given-names>
            <surname>Cunningham</surname>
          </string-name>
          .
          <article-title>The WyCash portfolio management system</article-title>
          .
          <source>In Addendum to the Proc. on Object-oriented Programming Systems, Languages, and Applications</source>
          , pages
          <fpage>29</fpage>
          -
          <lpage>30</lpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Fontana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ferme</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Spinelli</surname>
          </string-name>
          .
          <article-title>Investigating the impact of code smells debt on quality code evaluation</article-title>
          .
          <source>In Proc. of the Int'l Workshop on Managing Technical Debt (MTD)</source>
          , pages
          <fpage>15</fpage>
          -
          <lpage>22</lpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E. D. S.</given-names>
            <surname>Maldonado</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Shihab</surname>
          </string-name>
          .
          <article-title>Detecting and quantifying different types of self-admitted technical debt</article-title>
          .
          <source>In Proc. of the Int'l Workshop on Managing Technical Debt (MTD)</source>
          , pages
          <fpage>9</fpage>
          -
          <lpage>15</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Potdar</surname>
          </string-name>
          and
          <string-name>
            <surname>E. Shihab.</surname>
          </string-name>
          <article-title>An exploratory study on selfadmitted technical debt</article-title>
          .
          <source>In Proc. of the Int'l Conf. on Software Maintenance and Evolution (ICSME)</source>
          , pages
          <fpage>91</fpage>
          -
          <lpage>100</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Scientific</given-names>
            <surname>Toolworks</surname>
          </string-name>
          ,
          <source>Inc. FANIN - Understand 2</source>
          .6. https://scitools.com/support/metrics list/?metricGroup= count.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Scientific</given-names>
            <surname>Toolworks</surname>
          </string-name>
          ,
          <source>Inc. Understand 2</source>
          .6. http://www. scitools.com/.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>N.</given-names>
            <surname>Tsantalis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Chaikalis</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Chatzigeorgiou</surname>
          </string-name>
          . Jdeodorant:
          <article-title>Identification and removal of type-checking bad smells</article-title>
          .
          <source>In Proc. European Conf. on Software Maintenance and Reengineering (CSMR)</source>
          , pages
          <fpage>329</fpage>
          -
          <lpage>331</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wehaibi</surname>
          </string-name>
          , E. Shihab, and
          <string-name>
            <given-names>L.</given-names>
            <surname>Guerrouj</surname>
          </string-name>
          .
          <article-title>Examining the impact of self-admitted technical debt on software quality</article-title>
          .
          <source>In Proc. of the Int'l Conference on Software Analysis, Evolution, and Reengineering (SANER)</source>
          , pages
          <fpage>179</fpage>
          -
          <lpage>188</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>N.</given-names>
            <surname>Zazworka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Shaw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Shull</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Seaman</surname>
          </string-name>
          .
          <article-title>Investigating the impact of design debt on software quality</article-title>
          .
          <source>In Proc. of the Int'l Workshop on Managing Technical Debt (MTD)</source>
          , pages
          <fpage>17</fpage>
          -
          <lpage>23</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>N.</given-names>
            <surname>Zazworka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. O.</given-names>
            <surname>Sp</surname>
          </string-name>
          <article-title>´ınola, A</article-title>
          . Vetro',
          <string-name>
            <given-names>F.</given-names>
            <surname>Shull</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Seaman</surname>
          </string-name>
          .
          <article-title>A case study on effectively identifying technical debt</article-title>
          .
          <source>In Proc. of the Int'l Conf. on Evaluation and Assessment in Software Engineering (EASE)</source>
          , pages
          <fpage>42</fpage>
          -
          <lpage>47</lpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>