<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Evaluation of Java Code Coverage Testing Tools</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Elinda Kajo-Mece Faculty of Information Technology Polytechnic University of Tirana</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Faculty of Information Technology Polytechnic University of Tirana</institution>
        </aff>
      </contrib-group>
      <fpage>72</fpage>
      <lpage>75</lpage>
      <abstract>
        <p>Code coverage metric is considered as the most important metric used in analysis of software projects for testing. Code coverage analysis also helps in the testing process by finding areas of a program not exercised by a set of test cases, creating additional test cases to increase coverage, and determine the quantitative measure of the code, which is an indirect measure of quality. There are a large number of automated tools to find the coverage of test cases in Java. Choosing an appropriate tool for the application to be tested may be a complicated process. To make it ease we propose an approach for measuring characteristics of these testing tools in order to evaluate them systematically and to select the appropriate one.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Code coverage metrics</kwd>
        <kwd>testing tools</kwd>
        <kwd>test case</kwd>
        <kwd>test suite</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>To conclude with, according to some criteria, that we will take
into consideration for the evaluation of this code coverage tools,
we will judge for the most efficient tool to be used by the software
testing team. These criteria are: Human-Interface Design (HID),
Ease of Use (EU), Reporting Features (RF), Response Time (RT).</p>
      <p>
        In Section 2 we will mention the coverage metrics [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] used in
our experiments; we will shortly explain the tools [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] we have
selected to perform the code coverage analysis for our tests;
describe briefly how JUnit framework is implemented in each of
these tools [10] [11], since JUnit is our experimental
environment, where we program unit tests for our software and
the last part of this section consists of selecting some criteria
based on which we will then judge which of the tools is more
effective to use in the testing process. In Section 3 we will
summarize the results of our experiments for each tool and
analyze them to bring us in the conclusion which of the tools is
more effective. In Section 4 we give the conclusions of our work.
BCI’12, September 16–20, 2012, Novi Sad, Serbia.
      </p>
      <p>Copyright © 2012 by the paper’s authors. Copying permitted only for private and
academic purposes. This volume is published and copyrighted by its editors.
Local Proceedings also appeared in ISBN 978-86-7031-200-5, Faculty of Sciences,
University of Novi Sad.</p>
    </sec>
    <sec id="sec-2">
      <title>2. SELECTED TOOLS AND EVALUATION</title>
    </sec>
    <sec id="sec-3">
      <title>CRITERIA</title>
      <p>
        Among various automated testing tools [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], we have selected
two tools to perform the Code Coverage Analysis [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][
        <xref ref-type="bibr" rid="ref2">2</xref>
        ][
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], as a
manner to evaluate the efficiency of our tests we created in the
JUnit framework [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In this paragraph we will summarize
briefly the main features of these to: EMMA and CodeCover
coverage tools. The main reasons for which we choose them are:
      </p>
      <sec id="sec-3-1">
        <title>1. These tools are 100 % open-source.</title>
        <p>2. These tools have a large market share compared with the other
open source coverage tools.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3. These have multiple report type format. 4. These tools are for both open-source and commercial development projects.</title>
        <sec id="sec-3-2-1">
          <title>EMMA Tool</title>
          <p>We used EclEmma 2.1.0, a plug-in for Eclipse, which is our Java
development environment. Emma distinguishes itself from other
tools by going after a unique feature combination: development
while keeping individual developer's work fast and iterative. Such
a tool is essential for detecting dead code and verifying which parts
of an application are actually exercised by the test suite and
interactive use. The main features of Emma, which represent its
advantages are: Emma can instrument classes for coverage either
offline ( before they are loaded) or on the fly (using an
instrumenting application class loader); Supported coverage types:
class, method, line, basic block; Emma can detect
when a single source code line is covered only partially; Output
report types: plain text, HTML, XML.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>CodeCover Tool</title>
          <p>CodeCover is an extensible open source code coverage tool. It
provides several ways to increase test quality. It shows the quality
of test suite and helps to develop new test cases and rearrange test
cases to save some of them. So we get a higher quality and a better
test productivity. The main features of CodeCover are: Supports
statement coverage, branch coverage, loop coverage and strict
condition coverage; Performs source instrumentation for the most
accurate coverage measurement; CLKI interface, for easy use from
the command line; Ant interface, for easy integration into an
existing build process; Correlation Matrix to find redundant test
cases and optimize your test suite; The source code is highlighted
according to the measured date.</p>
          <p>The testing environment we used to project the set of tests for
our input programs was JUnit 3.</p>
          <p>We choose as input programs six sorting algorithms: Bubble
Sort, Selection Sort, Insertion Sort, Heap Sort, Merge Sort, Quick
Sort. The main reason why we choose these algorithms is the
facility we face on computing the Cyclomatic Complexity (CC),
which is crucial on defining the number of test cases needed to
achieve a good coverage percentage of the program code. To
proceed in the testing process for each of this sorting algorithm, we
first build Java programs for each of them.</p>
          <p>To achieve our goal we chose some criteria, based on which we
will evaluate which testing tool is the most efficient. So we chose
Human Interface Design (HID) as an indicator of the level of
difficulty to learn the tool's procedures on purchase and the
likelihood of errors, in using the tool over a long period of time;
Ease of Use (EU) to judge if the tool is easy to use to ensure
timely, adequate, and continual integration into the software
development process; Reporting Features (RF) to show the degree
of variety regarding the formats that tools use to report their
coverage results; Response Time (RT) used to evaluate the tool's
performance with regards to response time. In addition to these
criteria, we will also evaluate the number and quality of test cases
to judge for the most appropriate tool for the software testing
process.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. EXPERIMENTS AND ANALYSIS</title>
      <p>
        In this section we will summarize the experiments we have
performed on the selected algorithms. Initially, we built the Java
programs for each of our sorting algorithms. Then we designed
the set of testing units by using the JUnit testing framework [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] in
Java. Finally we performed the Analysis of Code Coverage, to
evaluate these tests through the selected code coverage tools. This
analysis calculates the coverage percentage, that serves as an
indirect measure of the quality of tests. Based on these
measurements, we can then create additional test cases [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]to
increase code coverage.
      </p>
      <p>In table 1 we summarized the quantitative information regarding
our experiments. In the last column we show the number of final
test cases we built for each of the Java programs of the sorting
algorithms. We used the term "final test cases" because we
continuously improved our coverage results by increasing the
number of test cases, until the addition of another test case does
not anymore affect the coverage result, that means we have
achieved a high level of code coverage.
Based on these coverage results and also the computed criteria
chosen for evaluation, we performed the analysis process to define
the best tool.</p>
      <p>In the figures below we see the coverage reports produced after
the execution of Emma and CodeCover for two cases: 1) When
we projected a small set of tests; 2) When we projected a larger
set of tests in order to improve quality of the testing process. To
show briefly the experimental procedure we followed to achieve
our objective, we will take as an example the experimental results
for Quick Sort algorithm. For Quick Sort we initially projected
only 3 test cases (Fig.1). The CodeCover tool produced low BC
(Branch Coverage) and LC (Loop Coverage) coverage metrics of
66.7 %. This result contradicts the result taken after the execution
of Emma tool on the same set of test cases, which is relatively
high with an average of 87 % (Fig.1). This contradict, led us to
increase the number of test cases for a higher quality of tests. For
Quick Sort we built 4 more test cases (Fig.2), which produced a
maximum result of 100 % code coverage with both tools.</p>
      <p>During our experiments, we noticed that this contradict, that
relates to the fact that for the same set of test cases the execution
of Emma gives us a higher coverage tool than the result reported
from CodeCover, we concluded that CodeCover gives a more
accurate information regarding the code coverage.</p>
      <p>In Section 2, we mentioned the Correlation Matrix as a way to
find redundant test cases, which does not increase the coverage
percentage. It shows a kind of dependency relationship between
test cases of the same input program. In JUnit3 testing framework,
dependency between tests is not supported, that is why we should
always try to avoid dependency between test cases. In the figure
below is shown the Correlation Matrix for Quick Sort.</p>
      <p>From the figure above, we see that blue squares (meaning that
there is 100 % dependency between test cases), exist only in the
case where the same number of test case intersect. So we can say
that we have proceeded according to the main rule of JUnit,that is
to avoid dependency between test cases.</p>
      <p>Below we will show by figures the results of the Code Coverage
Analysis performed by Emma and CodeCover tools for the other
five input sorting programs.</p>
      <p>For Bubble, Selection and Insertion Sort we initially projected 7
test cases, then in order to achieve a relatively high coverage we
projected 11 test cases. The coverage result report produced by
CodeCover for BubbleSort is shown below for both cases.</p>
      <p>From the figure above, we see a low percentage of 53.3 % for the
LC (Loop Coverage) metric. That is why we finally projected 11
test cases to increase this low percentage as shown in the figure
below, where the new LC metric is 86.7 %, which is considered a
high coverage percentage. By improving our experimental work
on the testing process repeatedly we came into the conclusion that
to achieve a high coverage percentage the secret is to project one
test case for each functional unit of the program, and to avoid
programming long test cases that try to cover a considerable part
of the program.
We haven't showed Emma coverage report, because it is
relatively high since the first case, where we projected only 7
tests.</p>
      <p>The results gained for SelectionSort are 46.7% for LC metric in
the case of 7 tests and 80 % in the final case of 11 test cases; for
Insertion are 60% for LC metric in the first case and 86.7% for
the final case. So far, we see that in general the most
"problematic" coverage metric is the Loop Coverage metric.
This happens mainly because of the for loop, that requires more
test cases to be covered. This is shown in fig.10, where yellow
signifies the partial coverage of the for loop.</p>
      <p>For MergeSort we initially projected 4 test cases, which according
to CodeCover produced a low LC indicator of 60 %,. Then we
extended this set of test cases to 7test cases, gaining a new
percentage of LC of 86.7 % ( the reason why it is not 100 % is
because there are many loops in the program, not only the for
loops, but also while).</p>
      <p>For Heap Sort we initially projected 8 test cases, giving a LC
metric of 33.3 % and a CC metric (Condition Coverage) of 80
%.Then we improved this set of tests by extending it to 16 test
cases, that improved considerably both the LC and CC metric to
respectively : 88.9 % and 100 %.</p>
      <p>Through the graph below we show the improvements we achieved
in our experiments until we gained a high code by showing the
initial result we gained when we projected a small set of test cases
and the final result after we increased the number of test cases for
a higher coverage.</p>
      <p>In table 2, we have summarized the results produced by Emma
and CodeCover tools after performing the Code Coverage
Analysis on each of the input programs (the sorting algorithms).</p>
      <sec id="sec-4-1">
        <title>SC-Statement Coverage, BLC-Block Coverage, BC-Branch</title>
      </sec>
      <sec id="sec-4-2">
        <title>Coverage, LC-Loop Coverage, MC-Method Coverage, CC</title>
      </sec>
      <sec id="sec-4-3">
        <title>Condition Coverage, FC-File Coverage, CLC-Class Coverage</title>
        <p>
          After analyzing the code coverage results produced after the
execution of Emma and CodeCover on the various sorting
programs, we concluded that CodeCover gives a more accurate
coverage information than Emma. To complete the process of
evaluating the effectiveness of these testing tools, we will show
in table 3 the computed criteria [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] selected to evaluate these
tools.
Based on these values (which we partially gained in their official
websites, as they are open-source tools), we judged that the best
and more effective tool to be used during the software testing
process is CodeCover.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. CONCLUSIONS</title>
      <p>Based on the results summarized in table 2, that shows achieved
code coverage metric reported from each tool, we conclude that
CodeCover tool reports a more accurate coverage information
than Emma, which does not supply us with sufficient information,
based on which we can judge over the quality of tests, that is why
we suggest the use of the CodeCover tool. CodeCover is more
efficient to perform the Code Coverage Analysis, because
through the detailed coverage analysis for each program method,
it allows us to define the unnecessary test cases, that does not
increase coverage of the program, affecting so negatively the
execution time of the test suite by decreasing it. We argued this
conclusion by taking as an example QuickSort, where for an
initial set of 3 test cases while Emma reported an average
coverage of 87%, CodeCover reported a low Loop Coverage of
66.7 %.The same fact was present in all our set of input sorting
programs. So in order to project a successful testing process for
our input programs, we should base on CodeCover coverage
reports, to decide whether it is necessary to increase the number
of test cases or not. During our experimental work, where we
continuously improved the testing process, we came into the
conclusion that the most problematic coverage metric is Loop
Coverage. This happens mainly because of the for loop, that
requires extra tests to be fully covered. So our coverage results
for all our input programs reached a Loop Coverage metric in the
range 46.7 % to 66.7%, which is considered very low. But not
only the Loop Coverage metric was responsible for low coverage
percentages in the beginning of our work, but also the manner in
which we projected our tests affects coverage result. So to
achieve a high code coverage, we have to avoid programming
long test cases that try to cover a considerable part of the
program, but instead we must project one test case for each
functional unit of the program. We arrive in the same conclusion
if we see table 3, that shows the computed criteria chosen to
completely evaluate the testing tools. From this table we infer that
the CodeCover tool is easy to use, has a very good response time
for every command given, has very good reporting features
compared with Emma tool.
[10] EMMA: a free Java code coverage tool http://Emma.sourceforge.net
[11] CodeCover Tutorial
http://www.codecoveragetools.com/code_coverage_java.html</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Lawrance</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clarke</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burnett</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>and G.</given-names>
            <surname>Rothermel</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>How Well Do Professional Developers Test with Code Coverage Visualizations? An Empirical Study</article-title>
          .
          <source>In Proceedings of the IEEE Symposium on Visual Languages</source>
          and
          <string-name>
            <surname>Human-Centric Computing</surname>
          </string-name>
          (
          <year>September 2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Tikir</surname>
            ,
            <given-names>M. M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Hollingsworth</surname>
            ,
            <given-names>J. K.</given-names>
          </string-name>
          <year>2002</year>
          .
          <article-title>Efficient instrumentation for code coverage testing</article-title>
          .
          <source>In Proceedings of the ACM SIGSOFT 2002 International Symposium on Software Testing and Analysis (Rome, Italy, July 22-24</source>
          ,
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Cornett</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>1996</year>
          -
          <fpage>2011</fpage>
          .
          <article-title>Code Coverage Analysis</article-title>
          .
          <source>Bullseye Testing Technology.</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Beust</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Suleiman</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>Next Generation Java Testing: TestNg and Advanced Concepts</article-title>
          .
          <source>Addison Wesley</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          ,
          <fpage>132</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Ammann</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Offutt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2008</year>
          . Introduction to Software Testing, Cembridge University Press,
          <fpage>268</fpage>
          -
          <lpage>277</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Sommerville</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>Software Engineering ( 8th edition)</article-title>
          .
          <source>Harlow: Addison Wesley</source>
          ,
          <fpage>537</fpage>
          -
          <lpage>565</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>JUnit</given-names>
            <surname>Best</surname>
          </string-name>
          Practices-Java World, http://www.javaworld.com/javaworld/jw-12-2000/jw-1221-junit.html
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Prasad</surname>
            ,
            <given-names>K.V.K.K.</given-names>
          </string-name>
          <year>2006</year>
          .
          <article-title>Software testing tools</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Durrani</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          <year>2005</year>
          .
          <article-title>Role of Software Metrics in Software Engineering and Requirements Analysis</article-title>
          .
          <source>In Proceeding of IEEE ICICT First International Conference of Information and Communication Technologies. (August 27-28).</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>