A Condition Coverage-Based Black Hole Inspired Meta-Heuristic
for Test Data Generation
Derya Yeliz Ulutaş 1,2, Ayşe Tosun 2
1
    ASELSAN Inc., Ankara, Turkey
2
    Istanbul University, Faculty of Computer and Informatics Engineering, Istanbul, Turkey

                  Abstract
                  Combinatorial Testing (CT) strategy is one of the well-known methods to achieve high code
                  coverage rates during testing for safety critical systems. While generating test data in CT, we
                  often encounter the problem of test case explosion, especially for the systems with multiple
                  parameters and values. To overcome this challenge, search-based CT (CSST) strategies are
                  introduced. In this study, we propose a new algorithm that inspires from a binary variant of
                  Black Hole Algorithm (BBH) in CSST and adopt BBH according to the CT challenges in an
                  industrial context. The proposed BBH version, BH-AllStar, aims the following: (1) obtaining
                  higher condition coverage, (2) avoiding being stuck in local minima and (3) handling discrete
                  input values. We finalize the solution space of BH-AllStar by reassessing the previously
                  removed stars and incorporating the useful ones into it. We evaluate our approach on a real-life
                  software project in the safety-critical domain with respect to condition coverage, number of test
                  cases and execution time. Compared to BBH, BH-AllStar generates more test cases which
                  achieve up to 43% increase in condition coverage.

                  Keywords 1
                  Combinatorial Search-based Software Testing (CSST), Test Data Generation, Meta-heuristic

1. Introduction                                                                                   software systems in which there exist many
                                                                                                  parameters and conditions that take a combination
                                                                                                  of multiple input values [7, 8]. CT method such as
    The complexity and the criticality of the safety
                                                                                                  t-way testing executes the software using test
critical systems in the defense industry are
                                                                                                  cases with the interaction of the possible input
growing [1, 2], hence, these systems require more
                                                                                                  values [9]. The radar software in our industrial
thorough and efficient testing technologies, such
                                                                                                  context, for instance, takes 12 input parameters
as test automation and regression test
                                                                                                  each takes two to six values. If we would like to
selection/prioritization [3] and continuous
                                                                                                  test this software with full input coverage, we
integration. The software testing field in the
                                                                                                  would end up executing 63.34.24.5=1,399,680
defense industry needs to develop more efficient
                                                                                                  test cases. This is an impossible goal to reach, and
and qualified techniques to meet security-critical
                                                                                                  instead in practice, we follow t-way testing with
requirements. [1]. For instance, various test
                                                                                                  the IPOG strategy [10]: select the t value as high
techniques such as Boundry-Value Analysis and
                                                                                                  as possible with the cost of increasing the number
Equivalence Partitioning are used to meet the DO-
                                                                                                  of test cases. The current limitation of the
178 Standard to ensure safety-critical
                                                                                                  employed t-way testing in our industrial setting is
requirements, which has been imposed by the U.S.
                                                                                                  the lack of relationship between the selected test
Federal Aviation Administration (FAA). [4, 5].
                                                                                                  cases and their code coverage, because the
To assess the effectiveness of these methods,
                                                                                                  strategy of selecting the test inputs is not
different coverage criteria are employed [6].
                                                                                                  dependent on the covered branches/conditions of
During functional testing, Combinatorial Testing
                                                                                                  the software system.
(CT) methods are widely utilized especially for

QuASoQ’21: 9th International Workshop on Quantitative
Approaches to Software Quality, December 06, 2021, Taipei,
Taiwan
EMAIL: dyulutas@aselsan.com.tr (D.Y. Ulutas);
tosunay@itu.edu.tr (A. Tosun)
              © 2021 Copyright for this paper by its authors. Use permitted under Creative
              Commons License Attribution 4.0 International (CC BY 4.0).
              CEUR Workshop Proceedings (CEUR-WS.org)

                                                                                             70
    To address this challenge, combinatorial                      The remainder of this paper is organized as
search-based software testing (CSST) approaches               follows: Section 2 presents the related work. We
are suitable [11]. One of the studies by Al-                  report the details of our approach BH-AllStar and
Sammarraie and Jawawi [9] inspired our work.                  research methodology in Section 3. Section 4
The authors propose three new approaches                      shows the results and discussion; Section 5
namely Binary Black Hole (BBH), Multiple Black                explains the threats to validity. Finally, Section 6
Hole (MBH) and Binary Multiple Black Hole                     concludes this study with future directions.
(BMBH) based on the Black Hole Algorithm
(BHA), which is first introduced in [12] as an                2. Related work
alternative to Particle Swarm Optimization
(PSO). Their aim was generating test inputs with
respect to the number of test cases and code                      Combinatorial Search-based Software Testing
coverage. The authors also aim to avoid being                 (CSST) is a testing method based on
stuck in local minima and they propose a novel                Combinatorial Testing (CT). CT aims at
energy mechanism for MBH that helps to choose                 generating input vectors combining possible input
useful black hole populations only. The results               values of all input parameters of the Software
show that MBH outperforms PSO and BHA in                      Under Test (SUT) [10]. The SUT can have n input
terms of the number of test cases, consistency and            parameters: ci (i = 1,2,3…n). Every parameter can
high coverage rates.                                          take a set of values Vi where i can be from 1 to n.
    We share the same goal with [9] of generating             For example, V1 is the value set of c1 and contains
test data using a search-based meta-heuristic,                m different values. (v1,v2,…,vm ). One test case is
which avoids local minima and achieves higher                 composed of a set of values from all Vi of all ci
coverage than BBH. The approach in [9] could be               (e.g. vx ∈ V1, vy ∈ V2, vz ∈ V3) [10]. There are
well fitted in our industrial problem because of              different methods to build all combinations of
two main motivations: (1) their problem domain                input parameters for CT such as t-way testing or
and data set are similar to our project, i.e., radar          randomization-based methods in the literature
software, (2) they innovatively apply BHA, which              [13]. CSST is one of the search-based approaches
is originally a clustering algorithm, to generate             to reduce the large solution space [9]. To find the
test data for functional testing. However,                    optimum solution of a problem, especially for the
considering the scale of our radar software under             problems, which contain very large sets of
test, BBH approach does not fully address our                 solutions and have computational constraints,
challenge. More specifically, BBH concentrates                meta-heuristic search techniques are commonly
on generating minimum number of test cases                    implemented [14]. These meta-heuristic search
rather than increasing the coverage criteria. Due             techniques are based on initially generated
to this, it lacks assessing the generated stars               random solution space and narrowing down the
except the black hole in terms of their coverage.             search space based on the evaluation criteria.
The term star here corresponds to the test case                   A recent systematic mapping study in [14]
(e.g., [3, 5, 7] is an input set, also a test case and        highlights 260 relevant studies in CSST, and
named as ‘star’), as depicted in Table 2. This in             narrows down to 42 primary studies to investigate
turn eliminates some of the useful stars during the           the proposed approaches and their test case
operations of BBH. In this study, we propose a                generation accuracy. The authors summarize
new algorithm called BH-AllStar by modifying                  well-known meta-heuristic search techniques as
several strategies in BBH, such as elimination                follows: Genetic Algorithm (GA), Particle Swarm
mechanism based on distance between stars,                    Optimization Algorithm (PSO), Simulated
moving stars in discrete space, and selecting the             Annealing Algorithm (SA), and Ant Colony
best black hole population from the history. Our              Optimization (ACO). Their mapping strategies
new approach generates relatively more test cases             show that GA is the most popularly applied
with higher condition coverage than BBH,                      technique, whereas some combine multiple
although we cover only 0.14% of 1,399,680 test                algorithms such as GA and SA. They conclude
cases in our industrial context. We assess the                that the primary studies provide benefits in terms
applicability of BBH in [9] and its new variant               of code coverage and execution time.
BH-AllStar in an industrial context with respect to               In Table 1, we list a sample of studies obtained
the number of test cases, coverage rate and                   from [14] that we examine in terms of the
execution time.                                               approach, dataset on which the approaches are
                                                              evaluated, and the fitness criteria. Among the
                                                         71
studies, we selected eight studies targeting a                     this name is that any object moving around this
similar objective to our industry problem, and                     special star cannot escape and is absorbed into the
varying in terms of fitness functions and datasets.                black hole if the objects cross the Schwarzschild
The studies [15-19] utilize nature inspired                        radius [21]. This special star is described with the
optimization methods for CSST, whereas [9, 12,                     adjective ‘black’ because light cannot be reflected
20] use a heuristic based on black hole                            and make this star invisible [21, 22]. With the
phenomenon. All studies in Table 1 show that the                   inspiration of the nature of black holes, BHA is
improved versions of nature inspired optimization                  proposed for data clustering [12]. Fig.1 presents
methods outperform original methods with                           the pseudocode of BHA adopted from [9, 12].
respect to code coverage or execution time. All
the prior studies validate the success of their
proposed approach on simple functions like the
source code that calculates the greatest common
divisor or checking the validity of the date, etc.
Unfortunately     these     projects    are     not
representative of real programs, and achieving
high coverage is relatively easier with few test
cases. Except one study [15], all studies focus on                 Figure 1: BHA pseudocode adopted from [9, 12]
branch coverage as one of their evaluation criteria,
although in practice wıth more complex                             According to the terminology used in [9], a
algorithms, multiple conditions need to be taken                   population with random stars is initially
into account with a condition converage criterion.                 generated, and a fitness value is calculated for
Below, we give more description on our baseline                    every star in this population. The best star with the
algorithm BHA [12] and its application on test                     best fitness value is selected as the black hole.
case generation [9].                                               Later, all stars in the population are moved
Table 1                                                            towards the black hole as described in Eq.1. The
Summary of studies in the literature                               new location of the current star x(t+1) is
 Study   Baseline   Dataset/Project      Fitness/Evaluation        calculated by multiplying a random number with
         Method                          Criteria                  the difference between the BH Star and the current
 [15]    ACO        triangleType, gcd,   Branch coverage
                    calday,                                        star (BH-currentStar), and adding the result to the
                    isValidDate, cal                               previous location x(t). The current star is moved
 [16]    SA         unknown              Condition coverage        closer to the BH Star as a result of this operation.
 [19]    PSO        triangleType, gcd,   Branch coverage
                    calday,                                        Movement of the stars towards the black hole
                    isValidDate, cal                               provides a grouping process and force all the other
 [17]    GA         triangle             Branch distance           stars to converge to the best fitness value. If a star
                    classification and
                    nextDate                                       crosses the event horizon (R) of the black hole
 [18]    SA         triangle             Branch Coverage,          defined in Eq.2, it is destroyed, and a new random
                    classification       # of Detected             star is generated and added into the population.
                                         Mutants/Defects
 [12]    BHA        Six-benchmark        Distance between          The R value is calculated as the ratio of the fitness
                    dataset from ML      blackhole and the         value of BH Star and sum of the fitness function
                    databases            star                      outcomes of all stars in the population. This step
 [20]    BHA        Well-known
                    mathematical                                   leads to adding various stars to the population and
                    functions                                      increases the probability of convergence to the
 [9]     BHA        pizza ordering,      k number of test          desired solution.
                    smart mobile,        cases
                    heart disease
                                                                                x(t+1) = x(t) + rand x (BH-currentStar)            (1)

2.1.     Black hole algorithm                                                     R = 𝑓𝑓(𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵)/ ∑𝑛𝑛𝑖𝑖=1 𝑓𝑓(𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆(𝑖𝑖))   (2)


   BHA has been first introduced in the study                          The searching process stops if the stopping
[12] as a new heuristic algorithm inspired by the                  criterion is met. This criterion can be the
black hole phenomenon. Based on Newton’s law,                      maximum number of iterations or a sufficiently
scientists invented a concept about the stars with                 good fitness criterion [9, 12].
high power and strong gravitational field [21].                        Hatamlou conducted an experiment on the six-
They name this star as “black hole”. The reason of                 benchmark dataset [12] and highlights two
                                                              72
advantages of BHA: (1) simple and easy to                                   being stuck in local minima since there is no
implement, (2) applicable to other problem                                  mutation operation. The authors point that the
domains, and (3) outperforming other clustering                             population      initialization   and    moveStars
methods. The performance of BHA is later tested                             operations could be improved.
in [21] on mathematical functions to reach their                                Our work complements the previous study [9]
minimum and maximum values. The authors in                                  in several ways: (1)We propose a condition
[21] compare BHA with GA and PSO, and report                                coverage-based elimination mechanism instead of
that BHA outperforms others and can be applied                              a distance-based one to keep the test cases that
to other problem domains.                                                   likely contribute to total coverage. (2) We
                                                                            prioritize achieving higher coverage rates than
2.2.        The baseline study                                              restricting the number of test cases. (3) We
                                                                            propose two new approaches within BHA to
                                                                            assess all the generated stars and black holes and
   Based on the BHA proposed in [12], Al-                                   to extend the selected population without being
Sammarraie and Jawawi [9] propose three                                     stuck in local minima. (4) We apply the BHA
variations of BHA for test case generation: BBH,                            method to a real-life software system for test case
MBH and BMBH. Terms and function names                                      generation.
used in the study [9] are described in Table 2. For
the inputs that has discrete values like binary
values (0,1), moveStars operation (Eq. 1) does not                          3. Our methodology
work because it causes a shift in the continuous
domain due to the multiplication of the distance                                The software under test in this study is part of
with a random value. To handle this issue, a                                a large-scale radar software developed for the
binary variant of black hole is proposed. This                              defense industry. Due to its confidentiality
approach generates a random value rd between 0                              constraints, we are not able to give the details of
and 1. This random value is compared against the                            the algorithm or its pseudocode. We can briefly
constant value, pr (Eq. 3). If it is greater than rd,                       describe the complexity of our SUT: it takes 12
the new location of the star is going to be equal to                        input parameters with different possible values,
that of black hole. Otherwise, new location of is                           and outputs an integer value. It contains two
equal to its own old value [9].                                             functions with 17 if and nested if statements with
                                                                            up to three levels, two for loops, two switch case
Table 2                                                                     statements and 72 branches in total. We built a
Terminology of BBH used in [9]                                              technological setup, in order to conduct our
    Term               Meaning                                              analysis on BBH and its variants including BH-
Parameter         x: can take values x1, x2, x3                             AllStar. Our analysis starts by selecting the
                  y: can take values y1, y2, y3. …
Star              [x1 y2 z3 t4]
                                                                            baseline algorithm as shown in Fig 2, namely
Population        [star1, star2…starn]                                      BBH, to generate test cases, i.e., a black hole (BH)
updateFitness()   Calculate the coverage of every star in the               surrounded with stars, and our software is
                  population and determine the best star as the             executed against this test case. Then the system
                  black hole.
moveStars()       Change location of stars towards the blackhole.           collects the information of missed conditions
updateRadius()    Update the radius value of black hole.                    using Jacoco testing tool. Iteratively, the
replace()         Remove the current star and generate a new one.           condition coverage information is stored in a file,
R                 Radius of the black hole (eq.2)                           and used in the elimination decision mechanism.
                                                                            Based on the decisions, the algorithm generates
                                   𝐵𝐵𝐵𝐵(𝑡𝑡) 𝑖𝑖𝑖𝑖 𝑟𝑟𝑑𝑑 < 𝑝𝑝𝑝𝑝
                   𝑥𝑥𝑥𝑥(𝑡𝑡 + 1) = �
                                    𝑥𝑥𝑖𝑖(𝑡𝑡) 𝑜𝑜𝑜𝑜ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
                                                                 (3)        new test cases (stars) and the cycle continues. The
                                                                            algorithm terminates after K (10-500) number of
   The authors reach consistent and high                                    iterations.
coverage by extending solution space with the                                   We modify several operations in [9] and
help of multiple swarm approach called MBH [9].                             propose our approach: BH-AllStar. We use
The main idea in MBH is to generate more than                               initializePopulation()      and      updateFitness()
one population and in turn, more than one black                             methods as is, but modify moveStars(), and
hole in a single iteration to avoid being stuck in                          created a new star elimination criterion. We also
local minima. At the end of the iterations, final list                      propose two new methods to select the final
of populations become the solution space. The                               population both of which will be described below.
findings showed that their approach suffers from                            Our motivation is also determining the best BH in
                                                                       73
reaching as high coverage rates as possible, and           to destroy the star or keep it alive, with our
shaping the other stars around it without being            condition coverage based mechanism, as shown in
stuck in local minima. During our analysis, we             Fig.3 described as isBestAtLeastOnOneBranch()
noticed that the distance criteria defined in [9]          and calculateBranchRatio() methods. With this
removes some stars although they contribute to             condition coverage-based mechanism for all
the coverage ratio. It also prioritizes to keep the        branches of our SUT, we give chance to the stars
number of test cases as few as possible. Increasing        with lower coverage ratios then the BH Star to
the number of test cases is preferable for our             cover uncovered branches. We have two steps in
industrial setting than having low coverage rates.         our decision mechanism:
Fig. 3 shows the pseudocode of our approach. Our
                                                               isBestAtLeastOnOneBranch We control if the
contributions are presented in the below
subsections.                                               current star is the best among all stars in the
                                                           population in terms of covering at least one more
                                                           branch. This operation would keep an efficient
                                                           star, even if it has a smaller coverage ratio
                                                           compared to other stars.
                                                               calculateBranchRatio We compare current star
                                                           and black hole in terms of covering all branches
                                                           one by one and calculate the ratio of the number
Figure 2: Our experiment setup                             of branches that current star covers better than
                                                           black hole, and the number of branches that
                                                           current star covers worse than black hole. We use
                                                           here a coefficient=3 to multiply the number of
                                                           better covered branches in order to give a star
                                                           more chance (see Eq. 4). We have made several
                                                           tests with different coefficients and chosen the
                                                           value 3 as the optimum value for deciding the
                                                           stars to be accepted or not.
                                                                             #𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜ℎ𝑒𝑒𝑒𝑒∗3
                                                                                                                                (4)
                                                                              #𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑟𝑟𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒ℎ𝑒𝑒𝑒𝑒


                                                               Fig. 4 illustrates a sample of stars: An
                                                           eliminated star (purple), a previously eliminated
Figure 3: Pseudocode of our proposed approach              but later revived star (yellow), and an always alive
                                                           star (orange) with their coverage values in the
                                                           timeline, when our proposed mechanism is
3.1. Condition      coverage-based                         applied. The total coverage significantly increases
elimination mechanism                                      at the end of the iterations as we let a previously
                                                           killed star (yellow) to be included into the final
   We calculate condition coverage for each                population. Eliminating a star (purple) also
branch existing in our testing code. For example,          increase total coverage during 18th iteration.
an “if” statement with one condition, e.g. x<5,
contains two branches, true or false, and the
condition coverage for this branch can be 0, 50%
or 100%. If an “if” statement has two conditions,
e.g. x<5 && y>10, then it has 4 conditions (TT,
TF, FT, FF). With the input set (x=3, y=11; x=3,
y=9; x=6, y=9), the condition coverage for this
branch is calculated as 75%. Only traditional
branch coverage is not enough for our study since
we need to cover all the conditions of each branch
in our testing code. Therefore, we replace the
distance based mechanism, which calculates to
what extent the distance between current star and          Figure 4: The x-axis is the number of iterations
black hole exceeds the R value to decide whether           and the y-axis is condition coverage rate.

                                                      74
3.2.    Moving Discretely                                    coverage rates can be achieved with fewer test
                                                             cases but higher execution time for different
                                                             number of iterations. The maximum code
   In moveStars() operation, to shift the stars
                                                             coverage rate is approximately 76% in iterations
towards the black hole, we propose an index-
                                                             150 and 500 with BH-AllStar. Solution space
based method for discrete-valued parameters,
                                                             consists of 190 test cases for 150 iterations and
similar to ‘Binary Variants’ in [9]. We store our
                                                             executes in 327 secs. However, for 500 iterations,
input parameters in an array and we generate a
                                                             only 19 test cases are used but execution time is
random index value between the input value index
                                                             815 secs.
of the current star and the BH. Then we move
current star to this index and assign the value to               In addition, the baseline algorithm, BBH has a
the current star as the updated input parameter.             better coverage (72.2%) than our condition
                                                             coverage based BBH approach (65.3%) for 300
                                                             iterations. However, our additions help to increase
3.3. Choosing the best population                            the coverage from 65.3% to 75% in BH-AllStar.
over history                                                 BBH* always reaches better coverage rates than
                                                             BBH. When the coverage criterion is set as
    In keepPopulationWithMaxTotalCoverage(),                 condition (BBHCC), we do not always see a better
we keep the population generated in all iterations           convergence than BBH. But choosing the best
of BBH including coverage rates and alive/dead               population and adding all the “good” stars
states of each star. At the end of the iterations, we        eventually converge to much higher coverage
select the population that has the highest coverage          ratios. Instead of defining multiple black hole as
ratio, not the last population according to BH.              in [9], we stick to the single black hole
                                                             phenomenon but we give a certain flexibility to
3.4.    All Star operation                                   the star selection and achieve a similar
                                                             improvement like in [9]. We can confirm the prior
                                                             finding that a black hole inspired meta-heuristic
   In allStar() method, we compare all the stars             could be successful at generating test data for
between t0 and tend against the BH at tn, and add            software systems with too many parameters.
the best stars in terms of condition coverage into
our final population. Although this causes an                Table 3
increase in the number of test cases, it also                Performance of BBH Variants and BH-Allstar
increases the coverage of the final population.               Iteration                                           BBH    BBH           BH-
                                                                                        BBH          BBH*
   The variants of BBH are as follows: (1) Best                                                                    CC    CC*          AllStar
BBH (BBH*): We incorporate “Choosing Best                                 Stars (#)     10           10           10     10           12
                                                              10


Population from History” approach into BBH. (2)                           Cov. (%)      51.4         70.8         68.1   72.2         73.6
BBH with Condition Coverage (BBHCC): We                                   Time          19           19           22     22           22
use “Condition Coverage-Based Elimination                                 Stars (#)     10           10           10     10           190
                                                              150


Mechanism”. (3) Best BBH with Condition                                   Cov. (%)      68.1         73.5         68.7   75           76.8
Coverage (BBHCC*): We add “Choosing Best                                  Time          320          320          372    372          372
Population over History” onto BBHCC. (4) BH-                              Stars (#)     10           10           10     10           14
                                                              300


AllStar: We add “AllStar Operation” onto                                  Cov. (%)      72.2         75.0         65.3   73.6         75
BBHCC*. We assess all according to condition                              Time          460          460          476    476          476
coverage of the population, number of test cases,                         Stars (#)     10           10           10     10           19
                                                              500


and execution time.                                                       Cov. (%)      69.4         75           70.8   73.6         76.4
                                                                          Time          782          782          815    815          815

4. Results and discussion
                                                             Table 4
                                                             Coverage for different number of iterations
   The results over different iterations and with                                                   Alive Stars                BH-AllStar
                                                                                 BH
variants of BBH, reported in Table 3, show that                         #iter    Star        Min        Max        Median
                                                             Coverage


higher code coverage rates are obtained compared                        10       51.4        25         51.4       37.5        73.6
to BBH. For instance, with 500 iterations,                              150      51.4        23.6       51.4       43          76.8
                                                                        300      51.4        23.6       51.4       37.5        75
coverage values are 69.4% in BBH, whereas 75%                           500      51.4        25         51.4       38.8        76.4
in BBH*, 70.8% in BBHCC, 73.6% in BBHCC*
                                                                Considering the number of test cases, we
and 76.8% in BH-AllStar. We observe that same
                                                             observe that BH-AllStar may take higher values
                                                        75
with respect to the positions of the stars, condition                                                                                                   smaller at the end. On the other hand, in our
coverage ratios of each star and the black hole                                                                                                         approach, a peak can be seen just after all the
star. In 150 iterations, we generate 190 test cases,                                                                                                    iterations are executed, at t=151. The population
which is a higher value compared to the other                                                                                                           with the best coverage has been kept and the good
experiment results. Since we pick the best stars in                                                                                                     stars over the history are added to that population.
terms of coverage during All-Star operation, there
may be many “best” stars in the population
compared to the selected black hole. We did not
perform any filtering on these best stars like
pruning, but it might be possible to reduce those
that cover similar conditions as a final step.
    Table 4 presents the coverage rate of the BH
Star, Alive Stars and BH-AllStar in the final
solution space. In both BBH and our proposed                                                                                                            Figure 6 Coverage in the timeline of BBH
approach, the black hole has the same coverage
rate (51.4%). Alive stars in the solution space
                                                                                                                                                           The results show that our Choosing Best
have a minimum of 23.6%, a maximum of %51.4
                                                                                                                                                        Population from History approach always
(BH itself) and a median of 43% coverage ratios.
                                                                                                                                                        provides better coverage rates. Since the nature of
A star with less coverage ratio than the BH Star is
                                                                                                                                                        the movement to the black hole and random
acceptable in our study because it might have a
                                                                                                                                                        generation of stars might cause a decrease in total
higher coverage on any branch compared to the
                                                                                                                                                        coverage in BBH, we handle this problem by
BH star.
                                                                                                                                                        keeping the population with maximum coverage.
    Fig. 5 presents the coverage rates of all the                                                                                                       The main reason that BH-AllStar outperforms
branches according to 190 stars in the solution                                                                                                         BBH is that the latter eliminates stars because it is
space for 150 iterations. As depicted in Fig.5,                                                                                                         too close to the black hole star. However, it does
there are still some branches (e.g., 19th to 27th)                                                                                                      not mean that these stars do not contribute to total
which are not sufficiently covered although we                                                                                                          coverage.
increase the total coverage ratio. This is partly due
to the fact that BH algorithm essentially revolves
around the black star that is picked. So, some of                                                                                                       5. Threats to validity
the areas in the search space might be missed
during the iterations.                                                                                                                                      We highlight several issues that might
                                                                                                                                                        jeopardize the validity of our findings, and discuss
                                                                                                                                                        these in this section. The first issue is related to
           Coverage Density Graph based on StarID and BranchID (Contour Plot)
         190
                                                                                                                                                        the construct defined as the fitness criteria of our
         180
         170
                                                                                                                                       Coverage
                                                                                                                                           < 0,0
                                                                                                                                       0,0 – 0,2
                                                                                                                                                        proposed algorithm. We aim to improve the final
         160
         150
                                                                                                                                       0,2 – 0,4
                                                                                                                                       0,4 – 0,6
                                                                                                                                                        population of the algorithm by assessing the
                                                                                                                                                        condition coverage ratio, which is calculated
         140
                                                                                                                                       0,6 – 0,8
         130                                                                                                                           0,8 – 1,0
         120

                                                                                                                                                        through Jacoco tool for each branch of the tested
                                                                                                                                           > 1,0
         110
StarID


         100
         90
         80                                                                                                                                             algorithm. Jacoco gives a condition coverage ratio
                                                                                                                                                        for each branch, but it does not specifically say
         70
         60
         50
         40
         30
                                                                                                                                                        which conditions are satisfied in that branch.
         20
          10                                                                                                                                            When two stars having 50% condition coverage
               1   2   3   4   5   6   7   8   9 10 1 1 12 1 3 1 4 1 5 1 6 1 7 1 8 1 9 20 21 2 2 23 2 4 25 2 6 27 2 8 29 3 0 3 1 3 2
                                                                                                                                                        for the same branch are evaluated, we consider
                                                                   BranchID
                                                                                                                                                        those the same. But we do not know whether both
Figure 5 Coverage Density Graph based on Star ID                                                                                                        cover different conditions and complement each
and Branch ID                                                                                                                                           other. This might have caused to falsely eliminate
                                                                                                                                                        some stars from the population. We plan to work
                                                                                                                                                        on this issue in our future studies by checking
    Fig. 6 illustrates the change in the total
                                                                                                                                                        other unit testing tools.
coverage ratio of BBH and BH-AllStar for 150
                                                                                                                                                            The second issue is related to the binary
iterations. It is seen that one or more populations
                                                                                                                                                        strategy applied to turn BHA to BBH algorithm.
were found with BBH (around 60th and 100th
                                                                                                                                                        In Section III, we describe how the movement of
iterations) that reach a higher coverage, but there
                                                                                                                                                        the stars happens on a discrete scale.
were later lost. The final coverage ratio was also
                                                                                                                                                   76
Unfortunately, the description in [9] was not                  Conditon/Decision Coverage). We think that our
detailed enough and we had to make some                        proposed approach can be applied to other
assumptions. This might cause a deviation                      problems, such as testing algorithms with
between BBH in [9] and BBH we developed here.                  different level of complexity, nested conditions
It does not affect our comparisons between BBH,                and line of codes. Initialization of the population,
its variants and BH-AllStar on the software under              which is the first step of BBH can also be
test in this paper.                                            optimized to start with a better population rather
    The third issue is related to the internal validity        than a random start as in its current setting.
of the experimental design. We are aware that
choosing a different coefficient value in Eq.4 in              7. References
BH-AllStar would impact our results, in fact, we
tried multiple values and observed the
convergence of the algorithm in terms of                       [1]     J. Park, H. Ryu, H.-J. Choi, and D.-K.
condition coverage to pick the final value as 3. We                    Ryu, A survey on software test maturity in
also think the randomness in the initial                               Korean defense industry. 2008, pp. 149-
population, and the selection of best stars from all                   150.
the iterations could affect the final results. We              [2]     R. Kenett, F. Ruggeri, and F. Faltin,
plan to work on these as future works as both                          "Analytic Methods in Systems and
require empirical analysis of different strategies.                    Software Testing," 07/01 2018, doi:
Finally, our results are limited to a single                           10.1002/9781119357056.
industrial context because we chose to solve the               [3]     V. Garousi, R. Ozkan, and A. Betin-Can,
problem of our industrial partner through an                           "Multi-objective regression test selection
improved CSST method. The prior works often                            in practice: An empirical study in the
assess their approaches on simple programming                          defense software industry," Information
exercises, which suffer from generalization of the                     and Software Technology, 06/01 2018,
results on real life projects. We believe our study                    doi: 10.1016/j.infsof.2018.06.007.
validates the real application of Black Hole                   [4]     S. Nidhra, "Black Box and White Box
inspired approaches adopted to condition                               Testing Techniques - A Literature
coverage criterion.                                                    Review," International Journal of
                                                                       Embedded Systems and Applications, vol.
                                                                       2, pp. 29-50, 06/30 2012, doi:
6. Conclusion                                                          10.5121/ijesa.2012.2204.
                                                               [5]     W. Youn, S. Hong, K. Oh, and O. Ahn,
    In this study, we present a new approach to                        "Software Certification of Safety-Critical
generate test data by implementing a previously                        Avionic Systems: DO-178C and Its
proposed meta-heuristic searching technique                            Impacts," Aerospace and Electronic
BBH on a real-life engineering problem that                            Systems Magazine, IEEE, vol. 30, pp. 4-
we face for our safety critical systems in the                         13,         04/01        2015,        doi:
industry. Our BBH variants and new algorithm                           10.1109/MAES.2014.140109.
BH-AllStar perform better than BBH in terms of                 [6]     M.-H. Chen, M. Lyu, and W. Wong,
condition coverage up to 43%. Although BH-                             "Effect of code coverage on software
AllStar generates more test cases compared to                          reliability measurement," Reliability,
BBH, we achieve extending search space and                             IEEE Transactions on, vol. 50, pp. 165-
obtaining higher condition coverage rates in the                       170,         07/01        2001,       doi:
same execution time. As a future work, we plan                         10.1109/24.963124.
new experiments to compare Multiple Black Hole                 [7]     L. Hu, W. Wong, D. Kuhn, and R.
Approach reported in [9] and our BH-AllStar. We                        Kacker, "How does combinatorial testing
also plan to decrease the number of test cases in                      perform in the real world: an empirical
our current approach, the selection of best stars for                  study," Empirical Software Engineering,
each branch in BH-AllStar can be limited by                            vol. 25, 07/01 2020, doi: 10.1007/s10664-
selection only one among the best stars. We also                       019-09799-2.
intend to conduct further research on every                    [8]     D. R. Kuhn, R. N. Kacker, and Y. Lei,
condition in a decision to be covered of the SUT,                      Introduction to Combinatorial Testing
rather than having higher condition coverage,                          (Chapman & Hall/CRC Innovations in
which is known as MC/DC (Modified                                      Software Engineering and Software
                                                          77
       Development Series). Taylor & Francis,            [20]   M. Farahmandian and A. Hatamlou,
       2013.                                                    "Solving optimization problem using
[9]    H. Al-Sammarraie and D. Jawawi,                          black hole algorithm," Journal of
       "Multiple Black Hole Inspired Meta-                      Advanced     Computer      Science  &
       Heuristic Searching Optimization for                     Technology, vol. 4, 12/25 2015, doi:
       Combinatorial Testing," IEEE Access,                     10.14419/jacst.v4i1.4094.
       vol. PP, pp. 1-1, 02/13 2020, doi:                [21]   E. Curiel, "The many definitions of a
       10.1109/ACCESS.2020.2973696.                             black hole," Nature Astronomy, vol. 3,
[10]   y. Lei, R. Kacker, D. Kuhn, V. Okun, and                 01/08 2019, doi: 10.1038/s41550-018-
       J. Lawrence, IPOG: A general strategy                    0602-1.
       for T-way software testing. 2007, pp. 549-        [22]   E.-G. Talbi, Metaheuristics: From
       556.                                                     Design to Implementation. 2009.
[11]   C. Nie and H. Leung, "A Survey of
       Combinatorial Testing," ACM Comput.
       Surv., vol. 43, p. 11, 01/01 2011, doi:
       10.1145/1883612.1883618.
[12]   A. Hatamlou, "Black hole: A new
       heuristic optimization approach for data
       clustering," Information Sciences, vol.
       222, pp. 175–184, 02/01 2013, doi:
       10.1016/j.ins.2012.08.023.
[13]   K. Zamli, "T-Way Strategies and Its
       Applications for Combinatorial Testing,"
       International Journal of New Computer
       Architectures and their Applications
       (IJNCAA), vol. 2, pp. 459-473, 01/01
       2011.
[14]   D. Y. Ulutas, "The Generation of
       Optimized Test Data: Preliminary
       Analysis of a Systematic Mapping
       Study," 2020.
[15]   C. Mao, X. Yu, J. Chen, and J. Chen,
       Generating Test Data for Structural
       Testing Based on Ant Colony
       Optimization. 2012, pp. 98-101.
[16]   N. Tracey, J. Clark, K. Mander, and J.
       McDermid, "An automated framework
       for structural test-data generation," in
       Proceedings 13th IEEE International
       Conference on Automated Software
       Engineering (Cat. No.98EX239), 13-16
       Oct. 1998 1998, pp. 285-288, doi:
       10.1109/ASE.1998.732680.
[17]   A. El-Serafy, G. El Sayed, C. Salama, and
       A. Wahba, Enhanced Genetic Algorithm
       for MC/DC test data generation. 2015,
       pp. 1-8.
[18]   K. Wang, Y. Wang, and L. Zhang,
       Software testing method based on
       improved simulated annealing algorithm.
       2014, pp. 418-421.
[19]   C. Mao, X. Yu, and J. Chen, Swarm
       Intelligence-Based Test Data Generation
       for Structural Testing. 2012.

                                                    78