A Condition Coverage-Based Black Hole Inspired Meta-Heuristic for Test Data Generation Derya Yeliz Ulutaş 1,2, Ayşe Tosun 2 1 ASELSAN Inc., Ankara, Turkey 2 Istanbul University, Faculty of Computer and Informatics Engineering, Istanbul, Turkey Abstract Combinatorial Testing (CT) strategy is one of the well-known methods to achieve high code coverage rates during testing for safety critical systems. While generating test data in CT, we often encounter the problem of test case explosion, especially for the systems with multiple parameters and values. To overcome this challenge, search-based CT (CSST) strategies are introduced. In this study, we propose a new algorithm that inspires from a binary variant of Black Hole Algorithm (BBH) in CSST and adopt BBH according to the CT challenges in an industrial context. The proposed BBH version, BH-AllStar, aims the following: (1) obtaining higher condition coverage, (2) avoiding being stuck in local minima and (3) handling discrete input values. We finalize the solution space of BH-AllStar by reassessing the previously removed stars and incorporating the useful ones into it. We evaluate our approach on a real-life software project in the safety-critical domain with respect to condition coverage, number of test cases and execution time. Compared to BBH, BH-AllStar generates more test cases which achieve up to 43% increase in condition coverage. Keywords 1 Combinatorial Search-based Software Testing (CSST), Test Data Generation, Meta-heuristic 1. Introduction software systems in which there exist many parameters and conditions that take a combination of multiple input values [7, 8]. CT method such as The complexity and the criticality of the safety t-way testing executes the software using test critical systems in the defense industry are cases with the interaction of the possible input growing [1, 2], hence, these systems require more values [9]. The radar software in our industrial thorough and efficient testing technologies, such context, for instance, takes 12 input parameters as test automation and regression test each takes two to six values. If we would like to selection/prioritization [3] and continuous test this software with full input coverage, we integration. The software testing field in the would end up executing 63.34.24.5=1,399,680 defense industry needs to develop more efficient test cases. This is an impossible goal to reach, and and qualified techniques to meet security-critical instead in practice, we follow t-way testing with requirements. [1]. For instance, various test the IPOG strategy [10]: select the t value as high techniques such as Boundry-Value Analysis and as possible with the cost of increasing the number Equivalence Partitioning are used to meet the DO- of test cases. The current limitation of the 178 Standard to ensure safety-critical employed t-way testing in our industrial setting is requirements, which has been imposed by the U.S. the lack of relationship between the selected test Federal Aviation Administration (FAA). [4, 5]. cases and their code coverage, because the To assess the effectiveness of these methods, strategy of selecting the test inputs is not different coverage criteria are employed [6]. dependent on the covered branches/conditions of During functional testing, Combinatorial Testing the software system. (CT) methods are widely utilized especially for QuASoQ’21: 9th International Workshop on Quantitative Approaches to Software Quality, December 06, 2021, Taipei, Taiwan EMAIL: dyulutas@aselsan.com.tr (D.Y. Ulutas); tosunay@itu.edu.tr (A. Tosun) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) 70 To address this challenge, combinatorial The remainder of this paper is organized as search-based software testing (CSST) approaches follows: Section 2 presents the related work. We are suitable [11]. One of the studies by Al- report the details of our approach BH-AllStar and Sammarraie and Jawawi [9] inspired our work. research methodology in Section 3. Section 4 The authors propose three new approaches shows the results and discussion; Section 5 namely Binary Black Hole (BBH), Multiple Black explains the threats to validity. Finally, Section 6 Hole (MBH) and Binary Multiple Black Hole concludes this study with future directions. (BMBH) based on the Black Hole Algorithm (BHA), which is first introduced in [12] as an 2. Related work alternative to Particle Swarm Optimization (PSO). Their aim was generating test inputs with respect to the number of test cases and code Combinatorial Search-based Software Testing coverage. The authors also aim to avoid being (CSST) is a testing method based on stuck in local minima and they propose a novel Combinatorial Testing (CT). CT aims at energy mechanism for MBH that helps to choose generating input vectors combining possible input useful black hole populations only. The results values of all input parameters of the Software show that MBH outperforms PSO and BHA in Under Test (SUT) [10]. The SUT can have n input terms of the number of test cases, consistency and parameters: ci (i = 1,2,3…n). Every parameter can high coverage rates. take a set of values Vi where i can be from 1 to n. We share the same goal with [9] of generating For example, V1 is the value set of c1 and contains test data using a search-based meta-heuristic, m different values. (v1,v2,…,vm ). One test case is which avoids local minima and achieves higher composed of a set of values from all Vi of all ci coverage than BBH. The approach in [9] could be (e.g. vx ∈ V1, vy ∈ V2, vz ∈ V3) [10]. There are well fitted in our industrial problem because of different methods to build all combinations of two main motivations: (1) their problem domain input parameters for CT such as t-way testing or and data set are similar to our project, i.e., radar randomization-based methods in the literature software, (2) they innovatively apply BHA, which [13]. CSST is one of the search-based approaches is originally a clustering algorithm, to generate to reduce the large solution space [9]. To find the test data for functional testing. However, optimum solution of a problem, especially for the considering the scale of our radar software under problems, which contain very large sets of test, BBH approach does not fully address our solutions and have computational constraints, challenge. More specifically, BBH concentrates meta-heuristic search techniques are commonly on generating minimum number of test cases implemented [14]. These meta-heuristic search rather than increasing the coverage criteria. Due techniques are based on initially generated to this, it lacks assessing the generated stars random solution space and narrowing down the except the black hole in terms of their coverage. search space based on the evaluation criteria. The term star here corresponds to the test case A recent systematic mapping study in [14] (e.g., [3, 5, 7] is an input set, also a test case and highlights 260 relevant studies in CSST, and named as ‘star’), as depicted in Table 2. This in narrows down to 42 primary studies to investigate turn eliminates some of the useful stars during the the proposed approaches and their test case operations of BBH. In this study, we propose a generation accuracy. The authors summarize new algorithm called BH-AllStar by modifying well-known meta-heuristic search techniques as several strategies in BBH, such as elimination follows: Genetic Algorithm (GA), Particle Swarm mechanism based on distance between stars, Optimization Algorithm (PSO), Simulated moving stars in discrete space, and selecting the Annealing Algorithm (SA), and Ant Colony best black hole population from the history. Our Optimization (ACO). Their mapping strategies new approach generates relatively more test cases show that GA is the most popularly applied with higher condition coverage than BBH, technique, whereas some combine multiple although we cover only 0.14% of 1,399,680 test algorithms such as GA and SA. They conclude cases in our industrial context. We assess the that the primary studies provide benefits in terms applicability of BBH in [9] and its new variant of code coverage and execution time. BH-AllStar in an industrial context with respect to In Table 1, we list a sample of studies obtained the number of test cases, coverage rate and from [14] that we examine in terms of the execution time. approach, dataset on which the approaches are evaluated, and the fitness criteria. Among the 71 studies, we selected eight studies targeting a this name is that any object moving around this similar objective to our industry problem, and special star cannot escape and is absorbed into the varying in terms of fitness functions and datasets. black hole if the objects cross the Schwarzschild The studies [15-19] utilize nature inspired radius [21]. This special star is described with the optimization methods for CSST, whereas [9, 12, adjective ‘black’ because light cannot be reflected 20] use a heuristic based on black hole and make this star invisible [21, 22]. With the phenomenon. All studies in Table 1 show that the inspiration of the nature of black holes, BHA is improved versions of nature inspired optimization proposed for data clustering [12]. Fig.1 presents methods outperform original methods with the pseudocode of BHA adopted from [9, 12]. respect to code coverage or execution time. All the prior studies validate the success of their proposed approach on simple functions like the source code that calculates the greatest common divisor or checking the validity of the date, etc. Unfortunately these projects are not representative of real programs, and achieving high coverage is relatively easier with few test cases. Except one study [15], all studies focus on Figure 1: BHA pseudocode adopted from [9, 12] branch coverage as one of their evaluation criteria, although in practice wıth more complex According to the terminology used in [9], a algorithms, multiple conditions need to be taken population with random stars is initially into account with a condition converage criterion. generated, and a fitness value is calculated for Below, we give more description on our baseline every star in this population. The best star with the algorithm BHA [12] and its application on test best fitness value is selected as the black hole. case generation [9]. Later, all stars in the population are moved Table 1 towards the black hole as described in Eq.1. The Summary of studies in the literature new location of the current star x(t+1) is Study Baseline Dataset/Project Fitness/Evaluation calculated by multiplying a random number with Method Criteria the difference between the BH Star and the current [15] ACO triangleType, gcd, Branch coverage calday, star (BH-currentStar), and adding the result to the isValidDate, cal previous location x(t). The current star is moved [16] SA unknown Condition coverage closer to the BH Star as a result of this operation. [19] PSO triangleType, gcd, Branch coverage calday, Movement of the stars towards the black hole isValidDate, cal provides a grouping process and force all the other [17] GA triangle Branch distance stars to converge to the best fitness value. If a star classification and nextDate crosses the event horizon (R) of the black hole [18] SA triangle Branch Coverage, defined in Eq.2, it is destroyed, and a new random classification # of Detected star is generated and added into the population. Mutants/Defects [12] BHA Six-benchmark Distance between The R value is calculated as the ratio of the fitness dataset from ML blackhole and the value of BH Star and sum of the fitness function databases star outcomes of all stars in the population. This step [20] BHA Well-known mathematical leads to adding various stars to the population and functions increases the probability of convergence to the [9] BHA pizza ordering, k number of test desired solution. smart mobile, cases heart disease x(t+1) = x(t) + rand x (BH-currentStar) (1) 2.1. Black hole algorithm R = 𝑓𝑓(𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵)/ ∑𝑛𝑛𝑖𝑖=1 𝑓𝑓(𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆(𝑖𝑖)) (2) BHA has been first introduced in the study The searching process stops if the stopping [12] as a new heuristic algorithm inspired by the criterion is met. This criterion can be the black hole phenomenon. Based on Newton’s law, maximum number of iterations or a sufficiently scientists invented a concept about the stars with good fitness criterion [9, 12]. high power and strong gravitational field [21]. Hatamlou conducted an experiment on the six- They name this star as “black hole”. The reason of benchmark dataset [12] and highlights two 72 advantages of BHA: (1) simple and easy to being stuck in local minima since there is no implement, (2) applicable to other problem mutation operation. The authors point that the domains, and (3) outperforming other clustering population initialization and moveStars methods. The performance of BHA is later tested operations could be improved. in [21] on mathematical functions to reach their Our work complements the previous study [9] minimum and maximum values. The authors in in several ways: (1)We propose a condition [21] compare BHA with GA and PSO, and report coverage-based elimination mechanism instead of that BHA outperforms others and can be applied a distance-based one to keep the test cases that to other problem domains. likely contribute to total coverage. (2) We prioritize achieving higher coverage rates than 2.2. The baseline study restricting the number of test cases. (3) We propose two new approaches within BHA to assess all the generated stars and black holes and Based on the BHA proposed in [12], Al- to extend the selected population without being Sammarraie and Jawawi [9] propose three stuck in local minima. (4) We apply the BHA variations of BHA for test case generation: BBH, method to a real-life software system for test case MBH and BMBH. Terms and function names generation. used in the study [9] are described in Table 2. For the inputs that has discrete values like binary values (0,1), moveStars operation (Eq. 1) does not 3. Our methodology work because it causes a shift in the continuous domain due to the multiplication of the distance The software under test in this study is part of with a random value. To handle this issue, a a large-scale radar software developed for the binary variant of black hole is proposed. This defense industry. Due to its confidentiality approach generates a random value rd between 0 constraints, we are not able to give the details of and 1. This random value is compared against the the algorithm or its pseudocode. We can briefly constant value, pr (Eq. 3). If it is greater than rd, describe the complexity of our SUT: it takes 12 the new location of the star is going to be equal to input parameters with different possible values, that of black hole. Otherwise, new location of is and outputs an integer value. It contains two equal to its own old value [9]. functions with 17 if and nested if statements with up to three levels, two for loops, two switch case Table 2 statements and 72 branches in total. We built a Terminology of BBH used in [9] technological setup, in order to conduct our Term Meaning analysis on BBH and its variants including BH- Parameter x: can take values x1, x2, x3 AllStar. Our analysis starts by selecting the y: can take values y1, y2, y3. … Star [x1 y2 z3 t4] baseline algorithm as shown in Fig 2, namely Population [star1, star2…starn] BBH, to generate test cases, i.e., a black hole (BH) updateFitness() Calculate the coverage of every star in the surrounded with stars, and our software is population and determine the best star as the executed against this test case. Then the system black hole. moveStars() Change location of stars towards the blackhole. collects the information of missed conditions updateRadius() Update the radius value of black hole. using Jacoco testing tool. Iteratively, the replace() Remove the current star and generate a new one. condition coverage information is stored in a file, R Radius of the black hole (eq.2) and used in the elimination decision mechanism. Based on the decisions, the algorithm generates 𝐵𝐵𝐵𝐵(𝑡𝑡) 𝑖𝑖𝑖𝑖 𝑟𝑟𝑑𝑑 < 𝑝𝑝𝑝𝑝 𝑥𝑥𝑥𝑥(𝑡𝑡 + 1) = � 𝑥𝑥𝑖𝑖(𝑡𝑡) 𝑜𝑜𝑜𝑜ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 (3) new test cases (stars) and the cycle continues. The algorithm terminates after K (10-500) number of The authors reach consistent and high iterations. coverage by extending solution space with the We modify several operations in [9] and help of multiple swarm approach called MBH [9]. propose our approach: BH-AllStar. We use The main idea in MBH is to generate more than initializePopulation() and updateFitness() one population and in turn, more than one black methods as is, but modify moveStars(), and hole in a single iteration to avoid being stuck in created a new star elimination criterion. We also local minima. At the end of the iterations, final list propose two new methods to select the final of populations become the solution space. The population both of which will be described below. findings showed that their approach suffers from Our motivation is also determining the best BH in 73 reaching as high coverage rates as possible, and to destroy the star or keep it alive, with our shaping the other stars around it without being condition coverage based mechanism, as shown in stuck in local minima. During our analysis, we Fig.3 described as isBestAtLeastOnOneBranch() noticed that the distance criteria defined in [9] and calculateBranchRatio() methods. With this removes some stars although they contribute to condition coverage-based mechanism for all the coverage ratio. It also prioritizes to keep the branches of our SUT, we give chance to the stars number of test cases as few as possible. Increasing with lower coverage ratios then the BH Star to the number of test cases is preferable for our cover uncovered branches. We have two steps in industrial setting than having low coverage rates. our decision mechanism: Fig. 3 shows the pseudocode of our approach. Our isBestAtLeastOnOneBranch We control if the contributions are presented in the below subsections. current star is the best among all stars in the population in terms of covering at least one more branch. This operation would keep an efficient star, even if it has a smaller coverage ratio compared to other stars. calculateBranchRatio We compare current star and black hole in terms of covering all branches one by one and calculate the ratio of the number Figure 2: Our experiment setup of branches that current star covers better than black hole, and the number of branches that current star covers worse than black hole. We use here a coefficient=3 to multiply the number of better covered branches in order to give a star more chance (see Eq. 4). We have made several tests with different coefficients and chosen the value 3 as the optimum value for deciding the stars to be accepted or not. #𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜ℎ𝑒𝑒𝑒𝑒∗3 (4) #𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑟𝑟𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒ℎ𝑒𝑒𝑒𝑒 Fig. 4 illustrates a sample of stars: An eliminated star (purple), a previously eliminated Figure 3: Pseudocode of our proposed approach but later revived star (yellow), and an always alive star (orange) with their coverage values in the timeline, when our proposed mechanism is 3.1. Condition coverage-based applied. The total coverage significantly increases elimination mechanism at the end of the iterations as we let a previously killed star (yellow) to be included into the final We calculate condition coverage for each population. Eliminating a star (purple) also branch existing in our testing code. For example, increase total coverage during 18th iteration. an “if” statement with one condition, e.g. x<5, contains two branches, true or false, and the condition coverage for this branch can be 0, 50% or 100%. If an “if” statement has two conditions, e.g. x<5 && y>10, then it has 4 conditions (TT, TF, FT, FF). With the input set (x=3, y=11; x=3, y=9; x=6, y=9), the condition coverage for this branch is calculated as 75%. Only traditional branch coverage is not enough for our study since we need to cover all the conditions of each branch in our testing code. Therefore, we replace the distance based mechanism, which calculates to what extent the distance between current star and Figure 4: The x-axis is the number of iterations black hole exceeds the R value to decide whether and the y-axis is condition coverage rate. 74 3.2. Moving Discretely coverage rates can be achieved with fewer test cases but higher execution time for different number of iterations. The maximum code In moveStars() operation, to shift the stars coverage rate is approximately 76% in iterations towards the black hole, we propose an index- 150 and 500 with BH-AllStar. Solution space based method for discrete-valued parameters, consists of 190 test cases for 150 iterations and similar to ‘Binary Variants’ in [9]. We store our executes in 327 secs. However, for 500 iterations, input parameters in an array and we generate a only 19 test cases are used but execution time is random index value between the input value index 815 secs. of the current star and the BH. Then we move current star to this index and assign the value to In addition, the baseline algorithm, BBH has a the current star as the updated input parameter. better coverage (72.2%) than our condition coverage based BBH approach (65.3%) for 300 iterations. However, our additions help to increase 3.3. Choosing the best population the coverage from 65.3% to 75% in BH-AllStar. over history BBH* always reaches better coverage rates than BBH. When the coverage criterion is set as In keepPopulationWithMaxTotalCoverage(), condition (BBHCC), we do not always see a better we keep the population generated in all iterations convergence than BBH. But choosing the best of BBH including coverage rates and alive/dead population and adding all the “good” stars states of each star. At the end of the iterations, we eventually converge to much higher coverage select the population that has the highest coverage ratios. Instead of defining multiple black hole as ratio, not the last population according to BH. in [9], we stick to the single black hole phenomenon but we give a certain flexibility to 3.4. All Star operation the star selection and achieve a similar improvement like in [9]. We can confirm the prior finding that a black hole inspired meta-heuristic In allStar() method, we compare all the stars could be successful at generating test data for between t0 and tend against the BH at tn, and add software systems with too many parameters. the best stars in terms of condition coverage into our final population. Although this causes an Table 3 increase in the number of test cases, it also Performance of BBH Variants and BH-Allstar increases the coverage of the final population. Iteration BBH BBH BH- BBH BBH* The variants of BBH are as follows: (1) Best CC CC* AllStar BBH (BBH*): We incorporate “Choosing Best Stars (#) 10 10 10 10 12 10 Population from History” approach into BBH. (2) Cov. (%) 51.4 70.8 68.1 72.2 73.6 BBH with Condition Coverage (BBHCC): We Time 19 19 22 22 22 use “Condition Coverage-Based Elimination Stars (#) 10 10 10 10 190 150 Mechanism”. (3) Best BBH with Condition Cov. (%) 68.1 73.5 68.7 75 76.8 Coverage (BBHCC*): We add “Choosing Best Time 320 320 372 372 372 Population over History” onto BBHCC. (4) BH- Stars (#) 10 10 10 10 14 300 AllStar: We add “AllStar Operation” onto Cov. (%) 72.2 75.0 65.3 73.6 75 BBHCC*. We assess all according to condition Time 460 460 476 476 476 coverage of the population, number of test cases, Stars (#) 10 10 10 10 19 500 and execution time. Cov. (%) 69.4 75 70.8 73.6 76.4 Time 782 782 815 815 815 4. Results and discussion Table 4 Coverage for different number of iterations The results over different iterations and with Alive Stars BH-AllStar BH variants of BBH, reported in Table 3, show that #iter Star Min Max Median Coverage higher code coverage rates are obtained compared 10 51.4 25 51.4 37.5 73.6 to BBH. For instance, with 500 iterations, 150 51.4 23.6 51.4 43 76.8 300 51.4 23.6 51.4 37.5 75 coverage values are 69.4% in BBH, whereas 75% 500 51.4 25 51.4 38.8 76.4 in BBH*, 70.8% in BBHCC, 73.6% in BBHCC* Considering the number of test cases, we and 76.8% in BH-AllStar. We observe that same observe that BH-AllStar may take higher values 75 with respect to the positions of the stars, condition smaller at the end. On the other hand, in our coverage ratios of each star and the black hole approach, a peak can be seen just after all the star. In 150 iterations, we generate 190 test cases, iterations are executed, at t=151. The population which is a higher value compared to the other with the best coverage has been kept and the good experiment results. Since we pick the best stars in stars over the history are added to that population. terms of coverage during All-Star operation, there may be many “best” stars in the population compared to the selected black hole. We did not perform any filtering on these best stars like pruning, but it might be possible to reduce those that cover similar conditions as a final step. Table 4 presents the coverage rate of the BH Star, Alive Stars and BH-AllStar in the final solution space. In both BBH and our proposed Figure 6 Coverage in the timeline of BBH approach, the black hole has the same coverage rate (51.4%). Alive stars in the solution space The results show that our Choosing Best have a minimum of 23.6%, a maximum of %51.4 Population from History approach always (BH itself) and a median of 43% coverage ratios. provides better coverage rates. Since the nature of A star with less coverage ratio than the BH Star is the movement to the black hole and random acceptable in our study because it might have a generation of stars might cause a decrease in total higher coverage on any branch compared to the coverage in BBH, we handle this problem by BH star. keeping the population with maximum coverage. Fig. 5 presents the coverage rates of all the The main reason that BH-AllStar outperforms branches according to 190 stars in the solution BBH is that the latter eliminates stars because it is space for 150 iterations. As depicted in Fig.5, too close to the black hole star. However, it does there are still some branches (e.g., 19th to 27th) not mean that these stars do not contribute to total which are not sufficiently covered although we coverage. increase the total coverage ratio. This is partly due to the fact that BH algorithm essentially revolves around the black star that is picked. So, some of 5. Threats to validity the areas in the search space might be missed during the iterations. We highlight several issues that might jeopardize the validity of our findings, and discuss these in this section. The first issue is related to Coverage Density Graph based on StarID and BranchID (Contour Plot) 190 the construct defined as the fitness criteria of our 180 170 Coverage < 0,0 0,0 – 0,2 proposed algorithm. We aim to improve the final 160 150 0,2 – 0,4 0,4 – 0,6 population of the algorithm by assessing the condition coverage ratio, which is calculated 140 0,6 – 0,8 130 0,8 – 1,0 120 through Jacoco tool for each branch of the tested > 1,0 110 StarID 100 90 80 algorithm. Jacoco gives a condition coverage ratio for each branch, but it does not specifically say 70 60 50 40 30 which conditions are satisfied in that branch. 20 10 When two stars having 50% condition coverage 1 2 3 4 5 6 7 8 9 10 1 1 12 1 3 1 4 1 5 1 6 1 7 1 8 1 9 20 21 2 2 23 2 4 25 2 6 27 2 8 29 3 0 3 1 3 2 for the same branch are evaluated, we consider BranchID those the same. But we do not know whether both Figure 5 Coverage Density Graph based on Star ID cover different conditions and complement each and Branch ID other. This might have caused to falsely eliminate some stars from the population. We plan to work on this issue in our future studies by checking Fig. 6 illustrates the change in the total other unit testing tools. coverage ratio of BBH and BH-AllStar for 150 The second issue is related to the binary iterations. It is seen that one or more populations strategy applied to turn BHA to BBH algorithm. were found with BBH (around 60th and 100th In Section III, we describe how the movement of iterations) that reach a higher coverage, but there the stars happens on a discrete scale. were later lost. The final coverage ratio was also 76 Unfortunately, the description in [9] was not Conditon/Decision Coverage). We think that our detailed enough and we had to make some proposed approach can be applied to other assumptions. This might cause a deviation problems, such as testing algorithms with between BBH in [9] and BBH we developed here. different level of complexity, nested conditions It does not affect our comparisons between BBH, and line of codes. Initialization of the population, its variants and BH-AllStar on the software under which is the first step of BBH can also be test in this paper. optimized to start with a better population rather The third issue is related to the internal validity than a random start as in its current setting. of the experimental design. We are aware that choosing a different coefficient value in Eq.4 in 7. References BH-AllStar would impact our results, in fact, we tried multiple values and observed the convergence of the algorithm in terms of [1] J. Park, H. Ryu, H.-J. Choi, and D.-K. condition coverage to pick the final value as 3. We Ryu, A survey on software test maturity in also think the randomness in the initial Korean defense industry. 2008, pp. 149- population, and the selection of best stars from all 150. the iterations could affect the final results. We [2] R. Kenett, F. Ruggeri, and F. Faltin, plan to work on these as future works as both "Analytic Methods in Systems and require empirical analysis of different strategies. Software Testing," 07/01 2018, doi: Finally, our results are limited to a single 10.1002/9781119357056. industrial context because we chose to solve the [3] V. Garousi, R. Ozkan, and A. Betin-Can, problem of our industrial partner through an "Multi-objective regression test selection improved CSST method. The prior works often in practice: An empirical study in the assess their approaches on simple programming defense software industry," Information exercises, which suffer from generalization of the and Software Technology, 06/01 2018, results on real life projects. We believe our study doi: 10.1016/j.infsof.2018.06.007. validates the real application of Black Hole [4] S. Nidhra, "Black Box and White Box inspired approaches adopted to condition Testing Techniques - A Literature coverage criterion. Review," International Journal of Embedded Systems and Applications, vol. 2, pp. 29-50, 06/30 2012, doi: 6. Conclusion 10.5121/ijesa.2012.2204. [5] W. Youn, S. Hong, K. Oh, and O. Ahn, In this study, we present a new approach to "Software Certification of Safety-Critical generate test data by implementing a previously Avionic Systems: DO-178C and Its proposed meta-heuristic searching technique Impacts," Aerospace and Electronic BBH on a real-life engineering problem that Systems Magazine, IEEE, vol. 30, pp. 4- we face for our safety critical systems in the 13, 04/01 2015, doi: industry. Our BBH variants and new algorithm 10.1109/MAES.2014.140109. BH-AllStar perform better than BBH in terms of [6] M.-H. Chen, M. Lyu, and W. Wong, condition coverage up to 43%. Although BH- "Effect of code coverage on software AllStar generates more test cases compared to reliability measurement," Reliability, BBH, we achieve extending search space and IEEE Transactions on, vol. 50, pp. 165- obtaining higher condition coverage rates in the 170, 07/01 2001, doi: same execution time. As a future work, we plan 10.1109/24.963124. new experiments to compare Multiple Black Hole [7] L. Hu, W. Wong, D. Kuhn, and R. Approach reported in [9] and our BH-AllStar. We Kacker, "How does combinatorial testing also plan to decrease the number of test cases in perform in the real world: an empirical our current approach, the selection of best stars for study," Empirical Software Engineering, each branch in BH-AllStar can be limited by vol. 25, 07/01 2020, doi: 10.1007/s10664- selection only one among the best stars. We also 019-09799-2. intend to conduct further research on every [8] D. R. Kuhn, R. N. Kacker, and Y. Lei, condition in a decision to be covered of the SUT, Introduction to Combinatorial Testing rather than having higher condition coverage, (Chapman & Hall/CRC Innovations in which is known as MC/DC (Modified Software Engineering and Software 77 Development Series). Taylor & Francis, [20] M. Farahmandian and A. Hatamlou, 2013. "Solving optimization problem using [9] H. Al-Sammarraie and D. Jawawi, black hole algorithm," Journal of "Multiple Black Hole Inspired Meta- Advanced Computer Science & Heuristic Searching Optimization for Technology, vol. 4, 12/25 2015, doi: Combinatorial Testing," IEEE Access, 10.14419/jacst.v4i1.4094. vol. PP, pp. 1-1, 02/13 2020, doi: [21] E. Curiel, "The many definitions of a 10.1109/ACCESS.2020.2973696. black hole," Nature Astronomy, vol. 3, [10] y. Lei, R. Kacker, D. Kuhn, V. Okun, and 01/08 2019, doi: 10.1038/s41550-018- J. Lawrence, IPOG: A general strategy 0602-1. for T-way software testing. 2007, pp. 549- [22] E.-G. Talbi, Metaheuristics: From 556. Design to Implementation. 2009. [11] C. Nie and H. Leung, "A Survey of Combinatorial Testing," ACM Comput. Surv., vol. 43, p. 11, 01/01 2011, doi: 10.1145/1883612.1883618. [12] A. Hatamlou, "Black hole: A new heuristic optimization approach for data clustering," Information Sciences, vol. 222, pp. 175–184, 02/01 2013, doi: 10.1016/j.ins.2012.08.023. [13] K. Zamli, "T-Way Strategies and Its Applications for Combinatorial Testing," International Journal of New Computer Architectures and their Applications (IJNCAA), vol. 2, pp. 459-473, 01/01 2011. [14] D. Y. Ulutas, "The Generation of Optimized Test Data: Preliminary Analysis of a Systematic Mapping Study," 2020. [15] C. Mao, X. Yu, J. Chen, and J. Chen, Generating Test Data for Structural Testing Based on Ant Colony Optimization. 2012, pp. 98-101. [16] N. Tracey, J. Clark, K. Mander, and J. McDermid, "An automated framework for structural test-data generation," in Proceedings 13th IEEE International Conference on Automated Software Engineering (Cat. No.98EX239), 13-16 Oct. 1998 1998, pp. 285-288, doi: 10.1109/ASE.1998.732680. [17] A. El-Serafy, G. El Sayed, C. Salama, and A. Wahba, Enhanced Genetic Algorithm for MC/DC test data generation. 2015, pp. 1-8. [18] K. Wang, Y. Wang, and L. Zhang, Software testing method based on improved simulated annealing algorithm. 2014, pp. 418-421. [19] C. Mao, X. Yu, and J. Chen, Swarm Intelligence-Based Test Data Generation for Structural Testing. 2012. 78