<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using genetic algorithm for generating optimal data sets to automatic testing the program code</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>K E Serdyukov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>T V Avdeenko</string-name>
          <email>tavdeenko@mail.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Novosibirsk State Technical University</institution>
          ,
          <addr-line>Marksa avenue, 20, Novosibrsk, Russia, 630073</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>173</fpage>
      <lpage>182</lpage>
      <abstract>
        <p>In present paper we propose an approach to automatic generation of test data set based on application of the genetic algorithm. We consider original procedure for computation of the weights of code operations used to formulate the fitness function being the sum of these weights. Terminal objective and result of fitness function selection is maximization of code coverage by generated test data set. The idea of the genetic algorithm application approach is that first we choose the most complex branches of the program code for accounting in the fitness function. After taking the branch into account its weight is reset to zero in order to ensure maximum code coverage. By adjusting the algorithm, it is possible to ensure that the automatic test data generating algorithm finds the most distant from each other parts of the program code and, thus, the higher level of code coverage is attained. We give a detailed example illustrating the work and advantages of considered approach and suppose further improvements of the method.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>One of the most important stages in developing software products is testing. The terminal goals of the
testing phase are compliance of the developed program with the specified requirements, ensuring right
logic while data processing and, as a result, obtaining the correct final results.</p>
      <p>Scaling of the software development stimulated the processes of creating huge software systems by
diverse development teams, each of which has its own programming style and different competencies.
Despite the fact that in parallel with this process there appeared programs allowing for a high level of
collaborative development, control over changes and the ability to check the quality of the code, the
final product does not always meet the requirements specified at the planning stage.</p>
      <p>
        For this reason, the need for quality and comprehensive testing increases significantly. It is
necessary not only to find errors in the code, but also logical inconsistencies. In order to test both the
program as a whole and its parts as thoroughly as possible, not only a team of testers is needed, but
also preparatory activities – the formation of a set of input data that would test certain parts of the
program [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Based on the above, we can conclude that automation of testing, or at least automatic test
generation, can significantly reduce not only the time but also the cost of development. There are other
advantages of that are not so obvious – a high probability of finding small errors, transparency of test
development, testing simultaneously with the development of the program, etc. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>Testing is not a standardized process, it depends on many factors, most of which vary from one
program to another. In addition, improvement of methods for automatic verification and validation of
program code occur quite slowly. Development of most types of designs and templates for testing is
often done manually, without use of any intelligent software. Therefore, the testing process becomes
incredibly complex and time-consuming, as well as costly, if the ultimate goal is indeed the creation of
high-quality software product. In such cases the testing phase can take up to 50% of the whole
development time. In this regard, it seems appropriate to use methods developed in the field of
artificial intelligence.</p>
      <p>One of the most important problems to be solved at the beginning is to identify one of the most
complex branches of the code. Based on the solution of this initial problem, we can further build an
algorithm for finding test data set that provides coverage of the most complex branches (as many as
possible) of the code. In this paper we are trying to derive a solution to this local problem of finding
one the branch with most operations of the program.</p>
      <p>
        Various methods have been proposed for solving the problem of automatic test generation. The
paper [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] compares various methods for generating test data, including genetic algorithms, random
search method, and other heuristic approaches. In [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] it is proposed to use programming based on the
constraint logic programming and symbolic execution to solve this problem. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] constraint handling
rules are used to assist in the manual verification of problem points in the code.
      </p>
      <p>
        Some researchers use heuristic methods with the help of visualization instruments to automate the
process of testing, such as data flow diagram. Studies of automation methods using this diagram have
been proposed in [
        <xref ref-type="bibr" rid="ref6 ref7 ref8 ref9">6-9</xref>
        ]. In the article [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] it is proposed to additionally use genetic algorithms to
determine new input test data sets based on previously used ones.
      </p>
      <p>
        The articles [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13">10-13</xref>
        ] consider integrated approaches for generating test data. In [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] an approach is
used that combines strategies of random search and dynamic symbolic computations. The article [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
proposes a theoretical description of a search testing strategy using a genetic algorithm. Approaches to
search for local and global extremes on real programs are considered. A hybrid approach for
generating test data is proposed - a memetic algorithm.
      </p>
      <p>
        Approach in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] uses a hybrid intelligent search algorithm to generate test data. In the proposed
algorithm, the method of branch and bound and hill climbing) are used with the use of intellectual
search.
      </p>
      <p>
        Also, there is investigate of approaches for generating test data based on the machine learning [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
The proposed approach uses the neural network structure with user-configured clustering of input data
for sequential learning. There are approaches based on the meta-heuristic algorithm of the cuckoo
[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>
        For the convenience of generating test data, the UML diagrams are also used. [
        <xref ref-type="bibr" rid="ref17 ref18">17, 18</xref>
        ]. The articles
propose to use genetic algorithms for generating triggers for UML diagrams, which will allow finding
the critical path in the program. In article [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] an improved genetic algorithm-based method is
proposed for selecting test data for multiple parallel paths in UML diagrams.
      </p>
      <p>
        In addition to UML diagrams, the program can be displayed as a classification tree method
developed by Grochtmann and Grimm [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. In paper [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] discusses the problem of tree building and
proposes an integrated classification tree algorithm, and in [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] the developed ADDICT prototype
(abbr. AutomateD test Data generation using the Integrated Classification-Tree methodology) is
investigated for an integrated approach
      </p>
      <p>There are many different researches on theme of generation of test data. Most often, to solve this
problem, heuristic approaches are used, since they are allowed to select data with not a complete
enumeration of possible options. The approach proposed in this article is based on a genetic algorithm
with a modification of the calculation of the function of adaptation, which allows to generate data
based on a program code without reference to any testing and development systems. This allows to
generate data directly only by specifying restrictions on the input variables.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Genetic algorithm</title>
      <p>
        Genetic algorithm is a heuristic method, more precise, one of the types of the evolutionary algorithms,
that uses the idea and terminology of evolution of the nature. Its goal is not to find the optimal and
best solution, but to find one that is close enough to it. Therefore, genetic algorithm is not
recommended to apply if there are already fast and well-developed optimization methods. But at the
same time, the genetic algorithm perfectly shows itself in solving non-standardized tasks, problems
with incomplete data or if it impossible to use other optimization methods because of the complexity
of implementation or the duration of execution [
        <xref ref-type="bibr" rid="ref23 ref24">23, 24</xref>
        ].
      </p>
      <p>A genetic algorithm is considered completed if a certain number of iterations are passed (it is
desirable to limit the number of iterations, since the genetic algorithm works on the method of trial and
error, which is quite a long process), or if the satisfactory value of the fitness function was obtained.
Generally, the genetic algorithm solves the problem of maximizing or minimizing and the adequacy of
each decision (chromosome) is assessed using the fitness function.</p>
      <p>
        Genetic algorithm works according by the following principle:
• Initializing. Establishing fitness function. Forming the initial population. Classically, the initial
population creating by random filling of genes in the chromosomes. However, to increase the
convergence rate, the initial population can be filling in specific way, there the values can be
analyzed in advance for exclusion of definitely unsuitable genes.
• Evaluation of population. Each of the chromosomes is evaluated by the fitness function. Based
on specified requirements, chromosomes acquire a certain value in accordance with the
solution of the problem [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ].
• Selection. After each chromosome obtain its own value, the selection of the best chromosomes
take place. Selection can be done by different methods, for example, take the first n
chromosomes sorted by value of the fitness function, or only chromosomes with maximum
value of the fitness function, etc.
• Crossover. The first significant difference from conventional methods and one of the most
important stages of the algorithm. After selection and retrieving the suitable chromosomes to
solve the problem, they crossover with each other. Randomly selected chromosomes generate
new chromosomes. Crossover occurs based on the selection of a specific position in the two
chromosomes and mutual replacement of parts. After filling the required number of
chromosomes to create a new population, the algorithm proceeds to the next step [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ].
• Mutation. This is also a step characteristic for GA only. In random order, a random gene can
change values to a random one. The main purpose of the mutation is the same as in biology –
the introduction of genetic diversity in the population. The main goal of mutation is to obtain
solutions that could not be produced with existing genes. This will allow, firstly, to avoid
falling into local extremes, since the mutation may allow the algorithm to go a completely
different path, and secondly, to “dilute” the population in order to avoid a situation where
there are only identical chromosomes in the entire population that will not move towards a
global solution.
      </p>
      <p>After all stages of the genetic algorithm have been completed, it is estimated whether the
population has reached the desired accuracy of the solutions, or whether a certain number of
populations have been reached. If these conditions have been met, the algorithm stops working.
Otherwise, the cycle is repeated with the new population until the conditions are reached.</p>
    </sec>
    <sec id="sec-3">
      <title>3. The test data generation with genetic algorithm</title>
      <p>
        The use of genetic algorithms in the testing process makes it possible to ensure that we will determine
the most complex parts of the program code in which the risks due to making mistakes are the
greatest. Evaluation is executed through the use of the fitness function, the parameters of which have
the different weights of each individual operation [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ].
      </p>
      <p>To date, many types of diagrams have been developed that allow us to represent the structure of a
program code not as a set of actions, but as diagrams with a specific structure. The most widely used
are diagrams (graphs) of control flows which allow representing the whole variety of ways to run a
program. The main purpose of such diagrams falls on the task of creating the program code which
includes determining the program complexity, verifying the program logic, and directly writing the
code. However, from the problem of generating test data point of view, this type of diagrams built on
the already written program code, permits to assess the quality of the developed code and, within the
scope of the task, to assess the importance, or complexity, of certain program paths.</p>
      <p>Based on the possibility of presenting the program code in the structural form, an approach was
developed through which it would be possible to evaluate the program code and to determine such a
set of test data that would allow one to “walk” through the largest number of operations and the
greatest number of paths. The first step in the proposed method is to consider the structural elements
of the code. For ease of presentation, we can use flow diagrams to visualize the structure of the code
and to understand in what way the program is executed.</p>
      <p>
        Each operation of the code is assigned its own separate graph node, and as a link is the direction in
which the code is executed. For example, a condition is denoted with one graph node, but two
branches of the code will come out of it. Each transition between the graph nodes is assigned a certain
weight depending on which part of the code the operation is in, whether any complex structural
elements precede it, etc. [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ].
      </p>
      <p>The problem of generating the input test data consists of three subproblems:
1) Search for the input data for traversing one of the most complex code branches;
2) Elimination or reduction of the weights of operations lying on this branch of the code at the
rate of subsequent branches;
3) Search for a set of test data for traversing multiple branches at once.</p>
      <p>The limitation on the size of the input data set is established after the development stage and allows
one to concentrate on certain branches in which the largest number of operations is performed.</p>
      <p>The whole algorithm is executed cyclically. First the procedure of searching the input data for one
branch is started, then the operations in this branch are excluded from further computations and the
data search for one branch is started again.</p>
      <p>Search for a single path in the program code works as follows:
• The first operation is assigned a weight, for example, 100.
• Each subsequent operation is also assigned a weight – if there are no conditions or cycles, the
weight is equal to the weight of the previous operation.
• Weight of the condition is assigned in accordance with the following rule. If the condition
contains only one branch (only if ...), then the weight of each operation is reduced on 0,8. If
the condition is divided into several branches (if ... else ...), then the weight is divided into
equivalent parts - for two branches 50 / 50, for three 33/ 33 / 33, etc.
• Weights of operations in the cycle remain the same, but can also be multiplied by a certain
weight, if it is necessary to increase the significance of the cycles during testing.
• All nested restrictions are taken into account, for example, for two nested conditions, the
weight of operations will be equal to 80 * 80 = 64 percent
In Figure 1 we present an example on the basis of which the algorithm was tested.</p>
      <p>As a result of the above procedure we obtain the weights that can be used to develop test variants
using genetic algorithm, that is, to estimate how much calculated weight falls on here or another
branch for certain values of input parameters.</p>
      <p>For convenience, we introduce the following notation:</p>
      <p>X – data sets; m – population size, i.e. the number of different variants of input data values; r(i) –
the weight of a single operation i; F (X) – the value of the fitness function for each data set depending
on the calculated weights.</p>
      <p>The problem is to select the maximum of the objective function, i.e.</p>
      <p>(i)→ max
(1)
0. public static void Main(int m, int i, int n) {
1. intsum = 0, avg = 0;
2. if (m &lt;i)
3. {
4. use_m = m;
5. }
6. int[] a = new int[m];
7. Console.WriteLine("Enter the Array Elements ");
8. for (j= 0; j&lt; m; j++)
9. {
10. a[j] = i;
11. if (a[j] &gt; n)
12. {
13. a[j] = n
14. }
15. }
16. for (k= 0; k &lt; m; k++)
17. {
18. sum += a[k];
19. }
20. avg = sum / m;
21. Console.WriteLine("Average is {0}", avg);
22. Console.ReadLine();
23. }
80
80
20
20
4
5
6
80
80
1
2
8
9
10
12
13
15
18
20
22
23
24
100
20
100
100
80
80
64
80
100
100
16
64
80</p>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>For a more detailed consideration of the algorithm we use simple test options and carry out each of the
steps manually with a detailed description of the pro. At the initialization stage we will generate the
following data sets – (10,5,12); (3,4,10); (25,30,11); (5,3,17).</p>
      <p>Table 1 presents these sets, the calculated target value of the fitness function and the rank
corresponding to the best set.</p>
      <p>From the table we can see that the better options for the selection are 2nd and 3rd options. In order
to obtain additional two new variants, their values will be mixed with a certain probability of mutation.</p>
      <p>In the crossover stage the division of data sets occur in the first and second positions. For example,
when crossing (R1, R2, R3) and (G1, G2, G3), the variables are obtained - (R1, R2, G3), (R1, G2,
G3), (G1, R2, R3) and (G1, G2, R3). Parental values remain in the set-in order to keep the crossing
clean, i.e. compared to the zero generation there will be added 2 new sets. Thus, in the subsequent
generations, six data sets will be used. It is worth mentioning that, depending on the settings of the
genetic algorithm, the parental chromosomes can be excluded from consideration.</p>
      <p>Mutation will occur with a probability of 0,1 for the chance of changing the value from 1 to the
specified value in both directions. Under these conditions, the maximum possible value added during a
mutation is 5. As a result of crossing, the data sets will be obtained, shown in Table 2.</p>
      <p>As a result, two more variants will be added to the additional two parent sets - (3,4,13) and
(25,30,10). Table 3 shows all the new variations of the test data set.</p>
      <p>If the two variants have the same rank, priority will be given to the option from the newer
generation. In the last generation we obtained three data sets, which test the majority of the whole set
of program paths - (25,30,11), (3,30,11) and (25,30,10). The first set was obtained from the first
generation, so it will be excluded and there will remain only two options - (3,30,11) and (25,30,10).</p>
      <p>Because of the small initial sample and a small code, the data sets quickly came to finding
overlapping values - 30 in the second position and 10 or 11 in the third. Therefore, to continue to carry
out iterations ceases to make sense - already in the next generation, the data will consist mainly of
repeating sets.</p>
      <p>For the current program code you can use test data sets obtained in the latest generation. Priority
depends on rank received.</p>
      <p>Thus, using genetic algorithms, one can find such initial test initial values, which would fully check
all the paths of the program.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Improving the algorithm</title>
      <p>The algorithm allows not only to evaluate which program paths will be used for certain data, but also
how the data changes and whether duplicate values are preserved for the best chromosomes between
populations.</p>
      <p>This not only makes it possible to determine the initial test suite, but also, on the basis of data
analysis, to draw conclusions about the presence of logic in the program that corresponds to the
planned one.</p>
      <p>Four additional tests were performed to present the results.In the algorithm, the first population is
formed randomly. Certain settings were made for testing – each population contains 100
chromosomes; the total number of populations also equals 100. This will make it possible to form a
sufficient number of different variants.</p>
      <p>Table 4 presents 4 runs of the method with the first random population, two middle populations and
the final one, from which the first chromosome is taken and counted as the final generated data set.
For convenience, only the 5 "best" chromosomes in each population will be reflected.</p>
    </sec>
    <sec id="sec-6">
      <title>Population 0 20 50</title>
    </sec>
    <sec id="sec-7">
      <title>Final (100)</title>
      <p>Test 1
1: 78, 23, 35
2: 62, 36, 95
3: 52, 35, 27
4: 17, 77, 73
5: 75, 9, 96
1: 95, 64, 54
2: 95, 64, 29
3: 95, 64, 54
1: 95, 64, 54
2: 95, 64, 29
3: 95, 64, 54
1: 95, 64, 54
2: 95, 64, 29</p>
      <p>At least two different final sets of test data were formed in each of the variants, in which the
operations in the considered program code will have the greatest weight. In addition, there is certain
patterns in the results - the first value is always the maximum (random values are limited to a
maximum of 100 to increase convergence), the second value is less than the first, but more than the
third.</p>
      <p>For additional analysis, we will check how the speed of the algorithm changes depending on the
genetic algorithm settings. In the graphs below it can be seeing how the duration of the program varies
depending on the size of the population, i.e. the number of chromosomes, and the total number of
populations. During the study of changes in the size of populations, the number of populations was
equal to 100. And vice versa, during the study of the dependence on the number of populations, the
number of chromosomes was equal to 100.</p>
      <p>Figure 2 shows the dependence on the number of chromosomes in the population. Based on it, it
can be concluded that with an increase number of chromosomes in a population, the duration of the
algorithm increases significantly, and more, exponentially.</p>
      <p>Figure 3 shows the dependence on the number of populations.</p>
      <p>It becomes obvious that with a change in the number of populations, the duration of the algorithm
increases, but in a linear progression. At the same time, there are noticeable fluctuations in the gain in
both directions, which remain ap-proximately at the same level.</p>
      <p>Despite the fact that in both cases total number of chromosomes remained the same, changes in the
duration of work are considerably different. An increase in population size significantly increases the
duration of work — the number of chromosomes in the population from 1,000 to 50,000 increased 50
times, but the duration increased 750. When the number of populations changed 50 times, the time
increased only 14 times.</p>
      <p>This is due to the fact that the most complicated operations from the point of view of loading occur
in calculating the fitness function and searching for optimal chromosomes for crossing. Because the
search for the best chromosomes depends on the number of chromosomes in the population, so the
current search algorithm, i.e. sorting chromosomes in population from best to worst, significantly
loads the power of computer systems and increases the speed of work exponentially to the size of the
population.</p>
      <p>The number of chromosomes in one population allows for a variety of options, i.e. more likely to
find more suitable options. An increase in the number of populations leads to a more accurate result,
but only with a large number of chromosomes. If the chromosomes are small enough, the algorithm
will quickly come to one repeated value.</p>
      <p>Based on the foregoing, the number of chromosomes in one population has the greatest influence
on the result. At the same time, an increase in the number of chromosomes significantly increases the
execution time. But in order to ensure the better end result, the total number of populations should also
be increased with an increase in the number of chromosomes.</p>
    </sec>
    <sec id="sec-8">
      <title>6. Conclusion</title>
      <p>Evolutionary methods work in such a way as to find the best solutions in the problems that are
impossible or too costly to solve using standard optimization methods. They do not always work fast
or qualitatively, but they show superiority in tasks with non-standard approaches.</p>
      <p>The method based on the genetic algorithm will automate the method of selection of input data,
while significantly increasing the speed of data retrieval. The algorithm is fully automatic (with the
exception of some restrictive settings), so it does not require additional testers or developers work. The
resulting data set can be directly used for the testing process and, if necessary, be re-assembled at no
additional cost.</p>
      <p>In the future, it is reasonable to conduct investigations of the influence of various metrics on the
final result and the amount of code coverage, in order to provide such data sets that would allow
testing the code as efficiently as possible and with the maximum number of operations.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgments</title>
      <p>The work was supported by a grant from the Ministry of Education and Science of the Russian
Federation in the framework of the project part of the state task, the project № 2.2327.2017/4.6.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Zanetti</surname>
            <given-names>M C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tessone</surname>
            <given-names>CJ</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scholtes</surname>
            <given-names>I</given-names>
          </string-name>
          <source>and Schweitzer F 2014 Automated Software Remodularization Based on Move Refactoring. A Complex Systems Approach 3th international conference on Modularity 73-83</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Crispin</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gregory</surname>
            <given-names>J</given-names>
          </string-name>
          2010
          <string-name>
            <surname>Agile</surname>
          </string-name>
          <article-title>Testing: A Practical Guide for Testers and Agile Teams (Pearson Education</article-title>
          ) p
          <fpage>576</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Maragathavalli</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Anusha</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Geethamalini</surname>
            <given-names>P</given-names>
          </string-name>
          and
          <string-name>
            <surname>Priyadharsini S 2011 Automatic Test-Data Generation</surname>
          </string-name>
          For Modified Condition/ Decision Coverage Using Genetic Algorithm
          <source>International Journal of Engineering Science and Technology</source>
          <volume>3</volume>
          (
          <issue>2</issue>
          )
          <fpage>1311</fpage>
          -
          <lpage>1318</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Meude</surname>
            <given-names>C 2001</given-names>
          </string-name>
          <string-name>
            <surname>AT Gen</surname>
          </string-name>
          <article-title>: Automatic Test Data Generation using Constraint Logic Programming and Symbolic Execution Software Testing Verification</article-title>
          and Reliability
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Gerlich</surname>
            <given-names>R 2014</given-names>
          </string-name>
          <string-name>
            <surname>Automatic Test</surname>
          </string-name>
          <article-title>Data Generation and Model Checking with CHR 11th Workshop on Constraint Handling Rules</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Moheb</surname>
            <given-names>R 2005</given-names>
          </string-name>
          <string-name>
            <surname>Automatic Test</surname>
          </string-name>
          <article-title>Data Generation for Data Flow Testing Using a Genetic</article-title>
          <source>Algorithm Journal of Universal Computer Science</source>
          <volume>11</volume>
          (
          <issue>6</issue>
          )
          <fpage>898</fpage>
          -
          <lpage>915</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Weyuker</surname>
            <given-names>E J</given-names>
          </string-name>
          <year>1984</year>
          <article-title>The complexity of data flow criteria for test data selection Inf</article-title>
          . Process. Lett.
          <volume>19</volume>
          (
          <issue>2</issue>
          )
          <fpage>103</fpage>
          -
          <lpage>109</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Khamis</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bahgat</surname>
            <given-names>R</given-names>
          </string-name>
          and
          <string-name>
            <surname>Abdelaziz</surname>
            <given-names>R 2011</given-names>
          </string-name>
          <article-title>Automatic test data generation using data flow information Dogus</article-title>
          <source>University Journal</source>
          <volume>2</volume>
          <fpage>140</fpage>
          -
          <lpage>153</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Singla</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rai H M and Singla P 2011</surname>
          </string-name>
          <article-title>A hybrid pso approach to automate test data generation for data flow coverage with dominance concepts</article-title>
          <source>Journal of Advanced Science and Technology</source>
          <volume>37</volume>
          <fpage>15</fpage>
          -
          <lpage>26</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Luger F G 2009 Artificial Intelligence</surname>
          </string-name>
          <article-title>Structures and Strategies for Complex Problem Solving</article-title>
          (University of New Mexico) p
          <fpage>679</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Berndt</surname>
            <given-names>D J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fisher</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnson</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pinglikar</surname>
            <given-names>J</given-names>
          </string-name>
          and
          <string-name>
            <surname>Watkins</surname>
            <given-names>A 2003</given-names>
          </string-name>
          <string-name>
            <surname>Breeding Software</surname>
          </string-name>
          <article-title>Test Cases with Genetic</article-title>
          <source>Algorithms Proceedings of the Thirty-Sixth Hawaii International Conference on System Sciences 36</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Liu</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fang</surname>
            <given-names>C</given-names>
          </string-name>
          and
          <string-name>
            <surname>Shi Q 2014 Hybrid Test Data Generation</surname>
          </string-name>
          / State Key Laboratory for Novel
          <source>Software Technology Companion Proceedings of the 36th International Conference on Software Engineering 630-631</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Harman</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McMinn P 2010 A Theoretical</surname>
          </string-name>
          and
          <article-title>Empirical Study of Search-Based Testing: Local, Global,</article-title>
          and
          <source>Hybrid Search IEEE Transactions on Software Engineering</source>
          <volume>36</volume>
          (
          <issue>2</issue>
          )
          <fpage>226</fpage>
          -
          <lpage>247</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Xing</surname>
            <given-names>Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gong</surname>
            <given-names>Y Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang Y W and Zhang X Z 2015 A Hybrid</surname>
          </string-name>
          <article-title>Intelligent Search Algorithm for Automatic Test Data Generation Mathematical Problems in Engineering 2015 15</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Paduraru</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Melemciuc M C 2018</surname>
          </string-name>
          <article-title>An Automatic Test Data Generation Tool using Machine Learning 13th</article-title>
          <source>International Conference on Software Technologies, ICSOFT 472-481</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Panda</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sarangi</surname>
            <given-names>P</given-names>
          </string-name>
          and
          <string-name>
            <surname>Dash S 2015 Automatic Test</surname>
          </string-name>
          <article-title>Data Generation using Metaheuristic Cuckoo Search Algorithm International Journal of Knowledge Discovery in Bioinformatics 5(2</article-title>
          )
          <fpage>16</fpage>
          -
          <lpage>29</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Doungsa-ard</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dahal</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hossain</surname>
            <given-names>A G</given-names>
          </string-name>
          and
          <string-name>
            <surname>Suwannasart</surname>
            <given-names>T 2007</given-names>
          </string-name>
          <article-title>An automatic test data generation from UML state diagram using genetic algorithm</article-title>
          IEEE Computer Society Press 47- 52
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Sabharwal</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sibal</surname>
            <given-names>R</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sharma C 2011 Applying Genetic</surname>
          </string-name>
          <article-title>Algorithm for Prioritization of Test Case Scenarios Derived from UML Diagrams IJCSI</article-title>
          <source>International Journal of Computer Science</source>
          <volume>8</volume>
          (
          <issue>3</issue>
          /2)
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Doungsa-ard</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dahal</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hossain</surname>
            <given-names>A</given-names>
          </string-name>
          and
          <string-name>
            <surname>Suwannasart</surname>
            <given-names>T 2008</given-names>
          </string-name>
          <article-title>GA-based Automatic Test Data Generation for UML State Diagrams with Parallel Paths Advanced design and manufacture to gain a competitive edge: New manufacturing techniques and their role in improving enterprise performance 147-156</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20] GrochtmannM,
          <article-title>GrimmK 1993 Classification trees for partition testing</article-title>
          .
          <source>Software Testing Verification and Reliability</source>
          <volume>13</volume>
          (
          <issue>2</issue>
          )
          <fpage>63</fpage>
          -
          <lpage>82</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Chen</surname>
            <given-names>T Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poon P L and Tse</surname>
            <given-names>T H</given-names>
          </string-name>
          <year>2000</year>
          <article-title>An integrated classification-tree methodology for test case generation</article-title>
          <source>International Journal of Software Engineering and Knowledge Engineering</source>
          <volume>10</volume>
          (
          <issue>6</issue>
          )
          <fpage>647</fpage>
          -
          <lpage>679</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Cain</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            <given-names>T Y</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grant</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poon</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tang</surname>
            <given-names>S</given-names>
          </string-name>
          and
          <string-name>
            <surname>Tse T H 2004 An Automatic</surname>
          </string-name>
          <article-title>Test Data Generation System Based on the Integrated Classification-Tree Methodology Software Engineering Research</article-title>
          and Applications,
          <source>Lecture Notes in Computer Science 3026 15</source>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Serdyukov</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Avdeenko</surname>
            <given-names>T 2017</given-names>
          </string-name>
          <article-title>Investigation of the genetic algorithm possibilities for retrieving relevant cases from big data in the decision support systems</article-title>
          <source>CEUR Workshop Proceedings</source>
          <volume>1903</volume>
          <fpage>36</fpage>
          -
          <lpage>41</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Yang</surname>
            <given-names>HL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            <given-names>C S</given-names>
          </string-name>
          <year>2008</year>
          <article-title>Two stages of case-based reasoning - Integrating genetic algorithm with data mining mechanism Expert Systems with</article-title>
          Applications l
          <volume>35</volume>
          <fpage>262</fpage>
          -
          <lpage>272</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Mühlenbein</surname>
            <given-names>H 1992</given-names>
          </string-name>
          <article-title>How genetic algorithms really work: Mutation and hill climbing (Parallel Problem Solving from Nature 2</article-title>
          , North−Holland)
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Spears W M 1993</surname>
          </string-name>
          <article-title>Crossover or mutation? Foundations of Genetic Algorithms 2</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Praveen</surname>
            <given-names>RS</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tai-hoon</surname>
            <given-names>K</given-names>
          </string-name>
          2009
          <source>Application of Genetic Algorithm in Software Testing International Journal of Software Engineering and Its Applications</source>
          <volume>3</volume>
          (
          <issue>4</issue>
          )
          <fpage>87</fpage>
          -
          <lpage>96</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Coyle</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cunningham</surname>
            <given-names>P 2004</given-names>
          </string-name>
          <article-title>Improving recommendation ranking by learning personal feature weights</article-title>
          <source>Proc. 7th European Conference on Case-Based Reasoning 560-572</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>