<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Technology</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1007/s10664-022-10207-5</article-id>
      <title-group>
        <article-title>SmallEvoTest: Genetically Created Unit Tests for Smalltalk</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alexandre Bergel</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Geraldine Galindo-Gutiérrez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alison Fernandez-Blanco</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juan-Pablo Sandoval-Alcocer</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>RelationalAI</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Switzerland</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CICEI, Universidad Católica Boliviana “San Pablo”</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, School of Engineering, Pontificia Universidad Católica de Chile</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>104</volume>
      <issue>2018</issue>
      <fpage>29</fpage>
      <lpage>31</lpage>
      <abstract>
        <p>Evolutionary test generation techniques have emerged as a popular approach in recent years for enhancing the testing of software systems. However, while these techniques have proven to be eficient in programming languages that support static type annotations, dynamically typed programming languages have not received significant attention from the automatic test generation community. This paper introduces an approach aimed at automatically generating fully executable unit tests suitable for dynamically typed programming languages. In particular, our approach is tuned for dynamically-typed and class-based programming languages, and it is implemented in the Pharo and GToolkit programming languages. To address the absence of static type annotations, our approach uses a type profiling mechanism and employs a genetic algorithm to drive the evolution of the unit tests.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Automatically Test Suite Generation</kwd>
        <kwd>Genetic Algorithms</kwd>
        <kwd>Pharo Programming Language</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Automatically Test Suite Generation (ATSG) consists in creating executable unit tests for a
particular class. ATSG produces unit tests that exercise methods for a given target class. Such
generated tests complement manually hand-written tests by (i) focussing on untested branches
or code portions or (ii) exercising corner-case scenarios. ATSG has been gaining popularity
thanks to EvoSuite1 and Randoop2 for Java [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1, 2, 3, 4</xref>
        ].
      </p>
      <p>EvoSuite considers the automatic test generation as a mathematical optimization process
through an evolutionary algorithm and a fitness function [ 5, 6]. In particular, the evolution of
the unit tests being generated is designed to maximize the branch coverage of the class being
tested [7].</p>
      <p>This short paper presents SmallEvoTest, a tool for Pharo to create unit tests for a particular
class automatically. Similarly to EvoSuite and Randoop, SmallEvoTest does not require a training
dataset and does not use any large language model such as ChatGPT. SmallEvoTest is available
online3 under the MIT license.</p>
      <p>Outline. The paper is organized as follows. Section 2 gives a running example of SmallEvoTest;
Section 3 gives a highlight of a number of design aspects of our tool; Section 4 lists studies and
tools related to this paper. Section 5 concludes and outlines our future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. SmallEvoTest In A Nutshell</title>
      <p>SmallEvoTest is relatively easy to configure and use. To illustrate this, let’s consider the class
GCPoint, which is defined as follows:
Object subclass: #GCPoint</p>
      <p>instanceVariableNames: 'x y'
GCPoint&gt;&gt;initialize
super initialize.
x := 0.</p>
      <p>y := 0
GCPoint&gt;&gt;x: xValue y: yValue
x := xValue.</p>
      <p>y := yValue.</p>
      <p>GCPoint&gt;&gt;x</p>
      <p>↑ x
GCPoint&gt;&gt;y</p>
      <p>↑ y
GCPoint&gt;&gt;add: anotherPoint</p>
      <p>↑ GCPoint new x: x + anotherPoint x y: y + anotherPoint y; yourself
GCPoint&gt;&gt;negated
↑ GCPoint new x: x negated y: y negated; yourself</p>
      <p>This simple class mimics the standard Point class, and we use this example throughout this
paper. Using SmallEvoTest, unit tests for this class can be generated by executing the following
code:
SmallEvoTest new
targetClass: GCPoint;
generateTestNamed: #GCPointTest;
numberOfTestsToBeCreated: 15;
nbOfStatements: 8;
executionScenario: [
(GCPoint new x: 3 y: 10)</p>
      <p>add: (GCPoint new x: 1 y: 12) ];
run.</p>
      <p>The class SmallEvoTest expects as arguments the target class (the GCPoint class for which we
want to generate unit tests), the name of the test case to be generated (GCPointTest), and an
execution scenario block exercising the target class. The execution scenario block is meant to</p>
      <sec id="sec-2-1">
        <title>3https://github.com/bergel/GeneticallyCreatedTests</title>
        <p>provide hints about the argument types. In this example, the scenario block invokes x:y: and
add: with some arguments. Note that the result of the scenario is not used.</p>
        <p>As a result, the class GCPointTest is created and will contain 15 test methods, each with 8
statements (excluding the assertions). Here is an example of how a test method looks like:
GCPointTest&gt;&gt;testGENERATED10
| v1 v2 v3 v4 v5 v6 v7 v8 |
v1 := GCPoint new.
v2 := 4.
v3 := v1 x: v2 y: v2 .
v4 := v3 negated.
v5 := GCPoint new.
v6 := v1 y.
v7 := v3 negated.
v8 := v5 add: v3 .
self assert: v4 printString equals: 'GCPoint(-4,-4)'.
self assert: v6 equals: (4).
self assert: v7 printString equals: 'GCPoint(-4,-4)'.
self assert: v8 printString equals: 'GCPoint(4,4)'.</p>
        <p>This test has been produced by genetic algorithms and the generation process was guided
by the objective to maximize the number of executed methods of the target class. Each of the
twenty generated test methods has eight statements (indicated with the assignments of v1 to v8)
and a number of assertions. Note that SmallEvoTest uses a set of hyperparameters, including the
number of tests to be generated and the number of (non-assertion) statements to be contained
in a test.</p>
        <p>As the invocation of SmallEvoTest illustrates, three essential parameters must be provided to
generate tests. First, the class to be tested is specified using targetClass:. Generated tests will
directly exercise the methods defined in this class. The result of the test generation will be kept as
test methods in a class named GCPointTest. The code provided as a block to executionScenario:
is meant to exercise the class under test and is solely used to extract argument types.</p>
        <p>A central aspect of SmallEvoTest is to use a type profiling technique to infer possible types
to be provided. In our example above, the scenario invokes (i) x:y: with two integers and (ii)
add: with another point. Type information is useful to produce and use object examples during
the test generation. Such examples are used to provide the necessary type information when
generating and composing statements through genetic operations. In particular:
• The fact that x:y: uses two integers lead to the creation of the statement v2, then v3 is
produced, and
• add: use another point as a parameter, it produces the statement v8, using v3 as argument.</p>
        <p>The next section describes some of the design aspects we had to consider when generating
unit tests for Smalltalk.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Design of SmallEvoTest</title>
      <sec id="sec-3-1">
        <title>3.1. Background: Genetic Algorithms</title>
        <p>Genetic algorithms are a type of machine learning algorithm inspired by biological and natural
evolution principles. In genetic algorithms, a population is made of individuals, and each
individual has a chromosome. A chromosome is a linear sequence of values. Genetic algorithms
are commonly employed to mathematically optimize a function  () = , i.e., finding a sequence
of  leading to maximize the value . The variable  is a datapoint in a multi-dimensional
domain, and the variable  is a number. The function  is called fitness function , which indicates
how fit the individual  is. The function  may model an arbitrary complex operation, such as
generating unit tests produced (obtained from the variable ) and measuring the coverage of
the target class (the  variable).</p>
        <p>Evolution with genetic algorithms happens by randomly selecting fit individuals from a given
population, breeding these individuals through genetic operations to produce a new population.
The population becomes fitter with each generation by producing individuals with higher fitness
values. Overall, the population is getting fitter, thus increasing the likelihood of finding the
optimal solution (i.e., the highest possible value of ).</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Genetic Encoding</title>
        <p>Applying genetic algorithms to produce a test implies that the content of the test must be
adequately encoded as a chromosome. We denote  as a test and refer to the number of covered
methods of the target class with . By optimizing  () = , genetic algorithms will search for
a test  with high code coverage.</p>
        <p>The test testGENERATED10 given above consists of two parts: (i) initialization of the tests
made of object creations and message sends, and (ii) assertions. As produced by SmallEvoTest,
assertions do not contribute to increasing the test coverage; as such, we exclude the assertion
generation from the genetic encoding to treat it in a separate way4.</p>
        <p>The unit test  is a value in the space  where  corresponds to the domain of statements,
and  is the length of the test to be generated in terms of a number of statements. We consider
two kinds of statements, either an object creation or a message send. For example, if we arbitrarily
say that  = 8 (i.e., generated test method will have eight statements as in the example above),
then  will be encoded as (1, 2, ..., 6), in which each  is either an object creation or a
message send. In the example given above, 1 corresponds to the statement v1 := GCPoint new,
an object creation, while 3, 4, 6, 7, 8 corresponds to message sends.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Mutation and crossover</title>
        <p>Genetic algorithms employ two biologically-inspired operators, mutation and crossover. A
mutation consists in replacing a statement with another. As such, a message sent can be
replaced by sending another message, e.g., v4 := v3 negated is replaced by v4 := v3 add: v1
or by an object creation. A crossover replaces a segment in an individual with a segment from
another individual. Consider two tests  = {1, 2, 3, 4, 5} and ′ = {′1, ′2, ′3, ′4, ′5},
a possible result for crossover(, ′) = {1, 2, ′3, ′4, ′5}, assuming a cutpoint on the third
statement.</p>
        <p>After each genetic operation, variables used as message arguments may have to be readjusted
to satisfy type requirements. For example, if ′3 was originally v3 := v2 add: v1 then it expect
v2 and v1 to be a GCPoint. However, in the result of the crossover, v1 (defined in 1) and v2</p>
        <sec id="sec-3-3-1">
          <title>4Note that this decision was also taken in EvoSuite.</title>
          <p>(defined in 2) may have diferent types. The receiver and arguments of a message may have to
be replaced by variables meeting the type requirements.</p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Generating Assertions</title>
        <p>During the source code generation, assertions are appended to the statement source code.
The test’s statements are executed in a local environment, and assertions are produced by
determining simple equality relations against diferent variables.</p>
        <p>In the current version of SmallEvoTest, assertions are produced for leaf variables, i.e., not
used as argument or receiver or other statements.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Related Work</title>
      <p>In recent years, ATSG has gained popularity with the introduction of new or improved generation
tools [8, 9]. An example of this growth can be seen in the SBST Tool Contest, which reached
its 10th edition in 2022 in the category of Java Unit Testing Contest. Two of its participants,
EvoSuite and Randoop, have been awarded for several years [10, 11].</p>
      <p>EvoSuite. Using a genetic algorithm to produce unit tests was pioneered by EvoSuite5 [5].
EvoSuite evolves unit tests in a similar fashion as we do and operates for the Java programming
language. SmallEvoTest uses some of the ideas from EvoSuite, such as test evolution, individual
encoding, and test generation. However, SmallEvoTest provides an explicit repository for type
information populated with a code example.</p>
      <p>Randoop. A popular alternative to EvoSuite is Randoop6, which is a Java unit test generator
that uses feedback-directed random generation which consists of creating statements using
a randomly chosen method call and previous statements as arguments [13]. The result of
executing each new statement is then verified by the tool. EvoSuite uses evolutionary search to
generate test suites [5]. It is guided by multiple coverage criteria (e.g., branch distance, mutation
testing) [14]. Studies have shown that its latest search algorithm, DynaMOSA (Dynamic
ManyObjective Sorting Algorithm) [15, 16, 17], produces short tests with higher coverage than
previous algorithms (e.g., MOSA [18], WSA [19]).</p>
      <p>Pynguin. Besides test generation in Java language, Lukasczyk et al. recently presented Pynguin
(Python General Unit Test Generator) [20, 21]. This tool uses evolutionary algorithms to
explore the challenge of test generation on dynamically typed languages. Unlike strongly typed
languages, generation in languages such as Python or Pharo faces the problem of missing
type information. Similar to the previous tools, our work focuses on test generation using
evolutionary search. However, we focus on Pharo, a dynamically typed language, and use a
running example to collect type information used in the generation process.</p>
      <sec id="sec-4-1">
        <title>5https://www.evosuite.org 6https://randoop.github.io/randoop/</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>This paper presents SmallEvoTest, a tool to generate unit tests for any arbitrary Pharo class
automatically. SmallEvoTest relies on a code example to extract argument type information and
uses a just-in-time example collecting technique to combine method invocations and generate
assertions. SmallEvoTest is a proposal for a foundation for automatic test generations, and our
efort will be followed up with various points:
• Conducting case studies: Conducting case studies on representative classes of prominent
Pharo systems is an obvious next step. This will help us illustrates some limitations of our
approach and will help us identify actions to take to generate unit tests for large classes.
• Improving assertions: Many aspects of SmallEvoTest are based on immediate decisions
taken from ad-hoc examples. In particular, the generation of assertions can be significantly
improved. Incorporating tests about collections or structural similarities seems to be a
reasonable step forward to produce.
• Abstract template: The objective of the generated tests is to cover a particular part of the
code given a particular budget, expressed in terms of test methods and statements. As
such, the generated tests difer from manually written tests. As a future work, we plan to
incorporate an abstract template as a way to better structure the generated tests. Abstract
templates are meant to generate tests that follow a particular structure, e.g., accessors
must be invoked to properly initialize the object before invoking methods with business
logic.</p>
      <p>SmallEvoTest is available under the MIT License for the Pharo and GToolkit platforms.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments References</title>
      <p>Juan Pablo Sandoval Alcocer thanks ANID FONDECYT Iniciacion Folio 11220885 for supporting
this article.
of the 2015 International Symposium on Software Testing and Analysis, ISSTA 2015,
Association for Computing Machinery, New York, NY, USA, 2015, p. 338–349. URL:
https://doi.org/10.1145/2771783.2771801. doi:10.1145/2771783.2771801.
[5] G. Fraser, A. Arcuri, Evosuite: Automatic test suite generation for object-oriented
software, in: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th
European Conference on Foundations of Software Engineering, ESEC/FSE ’11,
Association for Computing Machinery, New York, NY, USA, 2011, p. 416–419. URL: https:
//doi.org/10.1145/2025113.2025179. doi:10.1145/2025113.2025179.
[6] A. Panichella, J. Campos, G. Fraser, Evosuite at the sbst 2020 tool competition, in:
Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops,
ICSEW’20, Association for Computing Machinery, New York, NY, USA, 2020, p. 549–552.</p>
      <p>URL: https://doi.org/10.1145/3387940.3392266. doi:10.1145/3387940.3392266.
[7] G. Fraser, A. Arcuri, Whole test suite generation, IEEE Transactions on Software
Engineering 39 (2012) 276–291.
[8] P. Tonella, Evolutionary testing of classes, in: Proceedings of the 2004 ACM SIGSOFT
International Symposium on Software Testing and Analysis, ISSTA ’04, Association for
Computing Machinery, New York, NY, USA, 2004, p. 119–128. URL: https://doi.org/10.1145/
1007512.1007528. doi:10.1145/1007512.1007528.
[9] A. Sakti, G. Pesant, Y.-G. Guéhéneuc, Instance generator and problem representation to
improve object oriented code coverage, Software Engineering, IEEE Transactions on 41
(2015) 294–313. doi:10.1109/TSE.2014.2363479.
[10] S. Panichella, A. Gambi, F. Zampetti, V. Riccio, Sbst tool competition 2021, in: 2021
IEEE/ACM 14th International Workshop on Search-Based Software Testing (SBST), 2021,
pp. 20–27. doi:10.1109/SBST52555.2021.00011.
[11] A. Gambi, G. Jahangirova, V. Riccio, F. Zampetti, Sbst tool competition 2022, in: 2022
IEEE/ACM 15th International Workshop on Search-Based Software Testing (SBST), 2022,
pp. 25–32. doi:10.1145/3526072.3527538.
[12] G. Fraser, A. Arcuri, Evosuite: automatic test suite generation for object-oriented software,
in: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference
on Foundations of Software Engineering, ESEC/FSE ’11, ACM, New York, NY, USA, 2011,
pp. 416–419. URL: http://doi.acm.org/10.1145/2025113.2025179. doi:10.1145/2025113.
2025179.
[13] C. Pacheco, S. K. Lahiri, M. D. Ernst, T. Ball, Feedback-directed random test generation,
in: Proceedings of the 29th International Conference on Software Engineering, ICSE ’07,
IEEE Computer Society, USA, 2007, p. 75–84. URL: https://doi.org/10.1109/ICSE.2007.37.
doi:10.1109/ICSE.2007.37.
[14] S. Vogl, S. Schweikl, G. Fraser, A. Arcuri, J. Campos, A. Panichella, Evosuite at the sbst
2021 tool competition, in: 2021 IEEE/ACM 14th International Workshop on Search-Based
Software Testing (SBST), 2021, pp. 28–29. doi:10.1109/SBST52555.2021.00012.
[15] A. Panichella, F. M. Kifetew, P. Tonella, Automated test case generation as a many-objective
optimisation problem with dynamic selection of the targets, IEEE Transactions on Software
Engineering 44 (2018) 122–158. doi:10.1109//TSE.2017.2663435.
[16] J. Campos, Y. Ge, N. Albunian, G. Fraser, M. Eler, A. Arcuri, An empirical evaluation
of evolutionary algorithms for unit test suite generation, Information and Software</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bacchelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ciancarini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rossi</surname>
          </string-name>
          ,
          <article-title>On the efectiveness of manual and automatic unit test generation</article-title>
          ,
          <source>in: 2008 The Third International Conference on Software Engineering Advances</source>
          ,
          <year>2008</year>
          , pp.
          <fpage>252</fpage>
          -
          <lpage>257</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICSEA.
          <year>2008</year>
          .
          <volume>66</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Fraser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Staats</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>McMinn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Arcuri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Padberg</surname>
          </string-name>
          ,
          <article-title>Does automated unit test generation really help software testers? a controlled empirical study</article-title>
          ,
          <source>ACM Transactions on Software Engineering and Methodology (TOSEM) 24</source>
          (
          <year>2015</year>
          )
          <fpage>1</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Kracht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Z.</given-names>
            <surname>Petrovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Walcott-Justice</surname>
          </string-name>
          ,
          <article-title>Empirically evaluating the quality of automatically generated and manually written test suites</article-title>
          ,
          <source>in: 2014 14th International Conference on Quality Software</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>256</fpage>
          -
          <lpage>265</lpage>
          . doi:
          <volume>10</volume>
          .1109/QSIC.
          <year>2014</year>
          .
          <volume>33</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Rojas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Fraser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Arcuri</surname>
          </string-name>
          ,
          <article-title>Automated unit test generation during software development: A controlled experiment and think-aloud observations</article-title>
          , in: Proceedings
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>