=Paper=
{{Paper
|id=Vol-2019/acm_src_2
|storemode=property
|title=MaRTS: A Model-Based Regression Test Selection Approach
|pdfUrl=https://ceur-ws.org/Vol-2019/acm_src_2.pdf
|volume=Vol-2019
|authors=Mohammed Al-Refai
|dblpUrl=https://dblp.org/rec/conf/models/Al-Refai17a
}}
==MaRTS: A Model-Based Regression Test Selection Approach==
<pdf width="1500px">https://ceur-ws.org/Vol-2019/acm_src_2.pdf</pdf>
<pre>
 MaRTS: A Model-Based Regression Test Selection
                 Approach
                                                       Mohammed Al-Refai
                                                    Computer Science Department
                                                      Colorado State University
                                                       Fort Collins, CO, USA
                                                   Email: al-refai@cs.colostate.edu


   Abstract—Models can be used to plan the evolution and              that does not affect the operation’s signature and contract [4].
runtime adaptation of a software system. Regression testing of        Fine-grained changes are those that can be made at a low
the evolved and adapted models is important to ensure that            level of abstraction, such as changes to a statement inside
the previously tested functionality is not broken. Regression
testing is performed with limited time and resource constraints.      a method implementation. Second, they do not support the
Thus, regression test selection (RTS) techniques are needed to        identification of changes to inherited and overridden operations
reduce the cost of regression testing. Existing model-based RTS       along the inheritance hierarchy [4], [7], [8], which leads to
approaches cannot detect all types of fine-grained changes that       situations where relevant test cases that traverse such inherited
can be made at a low level of abstraction, and they do not consider   and overridden methods are not selected for regression testing.
the impact of inheritance hierarchy changes on the selection of
test cases.                                                              We propose a model-based RTS approach called MaRTS
   We propose a model-based RTS approach called MaRTS that            to be used for regression testing of unanticipated fine-grained
classifies test cases based on changes performed to UML class and     adaptations performed at the model level. MaRTS uses UML
activity diagrams. It supports both fine-grained and inheritance      design class and activity diagrams to represent behaviors of
hierarchy changes. We compared MaRTS with two code-based              a software system and its test cases. MaRTS is based on
RTS approaches using four applications. MaRTS achieved results
comparable to a dynamic code-based RTS approach (DejaVu),             (1) static analysis of the UML class diagram to identify the
and outperformed a static code-based RTS approach (ChEOPSJ).          changes in the inheritance hierarchy, (2) fine-grained model
The fault detection ability of the selected test cases was equal to   comparison to identify changes performed to UML class and
that of the baseline test cases.                                      activity diagrams, and (3) dynamic analysis of the test case
   Index Terms—inheritance hierarchy, model-based adaptation,         execution at the model level to determine the coverage for
model-based regression test selection, UML activity diagram,
UML class diagram
                                                                      each test case.
                                                                         We evaluated MaRTS on four applications, and compared
                                                                      it with two code-based RTS approaches. We also evaluated
                       I. I NTRODUCTION
                                                                      the fault detection ability of the reduced test sets achieved by
   Regression testing is one of the most expensive activities         MaRTS.
performed during the lifecycle of a software system [1],
[2]. Regression test selection (RTS) [3] is an approach that                                  II. A PPROACH
improves regression testing efficiency and reduces regression            MaRTS classifies the test cases as obsolete, retestable
testing time by selecting a subset of the original test set for       or reusable. Obsolete test cases are invalid and cannot be
regression testing [3], [4].                                          executed on the modified version of the software system.
   RTS approaches can be based on the analysis of code or             Retestable test cases exercise the modified parts of the soft-
model-level changes of a software system. Model-based RTS             ware system, and need to be selected for regression testing.
has some advantages over code-based RTS. First, it enables            Reusable test cases only exercise unmodified parts of the
early estimation of the effort required for regression testing [4].   system, and they do not need to be re-executed for safe
Second, it can scale up better than code-based RTS approaches         regression testing [9]. A safe RTS technique must select all
for large scale software systems [5]. Third, model-based RTS          modification-traversing test cases for regression testing [10].
techniques can be more convenient for approaches that already         A test case is considered to be modification-traversing for a
apply evolution/adaptation at the model level because both            program 𝑃 if it executes changed code in 𝑃 , or if it formerly
the evolution/adaptation and test selection processes can be          executed code that had been deleted in 𝑃 [5].
performed at the same level of abstraction [6].                          In a prior work [6], we applied MaRTS within the context
   Existing model-based RTS approaches suffer from the fol-           of a Fine Grained Adaptation (FiGA) framework [11], [12]
lowing limitations. First, they cannot detect all types of fine-      that uses UML diagrams to support unanticipated and fine-
grained changes from UML class, sequence, and state machine           grained adaptations on running Java software systems. FiGA
diagrams used in these approaches [4], [7], [8]. An example of        uses ReverseR [13] to extract UML class and activity diagrams
such a change is a modification to an operation implementation        from Java source code, and JavAdaptor [14], [15] to update a
running Java program without stopping it. In FiGA, each indi-         in each activity diagram are executed. This information is used
vidual method is represented as an activity diagram. The UML          to obtain the activity-level and flow-level traceability matrices
activity diagram elements that are supported are initial and          that relate each test case to the activity diagrams and their
final nodes, action nodes, call behavior nodes, and decision and      flows that were traversed by the test case.
merge nodes. An activity diagram generated using ReverseR
                                                                      C. Model Change Identification
is executable, where each action node in the activity diagram
has a code snippet associated with it, and Java statements are           MaRTS uses RSA model comparison to identify the model
contained inside the code snippet. When the model execution           changes after developers adapt the class and activity dia-
flow reaches an action node, then the code snippet associated         grams. The class diagram changes that can be identified
with the action node is executed. Additionally, ReverseR maps         are addition/deletion/modification of interfaces, classes, class
a code-level method invocation statement to a call to the             attributes, operations, and generalization and realization rela-
correspoding activity diagram. When the model execution flow          tions. The activity diagram changes that can be identified are
reaches such a call, the called activity diagram is executed [6],     addition/deletion/modification of nodes, transition flows, code
[16].                                                                 stored in a code snippet associated with an action node, and
   In MaRTS, each method of the software system is repre-             the boolean expression associated with a transition flow.
sented as a UML activity diagram. The same thing applies              D. Extraction of the Operations-Table from the Adapted Class
to each test case. These activity diagrams are executable.            Diagram
We exploit the Rational software architect (RSA) simulation              When developers adapt the class diagram, the declared and
toolkit 9.01 to execute test cases at the model level.                inherited operations in each class might change. Therefore, an
   MaRTS consists of the following five steps:                        operations-table is extracted from the adapted class diagram.
  1) Extract operations-table from the original class diagram.        The information stored in the operations-tables that are ex-
  2) Calculate the traceability matrix.                               tracted from the original and adapted class diagrams are used
  3) Identify model changes.                                          to determine changes to inherited or overridden operations in
  4) Extract operations-table from the adapted class diagram.         each class.
  5) Classify test cases.
                                                                      E. Test Case Classification
   MaRTS can scale up to large programs because all of its
steps are automated. MaRTS requires the UML models used                  We proposed a classification algorithm that takes the fol-
with it to be detailed and executable in order to obtain the          lowing inputs: (1) the operations-tables extracted from the
coverage of test cases at the model level. Therefore, MaRTS           original and adapted class diagrams, (2) the identified model
is not applicable to model-driven development approaches              differences, (3) the flow-level and activity-level traceability
that use models at a high level of abstraction and lack               matrices, (4) the set of UML activity diagrams representing
traceability links between the code-level test cases and the          the methods of the software system, and (5) the set of activity
models representing the software system.                              diagrams representing the baseline test cases. The algorithm
                                                                      classifies the test cases as obsolete, retestable, or reusable.
A. Extraction of the Operations-Table from the Original Class            Initially, all the test cases are assumed to be reusable. The
Diagram                                                               algorithm compares the operations-tables to identify which
   This step is performed before developers adapt the models.         operations were changed along the inheritance hierarchy. The
An operations-table is extracted from the class diagram. This         activity-level traceability matrix is used to determine each test
table stores for each class the operations that are declared and      case that is affected by those changes. The following rules are
inherited by the class. For each operation, the operations-table      applied:
stores the operation’s declaring class, name, formal parameter          1) If an operation op is initially declared or inherited by a
types, and return type. For each class in the table, the name               class C, and is now neither declared nor inherited by C,
of its superclass is also stored.                                           then, find each test case that traverses op on a receiver of
                                                                            type C. If a found test case directly calls op on a receiver
B. Traceability Matrix Calculation                                          of type C, then flag the test case as obsolete. Otherwise,
   This step is performed before developers adapt the models.               flag the test case as retestable.
The activity diagrams representing the test cases are executed          2) If an operation op is
with the activity diagrams representing the program methods                 a) initially inherited by a class C from an ancestor class
in order to obtain the coverage of test cases at the model level.              B, and is now overridden by C, or is inherited by C
   During model execution, four types of coverage information                  from one of its ancestors other than B, or
are collected for each test case: (1) what activity diagrams                b) initially declared by a class C, and is now inherited
are executed by the test case, (2) what activity diagrams are                  by C from one of its ancestors.
directly called by the test case, (3) what is the receiver object           then, flag any reusable test case that traverses op on a
type for each executed activity diagram, and (4) which flows                receiver of type C as retestable.
                                                                         Once the algorithm completes iterating over all entries of
  1 http://www-03.ibm.com/software/products/en/ratisoftarchsimutool   the operations-tables, the test cases that are still flagged as
reusable are classified based on the identified model dif-                the generated test cases for XML-security from this study,
ferences. If such a test case traverses deleted or modified               and only considered the existing test cases that come with the
transition flows and/or nodes, then the test case is flagged as           application.
retestable.
                                                                                                       TABLE II
                             III. C ASE S TUDY                                            A DAPTATIONS P ERFORMED ON M ODELS
   The goals of the evaluation were to (1) compare the in-                 Subject        Evolution                              Changes
clusiveness and precision of MaRTS with that of two code-                                                 classes &
                                                                                                                       generalizations   realizations   operations
based RTS approaches that support changes to the inheri-                                                  interfaces

                                                                           JUNG           1.3.0 → 1.4.0       5              7               2            79
tance hierarchy, and (2) evaluate the fault detection ability
                                                                           Siena          1.8 → 1.12          0              0               0             9
of the retestable test set with that of the original test set.                                                0              0               0            11
                                                                           Siena          1.8 → 1.14
Inclusiveness measures the extent to which a regression test               Chess          0→1                 1              6               6            56
selection technique selects modification-traversing test cases             XML-security   2→3                52             37               2           311
for regression testing, and precision measures the extent to
which a regression test selection approach excludes test cases               We extracted class and activity diagrams from the original
that are non-modification-traversing [10].                                version of each subject program and its test cases. Then, we
   We compared MaRTS with DejaVu [2] and ChEOPSJ [17].                    adapted the class and activity diagrams from one version to
DejaVu detects fine-grained changes at the statement level, and           the following version in a systematic way. First, we identified
ChEOPSJ detects fine-grained changes to method invocations.               the code-level differences between the two versions. Second,
Both tools support the identification of changes to the inher-            we manually applied these differences at the model level. The
itance hierarchy, and support RTS for Java software systems.              changes at the model level involved additions and deletions of
We did not compare MaRTS with the existing model-based                    classes, interfaces, operations, generalization and realization
RTS approaches because they lack tool support (or tools are               relations, and modifications to method implementations by
unavailable).                                                             modifying the activity diagrams representing these methods.
                                                                          Table II summarizes the changes performed on models.
A. Subject Programs and their Adaptations
                                                                             After the model-level adaptation process was completed, we
   We used four subject programs: (1) graph package of the                applied MaRTS to classify test cases at the model level, and
Java Universal Network/Graph Framework (JUNG)2 , (2) Sie-                 applied DejaVu and ChEOPSJ at the code level.
na3 , (3) XML-security4 , and (4) chess program, which is a
classroom project that only supports the functionality to create          B. Inclusiveness and Precision Results
a chessboard and move chess pieces. These programs were                      Table III shows the results of running the three RTS
implemented using Java 6 and 7. They do not use generic types             approaches. For example, MaRTS and DejaVu classified all
and multithreaded programming. Table I summarizes the data                the 188 test cases of JUNG as retestable, and ChEOPSJ
for the original versions of each subject.                                classified 178 of out of the 188 test cases as retestable. For
                                                                          the XML-security subject, MaRTS classified 10 out of 94
                                  TABLE I                                 test cases as obsolete, and classified the remaining 84 test
                            O RIGINAL P ROGRAMS                           cases as retestable. We found that the 10 obsolete test cases
                              Num.         Num.         Num.
                                                                          contain calls to deleted operations. DejaVu and ChEOPSJ do
 Subject          Version                                         LOC
                              classes      interfaces   methods           not address the identification of obsolete test cases. DejaVu
 JUNG             1.3.0       13           12           146        3655   classified all the 94 test cases as retestable. Therefore, we
 Chess            0           7            1            65         1074   excluded the 10 obsolete test cases from the calculations of
 Siena            1.8         9            0            95         1605
                                                                          the inclusiveness, precision, false positives, and false negatives
 XML-security     2           173          6            1172      16800
                                                                          for the three RTS tools.
                                                                             We did not get RTS results for ChEOPSJ when we ran it
   We used EvoSuite [18] to generate JUnit test cases for                 on the XML-security subject because of a bug in ChEOPSJ.
each of these versions. For JUNG, 188 test cases that achieve             It did not detect code changes that it is supposed to detect,
81% statement coverage were generated. For Siena, 107 test                and did not produce results. Table III and Table IV do not
cases that achieve 89% statement coverage were generated.                 show results for ChEOPSJ with respect to the XML-security
For chess, 130 test cases that achieve 96% statement coverage             subject.
were generated. The XML-security package has JUnit test                      Table IV shows the number of false positives and false
suite that comes with it and achieves 31% statement coverage.             negatives for each of the studied RTS approaches. DejaVu
The generated test cases for XML-security did not improve                 is a safe tool and classifies all modification-traversing test
the coverage of the existing test suite. Therefore, we excluded           cases as retestable, and therefore, its inclusiveness was 100%
  2 http://jung.sourceforge.net/download.html                             for all the subject programs. The same set of test cases that
  3 http://sir.unl.edu/portal/bios/siena.php                              was classified as retestable by DejaVu was also classified as
  4 http://sir.unl.edu/portal/bios/xml-security.php                       retestable by MaRTS for all the subject programs (excluding
                                      TABLE III                                             Mutator, (4) Math Mutator, (5) Negate Conditionals Mutator,
                        T EST C ASE C LASSIFICATION R ESULTS                                and (6) Void Method Calls Mutator. We configured PIT to
                                                              Retestable Test Cases
                                                                                            only mutate the adapted methods. We ran PIT with both the
 Subject            Evolution             Number of
                                         Test Cases                                         original and retestable test sets on both the versions.
                                                         DejaVu         ChEOPSJ   MaRTS

 JUNG               1.3.0 → 1.4.0        188                 188         178       188                                    TABLE V
 Siena              1.8 → 1.12           107                  26          54           26                         M UTATION T ESTING R ESULTS
 Siena              1.8 → 1.14           107                  36          59           36
                                                                                                                               Full Test Set                  Retestable Test Set
 Chess              0→1                  130                 130         126       130       Subject      Mutants
                                                                                                                        size                   score        size              score
 XML-security       2→3                  94                   94         N/A           84
                                                                                             Siena 1.12     134      107                29.8%          26                 29.8%
                                                                                             Siena 1.14     136      107                30.9%          36                 30.9%

the 10 obsolete test cases for XML-security). Therefore, the
inclusiveness of MaRTS was also 100%. ChEOPSJ missed                                           Table V shows the mutation testing results. Both the original
some modification-traversing test cases, and its inclusiveness                              and retestable test sets killed exactly the same set of mutants in
was 94% for JUNG, 96% for Chess, 92% for Siena version                                      both the versions. The fault detection ability of the retestable
1.12, and 88% for version 1.14. The reason is that ChEOPSJ                                  test set was equal to that of the original test set.
only records changes to method invocations, but not to other                                D. Threats to Validity
types of statements in method bodies.
                                                                                               We identify several threats to validity of the results of our
                             TABLE IV                                                       case study.
    N UMBER OF FALSE P OSITIVES (FP) AND FALSE N EGATIVES (FN)                                 External validity. It is difficult to generalize from a study
                                                                                            of only four subject programs. However, we selected program
                                               DejaVu    ChEOPSJ               MaRTS
   Subject               Evolution                                                          versions that incorporate various types of modifications, such
                                          FP        FN   FP        FN     FP       FN
                                                                                            as changes to classes, methods, inheritance hierarchy, and class
   JUNG                  1.3.0 → 1.4.0    0         0    0         10     0        0
                                                                                            attributes.
   Siena                 1.8 → 1.12       0         0    30        2      0        0
   Siena                 1.8 → 1.14       0         0    28        4      0        0
                                                                                               Internal validity. The unknown factors that might affect
   Chess                 0→1              0         0    0         4      0        0        the outcome of the analyses are possible errors in our algo-
   XML-security          2→3              0         0    N/A       N/A    0        0        rithm implementation, and that the test cases were generated
                                                                                            only using one test case generation tool. To control the first
   The precision was 100% for MaRTS and DejaVu because                                      factor, we tested the implementation of MaRTS on different
neither classified any non modification-traversing test case as                             change scenarios. We also compared the results achieved by
retestable for each subject program. The precision of ChEOPSJ                               MaRTS for the case studies with those of DejaVu.
was 100% for JUNG and Chess, 62% for Siena version 1.12,                                       We used EvoSuite to generate JUnit test cases for the subject
and 60% for version 1.14. The reason is that ChEOPSJ is based                               programs. The results could change if other test generation
on static analysis of dependencies between modified code                                    tools were used or test sets with different coverage numbers
and test cases, which leads to classifying non modification-                                were used. Additionally, the test cases generated for the Siena
traversing test cases as retestable.                                                        subject achieved low mutation scores. The fault detection
C. Fault Detection Ability Results                                                          ability results could change if other test sets that achieve
                                                                                            different mutation scores were used. We plan to evaluate the
   The results for MaRTS showed a reduction in the number of                                proposed approach on additional test suites generated by other
selected test cases only for the Siena subject for the adaptation                           test case generation tools.
from version 1.8 to 1.12, and from 1.8 to 1.14. We used                                        Another threat is that the same person selected the subject
mutation testing to evaluate the fault detection ability of these                           programs, generated the test cases, reverse engineered the
reduced test sets. We excluded the XML-security subject from                                models, performed the model-level adaptations, and executed
the fault detection ability evaluation because all of its test cases                        the RTS tools. There is a potential for getting different results
were selected by MaRTS (excluding the 10 test cases that were                               if different people worked on these steps. The test generation
classified as obsolete by MaRTS).                                                           process and RTS approaches were automated, and thus, having
   There are no tools (to the best of our knowledge) that                                   other people perform those steps would not make a difference
support systematic generation of mutations at the model level.                              if they used the same tool configurations. The adaptations are
Therefore, we used a code-level mutation testing tool. In                                   manual, which can lead to different modifications. However,
particular, we used PIT5 to apply first-order method-level                                  since we started from a particular version of code and finished
mutation operators to the code-level versions 1.12 and 1.14.                                at a well-defined version of code, the differences are not likely
The applied mutation operators6 were (1) Conditionals Bound-                                to be significant.
ary Mutator, (2) Increments Mutator, (3) Invert Negatives                                      Construct validity. We used inclusiveness and precision
  5 http://pitest.org                                                                       to evaluate MaRTS. However, there are other metrics that can
  6 http://pitest.org/quickstart/mutators/                                                  be used to evaluate an RTS approach, such as its efficiency in
terms of reducing regression testing time. We plan to evaluate                               ACKNOWLEDGMENT
the efficiency of MaRTS in the future.                                This material is based upon work supported by the National
                      IV. R ELATED W ORK                            Science Foundation under Grant No. CNS 1305381.
   The RTS problem has been studied for over three                                               R EFERENCES
decades [5], [19]. Most of the existing approaches are code-
                                                                     [1] G. Rothermel and M. J. Harrold, “A Safe, Efficient Regression Test
based [1], [2], [17], [20], [21], [22], and little work exists in        Selection Technique,” ACM Transactions on Software Engineering and
the literature on model-based RTS. We summarize the existing             Methodology, vol. 6, no. 2, pp. 173–210, Apr. 1997.
model-based RTS approaches and compare them with MaRTS.              [2] M. J. Harrold, J. A. Jones, T. Li, D. Liang, A. Orso, M. Pennings,
                                                                         S. Sinha, S. A. Spoon, and A. Gujarathi, “Regression Test Selection
   Chen et al. [23] use UML activity diagrams to perform                 for Java Software,” in Proceedings of the 16th Conference on Object-
specification-based black-box RTS. In their approach, an ac-             Oriented Programming, Systems, Languages, and Applications (OOP-
tivity diagram represents the requirements of a system. In con-          SLA’01), J. Vlissides, Ed. Tampa, FL, USa: ACM, Oct. 2001, pp.
                                                                         312–326.
trast, MaRTS uses activity diagrams to represent fine-grained        [3] M. J. Harrold, “Testing Evolving Software,” Journal of Systems and
behaviors of a software system. Korel et al. [24] use control            Software, vol. 47, no. 2-3, pp. 173–181, Jul. 1999.
and data dependencies in an extended finite state machine to         [4] L. C. Briand, Y. Labiche, and S. He, “Automating Regression Test
                                                                         Selection Based on UML Designs,” Journal on Information and Software
identify the impact of model changes and perform RTS. This               Technology, vol. 51, no. 1, pp. 16–30, Jan. 2009.
approach does not support changes to the inheritance hierarchy       [5] S. Yoo and M. Harman, “Regression Testing Minimization, Selection
because it does not use UML class diagram.                               and Prioritization: A Survey,” Journal of Software Testing, Verification
                                                                         and Reliability, vol. 22, no. 2, pp. 67–120, Mar. 2012.
   Farooq et al. [7] use UML class and state machine models          [6] M. Al-Refai, S. Ghosh, and W. Cazzola, “Model-based Regression Test
for RTS. This approach does not support the identification of            Selection for Validating Runtime Adaptation of Software Systems,” in
(1) the addition and deletion of the generalization relations,           Proceedings of the 9th IEEE International Conference on Software Test-
                                                                         ing, Verification and Validation (ICST’16), L. Briand and S. Khurshid,
and (2) the overridden and inherited operations along the                Eds. Chicago, IL, USA: IEEE, 10th-15th of Apr. 2016, pp. 288–298.
inheritance hierarchy.                                               [7] Q.-u.-a. Farooq, M. Z. Z. Iqbal, Z. I Malik, and M. Riebisch, “A Model-
   Briand et al. [4] present an RTS approach based on UML                Based Regression Testing Approach for Evolving Software Systems with
                                                                         Flexible Tool Support,” in Proceedings of the 17th IEEE International
use case models, class models, and sequence models. Zech                 Conference and Workshops on Engineering of Computer-Based Systems
et al. [8] present a generic model-based RTS platform, which             (ECBS’10). Oxford, UK: IEEE, Mar. 2010, pp. 41–49.
is based on the model versioning tool, MoVE. The approach            [8] P. Zech, M. Felderer, P. Kalb, and R. Breu, “A Generic Platform for
                                                                         Model-Based Regression Testing,” in Proceedings of the 5th Inter-
consists of the three phases that are controlled by OCL queries,         national Symposium on Leveraging Applications of Formal Methods,
namely, change identification, impact analysis, and test case            Verification and Validation (ISoLA’12), ser. Lecture Notes in Computer
selection. The approaches of Briand et al. and Zech et al. can           Science 7609, T. Margaria and B. Steffen, Eds.         Heraclion, Crete:
                                                                         Springer, Oct. 2012, pp. 112–126.
identify the addition and deletion of generalization relations       [9] H. K. N. Leung and L. J. White, “Insights into Regression Testing,”
between classes. However, they do not identify the impact of             in Proceedings of Conference on Software Maintenance. Miami, FL,
such changes to the inherited and overridden operations along            USA: IEEE, Oct. 1989, pp. 60–69.
                                                                    [10] G. Rothermel and M. J. Harrold, “Analyzing Regression Test Selection
the inheritance hierarchy, which can result in missing some              Techniques,” IEEE Transactions on Software Engineering, vol. 22, no. 8,
retestable test cases.                                                   pp. 529–551, Aug. 1996.
   In contrast to the above mentioned model-based RTS ap-           [11] W. Cazzola, N. A. Rossini, M. Al-Refai, and R. B. France, “Fine-
                                                                         Grained Software Evolution using UML Activity and Class Models,”
proaches, MaRTS can identify changes along the inheritance               in Proceedings of the 16th International Conference on Model Driven
hierarchy and classify test cases accordingly.                           Engineering Languages and Systems (MoDELS’13), ser. Lecture Notes
                                                                         in Computer Science 8107, A. Moreira and B. Schätz, Eds. Miami,
            V. C ONCLUSIONS AND F UTURE W ORK                            FL, USA: Springer, Sep. 2013, pp. 271–286.
   In this work, we presented a model-based RTS approach that       [12] W. Cazzola, N. A. Rossini, P. Bennett, S. Pradeep Mandalaparty, and
                                                                         R. B. France, “Fine-Grained Semi-Automated Runtime Evolution,” in
supports fine-grained changes in method implementation and               MoDELS@Run-Time, ser. Lecture Notes in Computer Science 8378,
changes to the inheritance hierarchy, and takes into account             N. Bencomo, B. Chang, R. B. France, and U. Aßmann, Eds. Springer,
the impact of such changes on the selection of test cases.               Aug. 2014, pp. 237–258.
                                                                    [13] W. Cazzola, S. Pini, A. Ghoneim, and G. Saake, “Co-Evolving Applica-
MaRTS was evaluated on four subjects and compared with two               tion Code and Design Models by Exploiting Meta-Data,” in Proceedings
code-based RTS approaches, DejaVu and ChEOPSJ, which                     of the 22nd Annual ACM Symposium on Applied Computing (SAC’07).
consider changes to the inheritance hierarchy and support                Seoul, South Korea: ACM Press, Mar. 2007, pp. 1275–1279.
                                                                    [14] M. Pukall, A. Grebhahn, R. Schröter, C. Kästner, W. Cazzola, and
Java software. MaRTS outperformed ChEOPSJ and achieved                   S. Götz, “JavAdaptor: Unrestricted Dynamic Software Updates for
comparable results to DejaVu in terms of inclusiveness and               Java,” in Proceedings of the 33rd International Conference on Software
precision. MaRTS was able to identify a certain type of                  Engineering (ICSE’11). Waikiki, Honolulu, Hawaii: IEEE, on 21st-28th
                                                                         of May 2011, pp. 989–991.
obsolete test cases. DejaVu and ChEOPSJ do not address the          [15] M. Pukall, C. Kästner, W. Cazzola, S. Götz, A. Grebhahn, R. Schöter,
identification of obsolete test cases. The retestable test sets          and G. Saake, “JavAdaptor — Flexible Runtime Updates of Java
obtained by MaRTS achieved the same fault detection ability              Applications,” Software—Practice and Experience, vol. 43, no. 2, pp.
                                                                         153–185, Feb. 2013.
that was achieved by the full test sets.
   We will evaluate the inclusiveness and precision of MaRTS
on additional subject programs, and evaluate its efficiency in
terms of reducing regression testing time.
[16] M. Al-Refai, W. Cazzola, S. Ghosh, and R. France, “Using Models to            May 1997.
     Validate Unanticipated, Fine-Grained Adaptations at Runtime,” in Pro-    [21] D. C. Kung, J. Gao, P. Hsia, Y. Toyoshima, and C. Chen, “On Regression
     ceedings of the 17th IEEE International Symposium on High Assurance           Testing of Object-Oriented Programs,” Journal of Systems and Software,
     Systems Engineering (HASE’16), H. Waeselynck and R. Babiceanu, Eds.           vol. 32, no. 1, pp. 21–40, Jan. 1996.
     Orlando, FL, USA: IEEE, 7th-9th of Jan. 2016, pp. 23–30.                 [22] M. Skoglund and P. Runeson, “Improving Class Firewall Regression
[17] Q. D. Soetens, S. Demeyer, A. Zaidman, and J. Pérez, “Change-                 Test Selection by Removing the Class Firewall,” International Journal
     Based Test Selection: An Empirical Evaluation,” Empirical Software            of Software Engineering and Knowledge Engineering, vol. 17, no. 3, pp.
     Engineering, pp. 1–43, Nov. 2015.                                             359–378, Jun. 2007.
[18] A. Arcuri, J. Campos, and G. Fraser, “Unit Test Generation Dur-          [23] Y. Chen, R. L. Probert, and D. P. Sims, “Specification-Based Regression
     ing Software Development: EvoSuite Plugins for Maven, IntelliJ and            Test Selection with Risk Analysis,” in Proceedings of the Conference
     Jenkins,” in Proceedings of the 9th IEEE International Conference on          of the Centre for Advanced Studies on Collaborative Research (CAS-
     Software Testing, Verification and Validation (ICST’16), L. Briand and        CON’02), D. A. Stewart and J. H. Johnson, Eds. IBM Press, Sep.
     S. Khurshid, Eds. Chicago, IL, USA: IEEE, Apr. 2016, pp. 401–408.             2002, pp. 1–14.
[19] E. Engström, P. Runeson, and M. Skoglund, “A Systematic Review           [24] B. Korel, L. H. Tahat, and B. Vaysburg, “Model Based Regression Test
     on Regression Test Selection Techniques,” Information and Software            Reduction Using Dependence Analysis,” in Proceedings of the Inter-
     Technology, vol. 52, no. 1, pp. 14–30, Jan. 2010.                             national Conference on Software Maintenance (ICSM’02), G. Antoniol
[20] L. J. White and K. Abdullah, “A Firewall Approach for Regression              and I. D. Baxter, Eds. Montréal, Quebec, Canada: IEEE, Oct. 2002,
     Testing of Object-Oriented Software,” in Proceedings of the 10th In-          pp. 214–223.
     ternational Software Quality Week (QW’97), San Francisco, CA, USA,

</pre>