Improving Model-Based Regression Test Selection
                                                       Mohammed Al-Refai
                                                    Computer Science Department
                                                      Colorado State University
                                                       Fort Collins, CO, USA
                                                   Email: al-refai@cs.colostate.edu


   Abstract—Existing model-based regression test selection ap-        lack of traceability is a known issue in model-based RTS
proaches are based on analyzing changes performed at the              approaches, and can severely limit the applicability of these
model level. These approaches have three limitations. First, they     approaches [9].
cannot detect all types of changes from design models. Second,
they do not identify the impact of changes to the inheritance                              II. R ELATED W ORK
hierarchy of the classes. Third, their applicability is limited due
to the abstraction gap between the code-level regression test cases      The RTS problem has been studied for over three
and the models that represent the software system at a high           decades [10], [9]. Most of the existing approaches are code-
level of abstraction. This paper discusses two model-based RTS        based [11], [12], [13], [14], [3], [4], and little work exists
approaches to overcome these limitations, the evaluation plan,
and the current status.                                               in the literature on model-based RTS. We summarizes the
   Index Terms—regression testing, model-based regression test        limitations of existing model-based RTS approaches. Farooq
selection, UML activity diagram, UML class diagram, fuzzy logic       et al. [7] used UML class and state machine models for
                                                                      RTS. This approach does not support the identification of
                                                                      (1) the addition and deletion of the generalization relations,
                          I. P ROBLEM                                 and (2) the overridden and inherited operations along the
   Models can be used to perform evolution and runtime                inheritance hierarchy. Briand et al. [6] presented an RTS
adaptation of a software system [1], [2]. Regression testing          approach based on UML use case models, class models, and
of the evolved and adapted models is needed to ensure                 sequence models. This approach can identify the addition and
that previously tested functionality is still correct. Regression     deletion of generalization relations between classes. Zech et
testing is one of the most expensive activities performed during      al. [8] presented a generic model-based RTS platform, which
the lifecycle of a software system [3], [4]. Regression test          is based on the model versioning tool, MoVE. Briand et al.
selection (RTS) [5] improves regression testing efficiency by         and Zech et al. do not support the identification of inherited
selecting a subset of test cases from the original test set           and overridden operations along the inheritance hierarchy.
for regression testing [5], [6]. Model-based RTS has several          Korel et al. [15] used control and data dependencies in an
advantages over code-based RTS. The effort for testing can be         extended finite state machine to identify the impact of model
estimated earlier [6], [7], [8], tools for regression testing can     changes and perform RTS. This approach does not use UML
be programming language independent [6], [8], and managing            class model. All of the mentioned approaches use design-
traceability can be more practical at the model level [6], [8].       time models, and require these models to contain enough
Model-based RTS can scale up better than code-based RTS               information to obtain the coverage of test cases at the model
for large software systems [9].                                       level, which is not always a common practice [6].
   Existing model-based RTS approaches suffer from the fol-
lowing three problems. First, they cannot detect all types                             III. P ROPOSED S OLUTION
of changes from UML class, sequence, and state machine                  In this work we propose two model-based RTS approaches.
diagrams that are used in these approaches [6], [7], [8]. Briand      The first approach called MaRTS addresses the first two
et al. [6] provided an example for such a change, which is            problems discussed in section I. The second approach called
a modification to an operation implementation that does not           FLiRTS uses fuzzy logic to address the third problem dis-
affect the operation’s signature and contract. Second, they           cussed in section I.
do not support the identification of changes to inherited and
overridden operations along the inheritance hierarchy [6], [7],       A. MaRTS
[8]. As a result, existing model-based RTS approaches can                MaRTS [16] is a model-based RTS approach that uses (1) a
miss relevant test cases that need to be reexecuted. The third        UML design class diagram to represent the static structure of
problem is the lack of traceability links between code-level          a software system, and (2) UML activity diagrams to represent
test cases and the models representing the software system.           the fine-grained behaviors of a software system. The class
The reason is that models are generally created at a high level       and activity diagrams are reverse engineered from the original
of abstraction and lack low-level details that are needed to          version of the software system. Each method of the software
obtain the coverage of test cases at the model level. This            system and each test case is represented as an activity diagram.
   The activity diagrams used in MaRTS are detailed and                 to an operation in the class diagram. This level of abstraction
executable. MaRTS exploits the Rational Software Architect              prevents obtaining the coverage of test cases at the model level.
(RSA) simulation toolkit 9.01 to execute test cases at the model        We propose two solutions for this problem.
level. Each action node has an associated code snippet that                Activity diagram-based solution. This solution is based
contains code statements. When the execution flow reaches an            on automatically generating detailed activity diagrams called
action node in a model, the code snippet associated with the            refinements from the abstract activity diagrams [18]. A refine-
action node is executed. Each code-level method invocation              ment is an activity diagram that contains more flows and nodes
statement is represented at the model level as a call to the            compared to the one that it is refining it. FLiRTS assumes
corresponding activity diagram. When the model execution                that a UML sequence diagram that represents all the use case
flow reaches such a call, the associated activity diagram is            scenarios of the system is available, and is used to control
executed [17], [16]. MaRTS is based on the following steps.             the refinement generation process to avoid the generation of
   Extract information from the UML class diagrams. An                  completely inconsistent and unrelated activity diagrams. Each
operations-table is extracted from the original class diagram.          activity diagram in the system model has several possible
This table stores for each class the operations that are declared       refinements. A refined system model is the system model where
and inherited by the class, and the name of its superclass.             each activity diagram is replaced by one of its refinements.
When developers adapt the class diagram, the declared and               Several combinations of the refinements are possible, which
inherited operations in each class might change. Therefore,             leads to the creation of several refined system models. A test
an operations-table is also extracted from the adapted class            case may or may not traverse activity diagrams in a refined
diagram.                                                                system model depending on which refinements are used, and
   Calculate traceability matrix. This step is performed                their correctness.
before adapting the models. The test cases are executed at                 Fuzzy logic is used to address this uncertainty. We define
the model level, and coverage information is collected for each         two input variables and set their crisp values in terms of (1) the
test case: (1) what activity diagrams are executed, and (2) what        extent to which a test case traverses modified activity diagrams
flows in each activity diagram are executed. This information           in a refined system model, and (2) the extent to which a test
is used to obtain the traceability matrix that relates each test        case traverses correct refinements in a refined system model.
case to the activity diagrams and the flows that were traversed         These values are provided as inputs to the fuzzy logic-based
by the test case.                                                       classifier. The final results of the classifier for each test case are
   Identify model changes. MaRTS uses RSA model compar-                 the probabilities for Retestable and Reusable associated with
ison2 to identify fine-grained model changes after developers           each refined system model. We use the most correct refined
adapt the class and activity diagrams. This tool can identify           system model to obtain the final classification for the test case.
fine-grained changes in the activity diagrams at the level                 Class diagram-based solution. First, FLiRTS reads the
of flows, nodes, and code statements associated with action             design class diagram and extracts relations between the classi-
nodes.                                                                  fiers. We consider the association, composition, generalization,
   Classify test cases. Our algorithm classifies the test cases         realization, and usage relations. The usage relation type that
as obsolete, retestable, or reusable. Initially, all the test cases     we consider is defined as the one from a class/interface C to
are classified as reusable. Next, the algorithm compares the            a class/interface D, where C has an operation with a return
operations-tables to identify which operations were changed             type and/or a parameter type of D.
along the inheritance hierarchy. The traceability matrix is                Second, FLiRTS uses the extracted relations to build a
used to determine each test case that is affected by those              relations graph. In this graph, the classifiers are nodes, and
changes, and the affected test cases are classified as retestable       the extracted relations are directed edges between the nodes.
or obsolete. The remaining test cases are classified based on           Third, FLiRTS uses the activity diagrams representing the
the identified model differences.                                       test cases along with the paths between the classifiers in the
B. FLiRTS                                                               relations graph to estimate the coverage of each test case.
                                                                        Based on the relations graph, there can be multiple paths
   FLiRTS is a fuzzy logic-based RTS approach that performs
                                                                        that start from a test case and end in adapted/evolved classes.
RTS using design models that are at a high level of abstraction.
                                                                        However, we are uncertain about which one of these paths is
FLiRTS uses a UML design class diagram to represent the
                                                                        traversed by the test case. We use fuzzy logic to address this
static structure of the system and UML activity diagrams to
                                                                        uncertainty for each test case by considering the number of
represent the behaviors of the system’s methods at a high level
                                                                        such paths, their length, and types of their edges.
of abstraction. In contrast to MaRTS, action nodes in these
activity diagrams are not executable, and do not contain code                  IV. P LAN FOR E VALUATION AND VALIDATION
statements. Each test case is modeled as an activity diagram
that includes call behavior nodes, each of which directly links           We plan to empirically evaluate MaRTS and FLiRTS using
                                                                        subjects as follows:
  1 http://www-03.ibm.com/software/products/en/ratisoftarchsimutool
  2 https://www.ibm.com/developerworks/rational/library/05/712_comp2/    1) Compare the inclusiveness and precision of MaRTS and
index.html                                                                  FLiRTS with that of two code-based RTS approaches.
 2) Evaluate the fault detection ability of the reduced test                               ACKNOWLEDGMENT
    sets.                                                           This material is based upon work supported by the National
 3) Identify generalizable thresholds for the fuzzy sets/rules    Science Foundation under Grant No. CNS 1305381.
    used in FLiRTS.
                                                                                               R EFERENCES
               V. E XPECTED C ONTRIBUTIONS
                                                                   [1] W. J. Dzidek, E. Arisholm, and L. C. Briand, “A realistic empirical
   This work will contribute to the modeling field by showing          evaluation of the costs and benefits of uml in software maintenance,”
that the proposed RTS techniques are feasible and produce              IEEE Transactions on software engineering, vol. 34, no. 3, pp. 407–432,
                                                                       2008.
results that are comparable to that of code-based RTS. We          [2] G. S. Blair, N. Bencomo, and R. B. France, “Models@run.time,” IEEE
expect that MaRTS will improve the safety and precision of             Computer, vol. 42, no. 10, pp. 22–27, Oct. 2009.
model-based RTS, and FLiRTS will improve the applicabil-           [3] G. Rothermel and M. J. Harrold, “A Safe, Efficient Regression Test
                                                                       Selection Technique,” ACM Transactions on Software Engineering and
ity, safety, and precision of model-based RTS when applied             Methodology, vol. 6, no. 2, pp. 173–210, Apr. 1997.
with models that are at a high level of abstraction and lack       [4] M. J. Harrold, J. A. Jones, T. Li, D. Liang, A. Orso, M. Pennings,
traceability with the test cases.                                      S. Sinha, S. A. Spoon, and A. Gujarathi, “Regression Test Selection
                                                                       for Java Software,” in Proceedings of the 16th Conference on Object-
   MaRTS and FliRTS can be used within the contexts of                 Oriented Programming, Systems, Languages, and Applications (OOP-
model-based evolution and runtime adaptation. For example,             SLA’01), J. Vlissides, Ed. Tampa, FL, USa: ACM, Oct. 2001, pp.
the approaches that use models to perform runtime adapta-              312–326.
                                                                   [5] M. J. Harrold, “Testing Evolving Software,” Journal of Systems and
tion [19], [20], [21], [22] can utilize MaRTS and FLiRTS to            Software, vol. 47, no. 2-3, pp. 173–181, Jul. 1999.
classify regression test cases based on model-level changes.       [6] L. C. Briand, Y. Labiche, and S. He, “Automating Regression Test
                                                                       Selection Based on UML Designs,” Journal on Information and Software
                    VI. C URRENT S TATUS                               Technology, vol. 51, no. 1, pp. 16–30, Jan. 2009.
   We implemented a prototype tool that automates the process      [7] Q. Farooq, M. Z. Z. Iqbal, Z. I. Malik, and M. Riebisch, “A model-based
                                                                       regression testing approach for evolving software systems with flexible
of MaRTS. We compared MaRTS with two code-based RTS                    tool support,” in 17th IEEE International Conference and Workshops
approaches, DejaVu [4] and ChEOPSJ [14], in terms of their             on the Engineering of Computer-Based Systems, ECBS 2010, Oxford,
inclusiveness, precision, the fault detection ability of the           England, UK, 22-26 March 2010, 2010, pp. 41–49.
                                                                   [8] P. Zech, M. Felderer, P. Kalb, and R. Breu, “A Generic Platform for
retestable test set. We used four subject programs: JUNG,              Model-Based Regression Testing,” in Proceedings of the 5th Inter-
Siena, XML-security, and chess, which is a classroom project.          national Symposium on Leveraging Applications of Formal Methods,
   We extracted class and activity diagrams from the original          Verification and Validation (ISoLA’12), ser. Lecture Notes in Computer
                                                                       Science 7609, T. Margaria and B. Steffen, Eds.         Heraclion, Crete:
version of each subject program and its test cases. We adapted         Springer, Oct. 2012, pp. 112–126.
the diagrams from one version to the following version. Then,      [9] S. Yoo and M. Harman, “Regression Testing Minimization, Selection
we applied MaRTS, DejaVu, and ChEOPSJ to classify the test             and Prioritization: A Survey,” Journal of Software Testing, Verification
                                                                       and Reliability, vol. 22, no. 2, pp. 67–120, Mar. 2012.
cases. We used PIT3 to apply method-level mutation operators      [10] E. Engström, P. Runeson, and M. Skoglund, “A Systematic Review
to the adapted versions at the code level. We ran PIT with             on Regression Test Selection Techniques,” Information and Software
both the original and retestable test sets on each version, and        Technology, vol. 52, no. 1, pp. 14–30, Jan. 2010.
                                                                  [11] L. J. White and K. Abdullah, “A Firewall Approach for Regression
reported the killed mutants by each test set. The inclusiveness        Testing of Object-Oriented Software,” in Proceedings of the 10th In-
of DejaVu and MaRTS was 100% for all the programs. The                 ternational Software Quality Week (QW’97), San Francisco, CA, USA,
inclusiveness of ChEOPSJ was 94% for JUNG, 96% for Chess,              May 1997.
                                                                  [12] D. C. Kung, J. Gao, P. Hsia, Y. Toyoshima, and C. Chen, “On Regression
and 92% for Siena. The precision was 100% for MaRTS and                Testing of Object-Oriented Programs,” Journal of Systems and Software,
DejaVu. The precision of ChEOPSJ was 100% for JUNG and                 vol. 32, no. 1, pp. 21–40, Jan. 1996.
Chess, and 62% for Siena. ChEOPSJ did not produce results         [13] M. Skoglund and P. Runeson, “Improving Class Firewall Regression
                                                                       Test Selection by Removing the Class Firewall,” International Journal
for XML-security because of a bug in this tool. The retestable         of Software Engineering and Knowledge Engineering, vol. 17, no. 3, pp.
test sets obtained by MaRTS achieved the same fault detection          359–378, Jun. 2007.
ability that was achieved by the full test sets.                  [14] Q. D. Soetens, S. Demeyer, A. Zaidman, and J. Pérez, “Change-
                                                                       Based Test Selection: An Empirical Evaluation,” Empirical Software
   We conducted a preliminary evaluation of FLiRTS. We                 Engineering, pp. 1–43, Nov. 2015.
obtained comparable results on inclusiveness and precision        [15] B. Korel, L. H. Tahat, and B. Vaysburg, “Model Based Regression
with DejaVu and MaRTS [18].                                            Test Reduction Using Dependence Analysis,” in Proceedings of the
                                                                       International Conference on Software Maintenance (SM’02). Montreal,
 VII. P ROPOSED T IMELINE FOR THE R EMAINING TASKS                     Quebec, Canada: IEEE, Oct. 2002, pp. 214–233.
                                                                  [16] M. Al-Refai, S. Ghosh, and W. Cazzola, “Model-based Regression Test
  August - October 2017:                                               Selection for Validating Runtime Adaptation of Software Systems,” in
  ∙ Collect subjects for the experiments of FLiRTS.                    Proceedings of the 9th IEEE International Conference on Software Test-
                                                                       ing, Verification and Validation (ICST’16), L. Briand and S. Khurshid,
  ∙ Complete the implementation of FLiRTS.                             Eds. Chicago, IL, USA: IEEE, 10th-15th of Apr. 2016, pp. 288–298.
  ∙ Perform the experiments using FLiRTS and DejaVu.              [17] M. Al-Refai, W. Cazzola, S. Ghosh, and R. France, “Using Models to
                                                                       Validate Unanticipated, Fine-Grained Adaptations at Runtime,” in Pro-
October - December 2017:                                               ceedings of the 17th IEEE International Symposium on High Assurance
  ∙ Analyze and report the results of the experiments.                 Systems Engineering (HASE’16), H. Waeselynck and R. Babiceanu, Eds.
  ∙ Write the dissertation.                                            Orlando, FL, USA: IEEE, 7th-9th of Jan. 2016, pp. 23–30.

 3 http://pitest.org
[18] M. Al-refai, W. Cazzola, and S. Ghosh, “A Fuzzy Logic Based Approach        vol. 42, no. 10, pp. 44–51, Oct. 2009.
     for Model-based Regression Test Selection,” in Proceedings of the      [21] T. Vogel and H. Giese, “Adaptation and Abstract Runtime Models,”
     ACM/IEEE 20th International Conference on Model Driven Engineering          in Proceedings of the ICSE Workshop on Software Engineering for
     Languages and Systems (MoDELS’17), Austin, TX, USA, 17th of Sep.-           Adaptive and Self-Managing Systems (SEAMS’10). Cape Town, South
     22th of Sep. 2017.                                                          Africa: ACM, May 2010, pp. 39–48.
[19] D. Garlan, S.-W. Cheng, A.-C. Huang, B. Schmerl, and P. Steenkiste,    [22] W. Cazzola, N. A. Rossini, P. Bennett, S. Pradeep Mandalaparty, and
     “Rainbow: Architecture-Based Self Adaptation with Reusable Infras-          R. B. France, “Fine-Grained Semi-Automated Runtime Evolution,” in
     tructure,” IEEE Computer, vol. 37, no. 10, pp. 46–54, Oct. 2004.            MoDELS@Run-Time, ser. Lecture Notes in Computer Science 8378,
[20] B. Morin, O. Barais, J.-M. Jézéquel, F. Fleurey, and A. Solberg,            N. Bencomo, B. Cheng, R. B. France, and U. Aßmann, Eds. Springer,
     “Models@Run.time to Support Dynamic Adaptation,” IEEE Computer,             Aug. 2014, pp. 237–258.