=Paper=
{{Paper
|id=Vol-1848/CAiSE2017_Forum_Paper8
|storemode=property
|title=VarMeR – A Variability Mechanisms Recommender for Software Artifacts
|pdfUrl=https://ceur-ws.org/Vol-1848/CAiSE2017_Forum_Paper8.pdf
|volume=Vol-1848
|authors=Iris Reinhartz-Berger,Anna Zamansky
|dblpUrl=https://dblp.org/rec/conf/caise/Reinhartz-Berger17
}}
==VarMeR – A Variability Mechanisms Recommender for Software Artifacts==
<pdf width="1500px">https://ceur-ws.org/Vol-1848/CAiSE2017_Forum_Paper8.pdf</pdf>
<pre>
 VarMeR – A Variability Mechanisms Recommender for
                 Software Artifacts

                         Iris Reinhartz-Berger and Anna Zamansky

                Department of Information Systems, University of Haifa, Israel
                         iris@is.haifa.ac.il, annazam@is.haifa.ac.il


        Abstract. Software is typically not developed from scratch and reuse of existing
        artifacts is a common practice. Consequently, variants of artifacts exist, challeng-
        ing maintenance and future development. In this paper, we present a tool for
        identifying variants in object-oriented code artifacts (in Java) and guiding their
        systematic reuse. The tool, called VarMeR – a Variability Mechanisms Recom-
        mender, utilizes known variability mechanisms, which are techniques applied to
        adapt generic (reusable) artifacts to the context of particular products, for both
        identification of variants and recommendation on systematic reuse. Building on
        ontological foundations for representing variability of software behaviors,
        VarMeR visually presents the commonality and variability of the classes in dif-
        ferent products and recommendations on suitable polymorphism variability
        mechanisms to increase systematic reuse.


        Keywords: Software Product Line Engineering, Variability Analysis, Variabil-
        ity Mechanisms, Polymorphism, Ontology


1       Introduction

In practice, software reuse takes in many cases an ad-hoc form. Often while artifacts
are not developed with reuse in mind, it is later achieved by duplicating artifacts and
adapting them to the particular needs (a clone-and-own approach). Such an approach is
easy to follow and intuitive, but has deficiencies in maintenance and future develop-
ment. Thus, various methods have been suggested to detect variants, mainly in code,
e.g., [1], [5]. Targeting at comparing and evaluating clone detection tools, four types of
clones are mentioned in [2]: Type 1 – an exact copy without modifications (except for
white space and comments); Type 2 – a syntactically identical copy (only variable, type,
or function identifiers were changed); Type 3 – a copy with further modifications (state-
ments were changed, added, or removed); Type 4 – a syntactically different copy which
performs the same computation. Taking clone detection one step forward, a method,
named ECCO (Extraction and Composition for Clone-and-Own), is introduced in [3]
to enhance the clone-and-own approach with reuse capabilities. Given a selection of
the desired features by the software engineer, ECCO finds the appropriate software
artifacts to reuse and also provides hints whether they need adaptation. The adaptation
itself, however, is left to the software engineer.


X. Franch, J. Ralyté, R. Matulevičius, C. Salinesi, and R. Wieringa (Eds.):
CAiSE 2017 Forum and Doctoral Consortium Papers, pp. 57-64, 2017.
Copyright 2017 for this paper by its authors. Copying permitted for private and academic purposes.
   In order to provide guidance to the adaptation process and to extract reusable arti-
facts which make future development and maintenance easier, we suggested in [11],
[12], [13] a framework for identifying variants of software artifacts and associating
them with variability mechanisms – techniques applied to adapt reusable artifacts to the
context of particular products in Software Product Line Engineering (SPLE) [8]. The
framework is based on ontological foundations, where software artifacts are viewed as
things exhibiting behavior. The framework allows us to identify similar behaviors (ra-
ther than cloned realizations) and associate different variability mechanisms based on
the characteristics of related similarity mappings. To support this approach, we have
developed a tool called VarMeR – a Variability Mechanisms Recommender – which
gets object-oriented code artifacts (in Java) that belong to two products and provides
graphs that capture the commonality and variability of the classes of those products.
The tool further recommends how to increase reuse by utilizing suitable variability
mechanisms on similar classes. Currently VarMeR supports recommendations on three
mechanisms related to polymorphism.
   The rest of this paper is structured as follows. Section 2 provides the background of
the approach, while Section 3 presents the capabilities of the VarMeR tool. Finally,
Section 4 summarizes and refers to future development plans.


2        The Approach

The approach at the heart of VarMeR analyzes the commonality and variability of prod-
ucts behaviors and provides reuse recommendations in the form of associating poly-
morphism mechanisms to classes that behave similarly (even if their realizations are
different). Accordingly, the approach is composed of three steps, which are shown in
Figure 1 and elaborated next: Extract Behaviors, Compare Behaviors, and Analyze Var-
iability.

                                 Products
                                 representations                Similar elements
    P1


              Extract                      Compare                             Analyze           Reuse
             Behaviors                     Behaviors                          Variability        Recommendations


    P2

               Ontological                    Similarity                           Variability
               foundation                     measures                             mechanisms


                         Figure 1. A high level overview of the approach
   Extracting Behaviors. Referring to a software behavior as a triplet of initial state,
external event, and final state [11], this step extracts those behavioral components from
the operations of the different classes. Each class operation specifies some behavior of
the software product. We assume that the operation name captures the essence of the
behavior and thus can describe the external event, e.g., Borrow and Return of a Book
Copy class in a library management system.


                                                           58
   For extracting initial and final states, we distinguish between two levels: shallow –
which refers to the signature of the operation, and deep – which takes into consideration
the operation’s behavior in terms of attributes used and modified throughout the oper-
ation1. The initial state of the behavior is composed of all the parameters passed to the
operation (part of shallow) and all the class attributes used (read) by the operation (part
of deep). The final state consists of the returned type (part of shallow) and all the class
attributes modified (set) by the operation (part of deep). For the operation Borrow of
the Book Copy class, we can consider the attributes AvailabilityStatus and Borrow-
ingPeriod for the initial state, as they are needed for the operation to be executed. The
attribute AvailabilityStatus is further modified as a result of the operation execution
and hence is considered part of the operation’s final state.
   Compare Behaviors. After extracting the behaviors and their shallow and deep lev-
els, a similarity mapping between their constituents is applied. This mapping can be
based on existing general-purpose or domain-specific similarity metrics or some com-
bination of such metrics. The metrics can take into account semantic considerations
using semantic nets or statistical techniques to measure the distances among words and
terms [10]. Alternatively, they can use type or schematic similarities, potentially ignor-
ing the semantic roles or essence of the compared elements [6]. The similarity mapping
associates to each operation’s constituent (shallow or deep) all of its similar counter-
parts in the other operation (i.e., elements whose similarity with the given constituent
exceeds some predefined threshold).
   Analyze Variability. Suppose that the constituents of two operations o1 and o2 in
classes C1 and C2 respectively are mapped using a similarity mapping sim, namely, the
similarity of their constituents exceeded some predefined threshold. We can distinguish
between the following situations with respect to sim:
   1.   USE – each constituent of o1 has exactly one counterpart in o2 and vice versa.
   2.   REF (abbreviation for refinement) – at least one constituent in o1 has more than one
        counterpart in o2.
   3.   EXT (abbreviation for extension) – at least one constituent in o1 has no counterpart in
        o2.
   Note that REF and EXT are not mutually exclusive; we refer to a combination of
both as REF-EXT (abbreviation for refined extension).
   Aggregating the above notions from the level of operations to the level of classes,
we aim at recommending on appropriate variability mechanisms. Our current focus is
on the common polymorphism mechanisms. Polymorphism is the provision of a sin-
gle interface to entities of different types. Therefore, the cases of polymorphism are
characterized by similar signatures of operations (namely, the USE category in the shal-
low level of the operations). We further focus on three types of polymorphism which
are widely used in industry [14]: subtype (inclusion) polymorphism (e.g., function
pointers, inheritance), parametric polymorphism (e.g., C++ templates), and overload-
ing. Table 1 presents recommendations for those polymorphism mechanisms based on
the reuse mapping characteristics.


1 We consider only attributes and ignore local variables, as the latter can be defined for imple-
   mentation and realization purposes and may hinder the essence of the operation’s behavior.


                                               59
              Table 1. Characteristics of Polymorphism Variability Mechanisms
    Shallow    Deep       Description            Variability       Recommendation
                                                 mechanism
    USE        USE        Both signatures        Parametric        Add complete behavior or
                          and behaviors are      polymorphism      behavior template as a core
                          similar                                  asset and utilize the para-
                                                                   metric polymorphism
    USE        REF        Signatures are sim-    Subtype           Add the behavior as a core
                          ilar and behavior is   polymorphism      asset and utilize the subtype
                          refined                                  polymorphism
    USE        EXT        Signatures are sim-    Subtype           Add the behavior as a core
                          ilar and behavior is   polymorphism      asset and utilize the para-
                          extended                                 metric polymorphism; use
                                                                   (procedure) calls to the less
                                                                   extended code
    USE        REF-       Signatures are sim-    Subtype           As with Refinement and
               EXT        ilar and behavior is   polymorphism      Extension
                          both refined and
                          extended
    USE        Not        Signatures are sim-    Overloading       Add behavior interface as a
               mapped     ilar and behavior is                     core asset and utilize over-
                          different                                loading


3         The VarMeR Tool

We implemented the approach in Java. The main inputs of the tool – VarMeR – are jar
files implementing the software products (or applications) to be compared. Figure 2
presents the user interface of VarMeR: besides the names and paths to the compared jar
files, the tool supports selection of similarity-related information, including thresholds,
weights, and measures. Similarity measures define the way similarity is calculated.
VarMeR currently supports the text semantic similarity metric of Mihalcea, Corley, and
Strapparava (MCS) [10] that combines corpus-based and knowledge-based measures,
the Latent Semantic Analysis (LSA) metric [9], and UMBC – top N similar words and
phrase similarity metric [4]. The element and parameter name weights define the ratio
between name and type similarities of elements (operations or attributes) and parame-
ters, respectively. The weights are taken into consideration when comparing behaviors.
As the names of parameters are more often meaningless (with respect to attribute/oper-
ation names), the tool supports separate weights. Finally, the similarity threshold de-
fines the minimal value above which elements are considered similar.
    The jar files of the compared products are reverse engineered into class diagrams (in
XMI format) and Program Dependence Graphs (PDG)2 [7] (in JSON format). The shal-
low and deep levels of the behaviors are extracted from those representations. Then the
behaviors are compared utilizing the similarity-related information provided as input.


2   PDG explicitly represents the data and control dependencies of a program.


                                                 60
Finally, variability is analyzed using the features of the three types of polymorphism
(see Table 1).


                        Figure 2. A screenshot of VarMeR inputs
   The outcome of VarMeR is presented visually in the form of graphs. Each graph,
comparing two products, shows the classes of the two products in different colors (one
for the first product and one for the second product). The size of the nodes is propor-
tional to the number of operations in the corresponding classes (the larger the node is,
the more operations the class have). When hovering a node with the mouse, a tooltip
showing the behavior appears, presenting the list of all operations of the current class.
This way the user (e.g., a programmer or a code reviewer) can get a general idea on the
role of each class, not just by its name, but also by the behavior it is expected to support.
   The edges of the graphs (links between classes) represent recommendation on vari-
ability mechanisms, where:
    the label on the edge (link) indicates which variability mechanisms were identified:
     parametric, subtyping, and/or overloading.
    the width of the edge (link), as well as its length, represents the degree of evidence
     (i.e., the number of operations related with a certain type of polymorphism; the
     thicker/longer the link is, the more evidence to use the recommended variability
     mechanism exist).
   An example of VarMeR’s output is depicted in Figure 3. The numbers in parentheses
in the tooltip indicate the numbers of operations with certain names (but different sig-
natures). The top of the screen includes controls that allow defining the thresholds
above which a given mechanism (parametric, subtyping, or overloading) will be pre-
sented. In other words, these weights separately control the minimal numbers of oper-
ation pairs that need to satisfy certain constraints (USE, REF, EXT, REF-EXT) so that
the given mechanism will be recommended. In the top right part of the screen, the user
can hide classes unrelated to classes in the other product (such as testFlight in Figure


                                             61
3). The bottom of the screen supports selecting colors for the classes of each of the two
products and the links between them. Note that currently, VarMeR compares classes
from different products (and not classes from the same product that may further in-
crease reuse). Hence, links connect nodes of alternating colors, potentially connecting
a single node to several nodes representing classes in the other product (see scribeFilter
in Figure 3 as an example).


                       Figure 3. An example of VarMeR output
   The tool further enables zooming into relations (links) among classes to have better
understanding how the recommendations are generated and how to apply the recom-
mended variability mechanisms and systemize reuse of the corresponding classes. This
option presents the related operations of the classes and the categories to which the
links among them belong: USE, REF, EXT, REF-EXT. Figure 4 zooms into the sub-
typing link of Figure 3 presenting EXT and REF-EXT relations among operations of
the corresponding classes. This mapping can be used by a programmer in order to create
a class from which the two compared classes (ScribeGui and ScribeBinder) can inherit
through sub-typing polymorphism.


                                           62
                     Figure 4. Zooming into a relation in VarMeR


4      Summary and Future Work

We presented a tool, called VarMeR – A Variability Mechanisms Recommender, which
is based on ontological foundations for representing variability of behaviors of software
products. The inputs of VarMeR are object-oriented code artifacts (jar files) belonging
to different software products. The outputs are graphs which can be used for analyzing
the commonality and variability of the classes in different products and recommending
on suitable polymorphism variability mechanisms to increase systematic reuse.
    In the future, we intend to extend the tool support in several ways. First, VarMeR
will be extended to support further variability mechanisms besides polymorphisms. An-
other direction is incorporating VarMeR into existing programming environments, so
that it will be relatively easy to apply the recommendations generated by VarMeR. Fur-
thermore, we intend to empirically explore the usefulness of VarMeR and the quality
of its outcomes.
Acknowledgment. The authors would like to thank Jonathan Liberman, Alex Kogan
and Asaf Mor for their help in the implementation of the VarMeR tool. The second
author was supported by the Israel Science Foundation under grant agreement 817/15.


                                           63
References

[1] Baker, B. S. (2007). Finding Clones with Dup: Analysis of an Experiment. IEEE Transac-
     tions on Software Engineering 33 (9), pp. 608-621.
[2] Bellon, S., Koschke, R., Antoniol, G., Krinke, J., and Merlo, E. (2007). Comparison and
     Evaluation of Clone Detection Tools. IEEE Transactions on Software Engineering 33 (9),
     pp. 577-591.
[3] Fischer, S., Linsbauer, L., Lopez-Herrejon, R. E., and Egyed, A. (2014). Enhancing Clone-
     and-Own with Systematic Reuse for Developing Software Variants. IEEE International
     Conference on Software Maintenance and Evolution, pp. 391-400.
[4] Han, L., Kashyap, A., Finin, T., Mayfield, J., & Weese, J. (2013). UMBC EBIQUITY-
     CORE: Semantic textual similarity systems. In Proceedings of the Second Joint Conference
     on Lexical and Computational Semantics (Vol. 1), pp. 44-52.
[5] Kamiya, T., Kusumoto, S., and Inoue, K. (2002). CCFinder: A Multilinguistic Token-Based
     Code Clone Detection System for Large Scale Source Code. IEEE Transactions on Software
     Engineering 28, pp. 654-670.
[6] Kashyap, V. and Sheth, A., (1996). Semantic and schematic similarities between database
     objects: a context-based approach. The VLDB Journal—The International Journal on Very
     Large Data Bases, 5(4), pp. 276-304.
[7] Krinke, J. (2001). Identifying Similar Code with Program Dependence Graphs. 8th Working
     Conference on Reverse Engineering, pp. 301-309.
[8] Lee, J. and Hwang, S. (2013). A Review on Variability Mechanisms for Product Lines.
     ICCA’ 2013, ASTL vol. 24, pp. 1-4.
[9] Landauer, T. K., Foltz, P. W. and Laham, D. (1998). Introduction to Latent Semantic Anal-
     ysis. Discourse Processes 25, pp. 259-284.
[10] Mihalcea, R., Corley, C., and Strapparava, C. (2006). Corpus-based and knowledge-based
     measures of text semantic similarity. American Association for Artificial Intelligence
     (AAAI’06), pp. 775-780.
[11] Reinhartz-Berger, I., Zamansky, A., & Wand, Y. (2016). An Ontological Approach for Iden-
     tifying Software Variants: Specialization and Template Instantiation. 35th International
     Conference on Conceptual Modeling (ER’2016), pp. 98-112.
[12] Reinhartz-Berger, I., Zamansky, A., and Kemelman, M. (2015). Analyzing Variability of
     Cloned Artifacts: Formal Framework and Its Application to Requirements. Enterprise, Busi-
     ness-Process and Information Systems Modeling, EMMSAD’2015, pp. 311-325.
[13] Reinhartz-Berger, I., Zamansky, A., and Wand, Y. (2015). Taming Software Variability:
     Ontological Foundations of Variability Mechanisms. 34th International Conference on Con-
     ceptual Modeling (ER'2015), LNCS 9381, pp. 399-406.
[14] Zhang, B., Duszynski, S., and Becker, M. (2016). Variability Mechanisms and Lessons
     Learned in Practice. 1st International Workshop on Variability and Complexity in Software
     Design (VACE'2016), pp. 14-20.


                                             64

</pre>