=Paper= {{Paper |id=Vol-2070/paper-05 |storemode=property |title=Analysis of a Clone-and-Own Industrial Automation System: An Exploratory Study |pdfUrl=https://ceur-ws.org/Vol-2070/paper-05.pdf |volume=Vol-2070 |authors=Nick Lodewijks }} ==Analysis of a Clone-and-Own Industrial Automation System: An Exploratory Study== https://ceur-ws.org/Vol-2070/paper-05.pdf
       Analysis of a Clone-and-Own Industrial Automation
                 System: An Exploratory Study

                                                  Nick Lodewijks
                                             University of Amsterdam,
                                                 The Netherlands
                                             nicklodewijks@gmail.com



                                                                   many practitioners, mainly because of its simplicity
                                                                   and availability (Dubinsky et al., 2013).
                       Abstract                                        While the general belief is that clone-and-own is
                                                                   a bad and unsustainable development technique, it
    In industry, the development of similar prod-                  has been used successfully for the development of
    ucts is often addressed by cloning and modi-                   the MES-Toolbox; a large (±1 million lines of Java
    fying existing artifacts. This so-called clone-                code) proprietary factory automation system. Over
    and-own approach is often considered to be a                   the past 17 years, for each new customer, an exist-
    bad practice but is perceived as a favorable                   ing system was cloned and modified in any possible
    and natural software reuse approach by many                    way to add, modify or remove functionality. With
    practitioners. Unfortunately, current litera-                  over 70 implementations of the systems running world-
    ture lacks quantitative information about the                  wide, the company now seeks to reduce maintenance
    positive and negative effects of clone-and-own.                 overhead. Unfortunately, the decision on how to move
    In this paper, we present the results of our ex-               forward from a successful clone-and-own approach is
    ploratory analysis of an industry system devel-                not straightforward.
    oped using the clone-and-own approach. We                          Over the past decade, several tools and techniques
    found that products from the same product                      for dealing with cloned product variants have been pro-
    family can vary significantly in change activ-                 posed. Some of them advocate the elimination of all
    ity over time, divergence from their origin and                clones by merging the variants into a single platform,
    synchronization activity. We will further in-                  and others propose to maintain multiple variants as-
    vestigate these factors to develop quantitative                is (Rubin, Czarnecki, and Chechik, 2013). What ap-
    measures for the assessment of clone-and-own                   proach works best for a given situation depends on the
    benefits and drawbacks.                                        domain and context of that situation. In some cases
                                                                   eliminating all clones and adopting an integrated plat-
1    Introduction                                                  form is neither possible nor beneficial (Antkiewicz et
                                                                   al., 2014). Eliminating clones will increase coupling,
Cloning is often considered to be a practice harm-                 and changing shared code may require re-testing of all
ful to the quality of source code, and potentially a               systems that use it (Dubinsky et al., 2013). If the suc-
cause of maintainability problems (Kapser and God-                 cess of the product highly depends on the benefits of
frey, 2006; Thummalapenta et al., 2010). Yet, in in-               clone-and-own, then its merits should be considered
dustry, the development of similar products is often               before moving away to a different approach.
addressed by cloning and modifying existing artifacts.
                                                                       The main objective of our study is to explore the
This so-called clone-and-own approach is perceived as
                                                                   evolution of MES-Toolbox systems and to gain in-
a favorable and natural software reuse approach by
                                                                   sight into how clone-and-own may have affected on-
                                                                   going project development and maintenance. In this
Copyright c by the paper’s authors. Copying permitted for
private and academic purposes.
                                                                   paper, we show how version control system metadata,
Proceedings of the Seminar Series on Advanced Techniques and
                                                                   source-code differencing, and visualization techniques
Tools for Software Evolution SATToSE 2017 (sattose.org).           can be used to identify clone-and-own related points
07-09 June 2017, Madrid, Spain.                                    of interest in the evolution of a product family.




                                                               1
2    Subject System                                                and Jager, 2015). For every new factory, a clone of
                                                                   the codebase of the latest platform release is realized
The system studied in this work is the MES-Toolbox;                by creating a branch with the Subversion version con-
a 17-year-old proprietary Java-based factory automa-               trol system. The clone is then configured and changed
tion system developed by ENGIE. The main purpose                   in any possible way by Application Engineers to add,
of the systems is for automation of batch and con-                 modify or remove functionality. Each clone corre-
tinuous production processes. It can visualize, con-               sponds to the automation system of a factory some-
trol and register every step of an entire production               where around the world for some specific customer.
process. From the intake of raw material (unloading                Each clone is a variant of the base platform. We refer
from trucks, ships, bags, pallets, containers), prepara-           to the collection of all MES-Toolbox variants as the
tion (dosing, weighing, heating), processing (pressing,            MES-Toolbox product family.
grinding, mixing), storage, to the distribution of end                Between clones there exists a varying degree of com-
products to customers. Depending on what customers                 monality, and there is often no clear relation between
require for their production process, the system per-              the clones. Clones developed for the same customer
forms article and recipe management, quality regis-                might have more in common than clones developed
tration, production planning, tracking and tracing of              for different customers. For example, if a company
materials used in production, stock control, shift reg-            requires all their production facilities to have identi-
istration, production performance analysis and com-                cal branding and communication interfaces with third-
municates with ERP systems. To monitor and con-                    party systems. However, even clones that appear to be
trol physical production equipment (e.g., conveyors,               unrelated in terms of end-user requirements may still
mixers, weigher, buttons, lights), the MES-Toolbox                 have some forms of commonality, such as the graphical
communicates with Programmable Logic Controller’s                  user interface components or the configuration frame-
(PLC’s) that perform the actual low-level control of               work that is used.
these physical devices.
    Over the past 17 years, the system has grown to                3   Research Questions
contain more than 6500 Java files, with a total of ap-
proximately 1 million lines of Java code. While the de-            Dubinsky et al. (2013) observed that independence
sign of the system has a modular structure and aims to             provided by clone-and-own is one of the major reasons
separate common code from customer implementation                  for considering cloning as an efficient reuse mechanism.
code as much as possible, it’s a monolithic application.           Developers can make any change required to satisfy
Nearly all source-code is contained in a single project,           customer requirements, without affecting other clones.
which is developed, built and versioned as a whole. In-            They do not have to collaborate with teams working
ternally, this project is called the Standard project, as          on other systems, that may have different priorities or
it is used as a basis for all new projects. This project,          scheduling constraints. These characteristics of clone-
which can be considered as the main platform of the                and-own have to be considered when new change mech-
product family, contains a constantly growing set of               anisms are introduced, since different techniques may
reusable core components and ready-to-use standard                 not provide the same degree of independence. But how
solutions.                                                         much independence is needed, and how can it be mea-
    Within the organization there is a clear distinction           sured? In this section, we describe three research ques-
between platform development and application devel-                tions that we will use to explore independence-related
opment, this distinction is often found in a Software              characteristics of the MES-Toolbox product family.
Ecosystem (SECO) (Lettner, Angerer, Grünbacher, et                 RQ1: Do MES-Toolbox systems change in parallel?
al., 2014). A small team of five developers is respon-             When cloning is used to develop systems independent
sible for the overall design, development, and mainte-             of each other, developers can decide when to change
nance of the system. The founder and writer of the                 the codebase of each individual system. The develop-
first line of code of this system is also still part of this       ment of each system can follow its own release and de-
team. Work of this team is focused on maintenance of               velopment schedule that is based on available resources
the core platform, development of complex customer                 and requirements for the system. The development of
specific features, standardization of functionality, de-           complex systems with many customer-specific modifi-
velopment of product configuration tools, and provide              cations may require and allow for months of continu-
support to application engineers.                                  ous, frequent change, while relatively straightforward
    Even though the system is highly configurable,                 and simple systems might have strict deadlines and
cloning is used to address the specificity and high de-            require only a few changes within the first weeks.
gree of variation of customer requirements often found                To explore whether MES-Toolbox systems may
in the domain of industrial automation (Schrock, Fay,              have benefited from this time aspect of independence,




                                                               2
we want to gain a rough understanding of the degree                                            21(35%)

of parallel change in the MES-Toolbox product family.                                   20


We hypothesize that a schedule-driven need for inde-
pendence may lead to a lack of parallel development,                                                                  15(25%)
                                                                                        15
whereas some relation between systems (e.g.: sys-




                                                                    Number of Systems
tems developed the same customer) or a collaboration-                                                                                 11(18.3%)

overhead driven need for independence may lead to                                       10
                                                                                                                                                  8(13.3%)
parallel change. For the purpose of this exploratory
study, we do not yet use a strict definition of paral-                                   5
                                                                                                         5(8.3%)

lel change. Instead, we are interested in any form of
seemingly parallel change. Do systems change in par-
allel every week, month or year? Do many systems                                         0


change at roughly the same time, or is this only the
                                                                                                 Pre       7.1           7.2            7.2.1       7.3
                                                                                                                   Platform Version

case for specific systems?
                                                                                              Figure 1: Distribution of System Versions
RQ2: How much do MES-Toolbox systems diverge
from their origin? Clone-and-own allows developers              nance overhead caused synchronization for these sys-
to add, remove or modify files without affecting their           tems. In the MES-Toolbox product family, synchro-
origin. These changes will inherently cause systems to          nization with products and their origin can occur in
diverge from their origins; they are no longer identical.       both ways. Bugs are often found and fixed on a prod-
As the product family grows it often becomes increas-           uct, after which the change is propagated to the plat-
ingly hard to keep an overview of the available func-           form project. From there, the change can be propa-
tionality (Stanciulescu, Schulze, and Wa̧sowski, 2015;          gated to all the other products derived from that plat-
Berger et al., 2014; Duc et al., 2014). We hypothesize          form version.
that the degree of divergence can be used to quan-
tify the complexity caused by cloning. Therefore, we            4                        Research Methodology
are interested to see how this property of the MES-
                                                                To explore the evolution of MES-Toolbox systems, we
Toolbox product family has evolved over time.
                                                                built a tool that retrieves changes to each system from
   A developer of the MES-Toolbox platform stated
                                                                the subversion (SVN) repository, performs source-code
that diverged Java files often make it difficult to prop-
                                                                differencing and exports the relevant information to a
agate changes, but expected that the Java codebase
                                                                CSV file for further analysis in R. Our tool is embed-
would not significantly diverge for most of the 7.2 and
                                                                ded in a modified version of JMeld1 , an open source
7.2.1 systems. Many of these systems are considered
                                                                differencing tool written in Java.
relatively simple, and hardly require any customer-
specific modification of the codebase.                             First, we make a local copy of the SVN repository
                                                                with the command svnadmin hotcopy, and verify its
RQ3: Have all MES-Toolbox systems been synchro-                 integrity with svnadmin verify in the analysis envi-
nized with their origin? Cloning is said to increase            ronment. This local repository is used for all data col-
maintenance overhead because changes to one clone               lection to ensure that the data source does not change
may have to be propagated to all clones. Studies                during subsequent analysis.
have shown however that change propagation is not al-
ways performed (Stanciulescu, Schulze, and Wa̧sowski,           4.1                          Selecting MES-Toolbox Systems
2015), which suggests that cloning does not necessarily
increase maintenance overhead due to change propa-              We extract all systems present in the local copy of the
gation.                                                         repository by scanning the output of svn ls2 for paths
   In the organization we study, changes are manu-              in the form of projecten/.*/trunk/$. We then man-
ally propagated at the discretion of teams developing           ually validate these paths and documented for each
the systems. Application engineers stated that they             system the platform version it was branched from, the
periodically merge changes from the platform release            name of the project, an anonymised name, the repos-
to customer systems, but only while they are still un-          itory path, and any unusual properties of the system
der active development. Because some systems are                that we have to consider during analysis. For example,
developed relatively fast, we expect that some sys-             development of some systems was discontinued and the
tems retrieve only very few changes from their origin,          systems were never put into production. We excluded
thus arguably not causing much maintenance over-                these systems from the analysis. Finally, we noted
head. Consequently, techniques that purely reduce                       1 https://sourceforge.net/projects/jmeld/

repetitive task would have limited effect on mainte-                     2 svn ls -R {svnRepo} | egrep "projecten/.*/trunk/$"




                                                            3
whether the system was directly branched from the                                                   Change Activity Over Time
platform, or from another branch (its nesting depth).
                                                                                    System 1   ●       ●         ●        ●     ●


   There are currently four platform versions: 7.1, 7.2,                                                                             Commits

7.2.1 and 7.3. The first version of the platform (7.1)                                                                                    1
                                                                                                                                      ●

                                                                                    System 2   ●                 ●              ●




                                                                           System
                                                                                                                                     ● 2
was released on 7 March 2012 and was followed rela-                                                                                  ●3
tively fast by the next release (7.2) on 18 December                                System 3   ●       ●         ●

                                                                                                                                     ●4
2012. Version 7.2.1 of the platform was released on 7                                                                                ●5
October 2014, and version 7.3 on 14 December 2016.
                                                                                    System 4                     ●
                                                                                                                          ●     ●
Figure 1 shows the distribution of versions among sys-                                         01      02       03        04    05
                                                                                                            Date (week)
tems in the version repository. Twenty-one systems
pre-date the first platform release. There are five 7.1              Figure 2: Example visualization for change activity. All
systems, fifteen 7.2 systems, eleven 7.2.1 systems and               systems exhibit different change activities, with a varying
eight 7.3 systems. For this study, we mainly focus on                degree of parallel change.
7.2 and 7.2.1 systems, as these have all been put into
production, and are derived from a comparable base
platform within the last five years. The main difference              4.3      Detecting Parallel Change
between these versions is the internationalization of all
text visible to the end-user. There are no significant
differences in terms of architecture or functionality.                To determine whether systems change in parallel, we
   We refer to the codebase of a specific platform ver-              are interested in the time aspect of change at system-
sion in the form of PL-VERSION, for example, we use                  level granularity. We decided to use a visualization
PL7.2 to refer to version 7.2 of the platform. The in-               which allows us to gain insight into whether (a) sys-
ternal name of the system can contain the name of the                tems change in parallel, (b) systems change continu-
customer, and the location of the production facility.               ously, periodically or at arbitrary moments in time,
Since this information is subject to confidentiality, we             and (c) to identify variance between systems.
manually defined an anonymised name for each system                     For this visualization we chose systems as the first
in the form of P-NUMBER. In this paper, we often refer               dimension and time as the second dimension. To pre-
to this name as Pn , which can be read as project n or               vent overplotting, we group data-points by week or
product n.                                                           month. By grouping data, we will not be able to dis-
                                                                     tinguish between systems that changed many times a
4.2   Mining Commit Metadata                                         week, or only once a month. To mitigate this effect we
For each system we selected, we extract the version                  introduce an additional dimension which is number of
history using a bash script. This bash script uses                   commits (proportional to the radius of the dot). This
the svn log 3 command to export the version history                  leads us to the view shown in Figure 2.
in xml format. For each system we collect all revi-                     The vertical axis represents the systems, and the
sions from its change history, and extract the relevant              horizontal axis the time of the changes. Each dot rep-
change metrics. We used the definition and format of                 resents a point in time when a system was changed.
the change metrics dataset published by Yamashita et                 The radius of the dot is proportional to the number
al. (2017) as a basis for our data set.                              of commits that occurred. In this example, we group
   We extract the revision number, author and date of                the data-points by week. Continuous change activity
each revision. Next, from the output of svn diff4 , we               will give rise to a sequence of horizontally aligned dots.
determine the full path of the files that were changed,              Changing a system twenty times a week will result in
the type of change (added, deleted or modified), and                 a thicker horizontal dot pattern compared to changing
calculate for each file how many lines were changed,                 a system only once a week. In Figure 2 we observe
added or deleted. From the full path of the files, we                that system 1 was under continuous maintenance, as
extract the file name and file extension. Note that we               it was changed every week. System 2 was changed
use svn diff to determine which files were changed,                  every other week, which appears to be more periodi-
and not svn log. The reason for this is that when a                  cal but due to the week-based granularity may still be
directory is deleted, the output of svn log only con-                considered as continuous to some extent. The change
tains the name of the directory, and does not contain                activity for systems 3 and 4 is continuous for the first
the names of the files contained in the directory.                   three weeks, but declining for system 3 and increasing
  3 svn log --xml --stop-on-copy -v  >       for system 4. Finally, we see that systems 1, 2 and 3
-log.xml                                                    all changed in the first week, but system 3 has been
  4 svn diff -x -U0 -c {revisionNumber} {repositoryPath}              modified more frequent.




                                                                 4
4.4   Measuring Divergence                                          revision   system    file         diffDelta     diff
To measure how much systems have diverged from                      1          PL7.2.1   Main.java    5            5
their origin, we developed a tool that calculates how               2          P17       Main.java    -5           0
much the difference between each system and its ori-                 3          P17       Main.java    10           10
gin has changed over time. We do so by calculating                  4          P17       Main.java    5            15
the differences for each system, for each file, at every             5          PL7.2.1   Main.java    -15          0
revision that changed either the system or its origin.
We perform these measurements on a local copy of the            Table 1: Example data of divergence over time calculation.
actual codebase of the systems. For the platforms and
each system, we locally replay their change history by          versa. Subversion automatically registers the merged
sequentially updating the local working copy with svn           revision(s) and the origin of the merge in a so-called
update. After each update of a platform codebase,               svn:mergeinfo property attached to files and direc-
we re-calculate the differences with code-differencing            tories6 . We classify each revision commit as MERGE or
on all systems that have been derived from the plat-            NON_MERGE by scanning the output of svn diff for an
form. Similarly, after each update of the codebase of           occurrence of svn:mergeinfo.
a system, we re-calculate the differences between the               Unfortunately, we cannot blindly trust the validity
system and its origin. This technique is computation-           of Subversion properties. Subversion properties can be
ally intensive but does allow us to explore how much            changed by hand, developers might forget to commit
each revision has affected divergence.                           the changes to properties, or they could manually copy
    We measure differences at line-level granularity             changes between systems without using the merging
(number of lines different) with the Java implemen-              system. We aim to mitigate these issues by taking into
tation of GNU diff 5 . Using the file-level granular-            account whether revisions have caused convergence or
ity measurement, we aggregate to file-level granularity.        divergence. We expect that most changes to systems
By using a line-level granularity instead of a file-level       will cause them to diverge from their origin and that
granularity (number of files different), we will be able         merging these changes to their origin will cause them
to aggregate to file-level granularity and report on both       to converge. Similarly, we expect that changes to the
levels. We define the difference in number of lines as           origin of systems will cause them to diverge, and merg-
diff. During analysis, we keep track of how much the            ing the change to the systems will cause them to con-
difference has increased or decreased compare to the             verge. We manually validate a large sample of data to
previous revision, the diffDelta.                               ensure this is a reliable technique to detect synchro-
    We illustrate the divergence calculation on an arti-        nization.
ficial example in Table 1. In this example, PL7.2.1 is
the origin of system P17 . First, we update the local           5       Results and Analysis
copy of the codebase of PL7.2.1 to revision 1 and cal-
                                                                In this section, we present the results of our ex-
culate the differences between PL7.2.1 and P17 . We see
                                                                ploratory analysis.
that in revision 1, Main.java was modified on PL7.2.1 ,
causing a difference of five lines. Next, we update P17
to revision 2 and re-calculate the differences. We see           5.1     Parallel Change
that Main.java was changed, reducing the difference              RQ1. Do MES-Toolbox systems change in parallel?
by five lines. This pattern of increasing and decreasing        Figure 3 shows the change activity of the PL7.2 and
divergence is typically caused by change propagation            PL7.2.1 platforms, and all systems derived from these
when revision 1 is merged to system P17 in revision 2.          platforms that were included in our study. We see
As the measurements continue, we see that Main.java             that many systems appear to be modified almost con-
was modified two more times on P17 , increasing the             tinuously, even years after the first change was made.
difference by ten lines in revision 3 and five lines in          For example, systems P1 and P3 . Systems P4 , P8 and
revision 4. Finally, in revision 5 the difference was            P18 also appear to be changed continuously, but to a
reduced by fifteen lines by a change on PL7.2.1 .               lesser extent than the first group. The change activity
                                                                for these systems appears less dense and contains more
4.5   Detecting Synchronization                                 periods of inactivity. The longest period of inactivity
                                                                for these systems is approximately four months7 for
Systems retrieving changes from their origin, or con-
                                                                system P4 .
tributing changes to their origin is often done by merg-
ing the revision from the system to its origin or vice             6 http://svnbook.red-bean.com/en/1.7/svn.branchmerge.

                                                                basicmerging.html
  5 http://www.bmsi.com/java/#diff                                 7 124 days, 14 March 2014 to 16 July 2014




                                                            5
                                                                                                                                                                                                                                                                                 Change Activity Over Time

          PL−7.2    ● ●●●●●●●●●●●●●●●●● ●●●●●●● ●●●●●    ●                            ● ●●    ●●●●●●●● ● ●●●● ●●●                     ●   ●●●●●● ● ● ● ● ● ● ● ● ●                             ●●      ●●●● ● ● ● ● ●
                                                                                                                                                                                                                   ●       ●           ●                ●   ● ● ●● ● ● ●             ●   ●● ● ● ●           ●● ●        ●● ● ● ● ● ●               ● ● ●●●                 ●●       ●       ● ● ●               ●   ●   ●       ●       ●       ● ●● ● ● ●● ● ● ● ●                      ●   ●   ●       ●           ●   ● ●●● ●●●                           ●   ●●● ● ●                      ●   ●         ●                                           ●   ● ●                                                           ●                        ●




             P−1    ●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●                           ●       ●   ●
                                                                                                           ●
                                                                                          ●●●●● ●●●●●●●●●●● ● ● ● ●●●●● ●●●● ●●●● ● ●●●●●● ●                                                                                       ●               ●●       ● ●●●●●●                 ●   ●● ●   ●●      ●   ●           ●              ●           ●● ●        ● ● ●● ●             ●   ●                   ● ● ● ● ●●                          ●   ●   ●    ●       ● ●●●                   ●   ●●          ●           ●               ●       ● ● ●           ●                   ●       ● ●●         ●●        ●   ●● ●                      ●● ●●                       ●   ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  ●           ●   ●   ●                    ●               ●   ●   ● ● ●● ●



             P−2         ●
                              ●●● ● ● ● ● ● ●
                             ●●                      ●                   ●
                                                                             ●●   ●   ●   ●●       ●   ●                                           ●       ●                                       ● ●● ● ●            ●
                                                                                                                                                                                                                                               ●                     ●●                                     ●           ●                      ●           ●                                                                                                 ●                       ●                                                                                                                                                                                                    ●   ●               ●




             P−3                  ●       ●   ●●   ●● ●●         ●●●●●   ●●●●●●●●●● ●●●●●●●●●● ● ●●●●
                                                                                                   ●                                      ●●   ●●●●●●●●● ● ● ●●● ●●●●●●●●●●●● ● ● ●● ● ● ●●                                                    ●                             ●   ● ● ● ● ●● ● ●                 ●           ● ● ●                      ● ● ●           ●            ●               ●   ●● ● ● ●                                    ●   ●    ● ●●●● ● ● ●                                ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ● ● ● ● ● ● ● ●●                                ●   ●● ●         ●           ● ● ● ● ●         ●                     ●                   ●       ●                                   ●       ●            ●● ●            ●   ●       ●           ●




             P−4                      ●        ● ●●● ● ● ● ●●    ●● ● ●● ● ● ●●●●●●●●●●●●● ● ●                          ●   ● ● ● ●        ●●          ●           ●    ●                                                              ●●               ●              ● ●               ●              ●●●                   ●   ●    ●   ●   ●       ●                                    ●                           ●●                  ●●●                                              ●       ●               ● ●         ● ●                 ●   ●       ● ● ●●                      ●● ● ●             ●       ●         ●   ●       ●                       ●                   ●               ●                   ●   ●        ●                   ●           ●       ●




             P−5                                             ●●●●●●●
                                                             ●                        ●                    ●   ●                                               ●                                                                                                     ● ●                                                                                                                                                        ●




             P−6                                                         ●● ● ●●●             ●    ●●                       ●                                      ●●       ● ●● ●
                                                                                                                                                                            ●    ●                     ●       ●   ●                       ●                         ● ● ●                                                    ●                                    ●   ●                                                                                     ●




             P−7                                                              ●               ●●       ● ●●●●●     ●●●●●●●●●              ●●●●●●● ● ●●●●●●                             ●   ●   ●   ●       ● ● ● ●● ● ● ●                      ●    ●   ● ● ● ● ●●               ●       ●●● ● ●●       ●●          ●
                                                                                                                                                                                                                                                                                                                        ●●●●●●●● ● ●●●●●● ● ●                                       ●   ●● ● ●                      ●                   ●           ●   ●   ●●   ●   ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ● ●●●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●               ●           ● ●         ●   ●   ●●              ●       ● ● ●                   ●●                            ● ● ●● ●                       ●       ● ●       ●● ● ●        ● ●● ● ●●                       ●       ●●●                 ●●●



                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       7.2
             P−8                                                                                               ●   ●●   ● ● ●●  ●●● ● ● ● ● ● ● ● ● ● ● ● ●●                               ●   ●           ●       ●   ● ●             ●            ●       ●   ●    ● ●                            ●       ●       ●             ●●       ●   ●           ● ● ● ● ●● ● ●               ●
                                                                                                                                                                                                                                                                                                                                                                                                ●●●             ●           ●       ●   ●   ●● ●            ●●           ●   ●● ●            ●       ●                       ●           ●           ●   ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     ●   ●       ●                                  ●                 ●       ●                       ●   ●   ●                   ●       ●                           ●●       ●                       ●




             P−9                                                                                                                               ●   ●
                                                                                                                                                       ●●●●●●●●●●●●●● ●                                ●                   ●   ●           ●            ●            ●●                                                                                                        ●                                                            ●       ●            ●                                                                                                                                                              ● ● ●




            P−10                                                                                                                                           ●● ● ●                ● ●       ●●          ●●      ● ●         ●   ●●● ●                                 ●●                                                                                                                                                                                      ●           ●




            P−11                                                                                                                                                            ●●             ●   ●   ●       ●                   ●           ●                        ●● ●     ● ● ●● ●           ● ●         ●                         ●●       ●                                                                                                             ●                                                               ●   ●




            P−12                                                                                                                                                                 ●         ●                                   ●           ●   ●                     ●●                                                                                    ●                                                                ●                                ●




            P−13                                                                                                                                                                                   ●   ●●●                 ●   ●           ●            ●            ●●                                                                                                        ●                                                            ●       ●            ●                                                                                                                                                                    ●
System




            P−14                                                                                                                                                                                                       ●   ●●●●●●●●●●● ●●●● ● ● ● ● ● ● ● ●●●                    ●
                                                                                                                                                                                                                                                                                                                            ●● ● ● ●●●●● ● ● ● ● ●●                                             ●   ●   ●   ●●          ●
                                                                                                                                                                                                                                                                                                                                                                                                                            ●● ● ● ● ●                      ●●● ● ● ● ●                  ●●● ● ●                     ●●●             ● ●● ● ● ● ● ● ● ● ●                            ●        ●● ●        ●● ● ●        ●   ●         ●● ● ●            ●     ●   ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      ●           ●   ●       ●       ●       ●           ●                ●       ●                       ●       ●




            P−15                                                                                                                                                                                                                                                                                                                                                                                    ●   ● ●  ●      ●
                                                                                                                                                                                                                                                                                                                                                                                                                                    ●●●●            ●   ● ● ● ● ● ● ● ●●●●● ● ● ● ●                                  ●   ● ● ●                                                   ●                          ●           ●                 ●                 ●●                            ●   ●● ●                    ● ●                  ●




         PL−7.2.1                                                                                                                                                                                                                                                            ●       ●● ●       ●   ●   ●●      ●   ●   ●●● ● ● ●              ●       ●   ●● ●            ●        ●   ●           ●   ●       ● ● ●               ●           ● ● ●● ● ● ● ●                   ●       ●   ●   ●       ●           ●   ● ●●● ●● ●                          ●   ●●●             ●                ●        ●                                                  ●               ●                                   ●       ●                        ●




            P−16                                                                                                                                                                                                                                                             ●●          ● ●●   ●●●●●                   ●●●●●●●●●●●                                    ●
                                                                                                                                                                                                                                                                                                                                  ●●●● ●●●●●●●●●●●●●●●●● ● ●● ●● ● ● ● ● ●●●●●●                                                                 ●            ●                                               ●       ●   ●● ●●● ● ●● ● ●●●● ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ●                                                       ●●●●●●● ●                      ● ●●      ●       ●●● ● ● ●● ● ● ● ●●● ● ●● ● ● ●                                         ● ● ● ● ●                        ●           ● ● ●           ●   ●




            P−17                                                                                                                                                                                                                                                                     ●●                         ●                 ●●       ●●●●                ●       ●●● ● ●                          ●●                                                   ●               ●   ●   ●




            P−18                                                                                                                                                                                                                                                                 ●   ●● ●               ●               ●         ●                                    ●       ●●           ●           ●● ●                ● ●●●●                  ●   ●●●          ●   ●   ●●●             ●   ●●                                      ●●              ●● ● ●              ●                ●             ●       ●                 ●   ●                  ● ●●● ● ● ● ● ● ●                        ●   ●       ●       ● ● ● ● ● ●●                 ●●              ●




            P−19                                                                                                                                                                                                                                                                                                        ●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●
                                                                                                                                                                                                                                                                                                                                                                   ●                                                                        ●                                                                    ●                                                                       ●    ● ● ● ● ● ● ●● ● ● ●                        ● ●●          ●    ●    ● ● ●           ●       ● ● ● ● ● ●●                    ●   ●                        ● ● ●




            P−20                                                                                                                                                                                                                                                                                                        ●● ●           ●● ● ● ●                ●       ● ●● ● ●             ●       ●       ●   ●   ●       ●   ●       ●               ●    ●●                      ●●          ●                                                                                                                      ●         ●                                                                       ●   ●




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       7.2.1
            P−21                                                                                                                                                                                                                                                                                                                           ●   ● ● ● ● ●●                          ●●       ●       ●● ●                ●   ● ● ●               ●       ●    ●                                   ●● ●                                        ●● ● ●                          ●   ●            ●       ●                                       ●●        ●    ●        ●       ●           ●           ●       ●




            P−22                                                                                                                                                                                                                                                                                                                                   ●
                                                                                                                                                                                                                                                                                                                                                       ●● ●                         ●       ●●● ●● ● ●●●                                ●   ●   ●● ● ● ●                 ●                                                       ●




            P−23                                                                                                                                                                                                                                                                                                                                                                                                                ●   ●   ●   ●   ●●           ●       ●       ● ●●        ●●● ●       ●       ●       ●               ●   ●       ● ●●                                                                                                                                                             ●           ●            ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ●● ● ● ● ● ●                ●   ●




            P−24                                                                                                                                                                                                                                                                                                                                                                                                                            ●●               ●                       ●●      ●   ●   ●   ●                                               ●   ●           ●   ●                                                                                            ●       ●   ● ● ●● ●                ●




            P−25                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             ●● ● ●          ●   ●
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ●● ● ● ●●                            ●                     ●                                                                                                                                                              ●




            P−26                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              ● ● ● ●● ● ● ● ● ● ●●● ●                     ●

                    2013                                                                                                              2014                                                                                                                                                                      2015                                                                                                                                                                                     2016                                                                                                                                                                                             2017
                                                                                                                                                                                                                                                                                                                                       Date
                                                                                                                                                                                                                                                                           Commits                                      ●         20 ● 40                                  ● 60 ● 80



                                                                                                                            Figure 3: The change activity of PL7.2 and PL7.2.1 systems.
   The majority of the systems show an initial burst                                                                                                                                                                                                                                                                                                       The change activity of MES-Toolbox systems seems
of activity at the beginning of the project, followed by                                                                                                                                                                                                                                                                                               consistent with observations by Lettner, Angerer,
a varying amount of activity afterward. This seems                                                                                                                                                                                                                                                                                                     Grünbacher, et al. (2014), who stressed the impor-
similar to the change frequency of Keba, an industrial                                                                                                                                                                                                                                                                                                 tance of platform quality characteristics like stability
automation ecosystem studied by Lettner, Angerer,                                                                                                                                                                                                                                                                                                      and backward compatibility, and long-term platform
Grünbacher, et al. (2014). In the Keba ecosystem,                                                                                                                                                                                                                                                                                                      evolution in the domain of industrial automation. The
the change frequency reportedly largely depends on                                                                                                                                                                                                                                                                                                     oldest system we analyzed was 11 years old, and still
customer requirements, and most changes happen                                                                                                                                                                                                                                                                                                         continuously changed. Some systems were inactive for
within the first three to four weeks in a customer                                                                                                                                                                                                                                                                                                     years before becoming active again due to new cus-
project. In our case, for many systems, most change                                                                                                                                                                                                                                                                                                    tomer demands. This is not necessarily the case for
activity does appear to occur in the first period of the                                                                                                                                                                                                                                                                                               other systems developed with clone-and-own. Stanci-
project, but this period is much longer (2-4 months).                                                                                                                                                                                                                                                                                                  ulescu, Schulze, and Wa̧sowski (2015) found forks in
Manual examination of some of the changes that oc-                                                                                                                                                                                                                                                                                                     the Marlin ecosystem, an open source firmware for 3D
curred after this initial period, suggests that they are                                                                                                                                                                                                                                                                                               printers, to be characterized by a short maintenance
often (critical) bug-fixes or minor changes requested                                                                                                                                                                                                                                                                                                  lifetime (101 days on average).
by the customer. For example, P5 was changed
                                                                                                                                                                                                                                                                                                                                                          With regard to whether and to what extent MES-
on 31 July 2015 after being inactive for almost
                                                                                                                                                                                                                                                                                                                                                       Toolbox systems have been changed in parallel, we
a year (311 days). Manual analysis of this change
                                                                                                                                                                                                                                                                                                                                                       clearly see that multiple MES-Toolbox systems are
shows that this change was triggered by a customer re-
                                                                                                                                                                                                                                                                                                                                                       changed roughly at the same time. However, the de-
quest after the physical production line was modified.
                                                                                                                                                                                                                                                                                                                                                       gree of parallel change is not the same for all systems,
                                                                                                                                                                                                                                                                                                                                                       nor is it constant over time. Many systems appear to




                                                                                                                                                                                                                                                                                                                                  6
                                               P−1                                P−2                                P−3                                                                                  P−1                                P−2                                P−3
                                                                                                                                                                                  150000
                           20

                           15                                                                                                                                                     100000

                           10
                                                                                                                                                                                   50000
                           5

                           0                                                                                                                                                          0
                                               P−4                                P−5                                P−6                                                                                  P−4                                P−5                                P−6
                                                                                                                                                                                  150000
                           20

                           15




                                                                                                                                         Number of Lines Diverged (.java files)
                                                                                                                                                                                  100000

                           10
% of Java Files Diverged




                                                                                                                                                                                   50000
                           5

                           0                                                                                                                                                          0
                                               P−7                                P−8                                P−9                                                                                  P−7                                P−8                                P−9
                                                                                                                                                                                  150000
                           20

                           15                                                                                                                                                     100000

                           10
                                                                                                                                                                                   50000
                           5

                           0                                                                                                                                                          0
                                              P−10                               P−11                               P−12                                                                                 P−10                               P−11                               P−12
                                                                                                                                                                                  150000
                           20

                           15                                                                                                                                                     100000

                           10
                                                                                                                                                                                   50000
                           5

                           0                                                                                                                                                          0
                                2013   2014   2015   2016   2017   2013   2014   2015   2016   2017   2013   2014   2015   2016   2017                                                     2013   2014   2015   2016   2017   2013   2014   2015   2016   2017   2013   2014   2015   2016   2017
                                                                                 Date                                                                                                                                                       Date


                           Figure 4: Divergence over time for a subset of PL7.2 and PL7.2.1 systems in percentage of files and number of lines.
be changed in parallel initially until the development                                                                                                                               It may be seen clearly that while divergence tends
of one system is done and they no longer change in                                                                                                                                to increase over time, there is a variance both in the
parallel. For example, if we look at systems P4 and                                                                                                                               degree of divergence and rate of divergence. In the first
P5 , we see a major reduction in change activity of sys-                                                                                                                          year of the history of systems P1 , P2 , P3 , and P7 , the
tem P5 after June 2013, but the development of system                                                                                                                             proportion of diverged Java files appears to be highly
P4 continues. This type of pattern is what we would                                                                                                                               volatile compared to the other systems. This can also
expect to see due to a schedule-driven need for inde-                                                                                                                             be seen in divergence in number of lines, but is less
pendence.                                                                                                                                                                         clear.
   Furthermore, we observe at least two vertically                                                                                                                                   In terms of percentage of Java files, all systems at
aligned dot patterns. These patterns occur if multiple                                                                                                                            some point in time diverged between 7% and 22.5%
systems are changed at roughly the same time, while                                                                                                                               from their origin. This suggests that all systems, even
many of those systems did not change before or after                                                                                                                              those that do not frequently change, can diverge sig-
that time. Manual inspection of these patterns shows                                                                                                                              nificantly. In terms of diverged number lines, most
that both instances were critical bug fixes, manually                                                                                                                             systems did not exceed 50.000 lines (<5%), and only
merged to most systems on the same day, regardless                                                                                                                                two systems diverged more than 75.000 lines.
of the development schedule of the systems. The fact                                                                                                                                 Overall, we see that divergence measured in per-
that we do not see many of these vertical line patterns                                                                                                                           centage of Java files can be significantly different from
suggests that mass-synchronization of many systems                                                                                                                                divergence measured in terms of number of lines. In
at once does not happen often in the MES-Toolbox                                                                                                                                  2014 the diverged number of lines for system P6 rapidly
product family.                                                                                                                                                                   increased from less than 25.000 lines to more than
                                                                                                                                                                                  140.000 lines. We do not see this growth in the file-
5.2                                Divergence                                                                                                                                     based measurement. Manual analysis of this anomaly
                                                                                                                                                                                  shows that a developer deleted a module from the
RQ2. How much do MES-Toolbox systems diverge                                                                                                                                      codebase which was not required for the project but
from their origin? In this research question, we cal-                                                                                                                             was causing merge-conflicts.
culate how much MES-Toolbox systems are different                                                                                                                                     Even though the codebase of many systems report-
from their origin, and explore how this property has                                                                                                                              edly hardly required any customer-specific modifica-
changed over time. Figure 4 shows the divergence mea-                                                                                                                             tions, they still diverged significantly. For these sys-
surements over time, in terms of percentage of files and                                                                                                                          tems, this divergence was not caused by changes to the
number of lines.                                                                                                                                                                  systems, but by the lack of synchronization of changes




                                                                                                                                         7
                                                                                                                                            Synchronization Change Activity
                                                                                    Contributed to Origin                                                                                                                            Retrieved from Origin

         P−1    |●●●●●●● ● ●●●●●●
                ●                                                 ● ● ●         ●    ●● ●                              ●            ● ●     ●                |●●●●●●● ● ●
                                                                                                                                                             ●                                  ● ● ●    ●       ●● ● ●                ●                 ● ● ● ● ●           ●●                                      ●●      ●               ●



         P−2        |● ● ●                                                                                                                                        | ●● ●               ●● ●                              ●       ●     ●                                     ●                                               ●




         P−3         |   ● ● ●           ●●●          ●           ●     ●
                                                                                ●●                                                                                 |
                                                                                                                                                                   ●●          ● ● ● ● ● ● ● ● ●● ●                      ●       ● ●   ●●                                    ●   ●       ●   ●●          ●


         P−4             |           ● ●● ● ● ●                                                                                                                        |   ● ●         ● ● ●●            ●




         P−5                 |                                                                                                                                             |   ●   ●            ●                                      ●




         P−6                     |   ●    ●                       ●     ●                                                                                                          |
                                                                                                                                                                                   ● ●                           ●●                    ● ●                                   ●



         P−7                         |                    ●             ●   ●         ● ●      ●●                                       ●       ●                                      |●   ●       ●●       ●   ●           ●         ●       ●●● ●       ●                 ● ●     ●                                   ●                           7.2

         P−8                                  |   ●   ●●               ●                                                                                                                        |
                                                                                                                                                                                                ●   ●    ●                             ●                   ●         ●● ●                                                        ●




         P−9                                              |                                                                                                                                              |           ●●                ●

                                                                                                                                                                                                                                                                                                                                                             Commits
         P−10                                                 |       ● ●       ●                                                                                                                            |
                                                                                                                                                                                                             ●       ●           ●     ●                                     ●                                                                                ●   1
System




                                                                                                                                                                                                                                                                                                                                                              ●   2
         P−11                                                     |     ●                 ●                                         ●                                                                            |       ●       ●     ●●                                    ●

                                                                                                                                                                                                                                                                                                                                                             ●    5
         P−13                                                           |                                                                                                                                                |
                                                                                                                                                                                                                         ●             ●                                         ●                                   ●
                                                                                                                                                                                                                                                                                                                                                             ● 10
         P−15                                                                                                      |                                                                                                                                             |           ● ●     ●       ●                   ●       ●   ●           ●                   ● 20

         P−16                                                                         |                    ●                                                                                                                               |     ●
                                                                                                                                                                                                                                                     ●● ●
                                                                                                                                                                                                                                                     ●     ●             ●                       ●   ●       ●               ●                   ●




         P−17                                                                         |                                                                                                                                                    |
                                                                                                                                                                                                                                           ●         ●     ●     ●           ● ●



         P−18                                                                         |
                                                                                      ●            ●           ●               ●                                                                                                           |     ●         ●         ● ● ●                           ●


         P−20                                                                                  |
                                                                                               ●                                                                                                                                                 |
                                                                                                                                                                                                                                                 ●   ●     ● ●       ●       ●           ●                                       ●

                                                                                                                                                                                                                                                                                                                                                     7.2.1
         P−21                                                                                      |   ●                   ●        ●                                                                                                                |
                                                                                                                                                                                                                                                     ● ●             ●       ●       ●   ●


         P−23                                                                                                          |                                                                                                                                             |       ●       ●●

         P−25                                                                                                                       |   ●                                                                                                                                                    |       ●




         P−26                                                                                                                                        |                                                                                                                                                                               |   ●




                2013                                  2014                                    2015                                 2016             2017     2013                                   2014                                        2015                                     2016                                    2017
                                                                                                                                                           Date


                                                                                              Figure 5: Synchronizing Changes of PL7.2 and PL7.2.1 systems.
from their origin to the system. This is a form of inde-                                                                                                                           raises the question; how do we distinguish between
pendent evolution, a pattern of commits where clones                                                                                                                               these types of divergence, and how do they affect anal-
diverge throughout the studied time-interval. How-                                                                                                                                 ysis tools and techniques? Analyzing differences be-
ever, some clones were eventually synchronized which                                                                                                                               tween variants is the primary activity performed when
is a form of late propagation, a pattern of commits                                                                                                                                migrated to a more structured software product line
where clones diverge, and later in time converge again                                                                                                                             approach. Based on these differences, variants can be
after changes are propagated (Schmorleiz and Lammel,                                                                                                                               merged into a single variant or points where variation
2016).                                                                                                                                                                             is needed can be identified. In the context of varia-
   Thummalapenta et al. (2010) studied clone evolu-                                                                                                                                tion analysis, differences caused by late propagation
tion patterns for cloning in-the-small, and confirmed                                                                                                                              are not necessarily relevant.
the possibility of late propagation being misclassified
as independent evolution. However, they found that                                                                                                                                 5.3                   Synchronization
late propagation patterns always took place in much
less time than their total time interval of observations,                                                                                                                          RQ3. Have all MES-Toolbox systems been synchro-
thus concluded that such misclassification would occur                                                                                                                             nized with their origin?
only rarely. Our data suggest that cloning in-the-large                                                                                                                            To detect whether a change to either PL7.2.1 or PL7.2
may be much more susceptible to misclassification, as                                                                                                                              was contributed by one of the systems, we identify
in our case the systems are often synchronized at ar-                                                                                                                              revisions that caused at least one system to converge
bitrary points in time. System P8 did not retrieve any                                                                                                                             one line. In the combined history of PL7.2.1 and PL7.2 ,
new changes from its origin for almost a year, after                                                                                                                               there were 501 revisions for which this was the case.
which a bulk of changes were propagated at once, re-                                                                                                                               We manually inspected these revisions and found that
ducing the proportion of diverged Java files from 7.5%                                                                                                                             372 revisions (74%) were correctly classified as changes
to less than 4%.                                                                                                                                                                   contributed by the converging system(s). Out of these
   The maintenance overhead caused by divergence                                                                                                                                   372 revisions, 17 revisions did not have merge-info.
due to late propagation is arguably different from di-                                                                                                                              A detection strategy solely based on the presence of
vergence due to customer-specific modifications. This                                                                                                                              merge-info would have missed these revisions.




                                                                                                                                                                    8
    To detect whether systems retrieved changes from            viding quantitative data to support our findings and
their origin, we identify for each system, all revisions        collaboration with an external supervisor.
that have merge-info, and caused at least one Java              External Validity Development practices in other
file to converge with the origin of the system. The             organizations that use clone-and-own might have dif-
change history of system P4 contained 18 revisions              ferent effects on the evolution of the system, which
with merge-info, of which 14 caused convergence. Out            may lead to different observations. However, some of
of these 14 revisions, 12 (85%) were correctly classified       our findings are consistent with those of other, inde-
as changes retrieved from its origin.                           pendent studies.
    Figure 5 shows the synchronizing changes over time.            In our analysis of synchronizing changes, we looked
We see that all systems retrieved changes from their            at the number of synchronizing commits. The number
origin at least once, and most but not all systems con-         of commits can be affected by the behavior of individ-
tributed changes to their origin. This is different from         ual developers. Developers can choose to merge each
Marlin forks, as Stanciulescu, Schulze, and Wa̧sowski           individual revision, or merge a large number revisions
(2015) found that 15% of all forks, and 34% of all ac-          at once. The first style clearly results in a higher num-
tive forks synchronized at least once with the main             ber of commits compared to the latter, but arguably
Marlin repository.                                              requires more effort too.
    While all systems retrieve changes from their origin,
some do so significantly more frequent than others.             7   Related Work
Systems P1 and P3 retrieved changes from their origin
                                                                Clone Evolution Patterns
respectively 202 and 89 times. Furthermore, we see
that the period of time between subsequent synchro-             Thummalapenta et al. (2010) proposed an approach
nizations can be relatively long. For example, system           for the identification of the evolution of cloned code
P6 retrieved changes from its origin on 22 July 2013,           fragments over time and categorized the evolution pat-
and 8 months later on 24 March 2014. This is con-               terns as (a) Consistent Evolution, (b) Late Propaga-
sistent with the results in the previous section, where         tion, (c) Delayed Propagation, and (d) Independent
we identified long time-interval late propagation in the        Evolution. In our study, we used these patterns to
visualization of divergence over time.                          characterize some of the change patterns we observed
    Finally, we observe at least two instances of verti-        in the evolution of the product family. For example,
cally aligned dots. These patterns can be caused by             Delayed Propagation was used as a strategy to vali-
multiple systems retrieving changes from their origin           date the correctness of changes on some variants, be-
roughly at the same time. Manual inspection of these            fore propagating them to all variants. Independent
patterns shows that both instances were critical bug            Evolution was used to keep the variant as-is after the
fixes, manually merged to most systems on the same              project had been commissioned and the testing phase
day. The fact that we do not see many of these ver-             had already finished.
tical line patterns suggests that mass-synchronization             Similar characteristics were found by Stanciulescu,
of many systems at once does not happen often in the            Schulze, and Wa̧sowski (2015) in a study on the ad-
MES-Toolbox product family.                                     vantages and disadvantages of forking using the case
                                                                of Marlin, an open source firmware for 3D printers.
6   Threats to Validity                                         They found that important bug-fixes were not propa-
                                                                gated and functionality was sometimes developed more
Internal Validity During our study, the MES-                    than once. Intuitively you may consider these findings
Toolbox product family continued to change. To pre-             to be bad practices and drawbacks of clone-and-own.
vent this change from affecting our results, we obtained         However, there are situations where this may be de-
a local copy of the repository. This local copy of the          sirable, as the authors found that “Once the firmware
repository was used throughout the study.                       is configured and running on the printer, new changes
   We used the merge-info property to determine                 are not desired”.
whether a commit was a merge. Since this prop-                     In an environment where the potential cost of an
erty can be incorrect, we additionally checked whether          error can be significant, systems are changed as lit-
commits caused systems to converge. We cross-                   tle as possible when maintained (Cordy, 2003). In
checked the precision of this technique by manually             a clone-an-own based system, this characteristic can
inspecting revisions, and achieved a good precision.            be detected by looking for patterns like Independent
   While the experience of the author as a developer            Evolution, the lack of synchronization with the ori-
of the system may provide a detailed interpretation             gin, or redundant code. This is in line with some of
of fine-grained changes, this can cause some bias. We           the cloning patterns described by Kapser and Godfrey
aimed to reduce this threat as much as possible by pro-         (2006). They argued that code duplication can also




                                                            9
have benefits, and described the pros and cons in a              degree of variation in the implementation of crosscut-
catalog of cloning patterns used in real-world systems.          ting concerns, we expect that this may also affect the
                                                                 extent to which changes are propagated, and how the
Software Ecosystem Characteristics                               code-bases diverge.
Lettner, Angerer, Grünbacher, et al. (2014) studied                  Marin, Moonen, and Deursen (2005) propose a clas-
the relevance of characteristics of Software Ecosystems          sification system for crosscutting concerns in terms of
in the domain of industrial automation and found some            sorts, where a sort is a description based on a num-
additional characteristics that according to them are            ber of distinctive properties. A sort we expect to find
of particular importance in the industrial automation            often in this case study is Entangled Roles. In Object
domain. For example, platform quality characteristics            Oriented terminology this sort is defined as Implement
like stability and backward compatibility, and long-             a method with (entangled) functionality that belongs
term platform evolution seemed to be essential to the            to a different concern than the main concern of that
success of the studied system. One of the reasons for            method. A characteristic of clone-and-own is that it
this conclusion was that “application engineer B re-             allows application engineers to make these kinds of
ported that he had to update a ten-year-old version of           fine-grained changes quickly. For example, a customer
the platform software because an important customer              wants to be notified when stock levels exceed a certain
had decided to leave out several platform releases and           value. If there is no such monitoring system in place,
then requested a new feature. This led to significant            then the fastest solution can be to add this function-
difficulties in merging the old software version with the          ality to a method that deals in some way with stock-
new functionality.”. Developers of the system we study           control. Implementation of a generic solution may ex-
have reported similar issues with upgrading customer             ceed the level of expertise of the application engineer,
systems to a new release.                                        and waiting for a platform engineer to develop the so-
   In a later study by Lettner, Angerer, Prähofer, et al.        lution may take too much time.
(2014), the change characteristics and software evo-
lution challenges of the same ecosystem were inves-                 Figueiredo et al. (2009) describe 13 patterns of
tigated. The software change taxonomy of Buckley                 crosscutting concerns identified in three case studies,
et al. (2005) was used to describe qualitatively when,           one of which was a software product line. The authors
where, and how changes were made in different parts               found that some patterns consistently emerged in sit-
of the system and what was affected by changes. The               uations with the frequent use of inheritance. They
authors found that the ecosystem is subject to both              found that this was often the case in product lines
continuous and periodic evolution. The core platform             because “Program families rely extensively on the use
is continuously changed to include new features and              of abstract classes and interfaces in order to imple-
bug-fixes, while those changes are only periodically             ment variabilities. The inappropriate modularization
released to platform users. The granularity of these             of such crosscutting concerns might lead to future in-
changes is reportedly primarily coarse for customer              stabilities in the design of the varying modules”
requirements, and fine for bug fixes. Propagation of
changes is done by hand, and change impact analysis                 Detection of crosscutting concerns is called aspect
is performed manually, based on expert knowledge.                mining. Various aspect mining techniques have been
   The system we study is in the same domain and                 proposed (Kellens, Mens, and Tonella, 2007; Tourwé
seems to be developed similarly. Our study is different           and Mens, 2004; Ceccato et al., 2006). For exam-
in a sense that we support our findings with visual              ple, fan-in analysis looks for crosscutting functional-
representations of the evolution of the system. For              ity by detecting methods that are explicitly invoked
example, we know that in this case changes are also              from many methods scattered throughout the code
propagated by hand, so we developed a technique to               (Marin, Deursen, and Moonen, 2007). History-based
show how frequent this is actually done in the MES-              concern mining techniques analyze change-history to
Toolbox product family.                                          detect which program entities change together fre-
                                                                 quently (Breu and Zimmermann, 2006; Adams, Jiang,
Crosscutting Concerns                                            and Hassan, 2010). Hashimoto and Mori (2012) devel-
                                                                 oped a tool that improves history-based concern min-
A possible area of interest in the analysis of clone-            ing by combining it with fine-grained change analysis
and-own evolution is the presence and development of             based on abstract syntax tree differencing.
crosscutting concerns in the system. A crosscutting
concern is a feature whose implementation is spread                 In future work, we intend to use these tools and
across many modules (Marin, Deursen, and Moonen,                 techniques to gain a deeper understanding of the
2007). If product variants, or clones, exhibit a high            change and divergence patterns we found.




                                                            10
Clone-and-Own in Product Line Engineering                       related points of interest. First, we explored whether
                                                                MES-Toolbox systems have changed in parallel. Next,
Dubinsky et al. (2013) studied the processes and per-
                                                                we investigated how much the codebase of the systems
ceived advantages and disadvantages of the clone-and-
                                                                diverged from their origin, and to what extent this
own approach of six industrial software product lines.
                                                                changed over time. Finally, we studied the synchro-
They show that cloning is perceived as a favorable and
                                                                nization activity between systems and their origins.
natural reuse approach by the majority of practition-
                                                                    We observed that many MES-Toolbox systems are
ers in the studied companies, mainly because of its
                                                                changed roughly at the same time, but that the degree
simplicity and availability. They found that practi-
                                                                of parallel change is not the same for all systems, nor
tioners lack the awareness and knowledge about forms
                                                                is it constant over time. Many systems appear to be
of reuse, and many alternative approaches fail to con-
                                                                changed in parallel initially until the development of
vince them that they yield better results.
                                                                one system is done and they no longer change in par-
   Rubin, Czarnecki, and Chechik (2013) proposed a
                                                                allel. This is consistent with a schedule-driven need
framework to organize knowledge related to the devel-
                                                                for independence. We further observed a schedule-
opment, maintenance and merge-refactoring of prod-
                                                                independent cause for parallel change, which was the
uct lines realized via cloning. This framework is a step
                                                                need to propagate critical bug fixes to many systems
towards a recommender system that can assist users in
                                                                on the same day. This form of mass-synchronization
selecting tools and techniques that are useful in their
                                                                appeared to have occurred only twice in the history of
situation.
                                                                the systems we analyzed.
   Hetrick, Krueger, and Moore (2006) report on the
                                                                    With regard to divergence, we found that all MES-
experience of a structured, incremental transition from
                                                                Toolbox systems we analyzed, including those which
a clone-and-own approach to software product line
                                                                reportedly hardly required any customer-specific mod-
practices. They show that it is possible to make this
                                                                ifications, diverged significantly from their origin. In
transition without a significant upfront investment and
                                                                terms of the proportion of Java files, all systems di-
disruption of the ongoing production schedules. The
                                                                verged between 7% and 22.5% from their origin. In
authors indicate that the file branch factor gradually
                                                                terms of diverged number lines, most systems did not
reduced during the transition, to a point where all
                                                                exceed 50.000 lines (<5%), and only two systems di-
branches from product line core assets were completely
                                                                verged more than 75.000 lines. We identified one case
eliminated. This metric is defined as the average num-
                                                                where the divergence measured in percentage of Java
ber of branched files per product, normalized by the
                                                                files was significantly different from divergence mea-
number of products. Our study shows that the num-
                                                                sured in terms of number of lines.
ber of branched files per product can vary significantly
                                                                    During our analysis of divergence over time, we were
between systems and over time. Hence, care has to
                                                                able to identify points in time when systems were syn-
be taken when using the average. Furthermore, we
                                                                chronized with their origin. Our analysis of synchro-
found that products with a similar percentage of files
                                                                nizing changes confirms these findings, and we found
diverged can vary significantly in terms of total num-
                                                                that all systems we analyzed retrieved changes from
ber of lines diverged.
                                                                their origin at least once, but not all systems con-
   Antkiewicz et al. (2014) propose an incremental and
                                                                tributed changes back to their origin.
minimally invasive strategy for adoption of product-
                                                                    Overall, these results show that products from the
line engineering. The strategy is called virtual plat-
                                                                same product family can vary significantly in terms of
form, and should allow organizations to obtain incre-
                                                                change activity over time, divergence from their ori-
mental benefits from incremental changes to the de-
                                                                gin and synchronization activity. It is important to
velopment approach. By studying the development
                                                                keep this in mind when studying product families re-
practices of our industry case, we gain insight into an
                                                                alized via clone-and-own, as these variations may play
industry context and the needs of practitioners. This
                                                                an important role in reducing maintenance overhead.
may serve as input for recommender systems, require-
                                                                In future work, we will further investigate these factors
ments for the virtual platform, and can be helpful to
                                                                to develop quantitative measures for the assessment of
practitioners, researchers and tool developers.
                                                                clone-and-own benefits and drawbacks.
8   Conclusion                                                  Acknowledgements
In this work, we presented the results of our ex-               We thank prof. dr. J.J. Vinju, the reviewers and other
ploratory analysis of an industry product family de-            participants of the SATToSE 2017 seminar for their
veloped using a clone-and-own approach. The goal of             helpful input on related literature and the direction of
this analysis was to gain insight into how the prod-            this study.
uct family has evolved, and to identify clone-and-own




                                                           11
References                                                          Kellens, A., K. Mens, and P. Tonella (2007). “A Survey
                                                                       of Automated Code-Level Aspect Mining Techniques”.
Adams, B., Z. M. Jiang, and A. E. Hassan (2010). “Iden-
                                                                       In: Transactions on Aspect-Oriented Software Develop-
   tifying Crosscutting Concerns Using Historical Code
                                                                       ment IV. Berlin, Heidelberg: Springer Berlin Heidelberg,
   Changes”. In: Proceedings of the 32nd ACM/IEEE In-
                                                                       pp. 143–162.
   ternational Conference on Software Engineering - ICSE
                                                                    Lettner, D., F. Angerer, P. Grünbacher, et al. (2014). “Soft-
   ’10. Vol. 1. ACM, pp. 305–314.
                                                                       ware Evolution in an Industrial Automation Ecosystem:
Antkiewicz, M. et al. (2014). “Flexible Product Line En-
                                                                       An Exploratory Study”. In: Software Engineering and
   gineering with a Virtual Platform”. In: Companion
                                                                       Advanced Applications (SEAA), 2014 40th EUROMI-
   Proceedings of the 36th International Conference on
                                                                       CRO Conference on. IEEE, pp. 336–343.
   Software Engineering - ICSE Companion 2014. ACM,
                                                                    Lettner, D., F. Angerer, H. Prähofer, et al. (2014). “A Case
   pp. 532–535.
                                                                       Study on Software Ecosystem Characteristics in Indus-
Berger, T. et al. (2014). “Three Cases of Feature-Based
                                                                       trial Automation Software”. In: Proceedings of the 2014
   Variability Modeling in Industry”. In: Lecture Notes in
                                                                       International Conference on Software and System Pro-
   Computer Science (including subseries Lecture Notes in
                                                                       cess - ICSSP 2014. ACM, pp. 40–49.
   Artificial Intelligence and Lecture Notes in Bioinformat-
                                                                    Marin, M., A. van Deursen, and L. Moonen (2007). “Iden-
   ics). Vol. 8767. Springer, pp. 302–319.
                                                                       tifying Crosscutting Concerns Using Fan-In Analysis”.
Breu, S. and T. Zimmermann (2006). “Mining Aspects
                                                                       In: ACM Transactions on Software Engineering and
   from Version History”. In: Automated Software Engi-
                                                                       Methodology (TOSEM) 17.1, pp. 1–37.
   neering, 2006. ASE’06. 21st IEEE/ACM International
                                                                    Marin, M., L. Moonen, and A. van Deursen (2005).
   Conference on. IEEE, pp. 221–230.
                                                                       “A Classification of Crosscutting Concerns”. In: 21st
Buckley, J. et al. (2005). “Towards a Taxonomy of Software
                                                                       IEEE International Conference on Software Mainte-
   Change”. In: Journal of Software Maintenance and Evo-
                                                                       nance (ICSM’05). IEEE, pp. 673–676.
   lution: Research and Practice 17.5, pp. 309–332.
                                                                    Rubin, J., K. Czarnecki, and M. Chechik (2013). “Manag-
Ceccato, M. et al. (2006). “Applying and Combining Three
                                                                       ing Cloned Variants: A Framework and Experience”. In:
   Different Aspect Mining Techniques”. In: Software Qual-
                                                                       Proceedings of the 17th International Software Product
   ity Journal 14.3, pp. 209–231.
                                                                       Line Conference - SPLC ’13. ACM, p. 101.
Cordy, J. R. (2003). “Comprehending Reality - Practi-
                                                                    Schmorleiz, T. and R. Lammel (2016). “Similarity manage-
   cal Barriers to Industrial Adoption of Software Mainte-
                                                                       ment of ’cloned and owned’ variants”. In: Proceedings of
   nance Automation”. In: Program Comprehension, 2003.
                                                                       the 31st Annual ACM Symposium on Applied Comput-
   11th IEEE International Workshop on. IEEE, pp. 196–
                                                                       ing - SAC ’16. New York, New York, USA: ACM Press,
   205.
                                                                       pp. 1466–1471.
Dubinsky, Y. et al. (2013). “An Exploratory Study of
                                                                    Schrock, S., A. Fay, and T. Jager (2015). “Systematic inter-
   Cloning in Industrial Software Product Lines”. In: Pro-
                                                                       disciplinary reuse within the engineering of automated
   ceedings of the European Conference on Software Main-
                                                                       plants”. In: Systems Conference (SysCon), 2015 9th An-
   tenance and Reengineering, CSMR, pp. 25–34.
                                                                       nual IEEE International, pp. 508–515.
Duc, A. N. et al. (2014). “Forking and coordination in
                                                                    Stanciulescu, S., S. Schulze, and A. Wa̧sowski (2015).
   multi-platform development: a case study”. In: Proceed-
                                                                       “Forked and Integrated Variants in an Open-Source
   ings of the 8th ACM/IEEE International Symposium
                                                                       Firmware Project”. In: 2015 IEEE International Con-
   on Empirical Software Engineering and Measurement -
                                                                       ference on Software Maintenance and Evolution (IC-
   ESEM ’14. New York, New York, USA: ACM Press,
                                                                       SME). IEEE, pp. 151–160.
   pp. 1–10.
                                                                    Thummalapenta, S. et al. (2010). “An Empirical Study on
Figueiredo, E. et al. (2009). “Crosscutting Patterns and
                                                                       the Maintenance of Source Code Clones”. In: Empirical
   Design Stability: An Exploratory Analysis”. In: IEEE
                                                                       Software Engineering 15.1, pp. 1–34.
   International Conference on Program Comprehension,
                                                                    Tourwé, T. and K. Mens (2004). “Mining Aspectual Views
   pp. 138–147.
                                                                       using Formal Concept Analysis”. In: Source Code Analy-
Hashimoto, M. and A. Mori (2012). “Enhancing History-
                                                                       sis and Manipulation, Fourth IEEE International Work-
   Based Concern Mining with Fine-Grained Change Anal-
                                                                       shop on. IEEE Comput. Soc, pp. 97–106.
   ysis”. In: 2012 16th European Conference on Software
                                                                    Yamashita, A. et al. (2017). “Software Evolution and
   Maintenance and Reengineering. IEEE, pp. 75–84.
                                                                       Quality Data from Controlled, Multiple, Industrial
Hetrick, W. A., C. W. Krueger, and J. G. Moore (2006).
                                                                       Case Studies”. In: Proceedings of the 14th International
   “Incremental Return on Incremental Investment: En-
                                                                       Conference on Mining Software Repositories. IEEE,
   genio’s Transition to Software Product Line Practice”.
                                                                       pp. 507–510.
   In: International Conference on Object-Oriented Pro-
   gramming, Systems, Languages and Applications. ACM,
   pp. 798–804.
Kapser, C. and M. Godfrey (2006). “"Cloning Considered
   Harmful" Considered Harmful”. In: 2006 13th Working
   Conference on Reverse Engineering. IEEE, pp. 19–28.




                                                               12