<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Analysis of a Clone-and-Own Industrial Automation System: An Exploratory Study</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Nick Lodewijks University of Amsterdam</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In industry, the development of similar products is often addressed by cloning and modifying existing artifacts. This so-called cloneand-own approach is often considered to be a bad practice but is perceived as a favorable and natural software reuse approach by many practitioners. Unfortunately, current literature lacks quantitative information about the positive and negative effects of clone-and-own. In this paper, we present the results of our exploratory analysis of an industry system developed using the clone-and-own approach. We found that products from the same product family can vary significantly in change activity over time, divergence from their origin and synchronization activity. We will further investigate these factors to develop quantitative measures for the assessment of clone-and-own benefits and drawbacks.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Cloning is often considered to be a practice
harmful to the quality of source code, and potentially a
cause of maintainability problems
        <xref ref-type="bibr" rid="ref13 ref23 ref4">(Kapser and
Godfrey, 2006; Thummalapenta et al., 2010)</xref>
        . Yet, in
industry, the development of similar products is often
addressed by cloning and modifying existing artifacts.
This so-called clone-and-own approach is perceived as
a favorable and natural software reuse approach by
many practitioners, mainly because of its simplicity
and availability
        <xref ref-type="bibr" rid="ref8">(Dubinsky et al., 2013)</xref>
        .
      </p>
      <p>While the general belief is that clone-and-own is
a bad and unsustainable development technique, it
has been used successfully for the development of
the MES-Toolbox; a large (±1 million lines of Java
code) proprietary factory automation system. Over
the past 17 years, for each new customer, an
existing system was cloned and modified in any possible
way to add, modify or remove functionality. With
over 70 implementations of the systems running
worldwide, the company now seeks to reduce maintenance
overhead. Unfortunately, the decision on how to move
forward from a successful clone-and-own approach is
not straightforward.</p>
      <p>
        Over the past decade, several tools and techniques
for dealing with cloned product variants have been
proposed. Some of them advocate the elimination of all
clones by merging the variants into a single platform,
and others propose to maintain multiple variants
asis
        <xref ref-type="bibr" rid="ref19">(Rubin, Czarnecki, and Chechik, 2013)</xref>
        . What
approach works best for a given situation depends on the
domain and context of that situation. In some cases
eliminating all clones and adopting an integrated
platform is neither possible nor beneficial
        <xref ref-type="bibr" rid="ref2 ref3 ref9">(Antkiewicz et
al., 2014)</xref>
        . Eliminating clones will increase coupling,
and changing shared code may require re-testing of all
systems that use it
        <xref ref-type="bibr" rid="ref8">(Dubinsky et al., 2013)</xref>
        . If the
success of the product highly depends on the benefits of
clone-and-own, then its merits should be considered
before moving away to a different approach.
      </p>
      <p>The main objective of our study is to explore the
evolution of MES-Toolbox systems and to gain
insight into how clone-and-own may have affected
ongoing project development and maintenance. In this
paper, we show how version control system metadata,
source-code differencing, and visualization techniques
can be used to identify clone-and-own related points
of interest in the evolution of a product family.</p>
    </sec>
    <sec id="sec-2">
      <title>Subject System</title>
      <p>The system studied in this work is the MES-Toolbox;
a 17-year-old proprietary Java-based factory
automation system developed by ENGIE. The main purpose
of the systems is for automation of batch and
continuous production processes. It can visualize,
control and register every step of an entire production
process. From the intake of raw material (unloading
from trucks, ships, bags, pallets, containers),
preparation (dosing, weighing, heating), processing (pressing,
grinding, mixing), storage, to the distribution of end
products to customers. Depending on what customers
require for their production process, the system
performs article and recipe management, quality
registration, production planning, tracking and tracing of
materials used in production, stock control, shift
registration, production performance analysis and
communicates with ERP systems. To monitor and
control physical production equipment (e.g., conveyors,
mixers, weigher, buttons, lights), the MES-Toolbox
communicates with Programmable Logic Controller’s
(PLC’s) that perform the actual low-level control of
these physical devices.</p>
      <p>Over the past 17 years, the system has grown to
contain more than 6500 Java files, with a total of
approximately 1 million lines of Java code. While the
design of the system has a modular structure and aims to
separate common code from customer implementation
code as much as possible, it’s a monolithic application.
Nearly all source-code is contained in a single project,
which is developed, built and versioned as a whole.
Internally, this project is called the Standard project, as
it is used as a basis for all new projects. This project,
which can be considered as the main platform of the
product family, contains a constantly growing set of
reusable core components and ready-to-use standard
solutions.</p>
      <p>
        Within the organization there is a clear distinction
between platform development and application
development, this distinction is often found in a Software
Ecosystem (SECO)
        <xref ref-type="bibr" rid="ref15 ref16 ref2 ref3 ref9">(Lettner, Angerer, Grünbacher, et
al., 2014)</xref>
        . A small team of five developers is
responsible for the overall design, development, and
maintenance of the system. The founder and writer of the
first line of code of this system is also still part of this
team. Work of this team is focused on maintenance of
the core platform, development of complex customer
specific features, standardization of functionality,
development of product configuration tools, and provide
support to application engineers.
      </p>
      <p>
        Even though the system is highly configurable,
cloning is used to address the specificity and high
degree of variation of customer requirements often found
in the domain of industrial automation
        <xref ref-type="bibr" rid="ref21 ref22">(Schrock, Fay,
and Jager, 2015)</xref>
        . For every new factory, a clone of
the codebase of the latest platform release is realized
by creating a branch with the Subversion version
control system. The clone is then configured and changed
in any possible way by Application Engineers to add,
modify or remove functionality. Each clone
corresponds to the automation system of a factory
somewhere around the world for some specific customer.
Each clone is a variant of the base platform. We refer
to the collection of all MES-Toolbox variants as the
MES-Toolbox product family.
      </p>
      <p>Between clones there exists a varying degree of
commonality, and there is often no clear relation between
the clones. Clones developed for the same customer
might have more in common than clones developed
for different customers. For example, if a company
requires all their production facilities to have
identical branding and communication interfaces with
thirdparty systems. However, even clones that appear to be
unrelated in terms of end-user requirements may still
have some forms of commonality, such as the graphical
user interface components or the configuration
framework that is used.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Research Questions</title>
      <p>
        <xref ref-type="bibr" rid="ref8">Dubinsky et al. (2013)</xref>
        observed that independence
provided by clone-and-own is one of the major reasons
for considering cloning as an efficient reuse mechanism.
Developers can make any change required to satisfy
customer requirements, without affecting other clones.
They do not have to collaborate with teams working
on other systems, that may have different priorities or
scheduling constraints. These characteristics of
cloneand-own have to be considered when new change
mechanisms are introduced, since different techniques may
not provide the same degree of independence. But how
much independence is needed, and how can it be
measured? In this section, we describe three research
questions that we will use to explore independence-related
characteristics of the MES-Toolbox product family.
RQ1: Do MES-Toolbox systems change in parallel?
When cloning is used to develop systems independent
of each other, developers can decide when to change
the codebase of each individual system. The
development of each system can follow its own release and
development schedule that is based on available resources
and requirements for the system. The development of
complex systems with many customer-specific
modifications may require and allow for months of
continuous, frequent change, while relatively straightforward
and simple systems might have strict deadlines and
require only a few changes within the first weeks.
      </p>
      <p>
        To explore whether MES-Toolbox systems may
have benefited from this time aspect of independence,
we want to gain a rough understanding of the degree
of parallel change in the MES-Toolbox product family.
We hypothesize that a schedule-driven need for
independence may lead to a lack of parallel development,
whereas some relation between systems (e.g.:
systems developed the same customer) or a
collaborationoverhead driven need for independence may lead to
parallel change. For the purpose of this exploratory
study, we do not yet use a strict definition of
parallel change. Instead, we are interested in any form of
seemingly parallel change. Do systems change in
parallel every week, month or year? Do many systems
change at roughly the same time, or is this only the
case for specific systems?
RQ2: How much do MES-Toolbox systems diverge
from their origin? Clone-and-own allows developers
to add, remove or modify files without affecting their
origin. These changes will inherently cause systems to
diverge from their origins; they are no longer identical.
As the product family grows it often becomes
increasingly hard to keep an overview of the available
functionality
        <xref ref-type="bibr" rid="ref2 ref2 ref21 ref22 ref3 ref3 ref9 ref9">(Stanciulescu, Schulze, and Wa¸sowski, 2015;
Berger et al., 2014; Duc et al., 2014)</xref>
        . We hypothesize
that the degree of divergence can be used to
quantify the complexity caused by cloning. Therefore, we
are interested to see how this property of the
MESToolbox product family has evolved over time.
      </p>
      <p>A developer of the MES-Toolbox platform stated
that diverged Java files often make it difficult to
propagate changes, but expected that the Java codebase
would not significantly diverge for most of the 7.2 and
7.2.1 systems. Many of these systems are considered
relatively simple, and hardly require any
customerspecific modification of the codebase.</p>
      <p>
        RQ3: Have all MES-Toolbox systems been
synchronized with their origin? Cloning is said to increase
maintenance overhead because changes to one clone
may have to be propagated to all clones. Studies
have shown however that change propagation is not
always performed
        <xref ref-type="bibr" rid="ref21 ref22">(Stanciulescu, Schulze, and Wa¸sowski,
2015)</xref>
        , which suggests that cloning does not necessarily
increase maintenance overhead due to change
propagation.
      </p>
      <p>In the organization we study, changes are
manually propagated at the discretion of teams developing
the systems. Application engineers stated that they
periodically merge changes from the platform release
to customer systems, but only while they are still
under active development. Because some systems are
developed relatively fast, we expect that some
systems retrieve only very few changes from their origin,
thus arguably not causing much maintenance
overhead. Consequently, techniques that purely reduce
repetitive task would have limited effect on
mainte20
s15
m
e
t
s
y
S
f
o
re10
b
m
u
N
5
0
21(35%)
5(8.3%)
15(25%)
11(18.3%)
8(13.3%)</p>
      <p>Pre
7.1</p>
      <p>7.2
Platform Version
7.2.1
7.3
nance overhead caused synchronization for these
systems. In the MES-Toolbox product family,
synchronization with products and their origin can occur in
both ways. Bugs are often found and fixed on a
product, after which the change is propagated to the
platform project. From there, the change can be
propagated to all the other products derived from that
platform version.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Research Methodology</title>
      <p>To explore the evolution of MES-Toolbox systems, we
built a tool that retrieves changes to each system from
the subversion (SVN) repository, performs source-code
differencing and exports the relevant information to a
CSV file for further analysis in R. Our tool is
embedded in a modified version of JMeld1, an open source
differencing tool written in Java.</p>
      <p>First, we make a local copy of the SVN repository
with the command svnadmin hotcopy, and verify its
integrity with svnadmin verify in the analysis
environment. This local repository is used for all data
collection to ensure that the data source does not change
during subsequent analysis.
4.1</p>
      <sec id="sec-4-1">
        <title>Selecting MES-Toolbox Systems</title>
        <p>We extract all systems present in the local copy of the
repository by scanning the output of svn ls2 for paths
in the form of projecten/.*/trunk/$. We then
manually validate these paths and documented for each
system the platform version it was branched from, the
name of the project, an anonymised name, the
repository path, and any unusual properties of the system
that we have to consider during analysis. For example,
development of some systems was discontinued and the
systems were never put into production. We excluded
these systems from the analysis. Finally, we noted
1https://sourceforge.net/projects/jmeld/
2svn ls -R {svnRepo} | egrep "projecten/.*/trunk/$"
whether the system was directly branched from the
platform, or from another branch (its nesting depth).</p>
        <p>There are currently four platform versions: 7.1, 7.2,
7.2.1 and 7.3. The first version of the platform (7.1)
was released on 7 March 2012 and was followed
relatively fast by the next release (7.2) on 18 December
2012. Version 7.2.1 of the platform was released on 7
October 2014, and version 7.3 on 14 December 2016.
Figure 1 shows the distribution of versions among
systems in the version repository. Twenty-one systems
pre-date the first platform release. There are five 7.1
systems, fifteen 7.2 systems, eleven 7.2.1 systems and
eight 7.3 systems. For this study, we mainly focus on
7.2 and 7.2.1 systems, as these have all been put into
production, and are derived from a comparable base
platform within the last five years. The main difference
between these versions is the internationalization of all
text visible to the end-user. There are no significant
differences in terms of architecture or functionality.</p>
        <p>We refer to the codebase of a specific platform
version in the form of PL-VERSION, for example, we use
PL7.2 to refer to version 7.2 of the platform. The
internal name of the system can contain the name of the
customer, and the location of the production facility.
Since this information is subject to confidentiality, we
manually defined an anonymised name for each system
in the form of P-NUMBER. In this paper, we often refer
to this name as Pn, which can be read as project n or
product n.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Mining Commit Metadata</title>
        <p>
          For each system we selected, we extract the version
history using a bash script. This bash script uses
the svn log 3 command to export the version history
in xml format. For each system we collect all
revisions from its change history, and extract the relevant
change metrics. We used the definition and format of
the change metrics dataset published by
          <xref ref-type="bibr" rid="ref25">Yamashita et
al. (2017)</xref>
          as a basis for our data set.
        </p>
        <p>We extract the revision number, author and date of
each revision. Next, from the output of svn diff4, we
determine the full path of the files that were changed,
the type of change (added, deleted or modified), and
calculate for each file how many lines were changed,
added or deleted. From the full path of the files, we
extract the file name and file extension. Note that we
use svn diff to determine which files were changed,
and not svn log. The reason for this is that when a
directory is deleted, the output of svn log only
contains the name of the directory, and does not contain
the names of the files contained in the directory.</p>
        <p>3svn log --xml --stop-on-copy -v &lt;variant.repositoryPath&gt; &gt;
&lt;variant&gt;-log.xml
4svn diff -x -U0 -c {revisionNumber} {repositoryPath}</p>
        <p>System 1 ●
mSystem 2 ●
e
t
s
SSystem 3 ●
y
System 4
01
●
●
02
●
●
●
●
03
Date (week)
●
●
04
●
●
●
05
To determine whether systems change in parallel, we
are interested in the time aspect of change at
systemlevel granularity. We decided to use a visualization
which allows us to gain insight into whether (a)
systems change in parallel, (b) systems change
continuously, periodically or at arbitrary moments in time,
and (c) to identify variance between systems.</p>
        <p>For this visualization we chose systems as the first
dimension and time as the second dimension. To
prevent overplotting, we group data-points by week or
month. By grouping data, we will not be able to
distinguish between systems that changed many times a
week, or only once a month. To mitigate this effect we
introduce an additional dimension which is number of
commits (proportional to the radius of the dot). This
leads us to the view shown in Figure 2.</p>
        <p>The vertical axis represents the systems, and the
horizontal axis the time of the changes. Each dot
represents a point in time when a system was changed.
The radius of the dot is proportional to the number
of commits that occurred. In this example, we group
the data-points by week. Continuous change activity
will give rise to a sequence of horizontally aligned dots.
Changing a system twenty times a week will result in
a thicker horizontal dot pattern compared to changing
a system only once a week. In Figure 2 we observe
that system 1 was under continuous maintenance, as
it was changed every week. System 2 was changed
every other week, which appears to be more
periodical but due to the week-based granularity may still be
considered as continuous to some extent. The change
activity for systems 3 and 4 is continuous for the first
three weeks, but declining for system 3 and increasing
for system 4. Finally, we see that systems 1, 2 and 3
all changed in the first week, but system 3 has been
modified more frequent.
To measure how much systems have diverged from
their origin, we developed a tool that calculates how
much the difference between each system and its
origin has changed over time. We do so by calculating
the differences for each system, for each file, at every
revision that changed either the system or its origin.
We perform these measurements on a local copy of the
actual codebase of the systems. For the platforms and
each system, we locally replay their change history by
sequentially updating the local working copy with svn
update. After each update of a platform codebase,
we re-calculate the differences with code-differencing
on all systems that have been derived from the
platform. Similarly, after each update of the codebase of
a system, we re-calculate the differences between the
system and its origin. This technique is
computationally intensive but does allow us to explore how much
each revision has affected divergence.</p>
        <p>We measure differences at line-level granularity
(number of lines different) with the Java
implementation of GNU diff 5. Using the file-level
granularity measurement, we aggregate to file-level granularity.
By using a line-level granularity instead of a file-level
granularity (number of files different), we will be able
to aggregate to file-level granularity and report on both
levels. We define the difference in number of lines as
diff. During analysis, we keep track of how much the
difference has increased or decreased compare to the
previous revision, the diffDelta.</p>
        <p>We illustrate the divergence calculation on an
artificial example in Table 1. In this example, PL7.2.1 is
the origin of system P17. First, we update the local
copy of the codebase of PL7.2.1 to revision 1 and
calculate the differences between PL7.2.1 and P17. We see
that in revision 1, Main.java was modified on PL7.2.1,
causing a difference of five lines. Next, we update P17
to revision 2 and re-calculate the differences. We see
that Main.java was changed, reducing the difference
by five lines. This pattern of increasing and decreasing
divergence is typically caused by change propagation
when revision 1 is merged to system P17 in revision 2.
As the measurements continue, we see that Main.java
was modified two more times on P17, increasing the
difference by ten lines in revision 3 and five lines in
revision 4. Finally, in revision 5 the difference was
reduced by fifteen lines by a change on PL7.2.1.
4.5</p>
      </sec>
      <sec id="sec-4-3">
        <title>Detecting Synchronization</title>
        <p>Systems retrieving changes from their origin, or
contributing changes to their origin is often done by
merging the revision from the system to its origin or vice
revision
versa. Subversion automatically registers the merged
revision(s) and the origin of the merge in a so-called
svn:mergeinfo property attached to files and
directories6. We classify each revision commit as MERGE or
NON_MERGE by scanning the output of svn diff for an
occurrence of svn:mergeinfo.</p>
        <p>Unfortunately, we cannot blindly trust the validity
of Subversion properties. Subversion properties can be
changed by hand, developers might forget to commit
the changes to properties, or they could manually copy
changes between systems without using the merging
system. We aim to mitigate these issues by taking into
account whether revisions have caused convergence or
divergence. We expect that most changes to systems
will cause them to diverge from their origin and that
merging these changes to their origin will cause them
to converge. Similarly, we expect that changes to the
origin of systems will cause them to diverge, and
merging the change to the systems will cause them to
converge. We manually validate a large sample of data to
ensure this is a reliable technique to detect
synchronization.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Results and Analysis</title>
      <p>In this section, we present the results of our
exploratory analysis.
5.1</p>
      <sec id="sec-5-1">
        <title>Parallel Change</title>
        <p>RQ1. Do MES-Toolbox systems change in parallel?
Figure 3 shows the change activity of the PL7.2 and
PL7.2.1 platforms, and all systems derived from these
platforms that were included in our study. We see
that many systems appear to be modified almost
continuously, even years after the first change was made.
For example, systems P1 and P3. Systems P4, P8 and
P18 also appear to be changed continuously, but to a
lesser extent than the first group. The change activity
for these systems appears less dense and contains more
periods of inactivity. The longest period of inactivity
for these systems is approximately four months7 for
system P4.
m P−13
e
tysS P−14</p>
        <p>P−15
P−5
P−8
P−6
P−9
P−4
P−7
P−5
P−8
P−6
P−9
P−4
P−7
)s100000
e
lif
a
v
j.a 50000
(
d
e
g
ire 0
v
D
s150000
e
n
if
L
ro100000
e
b
m
uN 50000</p>
        <p>0
150000
100000
50000
P−10</p>
        <p>P−11</p>
        <p>P−12</p>
        <p>P−10</p>
        <p>P−11</p>
        <p>P−12
0 2013 2014 2015 2016 2017 2013 2014 2015 2016 2017 2013 2014 2015 2016 2017
Date
0 2013 2014 2015 2016 2017 2013 2014 2015 2016 2017 2013 2014 2015 2016 2017
Date</p>
        <p>To detect whether systems retrieved changes from
their origin, we identify for each system, all revisions
that have merge-info, and caused at least one Java
file to converge with the origin of the system. The
change history of system P4 contained 18 revisions
with merge-info, of which 14 caused convergence. Out
of these 14 revisions, 12 (85%) were correctly classified
as changes retrieved from its origin.</p>
        <p>Figure 5 shows the synchronizing changes over time.
We see that all systems retrieved changes from their
origin at least once, and most but not all systems
contributed changes to their origin. This is different from
Marlin forks, as Stanciulescu, Schulze, and Wa¸sowski
(2015) found that 15% of all forks, and 34% of all
active forks synchronized at least once with the main
Marlin repository.</p>
        <p>While all systems retrieve changes from their origin,
some do so significantly more frequent than others.
Systems P1 and P3 retrieved changes from their origin
respectively 202 and 89 times. Furthermore, we see
that the period of time between subsequent
synchronizations can be relatively long. For example, system
P6 retrieved changes from its origin on 22 July 2013,
and 8 months later on 24 March 2014. This is
consistent with the results in the previous section, where
we identified long time-interval late propagation in the
visualization of divergence over time.</p>
        <p>Finally, we observe at least two instances of
vertically aligned dots. These patterns can be caused by
multiple systems retrieving changes from their origin
roughly at the same time. Manual inspection of these
patterns shows that both instances were critical bug
fixes, manually merged to most systems on the same
day. The fact that we do not see many of these
vertical line patterns suggests that mass-synchronization
of many systems at once does not happen often in the
MES-Toolbox product family.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Threats to Validity</title>
      <p>Internal Validity During our study, the
MESToolbox product family continued to change. To
prevent this change from affecting our results, we obtained
a local copy of the repository. This local copy of the
repository was used throughout the study.</p>
      <p>We used the merge-info property to determine
whether a commit was a merge. Since this
property can be incorrect, we additionally checked whether
commits caused systems to converge. We
crosschecked the precision of this technique by manually
inspecting revisions, and achieved a good precision.</p>
      <p>While the experience of the author as a developer
of the system may provide a detailed interpretation
of fine-grained changes, this can cause some bias. We
aimed to reduce this threat as much as possible by
providing quantitative data to support our findings and
collaboration with an external supervisor.</p>
      <p>External Validity Development practices in other
organizations that use clone-and-own might have
different effects on the evolution of the system, which
may lead to different observations. However, some of
our findings are consistent with those of other,
independent studies.</p>
      <p>In our analysis of synchronizing changes, we looked
at the number of synchronizing commits. The number
of commits can be affected by the behavior of
individual developers. Developers can choose to merge each
individual revision, or merge a large number revisions
at once. The first style clearly results in a higher
number of commits compared to the latter, but arguably
requires more effort too.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Related Work</title>
      <sec id="sec-7-1">
        <title>Clone Evolution Patterns</title>
        <p>
          <xref ref-type="bibr" rid="ref23">Thummalapenta et al. (2010)</xref>
          proposed an approach
for the identification of the evolution of cloned code
fragments over time and categorized the evolution
patterns as (a) Consistent Evolution, (b) Late
Propagation, (c) Delayed Propagation, and (d) Independent
Evolution. In our study, we used these patterns to
characterize some of the change patterns we observed
in the evolution of the product family. For example,
Delayed Propagation was used as a strategy to
validate the correctness of changes on some variants,
before propagating them to all variants. Independent
Evolution was used to keep the variant as-is after the
project had been commissioned and the testing phase
had already finished.
        </p>
        <p>Similar characteristics were found by Stanciulescu,
Schulze, and Wa¸sowski (2015) in a study on the
advantages and disadvantages of forking using the case
of Marlin, an open source firmware for 3D printers.
They found that important bug-fixes were not
propagated and functionality was sometimes developed more
than once. Intuitively you may consider these findings
to be bad practices and drawbacks of clone-and-own.
However, there are situations where this may be
desirable, as the authors found that “Once the firmware
is configured and running on the printer, new changes
are not desired”.</p>
        <p>
          In an environment where the potential cost of an
error can be significant, systems are changed as
little as possible when maintained
          <xref ref-type="bibr" rid="ref7">(Cordy, 2003)</xref>
          . In
a clone-an-own based system, this characteristic can
be detected by looking for patterns like Independent
Evolution, the lack of synchronization with the
origin, or redundant code. This is in line with some of
the cloning patterns described by
          <xref ref-type="bibr" rid="ref13">Kapser and Godfrey
(2006)</xref>
          . They argued that code duplication can also
have benefits, and described the pros and cons in a
catalog of cloning patterns used in real-world systems.
        </p>
      </sec>
      <sec id="sec-7-2">
        <title>Software Ecosystem Characteristics</title>
        <p>Lettner, Angerer, Grünbacher, et al. (2014) studied
the relevance of characteristics of Software Ecosystems
in the domain of industrial automation and found some
additional characteristics that according to them are
of particular importance in the industrial automation
domain. For example, platform quality characteristics
like stability and backward compatibility, and
longterm platform evolution seemed to be essential to the
success of the studied system. One of the reasons for
this conclusion was that “application engineer B
reported that he had to update a ten-year-old version of
the platform software because an important customer
had decided to leave out several platform releases and
then requested a new feature. This led to significant
difficulties in merging the old software version with the
new functionality.”. Developers of the system we study
have reported similar issues with upgrading customer
systems to a new release.</p>
        <p>
          In a later study by Lettner, Angerer, Prähofer, et al.
(2014), the change characteristics and software
evolution challenges of the same ecosystem were
investigated. The software change taxonomy of
          <xref ref-type="bibr" rid="ref5">Buckley
et al. (2005)</xref>
          was used to describe qualitatively when,
where, and how changes were made in different parts
of the system and what was affected by changes. The
authors found that the ecosystem is subject to both
continuous and periodic evolution. The core platform
is continuously changed to include new features and
bug-fixes, while those changes are only periodically
released to platform users. The granularity of these
changes is reportedly primarily coarse for customer
requirements, and fine for bug fixes. Propagation of
changes is done by hand, and change impact analysis
is performed manually, based on expert knowledge.
        </p>
        <p>The system we study is in the same domain and
seems to be developed similarly. Our study is different
in a sense that we support our findings with visual
representations of the evolution of the system. For
example, we know that in this case changes are also
propagated by hand, so we developed a technique to
show how frequent this is actually done in the
MESToolbox product family.</p>
      </sec>
      <sec id="sec-7-3">
        <title>Crosscutting Concerns</title>
        <p>
          A possible area of interest in the analysis of
cloneand-own evolution is the presence and development of
crosscutting concerns in the system. A crosscutting
concern is a feature whose implementation is spread
across many modules
          <xref ref-type="bibr" rid="ref14 ref17">(Marin, Deursen, and Moonen,
2007)</xref>
          . If product variants, or clones, exhibit a high
degree of variation in the implementation of
crosscutting concerns, we expect that this may also affect the
extent to which changes are propagated, and how the
code-bases diverge.
        </p>
        <p>
          <xref ref-type="bibr" rid="ref18">Marin, Moonen, and Deursen (2005</xref>
          ) propose a
classification system for crosscutting concerns in terms of
sorts, where a sort is a description based on a
number of distinctive properties. A sort we expect to find
often in this case study is Entangled Roles. In Object
Oriented terminology this sort is defined as Implement
a method with (entangled) functionality that belongs
to a different concern than the main concern of that
method. A characteristic of clone-and-own is that it
allows application engineers to make these kinds of
fine-grained changes quickly. For example, a customer
wants to be notified when stock levels exceed a certain
value. If there is no such monitoring system in place,
then the fastest solution can be to add this
functionality to a method that deals in some way with
stockcontrol. Implementation of a generic solution may
exceed the level of expertise of the application engineer,
and waiting for a platform engineer to develop the
solution may take too much time.
        </p>
        <p>
          <xref ref-type="bibr" rid="ref10">Figueiredo et al. (2009)</xref>
          describe 13 patterns of
crosscutting concerns identified in three case studies,
one of which was a software product line. The authors
found that some patterns consistently emerged in
situations with the frequent use of inheritance. They
found that this was often the case in product lines
because “Program families rely extensively on the use
of abstract classes and interfaces in order to
implement variabilities. The inappropriate modularization
of such crosscutting concerns might lead to future
instabilities in the design of the varying modules”
        </p>
        <p>
          Detection of crosscutting concerns is called aspect
mining. Various aspect mining techniques have been
proposed
          <xref ref-type="bibr" rid="ref14 ref24 ref6">(Kellens, Mens, and Tonella, 2007; Tourwé
and Mens, 2004; Ceccato et al., 2006)</xref>
          . For
example, fan-in analysis looks for crosscutting
functionality by detecting methods that are explicitly invoked
from many methods scattered throughout the code
          <xref ref-type="bibr" rid="ref14 ref17">(Marin, Deursen, and Moonen, 2007)</xref>
          . History-based
concern mining techniques analyze change-history to
detect which program entities change together
frequently
          <xref ref-type="bibr" rid="ref1 ref13 ref4">(Breu and Zimmermann, 2006; Adams, Jiang,
and Hassan, 2010)</xref>
          .
          <xref ref-type="bibr" rid="ref11">Hashimoto and Mori (2012)</xref>
          developed a tool that improves history-based concern
mining by combining it with fine-grained change analysis
based on abstract syntax tree differencing.
        </p>
        <p>In future work, we intend to use these tools and
techniques to gain a deeper understanding of the
change and divergence patterns we found.</p>
      </sec>
      <sec id="sec-7-4">
        <title>Clone-and-Own in Product Line Engineering</title>
        <p>
          <xref ref-type="bibr" rid="ref8">Dubinsky et al. (2013)</xref>
          studied the processes and
perceived advantages and disadvantages of the
clone-andown approach of six industrial software product lines.
They show that cloning is perceived as a favorable and
natural reuse approach by the majority of
practitioners in the studied companies, mainly because of its
simplicity and availability. They found that
practitioners lack the awareness and knowledge about forms
of reuse, and many alternative approaches fail to
convince them that they yield better results.
        </p>
        <p>Rubin, Czarnecki, and Chechik (2013) proposed a
framework to organize knowledge related to the
development, maintenance and merge-refactoring of
product lines realized via cloning. This framework is a step
towards a recommender system that can assist users in
selecting tools and techniques that are useful in their
situation.</p>
        <p>
          <xref ref-type="bibr" rid="ref12">Hetrick, Krueger, and Moore (2006</xref>
          ) report on the
experience of a structured, incremental transition from
a clone-and-own approach to software product line
practices. They show that it is possible to make this
transition without a significant upfront investment and
disruption of the ongoing production schedules. The
authors indicate that the file branch factor gradually
reduced during the transition, to a point where all
branches from product line core assets were completely
eliminated. This metric is defined as the average
number of branched files per product, normalized by the
number of products. Our study shows that the
number of branched files per product can vary significantly
between systems and over time. Hence, care has to
be taken when using the average. Furthermore, we
found that products with a similar percentage of files
diverged can vary significantly in terms of total
number of lines diverged.
        </p>
        <p>
          <xref ref-type="bibr" rid="ref2">Antkiewicz et al. (2014)</xref>
          propose an incremental and
minimally invasive strategy for adoption of
productline engineering. The strategy is called virtual
platform, and should allow organizations to obtain
incremental benefits from incremental changes to the
development approach. By studying the development
practices of our industry case, we gain insight into an
industry context and the needs of practitioners. This
may serve as input for recommender systems,
requirements for the virtual platform, and can be helpful to
practitioners, researchers and tool developers.
8
        </p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Conclusion</title>
      <p>In this work, we presented the results of our
exploratory analysis of an industry product family
developed using a clone-and-own approach. The goal of
this analysis was to gain insight into how the
product family has evolved, and to identify clone-and-own
related points of interest. First, we explored whether
MES-Toolbox systems have changed in parallel. Next,
we investigated how much the codebase of the systems
diverged from their origin, and to what extent this
changed over time. Finally, we studied the
synchronization activity between systems and their origins.</p>
      <p>We observed that many MES-Toolbox systems are
changed roughly at the same time, but that the degree
of parallel change is not the same for all systems, nor
is it constant over time. Many systems appear to be
changed in parallel initially until the development of
one system is done and they no longer change in
parallel. This is consistent with a schedule-driven need
for independence. We further observed a
scheduleindependent cause for parallel change, which was the
need to propagate critical bug fixes to many systems
on the same day. This form of mass-synchronization
appeared to have occurred only twice in the history of
the systems we analyzed.</p>
      <p>With regard to divergence, we found that all
MESToolbox systems we analyzed, including those which
reportedly hardly required any customer-specific
modifications, diverged significantly from their origin. In
terms of the proportion of Java files, all systems
diverged between 7% and 22.5% from their origin. In
terms of diverged number lines, most systems did not
exceed 50.000 lines (&lt;5%), and only two systems
diverged more than 75.000 lines. We identified one case
where the divergence measured in percentage of Java
files was significantly different from divergence
measured in terms of number of lines.</p>
      <p>During our analysis of divergence over time, we were
able to identify points in time when systems were
synchronized with their origin. Our analysis of
synchronizing changes confirms these findings, and we found
that all systems we analyzed retrieved changes from
their origin at least once, but not all systems
contributed changes back to their origin.</p>
      <p>Overall, these results show that products from the
same product family can vary significantly in terms of
change activity over time, divergence from their
origin and synchronization activity. It is important to
keep this in mind when studying product families
realized via clone-and-own, as these variations may play
an important role in reducing maintenance overhead.
In future work, we will further investigate these factors
to develop quantitative measures for the assessment of
clone-and-own benefits and drawbacks.</p>
      <sec id="sec-8-1">
        <title>Acknowledgements</title>
        <p>We thank prof. dr. J.J. Vinju, the reviewers and other
participants of the SATToSE 2017 seminar for their
helpful input on related literature and the direction of
this study.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Adams</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Z. M.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Hassan</surname>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>“Identifying Crosscutting Concerns Using Historical Code Changes”</article-title>
          .
          <source>In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - ICSE '10</source>
          . Vol.
          <volume>1</volume>
          . ACM, pp.
          <fpage>305</fpage>
          -
          <lpage>314</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Antkiewicz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          et al. (
          <year>2014</year>
          ).
          <article-title>“Flexible Product Line Engineering with a Virtual Platform”</article-title>
          .
          <source>In: Companion Proceedings of the 36th International Conference on Software Engineering - ICSE Companion</source>
          <year>2014</year>
          . ACM, pp.
          <fpage>532</fpage>
          -
          <lpage>535</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Berger</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          et al. (
          <year>2014</year>
          ).
          <article-title>“Three Cases of Feature-Based Variability Modeling in Industry”</article-title>
          .
          <source>In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</source>
          . Vol.
          <volume>8767</volume>
          . Springer, pp.
          <fpage>302</fpage>
          -
          <lpage>319</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Breu</surname>
            , S. and
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Zimmermann</surname>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>“Mining Aspects from Version History”</article-title>
          .
          <source>In: Automated Software Engineering</source>
          ,
          <year>2006</year>
          . ASE'
          <volume>06</volume>
          . 21st IEEE/ACM International Conference on. IEEE, pp.
          <fpage>221</fpage>
          -
          <lpage>230</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Buckley</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          et al. (
          <year>2005</year>
          ).
          <article-title>“Towards a Taxonomy of Software Change”</article-title>
          .
          <source>In: Journal of Software Maintenance and Evolution: Research and Practice 17.5</source>
          , pp.
          <fpage>309</fpage>
          -
          <lpage>332</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Ceccato</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          et al. (
          <year>2006</year>
          ).
          <article-title>“Applying and Combining Three Different Aspect Mining Techniques”</article-title>
          .
          <source>In: Software Quality Journal 14.3</source>
          , pp.
          <fpage>209</fpage>
          -
          <lpage>231</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Cordy</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>“Comprehending Reality - Practical Barriers to Industrial Adoption of Software Maintenance Automation”</article-title>
          .
          <source>In: Program Comprehension</source>
          ,
          <year>2003</year>
          . 11th IEEE International Workshop on. IEEE, pp.
          <fpage>196</fpage>
          -
          <lpage>205</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Dubinsky</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          et al. (
          <year>2013</year>
          ).
          <article-title>“An Exploratory Study of Cloning in Industrial Software Product Lines”</article-title>
          .
          <source>In: Proceedings of the European Conference on Software Maintenance and Reengineering</source>
          , CSMR, pp.
          <fpage>25</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Duc</surname>
            ,
            <given-names>A. N.</given-names>
          </string-name>
          et al. (
          <year>2014</year>
          ).
          <article-title>“Forking and coordination in multi-platform development: a case study”</article-title>
          .
          <source>In: Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement - ESEM '14</source>
          . New York, New York, USA: ACM Press, pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Figueiredo</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          et al. (
          <year>2009</year>
          ).
          <article-title>“Crosscutting Patterns and Design Stability: An Exploratory Analysis”</article-title>
          .
          <source>In: IEEE International Conference on Program Comprehension</source>
          , pp.
          <fpage>138</fpage>
          -
          <lpage>147</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Hashimoto</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>A</given-names>
            .
            <surname>Mori</surname>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>“Enhancing HistoryBased Concern Mining with Fine-Grained Change Analysis”</article-title>
          .
          <source>In: 2012 16th European Conference on Software Maintenance and Reengineering. IEEE</source>
          , pp.
          <fpage>75</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Hetrick</surname>
            ,
            <given-names>W. A.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>C. W.</given-names>
            <surname>Krueger</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Moore</surname>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>“Incremental Return on Incremental Investment: Engenio's Transition to Software Product Line Practice”</article-title>
          .
          <source>In: International Conference on Object-Oriented Programming, Systems, Languages and Applications</source>
          . ACM, pp.
          <fpage>798</fpage>
          -
          <lpage>804</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Kapser</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>M. Godfrey</surname>
          </string-name>
          (
          <year>2006</year>
          ). “
          <article-title>"Cloning Considered Harmful" Considered Harmful”</article-title>
          .
          <source>In: 2006 13th Working Conference on Reverse Engineering. IEEE</source>
          , pp.
          <fpage>19</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Kellens</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mens</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Tonella</surname>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>“A Survey of Automated Code-Level Aspect Mining Techniques”</article-title>
          .
          <source>In: Transactions on Aspect-Oriented Software Development IV</source>
          . Berlin, Heidelberg: Springer Berlin Heidelberg, pp.
          <fpage>143</fpage>
          -
          <lpage>162</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Lettner</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Angerer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Grünbacher</surname>
          </string-name>
          , et al. (
          <year>2014</year>
          ).
          <article-title>“Software Evolution in an Industrial Automation Ecosystem: An Exploratory Study”</article-title>
          .
          <source>In: Software Engineering and Advanced Applications (SEAA)</source>
          ,
          <year>2014</year>
          40th
          <string-name>
            <surname>EUROMICRO</surname>
          </string-name>
          <article-title>Conference on</article-title>
          . IEEE, pp.
          <fpage>336</fpage>
          -
          <lpage>343</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Lettner</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Angerer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Prähofer</surname>
          </string-name>
          , et al. (
          <year>2014</year>
          ).
          <article-title>“A Case Study on Software Ecosystem Characteristics in Industrial Automation Software”</article-title>
          .
          <source>In: Proceedings of the 2014 International Conference on Software and System Process - ICSSP 2014. ACM</source>
          , pp.
          <fpage>40</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Marin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A. van Deursen</given-names>
            , and L.
            <surname>Moonen</surname>
          </string-name>
          (
          <year>2007</year>
          ).
          <article-title>“Identifying Crosscutting Concerns Using Fan-In Analysis”</article-title>
          .
          <source>In: ACM Transactions on Software Engineering and Methodology (TOSEM) 17.1</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>37</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Marin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Moonen</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A. van Deursen</surname>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>“A Classification of Crosscutting Concerns”</article-title>
          .
          <source>In: 21st IEEE International Conference on Software Maintenance (ICSM'05)</source>
          . IEEE, pp.
          <fpage>673</fpage>
          -
          <lpage>676</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Rubin</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Czarnecki</surname>
          </string-name>
          , and M.
          <string-name>
            <surname>Chechik</surname>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>“Managing Cloned Variants: A Framework and Experience”</article-title>
          .
          <source>In: Proceedings of the 17th International Software Product Line Conference - SPLC '13. ACM</source>
          , p.
          <fpage>101</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Schmorleiz</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Lammel</surname>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>“Similarity management of 'cloned and owned' variants”</article-title>
          .
          <source>In: Proceedings of the 31st Annual ACM Symposium on Applied Computing - SAC '16</source>
          . New York, New York, USA: ACM Press, pp.
          <fpage>1466</fpage>
          -
          <lpage>1471</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Schrock</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fay</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Jager</surname>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>“Systematic interdisciplinary reuse within the engineering of automated plants”</article-title>
          .
          <source>In: Systems Conference (SysCon)</source>
          ,
          <year>2015</year>
          9th Annual IEEE International, pp.
          <fpage>508</fpage>
          -
          <lpage>515</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Stanciulescu</surname>
            ,
            <given-names>S.</given-names>
            , S.
          </string-name>
          <string-name>
            <surname>Schulze</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Wa¸sowski (
          <year>2015</year>
          ).
          <article-title>“Forked and Integrated Variants in an Open-Source Firmware Project”</article-title>
          .
          <source>In: 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME)</source>
          . IEEE, pp.
          <fpage>151</fpage>
          -
          <lpage>160</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Thummalapenta</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          et al. (
          <year>2010</year>
          ).
          <article-title>“An Empirical Study on the Maintenance of Source Code Clones”</article-title>
          .
          <source>In: Empirical Software Engineering 15.1</source>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>Tourwé</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Mens</surname>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>“Mining Aspectual Views using Formal Concept Analysis”</article-title>
          .
          <source>In: Source Code Analysis and Manipulation, Fourth IEEE International Workshop on. IEEE Comput. Soc</source>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>106</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Yamashita</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          et al. (
          <year>2017</year>
          ).
          <article-title>“Software Evolution and Quality Data from Controlled, Multiple, Industrial Case Studies”</article-title>
          .
          <source>In: Proceedings of the 14th International Conference on Mining Software Repositories. IEEE</source>
          , pp.
          <fpage>507</fpage>
          -
          <lpage>510</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>