<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Developer Oriented and Quality Assurance Based Simulation of Software Processes</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Copyright c 2016 by the paper's authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors. In: A.H. Bagge, T. Mens (eds.): Postproceedings of SATToSE 2015 Seminar on Advanced Techniques and Tools for Software Evolution, University of Mons, Belgium</institution>
          ,
          <addr-line>6-8</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Verena Honsel, Daniel Honsel, Jens Grabowski, and Stephan Waack Institute of Computer Science Goldschmidtstr. 7 University of Goettingen</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Software process planning involves the consideration of process based factors, e.g., development strategies, but also social factors, e.g., collaboration of developers. To facilitate project managers in decision making during the project, we develop an agent-based simulation tool which allows them to test di↵erent alternative future scenarios. For this, it is indispensable to understand software evolution and its influences. We cover di↵erent aspects of software evolution with models tailored towards specific questions. For the investigation of system growth, developer networks and file dependency graphs, we performed two case studies of open source projects. This way, we infer parameters close to reality and are able to compare empirical with simulated results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>In software process planning decision making is a hard
but important task for project managers. It can be of
great help to have tool support, with which the
manager can test the interplay of project parameters and
resulting evolutionary scenarios. Relevant parameters
may be the number of developers, their bugfixing
effort, and the expected development time when
deciding, e.g., about the team constellation. The
bugfixing e↵ort depends on their roles, e.g., maintainers fix
more bugs. This process is iteratively repeated until
the project manager gets sucient information. The
intended feedback loop is depicted in Figure 1.</p>
      <p>
        To build an agent-based simulation tool aiding
software managers in the planning of software
development, it is important to get a deep understanding
of software evolution processes. Several factors
influence how the software evolve, what evolves, and why
it evolves. According to Lehman [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], finding answers
to these questions are the research directions in
software evolution. To reduce complexity and parameters,
we build di↵erent models reflecting di↵erent shades of
software evolution and related development processes.
Since humans – in the shape of developers, users, and
testers – constitute a big driver of software evolution,
it is reasonable to approach the simulation of software
processes agent-based. Agents are autonomous
individuals with a behavior specified by certain rules [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
Developers can be considered as active agents
changing the passive agents, i.e., software entities.
      </p>
      <p>
        When tracing aspects of software evolution, we first
have to identify influencing factors concerning the
question under investigation. Thus, we learn from
the past in form of analyzing open source software
repositories, which by itself became a large research
topic in recent years (e.g., MSR 1). In our approach
we combine software quality assurance issues with
social and process controlled factors influencing software
development. For this, we are interested in examining
the contribution behavior of developers as well as the
nature of changes and related error-proneness. The
knowledge we gain from mining is then transfered into
our agent-based simulation model so that we retrieve
a concrete instance constructed for answering the
specific evolutionary question under investigation. In this
paper, we summarize our recent research, which
considers system growth, developer collaboration and
behavior, and the evolution of software changes. This
paper presents a publication summary of our papers
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Only few approaches exist employing simulation in the
context of software evolution and the prediction of it.
The work most related to our approach is the work
of Smith et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] which proposes to use Agent-based
simulation in the context of software evolution. There,
as well the developers serve as active agents and the
software entities (here modules) constitute the
passive agents. They also have requirements as a primary
stage of software modules. As metrics they consider
complexity and fitness (of purpose) of modules, which
are changed by the actions of developers. In contrast
to their work, the environment is not defined on a grid,
where the developers can fall down when moving along.
Instead we use networks, which have the additional
advantage to store dependencies between entities. The
quality, in the work of Smith et al. modeled by the
fitness, is modeled by the bug distribution in our case,
but we do not consider complexity at the moment.
      </p>
      <p>
        Other studies touching the topic are for example the
work of Wagstrom et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and Andersson et al [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Approach</title>
      <p>In this section, we describe the background and
methods which represent the foundations of our work. We
comment on methods used for software mining and
explain the underlying agent-based model of our
approach.
3.1</p>
      <sec id="sec-3-1">
        <title>Software Mining and Analysis</title>
        <p>
          For the estimation of the simulation parameters, we
examine open source software projects, which are a
gold mine for researchers interested in software
evolution and software mining. Since we are interested in
di↵erent facets of the software development process,
we collect data from projects, for which the
information of commit logs, issue tracking systems, and
mailing lists are available. Once the project is selected,
we use tools from data mining and analysis, machine
learning, and visualization to first understand it and
later observe behavioral rules from it. For analyses
we use the tools R [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] and Weka [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. The observed
rules and information then serve us as input for our
simulation model.
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Agent-Based Simulation Model</title>
        <p>
          The current agent-based simulation model for software
evolution is an extension of our previous publication [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]
which considers only the system growth of the software
under simulation. For modeling and simulation
purposes we use Repast Simphony 2.
        </p>
        <p>The model depicted in Figure 2 contains the
environment which knows all other instances and is
responsible for the creation of a configured number of
developers at simulation start-up. Furthermore, the
environment instantiates bugs at scheduled points in
time and assigns them to randomly selected software
entities, e.g., files, classes, or modules. The developer
is responsible for creating, updating, and deleting
entities. For the estimation of parameters we used K3b 3
in our initial case study. Through the mining process
we have recognized four di↵erent types of developers.</p>
        <p>The Core Developer is the initial contributor
being familiar with many entities and performing most
2http://repast.sourceforge.net/ [last visited: 10.03.2016]
3http://www.k3b.org/ [last visited: 10.03.2016]
*</p>
        <p>Developer
* nnuummbbeerrOOffFCioxmesm:itIsnt:eIgneterger
createFiles()
updateFiles()
deleteFiles()
bugFix()
*</p>
        <p>Bug
dateOfCreation : Real
dateOfClosing : Real
computeLifespan() : Real
*
CreatesAndContainsDeveloper</p>
        <p>CreatesBug</p>
        <p>Environment
fileCount : Integer 1
1</p>
        <p>Contains
*</p>
        <p>*</p>
        <p>SoftwareEntity
owner : Developer
numberOfChanges : Integer
numberOfAuthors : Integer
couplingDegree : Integer
1
commits. The Maintainer is a person who does
primarily maintenance work, i.e., he fixes a large number
of bugs. Therefore, we assume he has good knowledge
about the entire project. The Major Developer knows
specific areas of the project and fixes most of the bugs
occurred in entities known by him. The Minor
Developer executes less than 100 commits and performs less
bugfixes. They might be specialists who only
implement one specific task or feature. In K3b there is one
core developer, one maintainer, 17 major developers,
and 106 minor developers.</p>
        <p>To model dependencies between the agents, we have
created three networks. One to represent dependencies
between developers and software entities, one stores
information about bugs and the modules they are
assigned to, and one represents dependencies between
software entities that are changed together several
times. Following, the networks are briefly summarized:
• DeveloperEntityNetwork : This network
represents the dependencies between entities and
developers. An edge is added if a developer creates
an entity or if a developer changes an entity, that
has not been created by him. Hence, this network
also provides the number of authors.
• BugEntityNetwork : After the environment
created a new bug an edge is added to this network.
The edge contains information whether a bug is
fixed or not. In future models an edge may
contain additional information about the bugs, e.g.,
the number of fixing attempts or if a bug is
reopened.
• ChangeCouplingNetwork :
This network
represents dependencies between software entities that
are changed together, including the number of
changes.</p>
        <p>
          The creation and deletion of entities is responsible for
the growth of the software under simulation. The
growth depends on the number of developers, the
desired system size, and the simulation time. Since the
lifespan of K3b is 4044 days, we have 4044 simulation
rounds. We assume that the number of entity changes
follows a geometric distribution [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>The likelihood of creating and deleting entities
decreases with the increasing system growth. The
probabilities are restricted through the project size and
adjusted for each developer type with respect to the
average commit behavior shown in Table 1 and the
average add, update, and delete behavior per commit
presented in Table 2. We assume that it is more likely
that developers update entities they already know.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Performed Analysis and Case Studies</title>
      <p>We briefly summarize two of our case studies and
results in this section. These studies include the
system growth in number of files, developer collaboration
depicted in developer-file networks, and the evolution
of software dependencies represented in file networks
based on change coupling. An approach for the
learning of developer experience and project involvement is
also presented.
4.1</p>
      <sec id="sec-4-1">
        <title>System Growth and Developer Collaboration</title>
        <p>
          One aspect of our preliminary case study [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] is the
growth of software systems. We measure growth in
number of files, which is reflected by the creations,
modifications, and deletions the developers perform.
For this purpose, we selected K3b with a development
time of over ten years, 125 developers and more than
6000 commits. We observed a super-linear growth
trend for K3b and used this to build a statistical model
for the growth based on changes made by developers.
Using geometric distributions for file creations,
modifications, and deletions, we were able to reproduce the
system growth in number of files of K3b validated by
comparing empirical and simulated results. The
simulated curve fits the trend as well as the concrete values
given the parameter set for K3b.
        </p>
        <p>
          Moreover, we build developer-file networks, where a
dependency between a developer node and a file node
is added, if the developer worked on that file. The
evolution of the graph depicted in Figure 3 shows that
there is one main contributor who is the project
creator (red node), whose central status is inherited by its
maintainer (blue node on the left) after 2006.
Moreover, we have a low modularity factor in 2006 and 2012,
i.e., the network cannot be modularized into clusters.
There the work depends too much on a certain
developer. How and why these networks evolve for other
projects is of interest for simulating developer
collaboration behavior. As Foucalt et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] stated,
developer turnover can have a high impact on software
quality. To identify such turnover patterns, could also
improve the simulation of software processes.
4.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Software Dependency Analysis</title>
        <p>
          In our latest work [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ], we analyzed change coupling
dependency graphs to understand the evolution of file
dependencies. The change coupling [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] degree describes
how often software entities are changed together. By
calculating the average degree as well as the average
weighted degree over the time, we trace the evolution
of the files not only in terms of the amount of
dependencies to other files, but also in terms of the intensity
of their relationship. For this, we used K3b for the
estimation of parameters and Log4j 4 for the validation of
our results. The two projects are alike in the number
of files and duration, but di↵er in the e↵ort spent by
the developers. e.g., K3b comprises more developers,
especially minor developers.
        </p>
        <p>
          For the estimation of parameters we build change
coupling dependency graphs from commit logs. If a
file is changed together with another file more than
twice, an edge is created in the network. To describe
4https://logging.apache.org/ [last visited: 10.03.2016]
the evolution of software dependencies in the
simulation we compared the resulting graphs for each year
in terms of the degree, modularity, and diameter. The
degree of a node (file) exposes the importance of the
node. The modularity indicates how good a network
can be divided into clusters, whereas the diameter
defines the maximum shortest path between each pair of
nodes [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
        </p>
        <p>In Figure 4 the evolution of K3b’s as well as Log4j’s
file dependency network for selected years is
illustrated. In the case of K3b, we have a quite high
average degree, which means that there are less weakly
connected entities and the networks remain quite
dense. The modularity instead is lower than in the
case of Log4j, so that the separation into clusters is
worse. For Log4j several small independent clusters
are visible, which constitute for example tests.</p>
        <p>
          The empirical behavior (red) of the average change
coupling degree is shown in Figure 5. To model this
trend we used linear regression and retrieved the best
fit for a second order model (black) with an adjusted
R-squared value of 0.97 [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Also the simulated average
degree of the software agents is depicted. By
comparing the real with the simulated behavior, it is
recognizable that the simulation exhibits a similar trend but is
about three too low.
        </p>
        <p>For validation we tried out the simulation built on
the knowledge gained from K3b with a changed
parameter set according to properties of Log4j. There
we have one core developer, five major developers, and
fourteen minor developers. With the adapted
distribution of developer types and size of the system, we were
able to simulate growth trends and network properties.</p>
        <p>In Figure 6 the empirical, fitted, and simulated
average coupling degree over the years is pictured.
Therefore, the simulation works for projects which are
similar in size and duration. The projects under
examination are also written in the same programming
language. We also validated the square trend and the
average coupling degree for Log4j.</p>
        <p>Since the roles of developer types are currently
static and we observed in our work, e.g., in the
evolution of developer-file networks (Figure 3) that the
roles and importance of developers can change over
time, we study the developer contribution behavior in
more detail.
Since our work exposed the need for a more
finegrained model of developer behavior, we created a
learning model, which helps us to understand
developer contribution behavior and related experience. A
first improvement of our simulation was to introduce
developer types, which are manually classified
according to their commit and bugfixing behavior. The
contribution behavior of developers complies with their
personal status of experience and involvement in the
project. In the analysis of contribution behavior only
the output of this status is visible. To retrieve
information about the underlying states, we employ Hidden
Markov Models (HMMs), which are stochastic models
used for discrete time observations. In doing so, we
hope to gain valuable insights for the refinement of
developer types.</p>
        <p>
          The general method as described in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] takes three
sources of data into account: version control data in
form of developer commits and bugfixes, bug tracking
system information in terms of bug comments, and
mailing list data by the number of threads opened and
answers for each developer. For every contributor in
the project these learning activities are collected for
each month. This requires a HMM which can handle
multi-dimensional observations since we have four
observations for each point in time. To map this into
a comprehensible format, we classify these
observations by using a threshold learner into low, medium
and high. When filtering developers with less than
twenty commits, ten active contributors remained for
the project Rekonq5.
        </p>
        <p>
          In Figures 7a - 7d [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] the monthly contribution
behavior as well as the monthly mailing list activities of
these developers are visible. The project points out
one main contributor (dev 1). For the communication
activity displayed in Figure 7c and 7d it is shown, that
also other developers play an active part there. As an
explanation one can think of less experienced
developers asking for help. The retrieved thresholds necessary
for the HMM learning input are listed in Table 3 and
Table 4 [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>
          A developer needs for example at least 14 commits
and no bugfixes to contribute with a medium activity
and at least 35 commits and 6 bugfixes to contribute
5https://rekonq.kde.org/ [last visited: 10.03.2016]
in a high manner. When the training of the HMMs
for each developer is finished, we get the
corresponding transition matrix containing the probabilities to
switch between the di↵erent learning stages. Via the
Viterbi algorithm [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] we are able to get the most
probable sequence of learning states which produced the
observation sequence.
In our case studies we showed the feasibility of
agentbased simulation for software processes. We
experienced that some facts as the growth and the general
commit behavior can be modeled well. But when it
comes to phenomena of software evolution, like the
developer turnover mentioned in Section 4.1, which are
project-specific, it is more complex to describe.
Therefore, it could be of help to generate developer profiles
storing more information about them, and which can
serve the project manager as indicator for such an
incidence. Moreover, the dependencies among agents are
intricately describable. Although we were able to
simulate the co-changes of K3b successfully, er lost
precision in the results for Log4j. Probable solutions are
described in the next section. Summarily, simulation
turns out to be a good method for software evolution,
but the whole process including selected parameters
and dependencies needs to be considered carefully and
describing it accurately is a strenuous process.
        </p>
        <p>During our experiments in simulating software
processes, we observed problems due to the lack of
development phases, which results in a more linear
rep4m0
o
c
20
0
20
15
d
e
e
p
10
o
s
d
a
e
r
h
t
5
0
dev 1
dev 2
dev 3
dev 4
e
s
p
s
e
r</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6 Future Work</title>
      <p>To include di↵erent learning types of developers, we
plan to implement the knowledge learned by the
HMMs into the simulation. From this, we hope to
refine the developer types and incorporate development
strategies according to the developers’ behavior. This
model will be transfered into a learning model for
development phases like initial, development, or
maintenance. This originates from the fact that at di↵erent
points in time certain actions are more likely, e.g., in
the beginning of a project creations are more likely,
whereas in the end or before a release bugfixes are
common. How suitable parameters look like and if
this presents a promising approach for the inclusion
of development phases in the simulation, is an open
question for us.</p>
      <p>Developer
dev 1
dev 2
dev 3
dev 4
dev 5
dev 6
dev 7
dev 8
dev 9
dev 10
Developer
dev 1
dev 2
dev 3
dev 4
dev 5
dev 6
dev 7
dev 8
dev 9
dev 10</p>
      <p>Furthermore, we plan to improve the strategy for
the bug introduction. With every change in the
software a bug can be introduced with a certain
probability. This probability depends on factors like the
experience of the author of the change, the number of
previous changes, and the complexity of the artifact
under change. Since we already simulate such factors,
we plan to build a heuristic (e.g., a decision tree)
using this information which defines the bug introduction
probability.</p>
      <p>
        Since we also have di↵erent graphs describing
software evolution in our simulation, we plan to use
them for software quality prediction. Bhattacharya
et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] showed the correlation between di↵erent
kinds of networks concerning software evolution
(displaying developer collaboration, module dependencies,
and function calls) and quality factors like defects and
maintenance e↵ort. For such an investigation we need
also networks displaying the structure of the software
in more detail. For this, we are currently working on
the mining of abstract syntax trees (ASTs) retrieved
from the source code and examining the evolution of
them. This way, we hope to get a fine-grained picture
of software evolution resulting in a simulation which
puts these networks together and simulate the e↵ect
on software quality.
      </p>
      <p>
        For the evaluation of empirical and simulated
networks we plan to use exponential random graph
models (ERGMs) [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], which allows us to get a structural
representation of graphs, which facilitates the
comparability. Instead of a set of metrics, we get a whole
describing picture of structural properties.
6.0.1
      </p>
      <sec id="sec-5-1">
        <title>Acknowledgements</title>
        <p>We would like to thank the simulation science center
Clausthal-G¨ottingen (SWZ), that funded parts of our
work.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.</given-names>
            <surname>Honsel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Honsel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Grabowski</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Waack</surname>
          </string-name>
          .
          <article-title>Mining software dependency networks for agent-based simulation of software evolution</article-title>
          .
          <source>In Proceedings of the 4th International Workshop on Software Mining (SoftMine)</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.M.</given-names>
            <surname>Lehman</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.F.</given-names>
            <surname>Ramil</surname>
          </string-name>
          .
          <article-title>Towards a theory of software evolution - and its practical impact (working paper)</article-title>
          .
          <source>In Invited Talk, Proceedings Intl. Symposium on Principles of Softw. Evolution, ISPSE</source>
          <year>2000</year>
          ,
          <volume>1</volume>
          -2 Nov, pages
          <fpage>2</fpage>
          -
          <lpage>11</lpage>
          . Press,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Macal</surname>
          </string-name>
          and
          <string-name>
            <surname>M. J. North.</surname>
          </string-name>
          <article-title>Introductory tutorial: Agent-based modeling and simulation</article-title>
          .
          <source>In Simulation Conference (WSC)</source>
          ,
          <source>Proceedings of the 2011 Winter</source>
          , pages
          <fpage>1451</fpage>
          -
          <lpage>1464</lpage>
          . IEEE,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V.</given-names>
            <surname>Honsel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Honsel</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Grabowski</surname>
          </string-name>
          .
          <article-title>Software process simulation based on mining software repositories</article-title>
          .
          <source>In Proceedings of the Third International Workshop on Software Mining</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V.</given-names>
            <surname>Honsel</surname>
          </string-name>
          .
          <article-title>Statistical learning and software mining for agent based simulation of software evolution</article-title>
          .
          <source>In Doctoral Symposium at the 37th International Conference on Software Engineering</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>N.</given-names>
            <surname>Smith</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. F.</given-names>
            <surname>Ramil</surname>
          </string-name>
          .
          <article-title>Agent-based simulation of open source evolution</article-title>
          .
          <source>In Software Process Improvement and Practice</source>
          , pages
          <fpage>423</fpage>
          -
          <lpage>434</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.A.</given-names>
            <surname>Wagstrom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Herbsleb</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Carley. A Social Network Approach</surname>
          </string-name>
          To Free/Open Source Software Simulation.
          <source>Proceedings of the 1st International Conference on Open Source Systems, Genova, 11th-15th July</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Andersson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Karlsson</surname>
          </string-name>
          , Jo. Nedstam,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hst</surname>
          </string-name>
          , and
          <string-name>
            <surname>B. I. Nilsson.</surname>
          </string-name>
          <article-title>Understanding software processes through system dynamics simulation: A case study</article-title>
          .
          <source>In ECBS</source>
          , pages
          <fpage>41</fpage>
          -.
          <source>IEEE Computer Society</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Ihaka</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Gentleman</surname>
          </string-name>
          . R:
          <article-title>a language for data analysis and graphics</article-title>
          .
          <source>Journal of computational and graphical statistics</source>
          ,
          <volume>5</volume>
          (
          <issue>3</issue>
          ):
          <fpage>299</fpage>
          -
          <lpage>314</lpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hall</surname>
          </string-name>
          , E. Frank,
          <string-name>
            <given-names>G.</given-names>
            <surname>Holmes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Pfahringer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Reutemann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and I. H.</given-names>
            <surname>Witten</surname>
          </string-name>
          .
          <article-title>The weka data mining software: an update</article-title>
          .
          <source>ACM SIGKDD explorations newsletter</source>
          ,
          <volume>11</volume>
          (
          <issue>1</issue>
          ):
          <fpage>10</fpage>
          -
          <lpage>18</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Foucault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Palyart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Blanc</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. C.</given-names>
            <surname>Murphy</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.-R.</given-names>
            <surname>Falleri</surname>
          </string-name>
          .
          <article-title>Impact of developer turnover on quality in open-source software</article-title>
          .
          <source>In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering</source>
          , ESEC/FSE 2015, pages
          <fpage>829</fpage>
          -
          <lpage>841</lpage>
          , New York, NY, USA,
          <year>2015</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ball</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Porter</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H. P.</given-names>
            <surname>Siy</surname>
          </string-name>
          .
          <article-title>If your version control system could talk</article-title>
          .
          <source>In ICSE Workshop on Process Modelling and Empirical Studies of Software Engineering</source>
          , volume
          <volume>11</volume>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Fortunato</surname>
          </string-name>
          .
          <article-title>Community detection in graphs</article-title>
          .
          <source>Physics Reports</source>
          ,
          <volume>486</volume>
          (
          <issue>3-5</issue>
          ):
          <fpage>75</fpage>
          -
          <lpage>174</lpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Rabiner</surname>
          </string-name>
          .
          <article-title>Readings in speech recognition. chapter A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition</article-title>
          , pages
          <fpage>267</fpage>
          -
          <lpage>296</lpage>
          . Morgan Kaufmann Publishers Inc., San Francisco, CA, USA,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Iliofotou</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Neamtiu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Faloutsos</surname>
          </string-name>
          .
          <article-title>Graph-based analysis and prediction for software evolution</article-title>
          .
          <source>In Proceedings of the 34th International Conference on Software Engineering</source>
          , ICSE '
          <volume>12</volume>
          , pages
          <fpage>419</fpage>
          -
          <lpage>429</lpage>
          , Piscataway, NJ, USA,
          <year>2012</year>
          . IEEE Press.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>David</surname>
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Hunter</surname>
          </string-name>
          ,
          <string-name>
            <surname>Steven M. Goodreau</surname>
            , and
            <given-names>Mark S.</given-names>
          </string-name>
          <string-name>
            <surname>Handcock</surname>
          </string-name>
          .
          <article-title>Goodness of fit for social network models</article-title>
          .
          <source>Journal of the American Statistical Association</source>
          , pages
          <fpage>248</fpage>
          -
          <lpage>258</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>