<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Skill-based Team Formation in Software Ecosystems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Daniel Schall</string-name>
          <email>daniel.schall@siemens.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Siemens Corporate Technology</institution>
          ,
          <addr-line>Siemensstrasse 90, 1211 Vienna</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper introduces novel techniques for the discovery and formation of teams in software ecosystems. Formation techniques have a wide range of applications including the assembly of expert teams in open development ecosystems, nding optimal teams for ad-hoc tasks in large enterprises, or working on complex tasks in crowdsourcing environments. Software development performance and software quality are a ected by the skills and application domain experiences that the team members bring to the project. Team formation in software ecosystems poses new challenges because development activities are no longer coordinated by a single organization but rather evolve much more exibly within communities. A suitable approach for nding optimal teams must consider expertise, user load, social distance and collaboration cost of team members. We have designed this model speci cally for the analysis of large-scale software ecosystems wherein users perform development activities. We have studied our approach by analysing the R ecosystem and nd that our approach is well suited for the team discovery in software ecosystems.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Establishing a software ecosystem becomes increasingly
important for a companies' collaboration strategy with other
companies, open source developers and end users. The idea
behind software ecosystems di ers from traditional
outsourcing techniques [
        <xref ref-type="bibr" rid="ref19 ref22">19, 22</xref>
        ]. The initiating actor does not
necessarily own the software produced by the contributing actors nor
are contributing actors hired by the initiator (e.g., a rm).
All actors as well as software artefacts, however, coexist in an
interdependent way. For example, actors jointly develop
applications and thus there is a relationship among the actors.
Software components may depend on each and thus there is a
relationship among the components. This is a parallel to
natural ecosystems where the di erent members of the
ecosystems (e.g., the plants, animals, or insects) are part of a food
network where the existence of one species depends on the
rest. In contrast to natural ecosystems, some software
ecosystems may be mainly top-down controlled, with most changes
driven by change requests and bug reports coming from other
actors [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Other software ecosystems may be controlled in
a bottom-up manner, primarily driven by input from its core
developers [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
      </p>
      <p>
        A key challenge in software ecosystems is to manage quality
of software [
        <xref ref-type="bibr" rid="ref15 ref8 ref9">8, 9, 15</xref>
        ] and addressing nonfunctional
requirements (NFRs) in general [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Software vendors may require
their plug-in developers to maintain certain quality levels to
deserve an approved status. As shown in earlier research,
individual as well as development team skill has a signi cant e ect
on the quality of a software product [
        <xref ref-type="bibr" rid="ref12 ref5 ref7">5, 7, 12</xref>
        ]. The approach in
this work takes a socio-technical view on software ecosystems
wherein ecosystems are understood as the interplay between
the social system and the technical system [
        <xref ref-type="bibr" rid="ref11 ref13">11, 13</xref>
        ]. Software
development teams composed of members with prior joint
project experience may be more e ective in coordinating
programmers' distributed expertise because they have developed
knowledge of who knows what [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. This paper addresses the
problem of team formation in open, dynamic software
ecosystems. In open source development teams are more often than
not formed spontaneously (i.e., they `emerge') based on
people's availability, willingness to collaborate and to contribute
to a certain task.
      </p>
      <p>Potential tasks for expert teams in software ecosystems and
open source development include:</p>
      <p>Come up with a software design and/or implementation of
a complex component or subsystem.</p>
      <p>Perform software architecture review of an existing
implementation or provide expert opinion about an emerging
technology.</p>
      <p>To give a concrete example of a potential high-level task, a
complex design or implementation may involve the analysis of
timeseries data including data extraction from a source
system, transformation of the data, storage, processing, and
visualization. Clearly, this task typically requires multiple people
with distinct skills such as data modelling, statistical
knowledge (uni-/multivariate data analysis), data persistence
management, and data visualization using various technologies
and toolkits. Indeed, the high-level task needs to be further
decomposed into smaller task. The goals of this work is to
nd a team of experts given a set of high level skills. Once
the team has been discovered, detailed task decomposition
and work planning can be performed, which is however not in
the focus of this work.</p>
      <p>We provide the following key contributions:
A novel approach supporting team formation in software
ecosystems based on user expertise, load, social distance
among team members, and collaboration cost.</p>
      <p>Support the discovery of potential mediators if social
connectedness in teams is low.</p>
      <p>Analysis of user expertise to recommend the most suitable
team members.</p>
      <p>Evaluation of the concepts using data collected from the
Comprehensive R Archive Network (CRAN) - the largest
public repository of R packages.</p>
      <p>The remainder of this paper is organized as follows. In
Section 2 we overview related work and concepts. In Section 3
we introduce our team formation approach. Experiments are
detailed in Section 4. The paper is concluded in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        The success of a project depends not only on the expertise
of the people who are involved, but also on how e ectively
they collaborate, communicate and work together as a team
[
        <xref ref-type="bibr" rid="ref17 ref26">17, 26</xref>
        ]. On the one hand, a team must contain the right set
of expertise, but on the other hand one should determine a
sta ng level that, while comprising all the needed expertise,
minimizes the cost and contributes to meeting the project
deadline [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The most critical resource for knowledge teams
is expertise and specialized skills in using and handling tools.
But the mere presence of expertise in a team is insu cient
to produce high-quality work. A team must collaborate in an
e ective manner. It has been found that prior collaborative
ties have a profound e ect on developers' project joining
decisions [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        In software engineering, team formation is often needed to
perform a development or maintenance activity. A general
trend is the growing number of large scale software projects,
software development and maintenance activities demanding
for the participation of larger groups [
        <xref ref-type="bibr" rid="ref14 ref6">6, 14</xref>
        ]. In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], the authors
proposed assignment of experts to handle bug reports based
on previous activity of the expert. Social collaboration on
GitHub including team formation has been addressed in [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
The authors [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] study the problem of online team formation.
In their work, a setting in which people possess di erent skills
and compatibility among potential team members is modelled
by a social network. It has to be noted that team formation in
social networks is an NP -hard problem. Thus, optimization
algorithms such as genetic algorithms [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] should be
considered in solving the team composition problem [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ].
      </p>
      <p>
        Expertise identi cation is one of the key challanges and
success factors for team work and collaboration [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. The
discovery of experts is becoming critical to ease the communication
between developers in case of global software development or
to better know members of large software communities [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ].
Network analysis techniques o er a rich set of theories and
tools to analyse the topological features and human behaviour
in online communities. We have extensively studied the
automatic extraction of expertise pro les in our prior research
(see [
        <xref ref-type="bibr" rid="ref24 ref25 ref27">24, 25, 27</xref>
        ]) and build upon our social network based
expertise mining framework.
      </p>
      <p>With regards to team formation in open source
communities as well as software ecosystems, there is still a gap in
related work and to our best knowledge there is no existing
approach that supports formation based on mined expertise
pro les.</p>
    </sec>
    <sec id="sec-3">
      <title>Formation in Ecosystems</title>
      <p>
        Here we present the overall team formation algorithm. We
employ a genetic algorithm that attempts to nd the best
team. Genetic algorithms (GAs) mimic Darwinian forces of
natural selection to nd optimal values of some function [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ].
      </p>
      <p>For each team a single number denoting the team's tness
is calculated (where larger values are better).
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>Genetic Algorithm Outline</title>
      <p>There are multiple objectives that need to be optimized. The
objectives are to:
maximize the average expertise score for given skills
minimize the average cost
minimize the average distance</p>
      <p>
        The team with the best trade-o among these objectives
shall obtain the highest tness and will be recommended as
the best tting team. The required team skills are stated by
customers who wish to assign a speci c complex task to a
team of experts | be it within a corporation or outsourcing
a speci c task to the crowd. Our assumption is that such
complex tasks demand for the expertise of multiple team
members. Within our formation approach, additional constraints
can be considered such as one person shall only cover one skill
and not multiple ones. Indeed, in practice people are familiar
with multiple topics thereby covering multiple skills. Factors
such as matching user load with complexity, e ort, and
deadline of a task are not in focus of this work. The reader may
refer to [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] for further information on these topics.
      </p>
      <p>The main computational steps of our formation approach
are introduced and elaborated in detail in Algorithm 1. The
relevant lines in Algorithm 1 are speci ed in parenthesis.
1. Based on the set S = fs1; s2; : : : ; sng of demanded skills,
prepare a mapping structure that holds skills, users U , and
expertise ranking scores (lines 6-9). In this step, current
user load is evaluated (based on previously assigned tasks)
and users with high load are ltered out.
2. Initialize a population of individuals. An individual is a
team consisting of n team members where n can be con
gured. The parameter n is given by the size of the skill set
S if each team member has to provision exactly one skill
(line 12).
3. Loop until max iterations have been reached and compute
the main portion of the genetic algorithm based search
heuristic. Finally, after this loop select the team with the
highest tness (lines 14-48).
4. Depending on the ecosystems community structure and the
demanded set of skills, teams may have good or poor
connectivity in terms of social links among team members.
Therefore, construct a subgraph of the social collaboration
graph GS containing only the nodes from the best team
and their edges between each other. Analyse the
connectivity of this subgraph by computing the average number
of neighbours (lines 50-51).
5. If connectivity is low, try to nd a dedicated coordinator
who is ideally connected to all team members through
social links. The role of the coordinator is to mediate
communication among members and strengthen team cohesion
(line 53-56).
In the following we detail the steps in the algorithm and
explain additional functions that are invoked while executing
the formation algorithm.
3.2.1</p>
      <sec id="sec-4-1">
        <title>Rank Expertise by Skills</title>
        <p>
          Expertise pro les are not created in a prede ned, static
manner. Expertise is calculated based on actual community
contributions. However, we do not attempt to analyse detailed user
contributions in terms of software versioning control systems
but rather focus on `high-level' package metadata thereby
following a less privacy intrusive approach. The method used for
determining the expertise scores is not within the focus of this
work due to space limits. The interested reader may refer to
[
          <xref ref-type="bibr" rid="ref27 ref28">27, 28</xref>
          ] for information on the basic approach.
3.2.2
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Basic Selection Strategies</title>
        <p>An initial set of candidate solutions are created and their
corresponding tness values are calculated. This set of solutions
is referred to as a population and each solution as an
individual (i.e., the team composed of team members). The
individuals with the best tness values are selected and combined
randomly to produce o springs, which make up the next
population. The approach of selecting a subset of individuals with
the best tness values is called elitism. Elitism is realized by
copying the ttest individuals to the next population.</p>
        <p>
          To maintain a demanded population of individuals,
individuals are selected and undergo crossover (mimicking genetic
reproduction). For the team formation approach this means
that team members are swapped between two teams. The
basic part of the selection process is to stochastically select from
one generation to create the basis of the next generation (see
rouletteW heelSelection in line 28). The requirement is that
the ttest individuals have a greater chance of survival than
weaker ones. This replicates nature in that tter individuals
will tend to have a better probability of survival and will go
forward. Weaker individuals are not without a chance. In
nature such individuals may have genetic coding that may prove
useful to future generations [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>Individuals are also subject to random mutations. However,
the probability of mutation is low because otherwise the
genetic algorithm would be just a random search procedure. In
our work we apply a `smart' approach to mutation and do
not select a replacement team member at random. Rather, a
new team member is predicted and voting is performed by
the existing team members.
3.2.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Relationship-driven Mutation</title>
        <p>In traditional GAs, which are agnostic to the underlying
nature of the population, mutation takes place by exchanging a
gene by a random gene thereby maintaining genetic diversity.
Here we perform prediction of edges between existing team
members and newly randomly selected team members. If a
threshold is surpassed, the randomly suggested team
members is added to the team. This prediction approach ensures a
much higher likelihood that the new team member will work
within the team more e ectively. We propose random forests
for predicting edges between pairs of users. Random forests
are a simple yet e ective and robust method for classi
cation problems. Prediction of edges between pairs of users is
performed by modelling features describing the relationship
between two nodes u and v. At a high level, these features
include common neighbours, the jaccard similarity index, joint
community interest, and joint package dependencies.
3.2.4
The last important ingredient of our GA based approach is
the design of a workable tness function. A tness function
is a type of objective function that is used to provide a
single number. The idea is to discard `bad' team con gurations
(i.e., individuals) and to breed new ones from the good con
gurations. The search heuristic is terminated when either the
overall tness converges or the maximum number of iterations
have been reached.</p>
        <p>The tness function computes the value I as</p>
        <p>I = w1 expertise + w2 cost + w3 distance
(1)
where w1 + w2 + w3 = 1. Each of the input factors
expertise; cost; distance needs to be scaled between 0 and
1. I is computed when invoking the function evaluate (see
Algorithm 2 Fitness function.
1: input: skills S, individual I, ranking score mapping M
2: output: tness value I 2 [0; 1] of individual I
3: metrics ; # metrics as basis for fitness
4: score 0 # team expertise score
5: popularity 0 # popularity - to approximate cost
6: for Skill s 2 S do
7: u I[s] # get member by skill
8: rs M [s][u] # expertise score by skill
9: # perform feature scaling
10: rs0 1 maxm(Ma x[s(]M)[sm])inr(Ms[s])
11: score score + rs0
12: # get user degree in GS
13: ku degree(GS; u)
14: # perform feature scaling
15: ku0 kmkamxaxku
16: popularity popularity + ku0
17: end for
18: # add average team expertise score to metrics
19: addM etric(metrics; score=jSj)
20: # add average popularity to metrics
21: # higher community popularity means higher cost
22: addM etric(metrics; popularity=jSj)
23: dist 0 # social distance
24: Q queue(I)
25: while Q 6= ; do
26: u poll(Q)
27: for v 2 Q do
28: # unweighted shortest path distance
29: # d(u; v) computed in GS
30: duv = d(u; v)
31: if duv 6= null then
32: # perform feature scaling
33: # lower values are better
34: # max(GS) is diameter of graph
35: d0uv mmaxa(xG(GSS)) du1v</p>
        <p>dist + d0uv
36: dist
37: end if
38: end for
39: end while
40: GI extractSubGraph(GS; I)
41: # add average distance to metrics
42: addM etric(metrics; dist=jedges(GI )j)
43: I 0
44: for m 2 metrics do
45: I I + wm metric
46: end for
Algorithm 1 line 45). Later on I is used to rank the
population by tness (see Algorithm 1 line 17) as well as when
calling f indBestIndividual (see Algorithm 1 line 48).</p>
        <p>Algorithm 2 details the computational steps of an
individual I's tness as de ned by Eq. 1. An approximation of a
cost factor is provided by community popularity in terms of
number of neighbours in the social graph GS. Thus, according
to this logic more popular users are also more expensive. A
perfect tness score, given a set of skills S, would be 1. This
is however impossible to achieve because expertise is also
inuenced by the user degree in GS. Thus, a suitable tradeo
among these factors has to be found. The individual with the
highest tness within the population is then selected and
recommended as the best team.
3.2.5</p>
      </sec>
      <sec id="sec-4-4">
        <title>Coordinators</title>
        <p>Based on the set of demanded skills and community structure,
it may not be possible to nd teams with good
connectivity among the team members. The last steps in Algorithm 1
would be to check the connectivity of I and to nd a dedicated
node who is ideally connected to all nodes in I to mediate
communication. All nodes in U matching this constraint are
then ranked based on their averaged expertise given the skill
set S. The discovered node acting is potential team
coordinator is added to the nal team.
4
4.1</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experimental Evaluation</title>
    </sec>
    <sec id="sec-6">
      <title>R Ecosystem</title>
      <p>In this section we present our experiments. We focus on one
part of the R ecosystem called the Comprehensive R Archive
Network (CRAN). Other R-based communities not considered
in this research are, for example, Bioconductor2. We have
implemented a Web crawler to download3 and parse R software
package meta information available as HTML pages. This
information includes contributing authors and package
dependencies. In addition, CRAN provides so called CRAN task
views, which are used in our analysis as skill or topic
information. As an example, a given task view Bayesian Inference
would be a single skill.</p>
      <p>Figure 1 shows the package authorship graph GA.
gprAfshinSadeg</p>
      <p>gietrensorOay
2 https://w
3 https://c</p>
      <p>TchngS
JunS.Li
1
s
r
seUm 50
u
N
5 10 50 500</p>
      <p>Num Nodes
5000
1 2 5 10 20 50 100</p>
      <p>Degree
5</p>
      <p>Num Views
1
2
10
20</p>
      <p>In Fig. 1, only a subset is shown due to space limits. There
are many more small clusters as those at the bottom of the
gure. The community has one large cluster, the largest
connected component (LCC) of the graph, containing 8985 nodes
which are either users or packages. There are many smaller
clusters with few users contributing to packages and also
many users that contribute only to one single package.</p>
      <p>Figure 2 shows the number of clusters versus the number
of nodes in it (log-log scale). The LCC is depicted by the dot
at the very bottom right corner of the gure. The majority of
clusters has only few or just a single user package tuple. This
also means that only users within the single largest connected
component will be relevant for our analysis. Since we heavily
rely on user degree in the social graph GS, many users in the
small clusters will have very low importance.</p>
      <p>Figure 3 shows the degree distribution of the user graph GS.
Low degree of many users is explained by the large number
of small clusters, which are mainly individual contributors
of single packages. The graph GS consists of 11189 users. A
fraction of 14% has a degree of 0 and 60% of users have a
degree smaller or equal 3. The median4 degree is 85. A fraction
of 0.7% of users (74 users) have a degree larger than 85. Such
degree distributions are typical in online communities.</p>
      <p>The next Fig. 4 shows the relationship between CRAN task
views and users. These views are interpreted as skills and
expertise ranking is performed within the context of individual
task views. In total, 33 views exist. 7316 users are not
associated with any view (because their packages are not listed in
any view). Thus, those users will not be considered in the
formation algorithm. 3873 users are associated with one or more
views. The median value for the number of views within this
user segment is 11. This provides already a good diversity in
terms of users having di erent skills.</p>
      <p>The next step is to analyse the relationship between users
and software packages. Figure 5 shows the number of users
over packages. The median value for the number of software
packages is 20. The dependencies of packages are visualized
by Fig. 6. 4550 packages have exactly one dependency (the R
environment). The median value is 10.</p>
      <p>Finally, Fig. 7 shows the average number of dependencies
by the number of users. The median value for the average
number of dependencies is clearly 2.
4 The median is used to separate the higher half of the user
population from the lower half.
s
rsueUm 100
N
0
0
0
0
1
0
0
0
1
0
1
1
0
se 50
kag
caPm 50
u
N
0
0
0
5
5
1
1
2</p>
      <p>5
Num Dependencies
10</p>
      <p>20
In our experiments, we sampled a random set of CRAN task
views representing the demanded team skills. The crossover
probability is set to 0.7 and the mutation probability to
0.05. The metric weights for tness calculation are set to
wscore = wcost = wdistance = 13 . In each experiment run,
a population of 200 individuals plus 5 individuals for elitism
has been created. We evaluate the quality of the expertise
mining approach by checking key metrics such as degree of a
user, number of packages in a given view (where the user is
top-ranked), and number of all packages.
4.3</p>
    </sec>
    <sec id="sec-7">
      <title>Qualitative Evidence</title>
      <p>A team with the highest tness for 5 skills is depicted by Fig. 8
and detailed in Table 1. The team has an average expertise
score of 0.8, a cost of 0.6, and distance 1.0. These are excellent
values. A perfect tness of 1.0 is not attainable because there
is a tradeo between expertise and cost. Indeed, the maximum
achievable tness depends on the selected skills.</p>
      <p>
        Table 1 shows further team member details. Rank is the
user rank within the speci c task view (community rank for
the given skill). Score is the numeric ranking score based on
our advanced expertise mining model and Score (GS) is the
ranking score computed in GS using a standard PageRank.
This comparison essentially demonstrates the impact of our
advanced context-based ranking model (see [
        <xref ref-type="bibr" rid="ref27 ref28">27, 28</xref>
        ]). Ch. is
the ranking change (context-based vs. standard PageRank).
One can see that context has a high impact in terms of ranking
position. We positively validated these results by checking the
online pro les of the top-ranked users because most users have
public Web sites. Deg. (GS) depicts the degree (number of
co-authors) in GS. High degree typically means high
community standing. Pkg depicts the number of packages in a given
view and All Pkg all packages of the given user. Deg. (T)
is the team degree. Here we see a perfectly connected team
where each member is connected to all other members. The
user `Brian D. Ripley' is selected as the coordinator.
4.4
      </p>
    </sec>
    <sec id="sec-8">
      <title>Performance Evaluation</title>
      <p>We show the convergence of tness values for both entire
populations (depicted as P - tness in Fig. 9) and for the best
individuals in each population (depicted as I- tness in Fig. 10).
Convergence means that no major changes between one
iteration to the next iteration are observed. 5 skills have been
selected randomly. Di erent color codes depict 5 runs of GA
formation algorithm. The best teams were identi ed after 7
iterations. Population tness started to settle at 10.
0
8
0
4
0</p>
      <p>In the following we answer the question whether an
increasing number of skills results in more disconnected teams.
Fig. 11 compares di erent populations with an increasing
number of skills (from 5 to 9 skills depicted by S5 to S9).</p>
      <p>The x-axis shows the number of disconnected team
members and the y-axis the number of teams within the population
(total number of populations is 205). The gures show that
by increasing the number of skills the number of teams where
only few members are disconnected decreases. More skills to
be satis ed by individual users increases the chance that team
members are disconnected from the rest of the team.</p>
    </sec>
    <sec id="sec-9">
      <title>Conclusions</title>
      <p>This work introduced team formation mechanisms for
software ecosystems. We apply a genetic algorithm including a
novel extension called relationship-driven mutation. In
development teams, performance and quality are a ected by the
programming skills and domain experiences of the project's
team members. We apply an advanced expertise mining
approach to address this problem. Empirical results con rm the
applicability of our presented methods.</p>
      <p>Future work includes the following aspects. Estimating
collaboration cost is not trivial and may depend on many
factors, such as physical co-location, geographical distribution
(including issues related to time di erence), or cultural
factors. We will analyse and model cost of collaboration. We
will perform further evaluations of the approach in two
directions. First, we want to validate results by engaging
community members of the R ecosystem to get direct feedback
on formation and expertise ranking results. Second, we will
broaden team formation experiments for other types of
software ecosystems including industrial and company internal
software ecosystems. An additional area of investigation is
the consideration of formal skill frameworks such as SFIA5.
5 https://www.s a-online.org</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1] Newcastle University, Roulette wheel selection. http://goo.gl/5CGi8t, Jan.
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Aris</given-names>
            <surname>Anagnostopoulos</surname>
          </string-name>
          , Luca Becchetti, Carlos Castillo, Aristides Gionis, and Stefano Leonardi, `
          <article-title>Online team formation in social networks'</article-title>
          ,
          <source>in Proceedings of the 21st international conference on World Wide Web, WWW '12</source>
          , pp.
          <volume>839</volume>
          {
          <issue>848</issue>
          , New York, NY, USA, (
          <year>2012</year>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>John</given-names>
            <surname>Anvik and Gail C. Murphy</surname>
          </string-name>
          , `
          <article-title>Reducing the e ort of bug report triage: Recommenders for development-oriented decisions'</article-title>
          ,
          <source>ACM Trans. Softw</source>
          . Eng. Methodol.,
          <volume>20</volume>
          (
          <issue>3</issue>
          ),
          <volume>10</volume>
          :1{
          <fpage>10</fpage>
          :
          <fpage>35</fpage>
          , (
          <year>August 2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Jakob</given-names>
            <surname>Axelsson</surname>
          </string-name>
          and Mats Skoglund, `
          <article-title>Quality assurance in software ecosystems: A systematic literature mapping and research agenda'</article-title>
          ,
          <source>Journal of Systems and Software</source>
          ,
          <volume>114</volume>
          , 69 {
          <fpage>81</fpage>
          , (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Justin</surname>
            <given-names>M.</given-names>
          </string-name>
          <article-title>Beaver and Guy A. Schiavone, `The e ects of development team skill on software product quality'</article-title>
          ,
          <source>SIGSOFT Softw. Eng. Notes</source>
          ,
          <volume>31</volume>
          (
          <issue>3</issue>
          ), 1{
          <fpage>5</fpage>
          , (May
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Christian</given-names>
            <surname>Bird</surname>
          </string-name>
          , David Pattison,
          <string-name>
            <surname>Raissa D'Souza</surname>
            ,
            <given-names>Vladimir</given-names>
          </string-name>
          <string-name>
            <surname>Filkov</surname>
          </string-name>
          , and Premkumar Devanbu, `
          <article-title>Latent social structure in open source projects'</article-title>
          ,
          <source>in Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, SIGSOFT '08/FSE-16</source>
          , pp.
          <volume>24</volume>
          {
          <issue>35</issue>
          , New York, NY, USA, (
          <year>2008</year>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Barry</surname>
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Boehm</surname>
          </string-name>
          , `
          <article-title>Improving software productivity'</article-title>
          ,
          <source>Computer</source>
          ,
          <volume>43</volume>
          {
          <fpage>47</fpage>
          , (
          <year>1987</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Jan</given-names>
            <surname>Bosch</surname>
          </string-name>
          , `
          <article-title>From software product lines to software ecosystems'</article-title>
          ,
          <source>in Proceedings of the 13th International Software Product Line Conference, SPLC '09</source>
          , pp.
          <volume>111</volume>
          {
          <issue>119</issue>
          ,
          <string-name>
            <surname>Pittsburgh</surname>
          </string-name>
          , PA, USA, (
          <year>2009</year>
          ). Carnegie Mellon University.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Claes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mens</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Grosjean</surname>
          </string-name>
          , `
          <article-title>On the maintainability of cran packages'</article-title>
          ,
          <source>in Software Maintenance, Reengineering and Reverse Engineering</source>
          ,
          <source>2014 Software Evolution Week</source>
          , pp.
          <volume>308</volume>
          {
          <fpage>312</fpage>
          , (
          <year>Feb 2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Massimiliano</given-names>
            <surname>Di</surname>
          </string-name>
          <string-name>
            <surname>Penta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Mark</given-names>
            <surname>Harman</surname>
          </string-name>
          , and Giuliano Antoniol, `
          <article-title>The use of search-based optimization techniques to schedule and sta software projects: An approach and an empirical study', Softw</article-title>
          . Pract. Exper.,
          <volume>41</volume>
          (
          <issue>5</issue>
          ),
          <volume>495</volume>
          {
          <fpage>519</fpage>
          , (
          <year>April 2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R.P.</given-names>
            dos Santos and
            <surname>C.M.L. Werner</surname>
          </string-name>
          , `
          <article-title>Treating social dimension in software ecosystems through reuseecos approach'</article-title>
          ,
          <source>in Digital Ecosystems Technologies (DEST)</source>
          ,
          <year>2012</year>
          6th IEEE International Conference on, pp.
          <volume>1</volume>
          {
          <issue>6</issue>
          , (
          <year>June 2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Jungpil</surname>
            <given-names>Hahn</given-names>
          </string-name>
          , Jae Y. Moon, and Chen Zhang, `
          <article-title>Emergence of new project teams from open source software developer networks: Impact of prior collaboration ties</article-title>
          .',
          <source>Information Systems Research</source>
          ,
          <volume>19</volume>
          (
          <issue>3</issue>
          ),
          <volume>369</volume>
          {
          <fpage>391</fpage>
          , (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Geir</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Hanssen</surname>
          </string-name>
          , `
          <article-title>A longitudinal case study of an emerging software ecosystem: Implications for practice and theory'</article-title>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Syst</surname>
          </string-name>
          . Softw.,
          <volume>85</volume>
          (
          <issue>7</issue>
          ),
          <volume>1455</volume>
          {
          <fpage>1466</fpage>
          , (
          <year>July 2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Qiaona</surname>
            <given-names>Hong</given-names>
          </string-name>
          , Sunghun Kim,
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Cheung</surname>
          </string-name>
          , and Christian Bird, `
          <article-title>Understanding a developer social network and its evolution'</article-title>
          ,
          <source>in Proceedings of the 2011 27th IEEE International Conference on Software Maintenance, ICSM '11</source>
          , pp.
          <volume>323</volume>
          {
          <issue>332</issue>
          , Washington, DC, USA, (
          <year>2011</year>
          ). IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Jansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Finkelstein</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Brinkkemper</surname>
          </string-name>
          , `
          <article-title>A sense of community: A research agenda for software ecosystems'</article-title>
          , in Software Engineering - Companion Volume,
          <year>2009</year>
          . ICSECompanion
          <year>2009</year>
          . 31st International Conference on, pp.
          <volume>187</volume>
          {
          <fpage>190</fpage>
          , (May
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Aniket</surname>
            <given-names>Kittur</given-names>
          </string-name>
          , Je rey
          <string-name>
            <given-names>V.</given-names>
            <surname>Nickerson</surname>
          </string-name>
          , Michael Bernstein, Elizabeth Gerber, Aaron Shaw, John Zimmerman, Matt Lease, and John Horton, `
          <article-title>The future of crowd work'</article-title>
          ,
          <source>in Proceedings of the 2013 Conference on Computer Supported Cooperative Work, CSCW '13</source>
          , pp.
          <volume>1301</volume>
          {
          <issue>1318</issue>
          , New York, NY, USA, (
          <year>2013</year>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Theodoros</surname>
            <given-names>Lappas</given-names>
          </string-name>
          , Kun Liu, and Evimaria Terzi, `
          <article-title>Finding a team of experts in social networks'</article-title>
          ,
          <source>in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          ,
          <source>KDD '09</source>
          , pp.
          <volume>467</volume>
          {
          <issue>476</issue>
          , New York, NY, USA, (
          <year>2009</year>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Anirban</surname>
            <given-names>Majumder</given-names>
          </string-name>
          , Samik Datta, and
          <string-name>
            <given-names>K.V.M.</given-names>
            <surname>Naidu</surname>
          </string-name>
          , `
          <article-title>Capacitated team formation problem on social networks'</article-title>
          ,
          <source>in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining</source>
          ,
          <source>KDD '12</source>
          , pp.
          <volume>1005</volume>
          {
          <issue>1013</issue>
          , New York, NY, USA, (
          <year>2012</year>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Konstantinos</given-names>
            <surname>Manikas</surname>
          </string-name>
          and Klaus Marius Hansen, `
          <article-title>Software ecosystems - a systematic literature review'</article-title>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Syst</surname>
          </string-name>
          . Softw.,
          <volume>86</volume>
          (
          <issue>5</issue>
          ),
          <volume>1294</volume>
          {
          <fpage>1306</fpage>
          , (May
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Tom</given-names>
            <surname>Mens</surname>
          </string-name>
          , Malick Claes, Philippe Grosjean, and Alexander Serebrenik, `
          <article-title>Studying evolving software ecosystems based on ecological models'</article-title>
          , in Evolving Software Systems, eds., Tom Mens,
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Serebrenik</surname>
          </string-name>
          , and Anthony Cleve,
          <volume>297</volume>
          {
          <fpage>326</fpage>
          , Springer Berlin Heidelberg, (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Tom</given-names>
            <surname>Mens</surname>
          </string-name>
          and Philippe Grosjean, `
          <article-title>The ecology of software ecosystems'</article-title>
          , IEEE Computer,
          <volume>48</volume>
          (
          <issue>10</issue>
          ),
          <volume>85</volume>
          {
          <fpage>87</fpage>
          , (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>David</surname>
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Messerschmitt and Clemens Szyperski</surname>
          </string-name>
          ,
          <source>Software Ecosystem: Understanding an Indispensable Technology and Industry</source>
          , MIT Press, Cambridge, MA, USA,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Melanie</given-names>
            <surname>Mitchell</surname>
          </string-name>
          , An Introduction to Genetic Algorithms, MIT Press, Cambridge, MA, USA,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Schall</surname>
          </string-name>
          , `
          <article-title>Expertise ranking using activity and contextual link measures', Data Knowl</article-title>
          . Eng.,
          <volume>71</volume>
          (
          <issue>1</issue>
          ),
          <volume>92</volume>
          {
          <fpage>113</fpage>
          , (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Schall</surname>
          </string-name>
          , Service Oriented Crowdsourcing: Architecture,
          <source>Protocols and Algorithms</source>
          , Springer Briefs in Computer Science, Springer New York, New York, NY, USA,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Schall</surname>
          </string-name>
          , `
          <article-title>Formation and interaction patterns in social crowdsourcing environments',</article-title>
          <string-name>
            <given-names>Int. J.</given-names>
            <surname>Commun</surname>
          </string-name>
          . Netw. Distrib. Syst.,
          <volume>11</volume>
          (
          <issue>1</issue>
          ),
          <volume>42</volume>
          {
          <fpage>58</fpage>
          , (
          <year>June 2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Schall</surname>
          </string-name>
          , `
          <article-title>Measuring contextual partner importance in scienti c collaboration networks'</article-title>
          ,
          <source>Journal of Informetrics</source>
          ,
          <volume>7</volume>
          (
          <issue>3</issue>
          ),
          <volume>730</volume>
          {
          <fpage>736</fpage>
          , (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Schall</surname>
          </string-name>
          ,
          <source>Social Network-Based Recommender Systems</source>
          , Springer International Publishing,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>M.</given-names>
            <surname>Srinivas</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.M.</given-names>
            <surname>Patnaik</surname>
          </string-name>
          , `
          <article-title>Genetic algorithms: a survey'</article-title>
          ,
          <source>Computer</source>
          ,
          <volume>27</volume>
          (
          <issue>6</issue>
          ),
          <volume>17</volume>
          {
          <fpage>26</fpage>
          , (
          <year>June 1994</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Cedric</surname>
            <given-names>Teyton</given-names>
          </string-name>
          , Marc Palyart,
          <string-name>
            <surname>Jean-Remy</surname>
            <given-names>Falleri</given-names>
          </string-name>
          , Floreal Morandat, and Xavier Blanc, `
          <article-title>Automatic extraction of developer expertise'</article-title>
          ,
          <source>in 18th International Conference on Evaluation and Assessment in Software Engineering, EASE '14</source>
          , pp.
          <volume>8</volume>
          :
          <issue>1</issue>
          {8:
          <fpage>10</fpage>
          , New York, NY, USA, (
          <year>2014</year>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Hyeongon</surname>
            <given-names>Wi</given-names>
          </string-name>
          , Seungjin Oh,
          <string-name>
            <given-names>Jungtae</given-names>
            <surname>Mun</surname>
          </string-name>
          , and Mooyoung Jung, `
          <article-title>A team formation model based on knowledge and collaboration'</article-title>
          ,
          <source>Expert Systems with Applications</source>
          ,
          <volume>36</volume>
          (
          <issue>5</issue>
          ),
          <volume>9121</volume>
          {
          <fpage>9134</fpage>
          , (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>