=Paper=
{{Paper
|id=None
|storemode=property
|title=Analysing the evolution of social aspects of open source software ecosystems
|pdfUrl=https://ceur-ws.org/Vol-746/IWSECO2011-1-InvitedPaper-MensGoeminne.pdf
|volume=Vol-746
|dblpUrl=https://dblp.org/rec/conf/icsob/MensG11
}}
==Analysing the evolution of social aspects of open source software ecosystems==
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
Analysing the evolution of social aspects
of open source software ecosystems
Tom Mens1 and Mathieu Goeminne1
Service de Génie Logiciel, Faculté des Sciences, Université de Mons
Place du Parc 20, 7000 Mons, Belgique
{tom.mens,mathieu.goeminne}@umons.ac.be
informatique.umons.ac.be
Abstract. Empirical software engineering is concerned with statistical
studies that aim to understand and improve certain aspects of the soft-
ware development process. Many of these focus on the evolution and
maintenance of evolving software projects. They rely on repository min-
ing techniques to extract relevant data from software repositories or
other data sources frequently used by software developers. We enlarge
these empirical studies by exploring social software engineering, study-
ing the developer community, including the way developers work, coop-
erate, communicate and share information. The underlying hypothesis is
that social aspects significantly influence the way in which the software
project will evolve over time. We present some preliminary results of an
empirical study we are carrying out on the different types of activities of
the community involved in the GNOME open source ecosystem, and we
discuss suggestions for future work.
Key words: software evolution, open source software, empirical software engi-
neering, social software engineering, repository mining, software ecosystem
1 Introduction
This article accompanies an invited talk presented by the first author at the Third
International Workshop on Software Ecosystems (IWSECO 2011). It represents
our ongoing research in this emerging domain.
Since the start of the new millennium, the number of empirical studies on how
free / libre / open source software projects evolve has been steadily increasing.
The main reasons for this are: (i) the abundance and accessibility of software
projects for which historical data is freely available; (ii) the increasing popularity
of open source software, even in industry; (iii) the ability to publish scientific
results about these systems and to allow other researchers to verify and reproduce
the obtained results.
Nevertheless, the majority of empirical studies on open source software evo-
lution focus on the technical aspects, such as source code artefacts. These studies
largely ignore the social aspect, i.e., the impact that user and developer com-
munities (and their interaction) have on the evolution of the software project.
1
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
Important changes in the community (such as the unexpected departure of a
key person, the takeover of the project by a new community) or in the software
product (such as, a major restructuring or a replacement or addition of a sub-
stantial part of the code base) may significantly influence the way in which the
software project will continue to evolve over time.
We therefore propose to extend empirical studies of evolving open source
projects by taking into account information about the communities that are
involved in this project. In particular, we wish to analyse and understand how
the interaction and communication within and across communities influences the
evolution of the software product and vice versa.
A better understanding of this impact will allow us, at a medium term,
to come up with prediction models, guidelines and best practices that allow
communities to improve upon their current practices, and tools that can be used
by the community to control and improve upon their current work processes, to
communicate more effectively, and to make the software more attractive to its
developers and users.
In addition to this, the focus of our study is not the evolution of individual
software projects, but rather coherent collections of projects developed by the
same community. In this respect, we adopt Lungu’s view, who defines a software
ecosystem as “a collection of software projects which are developed and evolve
together in the same environment” [1].
2 Social aspects
The developer community of a software project is composed of persons that
create and modify software artefacts. Programmers modify the source code, ex-
tend the software functionaility and fix bugs. Other technical artefacts of the
software product are modified by documenters, architects, testers, and so on.
Persons involved in a developer community are often structured into subgroups,
each focusing on a specialised activity in order to better respond to the needs of
the development process.
Good communication is an essential success factor for any software project [2,
3]. This is especially true for open source projects that are often developed in
a geographically distributed way. For these types of projects it is also, in most
cases, easier to become involved in the development team, implying that the
team structure needs to be more flexible in order to accommodate the easy
integration of newcomers and to deal with the frequent departure of developers.
Open source projects rely on a number of tools accessible by the developer
community to share and exchange information, to communicate and to coordi-
nate their work. In practice, these tools make heavy use of the Internet. The
main tools employed by these communities are version control systems (such as
Subversion or Git), bug tracking systems (such as Bugzilla), mailing lists and
developer forums. A number of researchers have started to analyse the social
aspects of evolving software projects [4, 5, 6, 7, 8]. To this extent, they make use
of the information extracted from the aforementioned tools.
2
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
3 Experimental setup
3.1 Research methodology
Our main research goal consists in studying communities surrounding open
source software development in order to understand how their communication
and interaction impacts the software evolution process.
To reach this goal, we rely on the scientific method that is common prac-
tice for research in empirical software engineering. Based on the Goal-Question-
Metrics paradigm, we define specific research questions, formulate one or more
research hypotheses for each question, and define and use metrics to verify the
hypotheses. To achieve this, we select a representative set of open source software
ecosystems (and a subset of projects for each considered ecosystem) on which
to verify our hypotheses. For these selected projects, we combine data extracted
from version repositories, bug trackers, mailing lists. This data is cleansed to deal
with possible inconsistencies or incompleteness, to merge data corresponding to
the same identity in different data sources, and to convert the data into a format
that is easier to analyse. The converted data is then analysed using a combina-
tion of visual analysis, statistical analysis and data mining techniques. Whenever
sufficient statistical evidence is found for a particular research hypothesis, the
research questions are further refined and new hypotheses are formulated and
verified in an incremental manner.
3.2 Tools
A wide variety of tools are used during our experiments: the Libresoft tools1 for
mining relevant social data from the repositories, a FLOSSMetrics-compliant
SQL database for storing and querying the data2 , the R software environment3
for statistical analysis, the WEKA tool for data mining4 , and various other
tools for visual analysis of the results. All of these tools are being integrated in
a layered Java framework that we presented in earlier work [9].
3.3 Research questions
Some of the research questions that will be the focus of our attention are listed
below. Most of these questions are still open, so only partial answers to them
will be provided in this article:
– How is the activity within a project distributed across different persons, how
does this change over time, and how does this vary across different projects
belonging to the same ecosystem?
1
tools.libresoft.es
2
www.flossmetrics.org
3
www.r-project.org
4
www.cs.waikato.ac.nz/ml/weka
3
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
– How is the activity of a person distributed across different projects belonging
to the same ecosystem and how does this change over time?
– How is a software community structured and how does this change over time?
Can we observe recurring or emerging patterns, phases and trends of commu-
nication, collaboration, organisation and activity in the project team?
3.4 Selected projects
For the purpose of this article, we have decided to study the GNOME ecosys-
tem5 as a case. The GNOME community develops a free and popular desktop
environment for GNU/Linux and UNIX-type operating systems.
We will study several GNOME projects within this ecosystem. They have
been selected based on the following factors: popularity, age, size, number of
people involved, availability of the necessary data sources for analysis. The names
and characteristics of some of the selected projects are presented in Table 1.
The data for all these projects is stored in different git repositories, a free and
open source distributed version control system. Observe that the number of
committers and number of authors reported in Table 1 differ, since not all authors
have commit rights.
project ID A B C D E
project name Banshee Rhythmbox Tomboy Evince Brasero
age (in years) 5.9 8.9 6.6 12.1 4.2
date of last commit 8/5/2011 9/5/2011 9/5/2011 4/5/2011 22/11/2010
# commits 8427 7979 5791 5024 4129
# committers 160 232 211 274 145
# authors 268 364 290 381 193
# files in most recent version 2700 937 766 699 797
# files during project’s life 13388 2767 5075 2701 2223
Table 1: Main characteristics of 5 selected GNOME projects.
4 Empirical study
Based on the experimental setup of section 3, we are carrying out three different
studies. It is important to note that these studies are still ongoing, and in this
article we present only preliminary results without statistically validating any
hypotheses.
The first study in subsection 4.1 focuses on individual GNOME projects, and
aims to correlate data from different data sources: the git code repository, the
5
www.gnome.org
4
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
bug tracker, and the developer mailing list. The second study in subsection 4.2
takes a more fine-grained view on the activity patterns of authors in the code
repository only, and aims to find correlations between different types of activity.
The third study in subsection 4.3 aims to correlate activities across different
GNOME projects.
4.1 First study
The first study aims to relate information about the open source project com-
munity by analysing three different data sources for a single Gnome project: the
code repository, the bug tracker and the mailing list. These results have been
reported in a previous article [10].
Although we have carried out the analysis for each of the Gnome projects of
Table 1, we only present the results for Evince here, a freely available document
writer that is mainly developed in C and C++, and has more than 11 years
of development’s history. Evince is a small project, with roughly five thousand
commits, nearly two thousands of e-mails and about a thousand bug reports (for
the time period we studied).
We wish to understand how the activities of this software project are dis-
tributed among the project’s contributors. For this, we use the information re-
lated to three categories of activity concerning the same person: the “commits”
done, the mails sent, and the modifications made to bug reports.
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(a) (b)
Fig. 1: Activity analysis of the community surrounding Evince (November 2010).
1a: The cumulative distribution marked with blue circles corresponds to the com-
mit activity. The cumulative distribution marked with red squares corresponds
to the mail sending activity. The cumulative activity marked with green trian-
gles corresponds to bug report change activity. 1b: Intersection of the activity
categories for the top 20 most active persons in each category.
5
Figure 1a shows the cumulative distribution for these three categories of
activity. The distribution is very unbalanced: a small number of persons is re-
sponsible for the majority of the activities. 20% of all committers are responsible
for 80% of the total commit activity in the version control repository; 20% of all
mailers contribute to 70% of all mails sent; and 20% of all bug report changers
take part in 88% of all bug report changes.
Figure 1b offers a more detailed overview of the same data, matching persons
who simultaneously contribute to the three considered categories of activity.
For each category, only the top 20 of most active persons has been considered.
The figure clearly shows that most active persons contribute to several activity
categories. For example, the two most active committers (15% and 15%) are
also very active in mail sending (13% and 6%) and bug report changes (22% and
7%).
The imbalance in the activity distribution can be summarised by economet-
rical aggregation indices, like Gini, Theil or Hoover [11, 12, 13]. A zero value
for these indices implies a uniform distribution, which means that each person
has the same activity rate. A value of 1 means that a single person carries out
all the work and the others do nothing. We can compute the indices on several
dates in order to visualise how the distribution imbalance evolves over time.
Figure 2 shows this evolution for Evince, using the Gini index for the three
considered activity categories. In each case, after a startup phase where the index
is very rapidly increasing, the index tends to stabilise arround a high value
(around 0.8), signifying an important imbalance in the activity distribution.
For the mail sending activity this imbalance is less significant as the coefficient
stabilises around 0.6, which means there are more persons regularly involved in
the sending of mails.
$"
$"
!#,"
!#,"
!#+"
!#+"
!#*"
!#*"
!#)"
!#)"
!#("
!#("
!#'" 1233456" !#'" .4556/7"
!#&" 37486" !#&" 58697"
!#%" 9:;"/<.2/5"1=7>;<6" !#%" :;"0".<8=>?7"
!#$" !#$"
!" !"
-./0,," -./0!!" -./0!$" -./0!%" -./0!&" -./0!'" -./0!(" -./0!)" -./0!*" -./0!+" -./0!," -./0$!" -./0!)" 1230!*" -./0!*" 1230!+" -./0!+" 1230!," -./0!," 1230$!" -./0$!"
(a) Evince (b) Brasero
Fig. 2: Comparison of the Gini index for Evince, since April 1999 for the commits
(continuous blue line), since January 2005 for the mails sent (dashed red line),
and since August 2004 for the bug report changes (dotted green line).
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
4.2 Second study
While the previous study informs us about the way in which the members of a
project community contribute to and participate in different types of repositories,
we can also analyse the activity patterns of persons within a single repository.
Following Robles et al. [14], the idea is that the type of activity a person involved
in can be approximated by the types of files this person is contributing to (i.e.,
adding, modifying or deleting files) in the version control repository.
Table 2 shows how we defined activity types based on the structure of the file
names and file extensions. File extensions are typically used for most of the file
types. The matching rules are build thanks to the extensions commonly used in
software development as well as the extensions observed in the studied projects.
Activity type File type
coding *.c, *.h, *.cc, *.pl, *.java, *.s, *.ada, *.cpp, *.chh, *.py
development documentation readme*, *changelog*, todo*, hacking*
documenting *.html, *.txt, *.ps, *.tex, *.sgml, *.pdf
translating *.po, *.pot, *.mo, *.charset
Table 2: Activity types and their corresponding file types. Only the most impor-
tant activity types (i.e., those that occured most often in all considered projects)
are listed here.
Analysing the git repository of Evince, we observe that a minority of authors,
132 out of 381 (i.e., 34.6%) are active in the coding activity (meaning that they
are involved in at least one commit of a coding related file), whereas the total
number of commits for these types of files represents 46.2% of all the file commits.
We can therefore conclude that coders are among the most active persons in the
project community. This is not very surprising, since version control repositories
are specifically aimed to manage the evolution of a software project’s source
code.
The second most important activity for Evince is development documenta-
tion, with 19.0% of all committed files attributable to this activity. Compared
with source code files many more authors, namely 241 out of 381 (i.e., 63.3%)
are involved in this activity. The third most important activity, with 12.6% of
commits done on the associated files, is software translation. Translating is also
a very popular activity, since it involves 248 authors out of 381 (i.e., 65.1%). All
these results are visually summarised in Figure 3, taking into account the data
over the project’s entire lifetime.
Figure 4 studies whether the same persons are involved in different activities.
We only show this for the activities of coding, translation and development
documentation. The Venn diagram illustrates how many persons are involved in
1, 2 or 3 of these activities over the project’s lifetme. We observe that most of
the coders (97 out of 132, i.e., 73.5%) are also development documentalists and
many translators (109 out of 248) are also development documentalists. We also
7
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
#+0:08,;/014# "+9721-4#
+,-./01462,1# 6.4<,57#
#-,+913## "4,.-<0+#
=807#
4561786213#
#+,-./01213##
!!"# $!"# %!"# &!"# '!"# (!"# )!"# *!"#
Fig. 3: Percentage of git authors involved in different Evince activities (in dark
gray), and percentage of files touched for each activity (in light gray).
C
97
1 129
21
34 15
A B
76
Fig. 4: Intersection of git authors for Evince involved in 3 types of project ac-
tivities: A = coding, B = development-documentation, C = translating.
remark that few coders (14 out of 132) are involved in translation and vice versa.
This disparity reveals the importance to take into account the type of activity
while analysing historical data from version repositories.
Based on the results displayed in Figure 4, we wish to understand the amount
of work that persons taking part in two different activities (corresponding to the
intersections between two circles in the Venn diagram) are involved in. Figure 5
shows scatterplots for each pair of activities. Each point represents the amount of
files touched by a given author for the two considered activities. Figure 5a shows
that only a few coders are also involved in translation, and no particular trend
can be observed. Figure 5b compares coders and persons involved in development
documentation. We observe a clear trend: more active coders are also more active
in development documentation too. Figure 5c compares the translators with the
8
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
persons involved in development documentation: again, more active translators
appear to be more active in development documentation as well, and vice versa.
#!!"
#!"
#"
!" #" #!" #!!" #!!!" #!!!!"
(a) coders versus translators
!"
#!!!"
#!!"
#!"
#"
!" #" #!" #!!" #!!!" #!!!!"
(b) coders versus development documental-
ists !"
#!!!"
#!!"
#!"
#"
!" #" #!" #!!"
(c) translators versus de-
!"
velopment documentalists
Fig. 5: Scatterplot of all authors involved in two different types of activity in
the Evince gitrepository. Each dot represents the number of files touched by a
particular author for two out of three considered activity types: coding, developer
documentation and translation.
9
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
4.3 Third study
Our third study extends the second to the level of the GNOME software ecosys-
tem. More precisely, we study the collection of selected GNOME projects as a
whole (as opposed to individual projects), and we try to find correlations be-
tween certain project activities. As in the second study, we restrict ourselves to
analysing the data stored in the version control repositories. In other words, we
rely on the information contained in the super-repository 6 of GNOME. From a
technical point of view, for the case of GNOME, this super-repository is basically
a collection of distinct git repositories (one for each GNOME project).
*<+-/<5# -?.02?8#
89:,;/# ;<=/>2#
"=:->,;2#+?2.5<-#
753451# :86784#
"25?;./=#@A/-#
0.12.3456# 314516789#
"@=0A/>5#.B518?0#
*+,-.//# -./0122# "58B>12@#CD20#
!!"# $!"# %!"# &!"# '!"# (!"# )!"# !!"# $!"# %!"# &!"# '!"# (!"# )!"# *!"# +!"# ,!"#
(a) Coding (b) Translation
Fig. 6: Percentages of git authors (dark gray) and files touched (light gray) for
the activities of coding (6a) and translation (6b) for the five selected GNOME
projects of Table 1.
To start with, we computed the results of Figure 3 for all selected GNOME
projects, and displayed them in Figure 6 for the activities of coding and trans-
lation, respectively. This corroborates what we already observed in Figure 3:
the number of authors that contribute to the coding activity is fairly low (be-
tween 11% and 43%) while they touch a significant number of files in the version
repository (between 33% and 58%). The most striking result is found for Brasero,
where only 11% of the authors touched 58% of all files in the version repository.
The activity of translation lies on the other extreme of the spectrum: a lot of
authors are involved in the activity (between 56% and 90%) while the percentage
of files touched remains very small (between 3% and 16%).
Figure 7 illustrates how authors involved in these two activities are involved in
multiple GNOME projects. We observe that the pattern of collaboration across
projects is very different for these two types of activities.
Coders seem to stick to a single project. It is rarely the case that a coder is
involved in two different GNOME projects, and even more rare for a coder to be
6
Lungu [1] defines a super-repository as “a collection of version control repositories
of the projects of an ecosystem.”
10
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
B B
106 66
C C
A 9 0
A 9 9
0
1 0 5 2 17
0 46 6 32
0 1 0 5
96 37 10 21
0 0 3 17
10 0 2 1 69 24
0 4
2 6
0 1
1 1
3 2
0 1 1 0
0 0 4 5
5 5 3 7
2 6
107 49
6 23
E D D
E
(a) Intersection of coders (b) Intersection of translators
Fig. 7: Intersection of git authors contributing to the five selected GNOME
projects of Table 1. 7a shows the authors involved in coding and 7b shows the
authors involved in translating.
involved in more than 2 GNOME projects. The intersection over all five selected
projects is even empty. For the activity of translation, the picture is very differ-
ent. More often than not, translators are involved in multiple GNOME projects.
For the 5 selected projects we find that 69 different translators contribute to
each of them at least once.
We can thus conclude that the way in which persons cooperate across projects
heavily depends on the type of activity they are involved in. This may either be
due to the intrinsic characteristics of the type of activity (for example, translat-
ing text from one language to another is less time consuming and requires less
project-specific knowledge than coding), or to the presence of external tools and
mechanisms used by the community to share and distribute work. In the case
of GNOME the main reason is the presence of the GNOME Live! Translation
project7 that manages and structures the way in which translations are carried
out across GNOME projects. For the different supported languages, translation
teams exist. Basically, this implies that, for each language, there is a group of
translators that take care of translating files to this language across all GNOME
projects. It is clear that this tool helps to increase the collaboration across
GNOME projects. If a similar mechanism would be available for coders, it is
likely that cross-project cooperation between coders would also increase.
7
live.gnome.org
11
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
5 Discussion
While our preliminary results reveal interesting patterns that encourage us to
pursue this line of research further, it is clear that a lot of work remains to be
done.
To start with, we need to perform a sound statistical analysis of the obtained
results, over all considered types of activities and over all projects involved in
the GNOME ecosystem. We need to refine and extend the types of activities
considered, and we also need to take into account activities that can be found
from the data stored in the bug tracker and mailing list. All the results we find
need to be validated on, or generalised to, other software ecosystems as well.
For study 2 and 3, we need to include the evolution dimension, by studying
how the discovered activity patterns evolve over time. We also wish to extend the
studies towards evolutionary patterns from the viewpoint of individual authors
(or coherent groups of authors): how does the way in which an author contributes
to a software project community evolves over time? Although generally commit-
ters seem to be only concerned by only one software project, a follow-up study
will analyse if committers can be involved in some related projects, such as a
library and its associated graphical user interface. We will also like refine our
studies by distinguishing those authors that created new files from those that
only edit existing files. We also wish to study whether the core groups of au-
thors (i.e. those that are most active for a particular activity) tend to be stable
or whether they evolve over time.
Another interesting point of study is the exploration of the relation between
the social and the technical dimension of open source software development. In
particular, we are interested in how software quality is influenced by the way the
community interacts, and vice versa. We are also interesting in the migration of
authors: in the case of a fork, can we predict who will be the authors that will
migrate from the original project to the new one? Are the authors simultaneously
working on all the projects they are involved in, or is there a migration effect
from one project to another over time? Can we observe the same patterns when
we consider the several branches of a single project?
For all of the above studies, we wish to use a wide range of different mecha-
nisms, coming from a variety of domains such as data mining, statistical analysis,
economy, software visualisation, social network analysis, and system dynamics.
To facilitate the empirical studies, we need to provide more tool support, and
improve existing tools for data extraction and analysis. While an important part
of the work has been automated, there is still quite some amount of manual in-
tervention involved that is amenable to automation. At a medium term, we wish
to come up with prediction models, guidelines and tools that allow communities
involved in software ecosystems to communicate and interact more effectively.
Prospective users and developers may also rely on such information to make a
more informed choice on whether or not to get involved in such an ecosystem.
12
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
6 Conclusion
Social aspects have a significant impact on the way software ecosystems (i.e.,
coherent collections of software products) evolve over time. Empirical studies of
software evolution must therefore take into account the community surrounding
the software as well as the way this community influences the software evolution.
This article has only scratched the surface of what can be done, by illustrating
some initial empirical studies on the different types of activities the community
members are involved in. Considerably more work is needed to get a deeper
understanding of how this affects the way the software product evolves, and how
this varies from one project to another, in order to come to tool support and
guidelines that can help the software community to optimise their work processes
and produce high quality code more effectively.
Acknowledgment
The research is partially supported by (i) F.R.S.-FNRS FRFC project 2.4515.09
“Research Center on Software Adaptability”; (ii) the European Regional Devel-
opment Fund (ERDF) and Wallonia; (iii) Action de Recherche Concertée project
AUWB- 08/12-UMH “Model-Driven Software Evolution”, financed by the Min-
istère de la Communauté française - Direction générale de l’Enseignement non
obligatoire et de la Recherche scientifique, Belgium.
References
1. Lungu, M., Lanza, M., Gı̂rba, T., Robbes, R.: The small project observatory:
Visualizing software ecosystems. Science of Computer Programming 75 (2010)
264–275
2. Brooks, Frederick P., J.: The Mythical Man-Month: Essays on Software Engineer-
ing. Addison-Wesley (1975)
3. DeMarco, T., Lister, T.: Peopleware: productive projects and teams. Dorset House
Publishing (1987)
4. Madey, G., Freeh, V., Tynan, R.: The open source software development phe-
nomenon: An analysis based on social network theory. In: Eighth Americas Con-
ference on Information Systems. (2002) 1806–1813
5. Mockus, A., Fielding, R.T., Herbsleb, J.D.: Two case studies of open source soft-
ware development: Apache and mozilla. ACM Trans. Softw. Eng. Methodol. 11(3)
(2002) 309–346
6. Nakakoji, K., Yamamoto, Y., Nishinaka, Y., Kishida, K., Ye, Y.: Evolution patterns
of open-source software systems and communities. In: Proc. Int’l Workshop on
Principles of Software Evolution, New York, NY, USA, ACM (2002) 76–85
7. Ye, Y., Nakakoji, K., Yamamoto, Y., Kishida, K.: The co-evolution of systems
and communities in free and open source software development. In Koch, S., ed.:
Free/Open Source Software Development. IDEA Group Publishing (2005) 59–82
13
Eds: Jansen, Bosch, Ahmed, and Campell Proceedings of the Workshop on Software Ecosystems 2011
8. Weiss, M., Moroiu, G., Zhao, P.: Evolution of open source communities. In Dami-
ani, E., Fitzgerald, B., Scacchi, W., Scotto, M., Succi, G., eds.: Open Source Sys-
tems. Volume 203 of IFIP International Federation for Information Processing.
Springer Boston (2006) 21–32
9. Goeminne, M., Mens, T.: A framework for analysing and visualising open source
software ecosystems. In: Proceedings International Workshop on Principles of Soft-
ware Evolution (IWPSE-EVOL), ACM Press (September 2010) 42–47
10. Goeminne, M., Mens, T.: Evidence for the pareto principle in open source soft-
ware activity. In Bruntink, M., Kontogiannis, K., eds.: CSMR 2011 Workshop on
Software Quality and Maintainability (SQM). Volume 701., CEUR-WS.org (2011)
74–82
11. Vasa, R., Lumpe, M., Branch, P., Nierstrasz, O.: Comparative analysis of evolving
software systems using the Gini coefficient. In: Proc. Int’l Conf. Software Mainte-
nance. (2009) 179–188
12. Serebrenik, A., van den Brand, M.: Theil index for aggregation of software metrics
values. In: IEEE International Conference on Software Maintenance, Los Alamitos,
CA, USA, IEEE Computer Society (2010) 1–9
13. Poncin, W., Serebrenik, A., van den Brand, M.: Process mining software reposi-
tories. In Mens, T., Kanellopoulos, Y., Winter, A., eds.: CSMR ’11: Proceedings
of the European Conference on Software Maintenance and Reengineering., IEEE
Computer Society (2011) 5–14
14. Robles, G., González-Barahona, J.M., Izquierdo-Cortazar, D., Herraiz, I.: Tools
for the study of the usual data sources found in libre software projects. IJOSSP
1(1) (2009) 24–45
14