=Paper=
{{Paper
|id=None
|storemode=property
|title=Debian Packages Repositories as Software Product Line Models. Towards Automated Analysis
|pdfUrl=https://ceur-ws.org/Vol-688/acota2010_paper5_galindo.pdf
|volume=Vol-688
|dblpUrl=https://dblp.org/rec/conf/acota/GalindoBS10
}}
==Debian Packages Repositories as Software Product Line Models. Towards Automated Analysis==
ACoTA 2010 First International Workshop on Automated Tailoring and Configuration of Applications
Debian Packages Repositories as Software Product
Line Models. Towards Automated Analysis
José A. Galindo, David Benavides and Sergio Segura
Department of Computer Languages and Systems
Av Reina Mercedes S/N, 41012 Seville, Spain
{jagalindo, benavides, sergiosegura} at us.es
Abstract—The automated analysis of variability models in realistic models are highly desirable in order to assess the
general and feature models in particular is a thriving research scalability of analysis tools.
topic. There have been numerous contributions along the last
twenty years in this area including both, research papers and In order to look for realistic and large–scale variability mod-
tools. However, the lack of realistic variability models to evaluate els, we decided to look into the open source community. We
those techniques and tools is recognized as a major problem
by the community. To address this issue, we looked for large– found that Debian based linux distributions have a language
scale variability models in the open source community. We found to describe dependencies in software packages repositories.
that the Debian package dependency language can be interpreted They are used in Debian individual installations in order to
as software product line variability model. Moreover, we found add, remove or update packages and manage their possible
that those models can be automatically analysed in a software conflicts and dependencies [16]. We found that this language
product line variability model-like style. In this paper, we take
a first step towards the automated analysis of Debian package can be interpreted as a variability language similar to those of
dependency language. We provide a mapping from these models feature models, OVMs or decision models. Thus, a Debian–
to propositional formulas. We also show how this could allow based linux distribution could be interpreted as a product line
us to perform analysis operations on the repositories like the and the individual installations could be therefore considered
detection of anomalies (e.g. packages that cannot be installed). the products of the product line. In this context, a product is a
I. I NTRODUCTION set of installed packages, i.e. a Debian installation. Similarly,
Software Product Line (SPL) engineering is about building we assume that a set of repositories define an SPL of Debian-
set of related software products. From existing assets, products based distributions. We may remark that this could also be
are assembled reducing production costs and time–to–market. applied to other dependency languages such as RPM [14], [10]
Variability Models are used to represent the set of products (previusly called Red-Hat Package Manager) but for simplicity
of an SPL. Typical variability models are feature models [3], they are out of scope of this paper.
orthogonal variability models [21] or decision models [9]. The main contribution of this paper is twofold. First, we
The automated analysis of variability models deals with the suggest that Debian packages dependency language can be
automated extraction of information from variability models considered as a variability language. Second, we provide a
[3]. In the last years, a large number of techniques haven been mapping from this language to propositional formulas enabling
proposed in the literature in order to automatically analyse their analysis by means of SAT solvers. As a part of our
variability models. In [3], Benavides et al. provide a detailed contribution, we present a motivating example showing how
literature review on twenty years of automated analysis of our approach could be used to detect uninstallable packages in
feature models identifying 30 different analysis operations the repositories, a task that is currently done by hand. Finally,
(e.g. Valid Product and Dead Features). These analyses are we explain how we plan to integrate our approach in the FaMa
mainly performed using logic paradigms such propositional Framework, a tool we developed, enabling an efficient and
logic, constraint programming or description logic. There are user-friendly extraction of Debian variability models and their
also several commercial and open source tools supporting the analysis .
analysis of variability models such as AHEAD tool suite [2],
DecisionKing [8], FaMa [4], pure::variants [20] S.P.L.O.T The rest of the paper is structured as follows. Section II
[17] and VarMod (OVM) [25]. describes the main elements of the Debian package depen-
The lack of realistic and large–scale variability models to dency language. Section III proposes a graphical view of
evaluate and compare the performance of analysis tools is Debian variability models. A motivation example is presented
acknowledged to be a challenge in the SPL community [1], in Section IV clarifying how the Debian packaging system
[23], [11], [24], [13]. Although there are references to real describes a SPL and how can the automated analysis help
feature models with hundreds or even thousands of features in the detection of inconsistencies and redundancies. We
[1], these are not available for the research community. This show a mapping from Debian Variability Modelling Language
has led authors to use trivially small feature models as case (DebianVML) to propositional formula in Section V to enable
studies or to generate the models randomly [23]. Therefore, the automated reasoning of it.
29
ACoTA 2010 First International Workshop on Automated Tailoring and Configuration of Applications
1 Package: openoffice.org-gnome
II. D EBIAN PACKAGE DEPENDENCY LANGUAGE 2 Essential: No
This section describes the main elements of the Debian 3 Version: 1:2.4.0-3ubuntu6
package dependency language. A Debian–based installation 4 Replaces: openoffice.org-common (<< 2.0.4˜
rc1-0),
is associated with a set of package repositories. A package 5 Provides: openoffice.org-gtk-gnome,
is a unit of software that can be installed and configured. openoffice.org2-gnome
Each package repository is associated with a configuration file 6 Depends: gconf2, libc6 (>= 2.1.3)
written in the Debian package dependency language including 7 Pre-Depends: dpkg (>= 1.14.12ubuntu3)
information about the software packages and their relation- 8 Suggests: openoffice.org-evolution
9 Conflicts: openoffice.org2-gnome (<<
ships/dependencies. 1:2.4.0-3ubuntu6)
Figure 1 shows a fragment of a repository configuration 10 Task: ubuntu-desktop, edubuntu-desktop,
file. For each package in the file, a set of properties are gobuntu-desktop
presented. A property is composed of a name and one or
more values associated with it. Properties may be divided
Figure 1. Sample Package Definition
into those providing basic information about the package (e.g.
name, version) and those describing the relationships with
other packages. These relationships can be divided into:
Variability Modelling Language (DebianVML). In this reduced
• Depends. Package A depends on package B if B must be notation we used the a concatenation of the name and version
installed for A to work properly. Sometimes, A depends of the packages as the name of the nodes. We omit in this
not only on B, but on a specific version of it. For instance, reduced visual notation the relationships of versions. In Figure
in Figure 1, line 6, describes that package openoffice.org- 2, we present the graphical symbols used to represent each
gnome depends on package libc6 with version greater relationship among packages: depends, pre-depends, conflicts,
or equal to 2.1.3. If the relationship requires package B suggests, recommends, replaces and provides. Notice that
to be properly configured for the installation of A, the some of these relationships are represented with the same
relationship is referred to as Pre-Depends. Note that a notation since they have similar meaning as described in
installed package could not be properly configured. previous Section.
• Suggests. Package A suggests package B if B contains Figure 3 shows an example using this notation. As il-
files that are related to (and usually enhance) the func- lustrated, the repository is represented as a graph in which
tionality of A. In Figure 1, line 8, package openoffice.org- nodes represent packages and edges the relationships among
gnome suggests package openoffice.org-evolution. When packages. Each package is represented as a rectangle labelled
the suggestion of the package is strong and it is intended with the name and version of the package. In this particular
to be accepted by most of the users this relationship is example, we also include a capital letter to simplify later
usually referred to as Recommends. Note that is a human references to the packages in the paper.
who determines if a the relation should be suggests or
recommends.
• Conflicts. Package A conflicts with package B when
A will not operate if B is installed on the system. In
most cases, conflicts are cases where A contains files
which are an improvement of those in B. For example, in
Figure 1, line 9, package openoffice.org-gnome conflicts
with package openoffice.org2-gnome in versions prior to
1:2.4.0-3ubuntu6.
• Replaces. Package A replaces package B when files
installed by B are removed and in some cases over-written
by files in A. In Figure 1, line 4, package openoffice.org-
gnome replaces package openoffice.org-common in ver-
sions prior to 2.0.4 rc1-0. When this relationship affects
only a part of the functionality of the package it is named
Provides.
Figure 2. DebianVML graphical view
III. A G RAPHICAL N OTATION FOR D EBIAN
D EPENDENCIES
In order to improve the understandability and readability IV. A MOTIVATING EXAMPLE
of the paper we propose a simplified graphical notation [18] Consider the example in Figure 3, it represents a valid De-
for the Debian package dependency language described in the bianVML instance composed by four packages and with three
previous section. We will refer to this notation as Debian different types of relationships: Depends, Conflicts, Replaces.
30
ACoTA 2010 First International Workshop on Automated Tailoring and Configuration of Applications
The model in Figure 3 represents the following products :
1) { }
2) {D}.
3) {C}.
4) {B, C}.
5) {B, D}
6) {A, B, D}
Figure 5. Valid Product
Figure 6 shows an example where A is a dead package.
Detecting dead packages in repositories increase the quality of
the user experience, preventing the user for an unsupported and
frustrating operation. Actually this operation is done manually
by the members of the different Debian based distributions.
Using the mechanism provided by automated analysis in
Figure 3. DebianVML example
Debian variability models this task can be done automatically
saving time into the development of those distributions.
Notice that the empty product is allowed in this variability
model as it represents a Linux distribution without installed
packages. Although this could be controversial, a distribution
completely void, even without the Linux Kernel, could be con-
sidered valid as even the kernel can be installed or uninstalled.
There are also other potential products, that are invalid, for
example the product {A} is not valid because the depends
relationship with B package is not satisfied.
An example of how the automated analysis could help in
this circumstance is shown if we apply the valid product
operation [28]. This analysis operation takes a feature model
and a product as inputs and determines whether the product
is valid for the model. Applying the same idea to Debian
configurations, we could determine that {A, B, C} (Figure 4)
is not a valid instantiation, because the inconsistency marked
by the conflicts relationship between packages A and C. In
contrast, the product {A, B, D} (Figure 5) is a valid product
Figure 6. DebianVML with dead package
since it does not violate any constraint.
In our preliminary tests using Ubuntu 8.04 [27] we detected
some of these kind of repository inconsistencies. Note that
Ubuntu distribution uses the Debian packaging system Figure
7 shows a real example of how a package, wysihtml-le, that
is in the universe repository can never be installed. Dead
Packages can be detected using similar techniques as we do
with other kinds of variability models so the knowledge can
be reused. The complexity of these variability models are
usually bigger than others used in the software product line
community having up to 24780 packages and 159812 packages
Figure 4. Not Valid Product
dependencies in Ubuntu 8.04 with main, multiverse, restricted
and universe repositories enabled.
Another analysis operation of feature models that can be
used in the Debian community is the so–called dead features.
V. M APPING D EBIAN VML TO P ROPOSITIONAL F ORMULA
A feature is dead when the feature does not appear in any
valid configuration. In the context of Debian, we can define a To enable the automated analysis of DebianVML we must
dead package as the one that is present in a repository but can define the mapping from DebianVML to constraints into an
never be installed because of the violation of dependencies. specific off–the–shelf solver such as Sat4j [5], Choco [6] or
31
ACoTA 2010 First International Workshop on Automated Tailoring and Configuration of Applications
Figure 7. Real demo of a dead package in Ubuntu 8.04
JaCoP [12]. Note that these solvers are widely used for the in TableI.
automated analysis of feature models [3].
Debian Variability Model Propositional formula
In this Section we do not refer to all the relationships as
A Depends B A ⇒ (B)
some of them have a very similar meaning and are treated in A Depends B A ⇒ (B∨
the same way as the relationships we describe here. Also there (being B replaced by C,D,...) C ∨ D ∨ ...)
are some relationships that do no affect to the variability in A Conflicts B A ⇒ (¬B)
A Replaces B Used to generate
the model and does not need to be added to the solver as con- the tuples
straints more exactly suggests and recommends relationships.
The algorithm we followed to translate DebianVML to Table I
constraints has the following general form: Each package D EBIAN R ELATIONS TO PROPOSITIONAL FORMULA
name make up the set of variables then, each relationship is
processed to make up the set of constraints (or propositional
clauses) of the problem.
VI. R ELATED WORK
The first relationship to be processed is Replaces. This
is a special case of relationship as it is needed to generate There are other contributions interested in solving the ex-
a set of tuples containing the replaced packages and their isting lack of realistic variability models in the SPL context.
replacements to be ready to process the depends relationships. Czarnecki et al.[13] present how to extract feature models from
This is needed because, after performing a general mapping the Linux kernel using the LKC (Linux Kernel Configurator)
from Debian dependencies to propositional clauses, we have to language that describes the relationships in a very similar way
change the output clauses according to the replacements that as feature models do obtaining feature models with about 6319
are stored in the tuples. For instance, in the example of Figure features. Our approach focuses on other variability dependency
3, we have the tuple (C,D) meaning that D is a replacement of language existing in open source community increasing even
C. Of course, in a real big example, we can have more than more the availability of realistic variability models up to 20000
a tuple referring to the same package (e.g. (C,X1), (C,X2),... packages to the SPL community. Although our first attempts
,(C,Xn) means that C is replaced by X1, X2, ..., Xn). tried to map DebianVML to feature models, we finally realised
After processing the Replaces relationship we can process that DebianVML can be considered a variability language by
the other relationships following the general rules presented it self.
32
ACoTA 2010 First International Workshop on Automated Tailoring and Configuration of Applications
Recently, Di Cosmo et al. [7] presented a way to describe 6 hours in a laptop machine with main, multiverse, re-
feature models using Debian packages dependencies and they stricted and universe repositories of Ubuntu 8.04 enabled
propose using Debian dependencies management tools to (24780 packages). We will work on the of the codding
analyse feature models. Our contribution here is the other to improve those results.
way around: we propose to use feature model analysis tools We consider that this work could foster discussion about
and operations to analyse Debian packages repositories. We the following topics:
think that Di Cosmo et al. contribution and ours are good
• Can open source models be considered as SPL models?
complements to investigate in this direction.
• Can the OS community benefit from the knowledge
A similar contribution to the problem of the analysis of
acquired by the SPL analysis community in the last 20
Debian repositories is presented by some of the results of
years [3]?
the EDOS project [19], [15]. Our contribution can benefit
• How advanced are Open Source configuration languages
their results and we have to study in depth at what extend
to abstract end–users from technical details?
they have performed analysis operations. We think that we
can complement their results providing our know–how in the ACKNOWLEDMENTS
analysis of feature models and software product lines [3].
We would like to thank the friends and co-worker’s that
VII. C ONCLUSIONS , FUTURE WORK AND DISCUSSIONS encouraged us in this research and the anonymous reviewers
In this paper, we propose using Debian repositories as a for their helpful comments.
good source of realistic and complex variability models. These This work has been partially supported by the European
can be used as motivations inputs problems for those tools Commission (FEDER) and Spanish Government under CICYT
dealing with the automation of variability models. At the same projects SETI (TIN2009-07366) and by the Andalusian Gov-
time, we consider the Debian community could benefit of ernment under ISABEL project (TIC-2533).
this automation to perform several analysis operations like
the detection of inconsistencies in repositories, a work that R EFERENCES
is currently done by hand (e.g. When an error is detected, it [1] D. Batory, D. Benavides, and A. Ruiz-Cortés. Automated analysis
is correctly manually). of feature models: Challenges ahead. Communications of the ACM,
December:45–47, 2006.
In our ongoing research, we are working to enable the
[2] Don Batory. Feature-oriented programming and the ahead tool suite.
analysis of DebianVML in FaMa, making FaMa, our tool to In ICSE ’04: Proceedings of the 26th International Conference on
analyse feature model, capable to perform this analysis and Software Engineering, pages 702–703, Washington, DC, USA, 2004.
helping the community of Debian based distributions, applying IEEE Computer Society.
[3] D. Benavides, S. Segura, and A. Ruiz-Cortés. Automated analysis of
the knowledge about feature models under Debian VM. feature models 20 years later: A literature review. Information Systems,
FaMa [26], [4] is a framework to build variability model 35(6), 2010. (In press).
analysis tools, that was designed with extensibility in mind. [4] D. Benavides, P. Trinidad, S. Segura, and A. Ruiz-Cortés. Fama
framework. http://www.isa.us.es/fama/. accessed May 2010.
It offers a platform to implement other kind of variability [5] D. Le Berre. Sat4j. http://www.sat4j.org/. accessed May 2010.
models. It has already been used to analyse feature models [6] CHOCO solver, http://choco.emn.fr/. accessed January 2010.
[26] and also OVMs [22]. Because the analysis of variability [7] Roberto Di Cosmo and Stefano Zacchiroli. Feature diagrams as package
dependencies. In 14th International Software Product Line Conference
models can be very expensive in terms of memory and time, (SPLC’10), 2010.
FaMa also offers a set of tools for benchmarking [23] over [8] Deepak Dhungana, Paul Grünbacher, and Rick Rabiser. Decisionking:
models. These features, coupled to our familiarity with the A flexible and extensible tool for integrated variability modeling. In 1st
International Workshop on Variability Modelling of Software-intensive
tool as members of the FaMa project, made FaMa as a good Systems, pages 119–128, 2007.
candidate to be extended in order to support the analysis of [9] Deepak Dhungana, Rick Rabiser, and Paul Grunbacher. Decision-
Debian variability models. This is part of our future work. oriented modeling of product line architectures. Software Architecture,
Working IEEE/IFIP Conference on, 0:22, 2007.
These are the main directions of our future work: [10] FOSDEM’08. http://en.opensuse.org/Package management/Sat solver.
• Address new operations. Using our experience on feature [11] Arnaud Hubaux, Andreas Classen, Marcı́lio Mendonça, and Patrick
Heymans. A preliminary review on the application of feature diagrams
models (e.g. Development of analysis tools, benchmark- in practice. In VaMoS, pages 53–59, 2010.
ing, detection of errors ...) we need to be able to detect [12] K. Kuchcinski and R. Szymanek. Jacop. http://jacop.osolpro.com/.
anomalies in Debian models such as conditionally dead accessed May 2010.
packages or redundancies, probably the operations with [13] Rafael Lotufo, Steven She, Thorsten Berger, Krzysztof Czarnecki, and
Andrzej Wasowski. Evolution of the linux kernel variability model. In
more than one variability model can be very useful to 14th International Software Product Line Conference (SPLC’10), 2010.
Debian community improving the usability of packaging [14] RPM Package Manager. http://www.rpm.org.
systems. Also the explanations of the errors to help De- [15] Fabio Mancinelli, Jaap Boender, Roberto di Cosmo, Jerome Vouillon,
Berke Durak, Xavier Leroy, and Ralf Treinen. Managing the complexity
bian community on its daily work could help to improve of large free and open source package-based software distributions. In
the end user experience. ASE ’06: Proceedings of the 21st IEEE/ACM International Conference
• Benchmarking and optimization. In our preliminary tests on Automated Software Engineering, pages 199–208, Washington, DC,
USA, 2006. IEEE Computer Society.
we have detected that the Debian based models automated [16] Debian Policy Manual. http://www.debian.org/doc/debian-policy/
analysis are computationally very expensive, more than ch-relationships.html.
33
ACoTA 2010 First International Workshop on Automated Tailoring and Configuration of Applications
[17] Marcilio Mendonca, Moises Branco, and Donald Cowan. S.p.l.o.t.:
software product lines online tools. OOPSLA ’09: Proceeding of the 24th
ACM SIGPLAN conference companion on Object oriented programming
systems languages and applications, pages 761–762, 2009.
[18] D. Moody. The physics of notations: Toward a scientific basis for con-
structing visual notations in software engineering. Software Engineering,
IEEE Transactions on, 35(6):756 –779, nov.-dec. 2009.
[19] EDOS project. http://www.edos-project.org.
[20] Pure-systems. pure::variants. http://www.pure-systems.com/. accessed
May 2010.
[21] Fabricia Roos-Frantz. A preliminary comparison of formal properties
on orthogonal variability model and feature models. In VaMoS, pages
121–126, 2009.
[22] Fabricia Roos-Frantz and Sergio Segura. Automated analysis of orthog-
onal variability models. a first step. In Steffen Thiel and Klaus Pohl,
editors, First Workshop on Analyses of Software Product Lines (ASPL
2008). SPLC’08, pages 243–248, Limerick, Ireland, September 2008.
[23] S. Segura and A. Ruiz-Cortés. Benchmarking on the automated analyses
of feature models: A preliminary roadmap. In Third International
Workshop on Variability Modelling of Software-intensive Systems, pages
137–143, Seville, Spain, 2009.
[24] Steven She, Rafael Lotufo, Thorsten Berger, Andrzej Wasowski, and
Krzysztof Czarnecki. The variability model of the linux kernel. In
Fourth International Workshop on Variability Modelling of Software-
intensive Systems (VAMOS’10), Linz, Austria, January 2010.
[25] VarMod-PRIME Tool-Environment. http://www.sse.uni-essen.de/wms/
en/index.php.
[26] P. Trinidad, D. Benavides, A. Ruiz-Cortés, S. Segura, and A.Jimenez.
Fama framework. In 12th Software Product Lines Conference (SPLC),
2008.
[27] Ubuntu. http://www.ubuntu.com.
[28] J. White, B. Doughtery, D. Schmidt, and D. Benavides. Automated
reasoning for multi-step software product-line configuration problems.
In Proceedings of the Sofware Product Line Conference, pages 11–20,
2009.
34