Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank                                                                       117


                          Towards Anomaly Explanation in Feature Models ∗
                            A. Felfernig1 , D. Benavides2 , J. Galindo2 , and F. Reinfrank1
                                    1
                                      Graz University of Technology, Graz, Austria
                                 {alexander.felfernig, florian.reinfrank}@ist.tugraz.at
                                              2
                                                University of Seville, Spain
                                             {benavides,jagalindo}@us.es

                           Abstract                                   concepts and algorithms can be applied to advanced feature
                                                                      model representations as well.
        Feature models are a wide-spread approach to vari-
        ability and commonality management in software                   Developing and maintaining large and potentially com-
        product lines. Due to the increasing size and com-            plex feature models is an error-prone activity which can
        plexity of feature models, anomalies in terms of              be explained by the cognitive overload of software engi-
        inconsistencies and redundancies can occur which              neers and domain experts [Trinidad et al., 2008; Benavides
        lead to increased efforts related to feature model            et al., 2013]. In order to tackle this challenge, feature
        development and maintenance. In this paper we in-             model development and maintenance processes have to be
        troduce knowledge representations which serve as              supported by intelligent techniques and tools which help to
        a basis for the explanation of anomalies in feature           identify anomalies which become manifest in different types
        models. On the basis of these representations we              of inconsistencies and redundancies [Batory et al., 2006;
        show how explanation algorithms can be applied.               Benavides et al., 2010]. An approach to the identification
        The results of a performance analysis show the ap-            of dead features (features not part of any configuration) is
        plicability of these algorithms for anomaly detec-            presented by Trinidad et al. [Trinidad et al., 2008]. The au-
        tion in feature models. We conclude the paper with            thors also introduce concepts to solve the problem of void
        a discussion of future research issues.                       feature models (no configuration exists that fulfills all the
                                                                      constraints in the feature model). For the identification of
                                                                      faulty relationships in the feature model (in these scenarios)
1       Introduction                                                  the authors define a corresponding diagnosis task which is
Similar to component-oriented configuration models [Felfer-           based on the concepts introduced by [Reiter, 1987]. As an
nig et al., 2000; Felfernig, 2007], Feature Models (FM)               alternative to the approach of [Trinidad et al., 2008], White
[Kang et al., 1990] are used to express variability properties        et al. [White et al., 2010] show how to transform feature
of highly-variant items [Mendonca and Cowan, 2010]. Ap-               models into a corresponding representation of a constraint
plications based on feature models help users to decide about         satisfaction problem (CSP) [Tsang, 1993]. On the basis of
relevant features and to learn about existing dependencies be-        this representation, diagnoses are directly determined by the
tween features. Feature models can be distinguished with              constraint solver without the support of an additional diag-
regard to the expressiveness of constraints defining the re-          nostic engine. An overview of analysis operations (for the
lationships between the different features contained in a fea-        identification of different inconsistencies and redundancies)
ture model [Benavides et al., 2010]. So-called basic feature          for feature models is provided in [Benavides et al., 2010;
models [Kang et al., 1990] will be used as a basis for the dis-       von der Massen and Lichter, 2004].
cussions in this paper. Such models allow the definition of              If we are interested in minimal explanations (diagnoses) for
basic relationships between features, for example, a feature          feature model anomalies, the performance of the underlying
f1 requires the inclusion of a feature f2 . Cardinality-based         algorithms becomes a challenge. An example explanation in
feature models [Czarnecki et al., 2005] extend basic ones by          this context would be the minimal set of constraints which
also allowing cardinalities with an upper bound > 1. Finally,         have to be adapted or deleted from an inconsistent feature
extended feature models [Batory, 2005] allow the inclusion            model (the determination of a configuration is not possible)
of additional information about features in terms of feature          such that the remaining constraints allow the calculation of
attributes. For presentation purposes we decided to use ba-           at least one configuration. Reiter [Reiter, 1987] introduced
sic feature models (see Section 2). However, the presented            a hitting set based approach to the determination of minimal
    ∗
    This work was supported, in part, by the Austrian Research Pro-   explanations (diagnoses) – these diagnoses are also of mini-
motion Agency under the project ICONE (827587), the European          mal cardinality since diagnosis search is performed on the ba-
Commission (FEDER), the Spanish Government under project SETI         sis of breadth-first search. The idea of applying the concepts
(TIN2009-07366), and by the Andalusian Government under project       of model-based diagnosis to inconsistent constraint sets has
THEOS (TIC-5906).                                                     first been introduced by Bakker et al. [Bakker et al., 1993].


                                                                                        Michel Aldanondo and Andreas Falkner, Editors
                                                                           Proceedings of the 15th International Configuration Workshop
                                                                                                     August 29-30, 2013, Vienna, Austria
118                                                                      Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank

Felfernig et al. [Felfernig et al., 2004] continued this work           Semantics of Feature Models. Our representation of FMs
by introducing an approach to the automated testing and de-         is based on the notation introduced in [Benavides et al.,
bugging of configuration knowledge bases where test cases           2010]. Relationships (constraints) in FMs are represented
are used to induce conflicts in the knowledge base. These           in terms of six different types of constraints [Batory, 2005;
conflicts are then resolved on the basis of the hitting set algo-   Benavides et al., 2010; Segura et al., 2010]: mandatory, op-
rithm [Reiter, 1987]. First experiences from the application        tional, alternative, or, requires, and excludes. FMs are rep-
of these testing and debugging approaches in industrial sce-        resenting configurable products which can be formalized in
narios are reported by Fleischanderl [Fleischanderl, 2002].         the form of a constraint satisfaction problem (CSP) [Tsang,
Junker [Junker, 2004] introduced the QuickXPlain (QX) al-           1993] where each variable fi has the assigned domain di =
gorithm. QX is an efficient divide-and-conquer based ap-            {true, f alse}. We define a feature model configuration task
proach to the determination of minimal conflicts which can          as follows (see Definition 1).
then be exploited for the determination of diagnoses. In this           Definition 1 (FM Configuration Task). A feature
paper we show how diagnosis and redundancy detection algo-          model (FM) configuration task is defined by the triple
rithms can be applied to support feature model analysis oper-       (F,D,C) where F = {f1 , f2 , ..., fn } is a set of features
ations [Benavides et al., 2010]. In this context we show how        fi , D = {dom(f1 ), dom(f2 ), ..., dom(fn )} (dom(fi ) =
to apply the diagnosis algorithm FAST D IAG [Felfernig et al.,      {true, f alse}) is the set of corresponding feature domains,
2012] (an algorithm with no need of determining conflict sets)      and C = CR ∪ CF is a set of constraints restricting the
and introduce the FMC ORE algorithm which allows the de-            possible configurations which can be derived from the fea-
tection of redundancies in feature models.                          ture model. In this context, CR = {c1 , c2 , ..., ck } repre-
   The work presented here is in the line of research dedicated     sents a set of requirements (of a specific user) and CF =
to the development of intelligent quality assurance mecha-          {ck+1 , ck+2 , ..., cm } a set of feature model constraints.
nisms for configuration knowledge bases [Felfernig et al.,              On the basis of this definition of an feature model configu-
2004]. The major contributions of this paper are the follow-        ration task, we now introduce the definition of a configuration
ing. First, we advance the state of the art in feature model        for a feature model (FM) configuration task (Definition 2).
anomaly detection by formalizing the anomaly types dis-                 Definition 2 (FM Configuration). A feature model (FM)
cussed in the feature modeling community on the basis of the        configuration for a given FM configuration task is a complete
concepts of inconsistency and redundancy. Second, we intro-         assignment of the variables fi ∈ F . Such a configuration is
duce the FMC ORE algorithm for the detection of redundant           consistent iff the constraints ci ∈ C are not contradicting with
constraints in feature models. Furthermore, we show how to          the variable assignment. Furthermore, an FM configuration is
apply the FAST D IAG algorithm [Felfernig et al., 2012] for ex-     valid, if it is consistent and complete.
plaining different types of inconsistencies in feature models.          Feature Model Constraint Types. Six basic types of
All anomaly types will be discussed in detail in Section 3 in       constraints can be included in CF [Benavides et al., 2010].
combination with corresponding explanation approaches.              These constraint types are the following – their representation
   The remainder of this paper is organized as follows. In          in a graphical feature model is shown in the example of Fig-
Section 2 we introduce a simple feature model (operating sys-       ure 1. In the following we introduce the semantics of these six
tem configuration) which will be used as working example            types of constraints – this semantics is based on the definition
throughout the paper. Furthermore, we introduce the def-            given in [Benavides et al., 2010].
initions of a feature model configuration task and a corre-
sponding feature model configuration. In Section 3 we in-
troduce different relevant forms of anomalies in feature mod-
els together with their formal definitions. The corresponding
anomaly detection algorithms FAST D IAG and FMC ORE are
explained in Section 4. The performance of these algorithms
is analyzed in Section 5 on the basis of selected feature mod-
els from the S.P.L.O.T.1 repository. A discussion of further
research issues and a conclusion is provided in Section 6.

2       Feature models
A feature model (FM) defines a set of possible products
of a domain in terms of features and the relationships be-
                                                                     Figure 1: Feature model (FM) with faulty model elements.
tween them [Wang et al., 2010]. Features are arranged hi-
erarchically (tree structure with one so-called root feature fr
(fr = true)) [Benavides et al., 2010] where the nodes are the          Mandatory: a feature f2 ∈ F is mandatory if it is in a
features and the edges are relationships (constraints) [Segura      mandatory relationship with another feature f1 ∈ F . This
et al., 2010]. For a more detailed overview of different feature    means, if f1 is part of the configuration, f2 must be part of
model representations we refer the reader to [Batory, 2005;         the configuration as well (and vice-versa). The formalization
Benavides et al., 2010].                                            of this constraint type (relationship) is realized on the basis
                                                                    of an equivalence: f1 ↔ f2 . In Figure 1 the feature gui is a
    1
        See www.splot-research.org.                                 mandatory feature connected to the feature ubuntu.


Michel Aldanondo and Andreas Falkner, Editors
Proceedings of the 15th International Configuration Workshop
August 29-30, 2013, Vienna, Austria
Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank                                                                        119

   Optional: a feature f2 ∈ F can (but must not) be included           3   Anomaly Patterns in Feature Models
in the configuration in the case that feature f1 ∈ F is part of        Anomalies can be defined as patterns in data that do not con-
the configuration. This type of constraint can be formalized           form to a well defined notion of normal behavior [Chandola
on the basis of an implication: f2 → f1 . In Figure 1 the              et al., 2009]. Trinidad et al. [Trinidad et al., 2008] are us-
feature games is an optional feature connected to ubuntu.              ing the term error for incorrect definitions of relationships,
   Alternative: only one feature fb ∈ F = {f1 , f2 , ..., fk }         i.e., the set of products described by a feature model does
can be selected if feature fa is selected. The property can be         not match the SPL (software product line) it describes. We
formalized as follows: f1 = true ↔ (f2 = f alse∧...∧fk =               interpret anomalies in the sense of [Trinidad et al., 2008]:
f alse ∧ fa = true) ∧ ... ∧ fk = true ↔ (f1 = f alse ∧ ... ∧           undesirable FM properties in terms of different facets of con-
fk − 1 = f alse ∧ fa = true). In Figure 1 an example of a              tradictory and redundant information contained in the FM.
feature fa is games, the subfeatures are gnuchess and glchess.            Handling Inconsistencies. Inconsistent feature models in-
   Or: at least one feature fb ∈ F = {f1 , f2 , ..., fk } must         clude contradictory constraints ci ∈ C that can not be sat-
be part of the configuration if feature fa is part of the con-         isfied at the same time, leading to no valid instances deriv-
figuration. This property can be formally defined with fa ↔            able from FMs [Wang et al., 2010]. For a given FM con-
{f1 , f2 , ..., fk }. In Figure 1 an example of a feature fa is gui,   figuration task this means that no solution can be identi-
the subfeatures are kde and gnome.                                     fied. In our working example (the FM of Figure 1) no so-
   Requires: a feature f2 must be included in a configuration          lution can be identified due to an inconsistent constraint set
if feature f1 is included. This requires relationship can be           C={c1 , c2 , ..., c10 }.2 Inconsistent sets of constraints can be
defined with f1 → f2 . In Figure 1 an example of a requires            defined on the basis of the concept of conflict sets [Junker,
relationship is games → gui.                                           2004] (see Definition 3).
   Excludes: it is not allowed to combine two features f1 and             Definition 3 (Conflict Set) A conflict set CS ⊆ C is a set
f2 in the same configuration, i.e., feature f1 excludes feature        of constraints s.t. CS is inconsistent. CS is minimal iff there
f2 and vice versa: ¬(f1 ∧ f2 ). In Figure 1 an example of              does not exist a conflict set CS 0 with (CS 0 ⊂ CS).
an excludes relationship is ¬(bash ∧ gui). Note that this is a
                                                                          Based on Definition 3, we can identify minimal sets of con-
possible faulty constraint to be detected by diagnosis.
                                                                       straints CSi ⊆ C, such that CSi is inconsistent. As long as
   Requires and excludes constraints are also denoted as
                                                                       there are conflicts in a given constraint set of a feature model,
cross-tree constraints. Finally, the set CR (customer require-
                                                                       no solutions for the underlying FM configuration task can be
ments) is an additional set of constraints to be taken into ac-
                                                                       identified. Our example feature model (see Figure 1) includes
count when determining configurations (solutions). The set
                                                                       two minimal conflict sets which are CS1 = {c1 , c2 , c6 } and
CR specifies a set of key features which have to be included
                                                                       CS2 = {c2 , c3 , c7 }. Each of these sets is a minimal set such
in the FM configuration for a specific user (customer).
                                                                       that (1) no solution (configuration) can be identified and (2)
   Example Feature Model. A simple example feature
                                                                       none of the subsets of CSi is inconsistent. As a consequence
model (from the domain of operating systems) is depicted in
                                                                       (due to their minimality property) conflicts (represented by
Figure 1. This model specifies a set of features relevant for
                                                                       conflict sets) can be resolved by simply deleting one con-
configuring an ubuntu operating system installation together
                                                                       straint from the set.
with constraints between the features. Note that faulty ele-
                                                                          The resolution of all conflicts (represented by conflict sets)
ments (constraints) are contained in this model – our goal in
                                                                       can be based on the determination of the corresponding hit-
the remainder of this paper will be to introduce algorithms
                                                                       ting sets (also denoted as diagnoses [Reiter, 1987]). The prob-
which help to identify and explain such faulty constraints.
                                                                       lem of identifying minimal sets of constraints which have to
   The CSP-based representation [Tsang, 1993] of the feature
                                                                       be adapted or deleted from the feature model such that the re-
model shown in Figure 1 is the following - a representation
                                                                       maining constraints become consistent can be represented as
as FM configuration task = (F,D,C=CR ∪ CF ).
                                                                       an FM diagnosis task (see Definition 4).
   • F = {ubuntu, texteditor, bash, gui, games, gedit,                    Definition 4 (FM Diagnosis Task) A feature model di-
      vi, kde, gnome, gnuchess, glchess}                               agnosis task (FM diagnosis task) is a tuple (S, AC) where
   • D = {dom(ubuntu) = {true, f alse}, dom(text−                      S ⊆ AC are constraints of the feature model. The task is to
      editor) = {true, f alse}, dom(bash) = {true,                     identify a minimal set of constraints which have to be deleted
      f alse}, dom(gui) = {true, f alse}, dom(games)                   from S s.t. consistency can be restored in the feature model.
      = {true, f alse}, dom(gedit) = {true, f alse},                      In this context, S helps us to focus our diagnostic activities,
      dom(vi) = {true, f alse}, dom(kde) = {true, f alse},             i.e., to focus on those model parts where we suspect faulty
      dom(gnome) = {true, f alse}, dom(gnuchess) =                     constraints. If no such suspects exist, S can be set to AC. An
      {true, f alse}, dom(glchess) = {true, f alse}                    FM diagnosis, i.e., a solution to an FM diagnosis task can be
   • CR = {c0 : ubuntu = true}                                         defined as follows (see Definition 5).
                                                                          Definition 5 (FM Diagnosis) A feature model diagnosis
   • CF = { c1 : ubuntu ↔ texteditor, c2 : ubuntu ↔                    (FM diagnosis) is a set of constraints ∆ ⊆ S with AC − ∆ is
      bash, c3 : ubuntu ↔ gui, c4 : games → ubuntu, c5 :               consistent. ∆ is minimal iff there does not exist a set ∆’ with
      texteditor ↔ gedit ∨ vi, c6 : ¬texteditor ∨ ¬bash,               ∆’ ⊂ ∆ and ∆’ has the diagnosis property as well.
      c7 : ¬bash ∨ ¬gui, c8 : gui ↔ kde ∨ gnome, c9 :
      games → gui, c10 : (gnuchess ↔ ¬glchess ∧ games)                    2
                                                                            Note that we interpret the constraint c0 : ubuntu = true as
      ∧ (glchess ↔ ¬gnuchess ∧ games)}                                 element of the (customer) requirements CR.


                                                                                         Michel Aldanondo and Andreas Falkner, Editors
                                                                            Proceedings of the 15th International Configuration Workshop
                                                                                                      August 29-30, 2013, Vienna, Austria
120                                                                        Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank

  The diagnoses for our example FM diagnosis task are                 neer and {c6 , c7 } have been deleted from the feature model.
∆1 = {c1 , c3 }, ∆2 = {c1 , c7 }, ∆3 = {c2 }, ∆4 = {c3 , c6 },            Dead feature fi . If a feature fi is not included in any of the
∆5 = {c6 , c7 }. These represent five ways to delete (adapt)          possible configurations (i.e., inconsistent(CF ∪ fi = true)),
constraints from (in) the feature model such that at least one        we are interested in solutions to the FM diagnosis task (S =
configuration can be determined. The calculation of all ∆i is         CF, AC = CF ∪ {c0 } ∪ {fi = true}). This way we are
sketched in Figure 2. The underlying assumption in this ex-           able to figure out the minimal sets of constraints that are re-
ample is that – conform to the algorithm introduced by Reiter         sponsible for the non-acceptance of fi . In our working ex-
[Reiter, 1987] – the search tree (hitting set directed acyclic        ample, there is no such dead feature (assuming that the con-
graph – HSDAG) is expanded in breadth-first manner.                   straints in ∆5 have been deleted from the feature model). If
                                                                      we would substitute the constraint c9 : games → gui with
                                                                      c9 : ¬gui ∨ ¬games, the feature games would be a dead
                                                                      feature. If we then want to make games a feature which is
                                                                      included in at least one configuration, the diagnoses for (S =
                                                                      CF, AC = CF ∪ {c0 } ∪ {games = true}) are ∆1 = {c3 }
                                                                      and ∆2 = {c9 }.
                                                                          Conditionally dead feature fi . Such a feature fi is not
                                                                      included in all of the possible configurations, i.e., consis-
                                                                      tent (CF ∪ {c0 } ∪ {fi =false}) and consistent (CF ∪ {c0 } ∪
                                                                      {fi =true}). If we want to have fi in each configuration, we
                                                                      have to add {fi = true} to the set CF. In our working exam-
                                                                      ple, games is a conditionally dead feature since there are also
                                                                      solutions with no inclusion of this feature. In order to make
Figure 2: Determination of diagnoses for a given inconsistent         games part of every possible feature model configuration, we
feature model (FM). The following discussion of anomaly               have to make this clear in the feature model. One way to
types assumes that ∆5 = {c6 , c7 } was chosen.                        achieve this would be to convert constraint c4 into a manda-
                                                                      tory constraint – this would have the same effect as adding
   One possible approach to determine the complete set of             games = true as an additional constraint to CF.
diagnoses is based on the hitting set directed acyclic graph              Full mandatory feature fi . A feature fi is fully mandatory
(HSDAG) algorithm introduced by Reiter [Reiter, 1987]. The            if it is included in every possible solution (configuration), i.e.,
basic idea of this algorithm is to determine a conflict (in the       inconsistent(CF ∪ {c0 } ∪ {fi = f alse}). If we want to adapt
example CS1 : {c1 , c2 , c6 }) and then to resolve this conflict.     the feature model in such a way that it also allows fi to be not
If this conflict is resolved (e.g., by deleting the constraint c1 )   included, we can determine the corresponding (minimal) sets
the algorithm checks whether further conflicts exist in the fea-      of responsible constraints by solving the FM diagnosis task
ture model. In our example this is the case and the next deter-       (S=CF, AC= CF ∪ {c0 } ∪ {fi = f alse}). In our working
mined conflict set is CS2 : {c2 , c3 , c7 }. If we delete, for ex-    example, the feature gui is a full mandatory feature since it
ample c3 from CS2 , we receive the diagnosis ∆1 = {c1 , c3 }.         is part of every possible configuration. If we want to allow
In a similar fashion all other diagnoses can be determined.           configurations where gui is not included, the only diagnosis
Note that {c1 , c2 } is not a (minimal) diagnosis since {c2 } is      for (S=CF, AC= CF ∪ {c0 } ∪ {gui = f alse}) is ∆1 = {c3 }.
already a diagnosis. The HSDAG algorithm is a traditional                 False optional feature fi . A false optional feature fi is in-
way of determining diagnoses – more efficient approaches              cluded in all configurations (e.g., products of a product line)
will be presented in Section 4.                                       although it has not been modeled as mandatory. If we replace
   Feature Model Anomaly Patterns. We can now discuss                 the constraint c9 : games → gui with c9 : gui → games,
in more detail different basic types of feature model anoma-          the feature games becomes a false optional feature since it is
lies. Ways to explain these anomalies and related algorithms          included in every possible configuration. An alternative in-
will then be discussed in detail in Section 4. An overview            terpretation of a false optional feature focuses on the optional
of these anomalies and related property checks is shown in            relationship between a feature fpar and fopt . If the consis-
Table 1. The following types of anomalies are taken from              tency check of (CF ∪ {c0 } ∪ {fpar = true ∧ fopt = f alse})
Benavides et al. [Benavides et al., 2010].                            returns false (and fpar = true), the feature fopt is not an op-
   Void feature model. If model constraints in CF are in-             tion. In our example (under the assumption that c9 is adapted
consistent (inconsistent(CF ∪ c0 )), we are interested in so-         as mentioned), the diagnosis for (S = CF, AC = CF ∪ {c0 } ∪
lutions to the FM diagnosis task (S=CF, AC = CF ∪ c0 ).               {ubuntu = true ∧ games = f alse}) is ∆1 = {c3 }.
In this case we want to figure out which are the minimal                  Redundant constraint ci . In our working example the con-
sets of constraints that are responsible for the given incon-         straint c9 : games → gui is redundant since gui is a full
sistency in the feature model. We do not include c0 (e.g.,            mandatory feature. If we check the consistency of {CF - {c9 }
c0 : ubuntu = true) in the set S since we are not interested          ∪ ¬CF} we see that c9 is redundant since the expression is in-
in changing this constraint. The feature model of our example         consistent. In other words, CF - {c9 } |= c9 , i.e., c9 logically
(see Figure 1) is an example of a void feature model.                 follows from CF - {c9 } – therefore it is redundant. The sec-
   Note that for the following discussions we assume that             ond redundant constraint in our working example is c4 since
∆5 = {c6 , c7 } (see Figure 2) has been chosen by the engi-           the feature ubuntu is a full mandatory feature as well. Con-


Michel Aldanondo and Andreas Falkner, Editors
Proceedings of the 15th International Configuration Workshop
August 29-30, 2013, Vienna, Austria
Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank                                                                         121

        Analysis operation       Property Check                                      Explanation (Diagnosis Task)
        Void feature model       inconsistent(CF ∪ {c0 })?                           FAST D IAG(CF,CF ∪ {c0 })
        Dead (fi )               inconsistent(CF ∪ {c0 } ∪ {fi =true})?              FAST D IAG(CF,CF ∪ {c0 } ∪ {fi = true})
        Conditionally            consistent(CF ∪ {c0 } ∪ {fi =false}) and            CF ← CF ∪ {fi =true}
        dead (fi )               consistent(CF ∪ {c0 } ∪ {fi =true})?
        Full mandatory (fi )     inconsistent(CF ∪ {c0 } ∪ {fi =false})?             FAST D IAG(CF,CF ∪ {c0 } ∪ {fi = f alse})
        False optional (fopt )   inconsistent(CF ∪ {c0 } ∪                           FAST D IAG(CF, CF ∪ {c0 } ∪
                                  {fpar =true ∧ fopt =false})?                         {fpar = true ∧ fopt = f alse})
        Redundant (ci )          inconsistent((CF ∪ {c0 } - {ci }) ∪ ¬(CF ∪ c0 ))?      / FMC ORE(CF ∪ {c0 })
                                                                                     ci ∈

Table 1: Feature model analysis operations, property checks, and related explanations. For example, figuring out whether a
feature model is void (no solution can be found) can be determined on the basis of a consistency check inconsistent (CF ∪
{c0 }). A related explanation can be determined by solving the FM diagnosis task (CF, CF ∪ {c0 }). The related diagnosis
(FAST D IAG) and redundancy detection algorithm (FMC ORE) are discussed in Section 4.


sequently, the constraints {c4 , c9 } can be deleted from the           Algorithm 2 D IAG(D, S = {s1 , ..., sr }, AC): ∆
feature model without changing the underlying semantics.3                 if D 6= ∅ and consistent(AC) then
   In the following section we focus on the presentation of                  return ∅;
two algorithms which help to determine explanations for the               end if
different feature model anomaly patterns.                                 if singleton(S) then
                                                                             return S;
4       Explaining Anomalies                                              end if
The two basic algorithms for determining diagnoses and                    k ← d 2r e;
redundancies are FAST D IAG and FMC ORE. FAST D IAG                       S1 ← {s1 , ..., sk }; S2 ← {sk+1 , ..., sr };
[Felfernig et al., 2012] is a divide-and-conquer algorithm                ∆1 ← DIAG(S2 , S1 , AC − S2 );
that supports the efficient determination of minimal diagnoses            ∆2 ← DIAG(∆1 , S2 , AC − ∆1 );
without the need of having conflict sets available. FMC ORE               return(∆1 ∪ ∆2 );
is an algorithm which focuses on the determination of mini-
mal cores, i.e., redundancy-free subsets of a constraint set.
   Determination of Diagnoses. In FAST D IAG (see Algo-
rithm 1), the set S represents the set of constraints where a           consistent, the diagnosis is searched in the other part and the
diagnosis should be searched, The set AC contains all con-              first part can be omitted (no constraints part of the diagnosis
straints of the feature model. For example, if we want to               will be found there). If a singleton constraint of S triggers
diagnose a void feature model (CF ∪ {c0 } is inconsistent –             an inconsistency, this constraint is considered a part of the
see Table 1), we would activate the algorithm with FAST-                diagnosis. FAST D IAG determines exactly one diagnosis at
D IAG(CF,CF ∪ {c0 }), i.e., S = CF and AC = CF ∪ {c0 }.                 a time. If we want to determine more than one or even the
We do not include c0 in the set of diagnosable constraints              complete set of diagnoses, we need to combine FAST D IAG
since c0 (the root constraint) is assumed to be correct (e.g.,          with a corresponding algorithm that supports the construc-
c0 : ubuntu = true). First, the algorithm (see Algorithm                tion of HSDAGs. The discussion of this approach is outside
1) checks whether the considered constraint set can be diag-            the scope of this paper. We want to refer the reader to the
nosed (if the set S is empty, no diagnosis will be found) and           work of Felfernig et al. [Felfernig et al., 2012]. Compared
whether the constraints in AC-S are inconsistent (in this case          to traditional diagnosis approaches, FAST D IAG needs in the
no diagnosis can be determined).                                        worst case 2d × log2 ( nd ) + 2d consistency checks where d is
                                                                        the number of constraints in the minimal diagnosis and n is
Algorithm 1 FAST D IAG(S, AC): ∆                                        the number of constraints in S [Felfernig et al., 2012]. The
                                                                        corresponding best case complexity in terms of the number of
  if isEmpty(S) or inconsistent(AC − S) then                            consistency checks is log2 ( nd +2d). A similar worst case (and
     return ∅;                                                          best case) complexity in traditional diagnosis approaches can
  else                                                                  be expected for each determination of a conflict set (see, e.g.,
     return DIAG(∅, S, AC)                                              Figure 2) [Felfernig et al., 2012].
  end if
                                                                           Determination of Redundancies. A constraint fi of a fea-
                                                                        ture model (represented by the constraint set CF) is redundant
   The major idea of FAST D IAG (and its subfunction D IAG              if its deletion from the model does not change the set of pos-
– see Algorithm 2) is to divide a set S of inconsistent con-            sible solutions. More formally, CF - {fi } |= fi which means
straints into two subsets S1 and S2 . If the first part becomes         that fi logically follows from CF - {fi } and therefore is re-
    3
    Note that redundancies can also be intended to achieve goals        dundant. An algorithm for redundancy detection should def-
such as improving understandability or increasing efficiency – a dis-   initely not check redundancy properties on the basis of con-
cussion of related issues is outside the scope of this paper.           crete configurations since such an approach becomes com-


                                                                                          Michel Aldanondo and Andreas Falkner, Editors
                                                                             Proceedings of the 15th International Configuration Workshop
                                                                                                       August 29-30, 2013, Vienna, Austria
122                                                                     Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank

                 Feature Model: Car Selection      #Variables: 72              #Constraints:96
                 # Diagnoses                               Inconsistency Rate
                                   2% (8 diagnoses)          5% (64 diagnoses)    7% (182 diagnoses)
                                FAST D IAG     HSDAG FAST D IAG HSDAG FAST D IAG HSDAG
                      1             452          874           561        1888      858        5366
                      2             749          890           920        1891      1638       5382
                      3            1045          921          1294        2138      2059       5506
                      4            1373          936          1653        2143      2324       5522
                      5            1529          968          1872        2262      2464       5544
                     10              –            –           2511        2418      2932       5709
                     20              –            –           2964        2450      3806       6162
                     all           1632         1027          4383        3339     11856       8860

             Table 2: Evaluation of FAST D IAG and HSDAG with the Car Selection feature model from S.P.L.O.T.

              Feature Model: SmartHome V. 2.2        #Variables: 61             #Constraints:63
              # Diagnoses                                 Inconsistency Rate
                                  2% (8 diagnoses)            5% (12 diagnoses)    7% (77 diagnoses)
                               FAST D IAG       HSDAG FAST D IAG HSDAG FAST D IAG HSDAG
                   1               297              920         312         952      577        2683
                   2               437              967         452         968      951        2684
                   3               609              983         592         983     1341        2686
                   4               734              998         733        1139     1762        2699
                   5               843             1014         842        1155     2090        2671
                  10                –                –          967        1529     2792        2715
                  20                –                –            –          –      3369        2746
                  all             1155             1061        1606        1545     6224        3151

           Table 3: Evaluation of FAST D IAG and HSDAG with the SmartHome V 2.2 feature model from S.P.L.O.T.

pletely inefficient even in the case of simple feature models.     Algorithm 3 FMC ORE(S): ∆
The basic idea of the FMC ORE algorithm is to iterate over           {S: the (redundant constraint set)}
the given set of constraints (S) and for each constraint ci ∈ S      {S: the complement of S}
to check whether the deletion of ci changes the semantics of         {∆: set of redundant constraints}
S. The assumption is that if ci is non-redundant, its deletion       Stemp ← S;
from S will change the semantics of S, i.e., S − {ci } ∪ S be-       for all ci in Stemp do
comes consistent. All these individual redundant constraints           if isInconsistent((Stemp − {ci }) ∪ {¬cj }) then
are deleted from Stemp (a temporal copy of S). Finally, the               Stemp ← Stemp − {ci };
algorithm returns the set Stemp which represents a minimal             end if
core, i.e., the original set S without redundant constraints.        end for
   Note that – instead of checking the inconsistency of CS −         return Stemp ;
{ci } ∪ S (see, e.g., [Felfernig et al., 2011]) – FMC ORE sys-
tematically reduces the number of constraints to be checked
in S. Given a configuration knowledge base S and its com-          gorithms on the basis of different feature models provided by
plement S, the (in)consistency check of S − {ci } ∪ S can be       the S.P.L.O.T. repository. The results of this analysis are pre-
reduced to the inconsistency check of S − {ci } ∪ S 0 where        sented in the following section.
S 0 = {¬ci }. If we assume that S = {c1 ∧c2 ∧..∧cm ∧cm+1 ∧
.. ∧ cn }, S = {¬c1 ∨ ¬c2 ∨ .. ∨ ¬cm ∨ ¬cm+1 ∨ .. ∨ ¬cn }, and     5   Performance Evaluation
γ = {cm+1 ∧ .. ∧ cn } then the consistency check of S − γ ∪ S
can be reduced to {c1 ∧ c2 ∧ .. ∧ cm } ∪ {¬cm+1 ∨ .. ∨ ¬cn }. In   For evaluation purposes we selected different feature models
FMC ORE (Algorithm 3) this property is taken into account.         offered by the S.P.L.O.T. repository: Car Selection (Table 2),
                                                                   SmartHome V. 2.2. (Table 3), and Xerox (Table 4). In order
   The number of consistency checks of FMC ORE in the best         to evaluate the performance of FAST D IAG, we randomly in-
case equals the number of consistency checks in the worst          serted additional cross-tree constraints in the feature models
case – in both cases the number of consistency checks needed       for inducing inconsistencies which could then be exploited
is exactly n (the number of constraints in S).                     for determining minimal diagnoses. For a systematic eval-
   In order to analyze the performance of FAST D IAG and           uation we generated different versions of the (inconsistent)
FMC ORE we conducted a performance analysis for both al-           feature models which differed in terms of their inconsistency


Michel Aldanondo and Andreas Falkner, Editors
Proceedings of the 15th International Configuration Workshop
August 29-30, 2013, Vienna, Austria
Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank                                                                      123

                      Feature Model: Xerox        #Variables: 172               #Constraints:205
                    # Diagnoses                             Inconsistency Rate
                                    2% (140 diagnoses)       5% (84 diagnoses)      7% (55 diagnoses)
                                 FAST D IAG HSDAG FAST D IAG HSDAG FAST D IAG HSDAG
                         1           1638        3354         1260        2996       1740          3023
                         2           2013        6646         1710        3167       2050          3203
                         3           2262       12106         1970        9454       2330          9544
                         4           2434       12355         2180        9536       2580          9654
                         5           2637       28111         2341       12044       2790         12165
                        10           3417       69950         2921       64631       3330         65240
                        20           4758       75317         3911       90715       5010         91726
                        all         46785      >100000       17301      >100000     10541        >100000

                   Table 4: Evaluation of FAST D IAG and HSDAG with the Xerox feature model from S.P.L.O.T.

                         Feature Model       #Variables    #Constraints   Redundancy Rate        Runtime (ms)
                          Car Selection         72             96              0.64                 5070
                        SmartHome V. 2.2        61             63              0.29                 1907
                             Xerox              172           205              0.71                 3261

                              Table 5: Evaluation of FMC ORE with selected S.P.L.O.T. feature models.

rate (see Formula 1) which was categorized in {2%, 5%, 7%}.         can be applied as well. Finally, we also evaluated the perfor-
We used a random variable to control the degree of generated        mance of the redundancy detection algorithm FMC ORE (see
inconsistencies (the number of conflicts) in a feature model.       Table 5). Our goal was to figure out for the selected feature
As reasoning engine we used the CHOCO constraint solving            models to which extent the constraints in the feature models
library.4 In order to import feature models to our environment      are redundant. We measured redundancy in the terms of the
we implemented a parser that generated CHOCO knowledge              redundancy rate (see Formula 2).
bases from S.P.L.O.T. SXFM based feature models.
                                                                                               #redundant constraints in F M
                                                                     Redundancy Rate =
                                  #conf licts in F M                                                 #constraints in F M
         Inconsistency Rate =                               (1)                                                                 (2)
                                 #constraints in F M
                                                                       The outcome of this analysis was that all the investigated
   The performance tests were executed within a Java appli-         feature models showed quite different degrees of redundancy
cation running on a 64bit Windows 7 desktop PC using 8GB            (see Table 5). However, we consider these as preliminary
RAM and an Intel(R) Core(TM) i5-2320 CPU with 3.0GHz.               results and further analyses have to be conducted, for exam-
Each run of the diagnosis algorithm for a specific setting          ple, we are interested in intra-constraint redundancies and the
has been repeated 10 times were in each run the ordering of         share of redundancy in cross-tree constraints with regard to
the constraints was randomized. For each setting we eval-           the overall number of constraints in the feature model.
uated the runtime (in ms) of both, the standard hitting set            Note that the FMC ORE algorithm is especially useful in
based approach to the termination of diagnoses [Reiter, 1987]       situations where models are developed by one or a few en-
(HSDAG) and FAST D IAG. As scenario we choose the diag-             gineers. In this case the degree of redundant constraints in
nosis of void feature models where we induced different de-         the model is low. For scenarios with high redundancy rate,
grees of inconsistency (based on the inconsistency rate mea-        alternative algorithms have already been developed (see, e.g.,
sure – see Formula 1). The upper bound for the evaluation           [Felfernig et al., 2011]).
time was set to 100.000 ms – in the case that this upper limit
was exceeded, the search was stopped.                               6     Conclusions
   If one or a few diagnoses are required (which is typical
for interactive settings) then FAST D IAG outperforms the stan-     In this paper we presented a consistency-based approach to
dard HSDAG approach in most of the cases. If all diagnoses          explaining anomalies in feature models. We introduced defi-
are required, for example, in situations where diagnoses are        nitions which are useful for the explanation of anomalies and
computed offline, the standard HSDAG approach seems to be           discussed the corresponding algorithms which help to deter-
the better choice. We want to emphasize that the presented di-      mine minimal diagnoses (FAST D IAG) and minimal sets of
agnosis algorithms are independent of the underlying reason-        non-redundant constraints (FMC ORE). Our future work will
ing mechanisms, i.e., beside using a basic constraint-based         focus on: (1) The definition of further anomaly patterns in
approach for supporting the reasoning tasks (mainly consis-         alternative knowledge representations such as advanced fea-
tency checking), description logics or SAT-based approaches         ture models [Batory, 2005] and UML models [Felfernig et
                                                                    al., 2000]. Due to higher expressiveness, these representa-
   4
       www.emn.fr/z-info/choco-solver.                              tions include further anomaly patterns such as multiplicity


                                                                                       Michel Aldanondo and Andreas Falkner, Editors
                                                                          Proceedings of the 15th International Configuration Workshop
                                                                                                    August 29-30, 2013, Vienna, Austria
124                                                                   Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank

bounds which can not represented by configurations, unsat-          mass customization. IEEE Transactions on Engineering
isfiable preconditions in constraints, and unexplained incom-       Management, 54:41–56, 2007.
patibilities. (2) The development of mechanisms for the auto-    [Fleischanderl, 2002] G. Fleischanderl.            Suggestions
mated generation of test cases for feature models. (3) Further      from the software engineering practice for applying
algorithms that enable the determination of diagnoses and re-       consistency-based diagnosis to configuration knowledge
dundancies on an intra-constraint level. (4) Evaluation of the      bases. In 13th Intl. Workshop on Principles of Diagnosis
developed algorithms with further benchmarks.                       (DX-02), pages 33–35, Semmering, Austria, 2002.
                                                                 [Junker, 2004] U. Junker. QuickXPlain: preferred explana-
References
                                                                    tions and relaxations for over-constrained problems. In
[Bakker et al., 1993] R. Bakker, F. Dikker, F. Tempelman,           Proceedings of the 19th National Conference on Artifical
   and P. Wogmim. Diagnosing and solving over-determined            Intelligence, AAAI 2004, pages 167–172. AAAI, 2004.
   constraint satisfaction problems. In Proceedings of IJCAI-
                                                                 [Kang et al., 1990] K. Kang, S. Cohen, J. Hess, W. No-
   93, pages 276–281. Morgan Kaufmann, 1993.
                                                                    vak, and S. Peterson. Feature-oriented Domain Analysis
[Batory et al., 2006] D. Batory, D. Benavides, and A. Ruiz-         (FODA) – Feasibility Study. TechnicalReport CMU – SEI-
   Cortes. Automated analysis of feature models: challenges         90-TR-21, 1990.
   ahead. Comm. of the ACM, 49:45–47, 2006.
                                                                 [Mendonca and Cowan, 2010] M. Mendonca and D. Cowan.
[Batory, 2005] D. Batory. Feature Models, Grammars, and             Decision-making coordination and efficient reasoning
   Propositional Formulas. In H. Obbink and K. Pohl, edi-           techniques for feature-based configuration. Science of
   tors, Software Product Lines Conference, volume 3714 of          Computer Programming, 75(5):311 – 332, 2010. Coordi-
   LNCS, pages 7–20. Springer, 2005.                                nation Models, Languages and Applications(SAC 2008).
[Benavides et al., 2010] D. Benavides, S. Segura, and            [Reiter, 1987] R. Reiter. A theory of diagnosis from first
   A. Ruiz-Cortes. Automated analysis of feature models             principles. Artificial Intelligence, 32(1):57–95, 1987.
   20 years later: A literature review. Information Systems,
   35:615–636, 2010.                                             [Segura et al., 2010] S. Segura, R. Hierons, D. Benavides,
                                                                    and A. Ruiz-Cortes. Automated test data generation on
[Benavides et al., 2013] D. Benavides,        A. Felfernig,         the analyses of feature models: A metamorphic testing ap-
   J. Galindo, and F. Reinfrank. Automated Analysis in              proach. In 3rd Intl. Conference on Software Testing, Veri-
   Feature Modelling and Product Configuration. In 13th             fication and Validation (ICST), pages 35–44, 2010.
   International Conference on Software Reuse (ICSR 2013),
   number 7925 in LNCS, pages 160–175, Pisa, Italy, 2013.        [Trinidad et al., 2008] P. Trinidad, D. Benavides, A. Duran,
                                                                    A. Ruiz-Cortez, and M. Toro. Automated error analysis
[Chandola et al., 2009] V. Chandola, A. Banerjee, and               for the agilization of feature modeling. Journal of Systems
   V. Kumar. Anomaly detection: A survey. ACM Computing             and Software, 81:883–896, 2008.
   Surveys, 41:15:1–15:58, July 2009.
                                                                 [Tsang, 1993] E. Tsang. Foundations of Constraint Satisfac-
[Czarnecki et al., 2005] K. Czarnecki, S.Helsen, and
                                                                    tion. Academic Press, London, 1993.
   U.Eisenecker. Formalizing Cardinality-based Feature
   Models and their Specialization.         SoftwareProcess:     [von der Massen and Lichter, 2004] T. von der Massen and
   Improvement and Practice, 10(1):7–29, 2005.                      H. Lichter. Deficiencies in Feature Models. In T. Mannisto
                                                                    and J. Bosch, editors, Workshop on Software Variability
[Felfernig et al., 2000] A. Felfernig, G. E. Friedrich, and
                                                                    Management for Product Derivation, 2004.
   D. Jannach. UML as Domain Specific Language for the
   Construction of Knowledge-based Configuration Systems.        [Wang et al., 2010] B. Wang, Y. Xiong, Z. Hu, H. Zhao,
   International Journal of Software Engineering and Knowl-         W. Zhang, and H. Mei. A dynamic-priority based approach
   edge Engineering, 10(4):449–469, 2000.                           to fixing inconsistent feature models. In D. Petriu, N. Rou-
                                                                    quette, and O. Haugen, editors, Model Driven Engineer-
[Felfernig et al., 2004] A. Felfernig, G. Friedrich, D. Jan-
                                                                    ing Languages and Systems, volume 6394 of LNCS, pages
   nach, and M. Stumptner. Consistency-based diagnosis              181–195. Springer Berlin, 2010.
   of configuration knowledge bases. Artificial Intelligence,
   152(2):213 – 234, 2004.                                       [White et al., 2010] J. White, D. Benavides, D. Schmidt,
                                                                    P. Trinidad, B. Dougherty, and A. Ruiz-Cortes. Automated
[Felfernig et al., 2011] A. Felfernig, C. Zehentner, and
                                                                    diagnosis of feature model configurations. Journal of Sys-
   P. Blazek. Corediag: Eliminating redundancy in constraint
                                                                    tems and Software, 83(7):1094–1107, 2010.
   sets. In 22nd International Workshop on Principles of Di-
   agnosis, pages 219–224, Murnau, Germany, 2011.
[Felfernig et al., 2012] A. Felfernig, M. Schubert, and C. Ze-
   hentner. An efficient diagnosis algorithm for inconsistent
   constraint sets. AI for Engineering Design, Analysis, and
   Manufacturing (AIEDAM), 26(1):53–62, 2012.
[Felfernig, 2007] A. Felfernig. Standardized configuration
   knowledge representations as technological foundation for


Michel Aldanondo and Andreas Falkner, Editors
Proceedings of the 15th International Configuration Workshop
August 29-30, 2013, Vienna, Austria