Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank 117 Towards Anomaly Explanation in Feature Models ∗ A. Felfernig1 , D. Benavides2 , J. Galindo2 , and F. Reinfrank1 1 Graz University of Technology, Graz, Austria {alexander.felfernig, florian.reinfrank}@ist.tugraz.at 2 University of Seville, Spain {benavides,jagalindo}@us.es Abstract concepts and algorithms can be applied to advanced feature model representations as well. Feature models are a wide-spread approach to vari- ability and commonality management in software Developing and maintaining large and potentially com- product lines. Due to the increasing size and com- plex feature models is an error-prone activity which can plexity of feature models, anomalies in terms of be explained by the cognitive overload of software engi- inconsistencies and redundancies can occur which neers and domain experts [Trinidad et al., 2008; Benavides lead to increased efforts related to feature model et al., 2013]. In order to tackle this challenge, feature development and maintenance. In this paper we in- model development and maintenance processes have to be troduce knowledge representations which serve as supported by intelligent techniques and tools which help to a basis for the explanation of anomalies in feature identify anomalies which become manifest in different types models. On the basis of these representations we of inconsistencies and redundancies [Batory et al., 2006; show how explanation algorithms can be applied. Benavides et al., 2010]. An approach to the identification The results of a performance analysis show the ap- of dead features (features not part of any configuration) is plicability of these algorithms for anomaly detec- presented by Trinidad et al. [Trinidad et al., 2008]. The au- tion in feature models. We conclude the paper with thors also introduce concepts to solve the problem of void a discussion of future research issues. feature models (no configuration exists that fulfills all the constraints in the feature model). For the identification of faulty relationships in the feature model (in these scenarios) 1 Introduction the authors define a corresponding diagnosis task which is Similar to component-oriented configuration models [Felfer- based on the concepts introduced by [Reiter, 1987]. As an nig et al., 2000; Felfernig, 2007], Feature Models (FM) alternative to the approach of [Trinidad et al., 2008], White [Kang et al., 1990] are used to express variability properties et al. [White et al., 2010] show how to transform feature of highly-variant items [Mendonca and Cowan, 2010]. Ap- models into a corresponding representation of a constraint plications based on feature models help users to decide about satisfaction problem (CSP) [Tsang, 1993]. On the basis of relevant features and to learn about existing dependencies be- this representation, diagnoses are directly determined by the tween features. Feature models can be distinguished with constraint solver without the support of an additional diag- regard to the expressiveness of constraints defining the re- nostic engine. An overview of analysis operations (for the lationships between the different features contained in a fea- identification of different inconsistencies and redundancies) ture model [Benavides et al., 2010]. So-called basic feature for feature models is provided in [Benavides et al., 2010; models [Kang et al., 1990] will be used as a basis for the dis- von der Massen and Lichter, 2004]. cussions in this paper. Such models allow the definition of If we are interested in minimal explanations (diagnoses) for basic relationships between features, for example, a feature feature model anomalies, the performance of the underlying f1 requires the inclusion of a feature f2 . Cardinality-based algorithms becomes a challenge. An example explanation in feature models [Czarnecki et al., 2005] extend basic ones by this context would be the minimal set of constraints which also allowing cardinalities with an upper bound > 1. Finally, have to be adapted or deleted from an inconsistent feature extended feature models [Batory, 2005] allow the inclusion model (the determination of a configuration is not possible) of additional information about features in terms of feature such that the remaining constraints allow the calculation of attributes. For presentation purposes we decided to use ba- at least one configuration. Reiter [Reiter, 1987] introduced sic feature models (see Section 2). However, the presented a hitting set based approach to the determination of minimal ∗ This work was supported, in part, by the Austrian Research Pro- explanations (diagnoses) – these diagnoses are also of mini- motion Agency under the project ICONE (827587), the European mal cardinality since diagnosis search is performed on the ba- Commission (FEDER), the Spanish Government under project SETI sis of breadth-first search. The idea of applying the concepts (TIN2009-07366), and by the Andalusian Government under project of model-based diagnosis to inconsistent constraint sets has THEOS (TIC-5906). first been introduced by Bakker et al. [Bakker et al., 1993]. Michel Aldanondo and Andreas Falkner, Editors Proceedings of the 15th International Configuration Workshop August 29-30, 2013, Vienna, Austria 118 Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank Felfernig et al. [Felfernig et al., 2004] continued this work Semantics of Feature Models. Our representation of FMs by introducing an approach to the automated testing and de- is based on the notation introduced in [Benavides et al., bugging of configuration knowledge bases where test cases 2010]. Relationships (constraints) in FMs are represented are used to induce conflicts in the knowledge base. These in terms of six different types of constraints [Batory, 2005; conflicts are then resolved on the basis of the hitting set algo- Benavides et al., 2010; Segura et al., 2010]: mandatory, op- rithm [Reiter, 1987]. First experiences from the application tional, alternative, or, requires, and excludes. FMs are rep- of these testing and debugging approaches in industrial sce- resenting configurable products which can be formalized in narios are reported by Fleischanderl [Fleischanderl, 2002]. the form of a constraint satisfaction problem (CSP) [Tsang, Junker [Junker, 2004] introduced the QuickXPlain (QX) al- 1993] where each variable fi has the assigned domain di = gorithm. QX is an efficient divide-and-conquer based ap- {true, f alse}. We define a feature model configuration task proach to the determination of minimal conflicts which can as follows (see Definition 1). then be exploited for the determination of diagnoses. In this Definition 1 (FM Configuration Task). A feature paper we show how diagnosis and redundancy detection algo- model (FM) configuration task is defined by the triple rithms can be applied to support feature model analysis oper- (F,D,C) where F = {f1 , f2 , ..., fn } is a set of features ations [Benavides et al., 2010]. In this context we show how fi , D = {dom(f1 ), dom(f2 ), ..., dom(fn )} (dom(fi ) = to apply the diagnosis algorithm FAST D IAG [Felfernig et al., {true, f alse}) is the set of corresponding feature domains, 2012] (an algorithm with no need of determining conflict sets) and C = CR ∪ CF is a set of constraints restricting the and introduce the FMC ORE algorithm which allows the de- possible configurations which can be derived from the fea- tection of redundancies in feature models. ture model. In this context, CR = {c1 , c2 , ..., ck } repre- The work presented here is in the line of research dedicated sents a set of requirements (of a specific user) and CF = to the development of intelligent quality assurance mecha- {ck+1 , ck+2 , ..., cm } a set of feature model constraints. nisms for configuration knowledge bases [Felfernig et al., On the basis of this definition of an feature model configu- 2004]. The major contributions of this paper are the follow- ration task, we now introduce the definition of a configuration ing. First, we advance the state of the art in feature model for a feature model (FM) configuration task (Definition 2). anomaly detection by formalizing the anomaly types dis- Definition 2 (FM Configuration). A feature model (FM) cussed in the feature modeling community on the basis of the configuration for a given FM configuration task is a complete concepts of inconsistency and redundancy. Second, we intro- assignment of the variables fi ∈ F . Such a configuration is duce the FMC ORE algorithm for the detection of redundant consistent iff the constraints ci ∈ C are not contradicting with constraints in feature models. Furthermore, we show how to the variable assignment. Furthermore, an FM configuration is apply the FAST D IAG algorithm [Felfernig et al., 2012] for ex- valid, if it is consistent and complete. plaining different types of inconsistencies in feature models. Feature Model Constraint Types. Six basic types of All anomaly types will be discussed in detail in Section 3 in constraints can be included in CF [Benavides et al., 2010]. combination with corresponding explanation approaches. These constraint types are the following – their representation The remainder of this paper is organized as follows. In in a graphical feature model is shown in the example of Fig- Section 2 we introduce a simple feature model (operating sys- ure 1. In the following we introduce the semantics of these six tem configuration) which will be used as working example types of constraints – this semantics is based on the definition throughout the paper. Furthermore, we introduce the def- given in [Benavides et al., 2010]. initions of a feature model configuration task and a corre- sponding feature model configuration. In Section 3 we in- troduce different relevant forms of anomalies in feature mod- els together with their formal definitions. The corresponding anomaly detection algorithms FAST D IAG and FMC ORE are explained in Section 4. The performance of these algorithms is analyzed in Section 5 on the basis of selected feature mod- els from the S.P.L.O.T.1 repository. A discussion of further research issues and a conclusion is provided in Section 6. 2 Feature models A feature model (FM) defines a set of possible products of a domain in terms of features and the relationships be- Figure 1: Feature model (FM) with faulty model elements. tween them [Wang et al., 2010]. Features are arranged hi- erarchically (tree structure with one so-called root feature fr (fr = true)) [Benavides et al., 2010] where the nodes are the Mandatory: a feature f2 ∈ F is mandatory if it is in a features and the edges are relationships (constraints) [Segura mandatory relationship with another feature f1 ∈ F . This et al., 2010]. For a more detailed overview of different feature means, if f1 is part of the configuration, f2 must be part of model representations we refer the reader to [Batory, 2005; the configuration as well (and vice-versa). The formalization Benavides et al., 2010]. of this constraint type (relationship) is realized on the basis of an equivalence: f1 ↔ f2 . In Figure 1 the feature gui is a 1 See www.splot-research.org. mandatory feature connected to the feature ubuntu. Michel Aldanondo and Andreas Falkner, Editors Proceedings of the 15th International Configuration Workshop August 29-30, 2013, Vienna, Austria Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank 119 Optional: a feature f2 ∈ F can (but must not) be included 3 Anomaly Patterns in Feature Models in the configuration in the case that feature f1 ∈ F is part of Anomalies can be defined as patterns in data that do not con- the configuration. This type of constraint can be formalized form to a well defined notion of normal behavior [Chandola on the basis of an implication: f2 → f1 . In Figure 1 the et al., 2009]. Trinidad et al. [Trinidad et al., 2008] are us- feature games is an optional feature connected to ubuntu. ing the term error for incorrect definitions of relationships, Alternative: only one feature fb ∈ F = {f1 , f2 , ..., fk } i.e., the set of products described by a feature model does can be selected if feature fa is selected. The property can be not match the SPL (software product line) it describes. We formalized as follows: f1 = true ↔ (f2 = f alse∧...∧fk = interpret anomalies in the sense of [Trinidad et al., 2008]: f alse ∧ fa = true) ∧ ... ∧ fk = true ↔ (f1 = f alse ∧ ... ∧ undesirable FM properties in terms of different facets of con- fk − 1 = f alse ∧ fa = true). In Figure 1 an example of a tradictory and redundant information contained in the FM. feature fa is games, the subfeatures are gnuchess and glchess. Handling Inconsistencies. Inconsistent feature models in- Or: at least one feature fb ∈ F = {f1 , f2 , ..., fk } must clude contradictory constraints ci ∈ C that can not be sat- be part of the configuration if feature fa is part of the con- isfied at the same time, leading to no valid instances deriv- figuration. This property can be formally defined with fa ↔ able from FMs [Wang et al., 2010]. For a given FM con- {f1 , f2 , ..., fk }. In Figure 1 an example of a feature fa is gui, figuration task this means that no solution can be identi- the subfeatures are kde and gnome. fied. In our working example (the FM of Figure 1) no so- Requires: a feature f2 must be included in a configuration lution can be identified due to an inconsistent constraint set if feature f1 is included. This requires relationship can be C={c1 , c2 , ..., c10 }.2 Inconsistent sets of constraints can be defined with f1 → f2 . In Figure 1 an example of a requires defined on the basis of the concept of conflict sets [Junker, relationship is games → gui. 2004] (see Definition 3). Excludes: it is not allowed to combine two features f1 and Definition 3 (Conflict Set) A conflict set CS ⊆ C is a set f2 in the same configuration, i.e., feature f1 excludes feature of constraints s.t. CS is inconsistent. CS is minimal iff there f2 and vice versa: ¬(f1 ∧ f2 ). In Figure 1 an example of does not exist a conflict set CS 0 with (CS 0 ⊂ CS). an excludes relationship is ¬(bash ∧ gui). Note that this is a Based on Definition 3, we can identify minimal sets of con- possible faulty constraint to be detected by diagnosis. straints CSi ⊆ C, such that CSi is inconsistent. As long as Requires and excludes constraints are also denoted as there are conflicts in a given constraint set of a feature model, cross-tree constraints. Finally, the set CR (customer require- no solutions for the underlying FM configuration task can be ments) is an additional set of constraints to be taken into ac- identified. Our example feature model (see Figure 1) includes count when determining configurations (solutions). The set two minimal conflict sets which are CS1 = {c1 , c2 , c6 } and CR specifies a set of key features which have to be included CS2 = {c2 , c3 , c7 }. Each of these sets is a minimal set such in the FM configuration for a specific user (customer). that (1) no solution (configuration) can be identified and (2) Example Feature Model. A simple example feature none of the subsets of CSi is inconsistent. As a consequence model (from the domain of operating systems) is depicted in (due to their minimality property) conflicts (represented by Figure 1. This model specifies a set of features relevant for conflict sets) can be resolved by simply deleting one con- configuring an ubuntu operating system installation together straint from the set. with constraints between the features. Note that faulty ele- The resolution of all conflicts (represented by conflict sets) ments (constraints) are contained in this model – our goal in can be based on the determination of the corresponding hit- the remainder of this paper will be to introduce algorithms ting sets (also denoted as diagnoses [Reiter, 1987]). The prob- which help to identify and explain such faulty constraints. lem of identifying minimal sets of constraints which have to The CSP-based representation [Tsang, 1993] of the feature be adapted or deleted from the feature model such that the re- model shown in Figure 1 is the following - a representation maining constraints become consistent can be represented as as FM configuration task = (F,D,C=CR ∪ CF ). an FM diagnosis task (see Definition 4). • F = {ubuntu, texteditor, bash, gui, games, gedit, Definition 4 (FM Diagnosis Task) A feature model di- vi, kde, gnome, gnuchess, glchess} agnosis task (FM diagnosis task) is a tuple (S, AC) where • D = {dom(ubuntu) = {true, f alse}, dom(text− S ⊆ AC are constraints of the feature model. The task is to editor) = {true, f alse}, dom(bash) = {true, identify a minimal set of constraints which have to be deleted f alse}, dom(gui) = {true, f alse}, dom(games) from S s.t. consistency can be restored in the feature model. = {true, f alse}, dom(gedit) = {true, f alse}, In this context, S helps us to focus our diagnostic activities, dom(vi) = {true, f alse}, dom(kde) = {true, f alse}, i.e., to focus on those model parts where we suspect faulty dom(gnome) = {true, f alse}, dom(gnuchess) = constraints. If no such suspects exist, S can be set to AC. An {true, f alse}, dom(glchess) = {true, f alse} FM diagnosis, i.e., a solution to an FM diagnosis task can be • CR = {c0 : ubuntu = true} defined as follows (see Definition 5). Definition 5 (FM Diagnosis) A feature model diagnosis • CF = { c1 : ubuntu ↔ texteditor, c2 : ubuntu ↔ (FM diagnosis) is a set of constraints ∆ ⊆ S with AC − ∆ is bash, c3 : ubuntu ↔ gui, c4 : games → ubuntu, c5 : consistent. ∆ is minimal iff there does not exist a set ∆’ with texteditor ↔ gedit ∨ vi, c6 : ¬texteditor ∨ ¬bash, ∆’ ⊂ ∆ and ∆’ has the diagnosis property as well. c7 : ¬bash ∨ ¬gui, c8 : gui ↔ kde ∨ gnome, c9 : games → gui, c10 : (gnuchess ↔ ¬glchess ∧ games) 2 Note that we interpret the constraint c0 : ubuntu = true as ∧ (glchess ↔ ¬gnuchess ∧ games)} element of the (customer) requirements CR. Michel Aldanondo and Andreas Falkner, Editors Proceedings of the 15th International Configuration Workshop August 29-30, 2013, Vienna, Austria 120 Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank The diagnoses for our example FM diagnosis task are neer and {c6 , c7 } have been deleted from the feature model. ∆1 = {c1 , c3 }, ∆2 = {c1 , c7 }, ∆3 = {c2 }, ∆4 = {c3 , c6 }, Dead feature fi . If a feature fi is not included in any of the ∆5 = {c6 , c7 }. These represent five ways to delete (adapt) possible configurations (i.e., inconsistent(CF ∪ fi = true)), constraints from (in) the feature model such that at least one we are interested in solutions to the FM diagnosis task (S = configuration can be determined. The calculation of all ∆i is CF, AC = CF ∪ {c0 } ∪ {fi = true}). This way we are sketched in Figure 2. The underlying assumption in this ex- able to figure out the minimal sets of constraints that are re- ample is that – conform to the algorithm introduced by Reiter sponsible for the non-acceptance of fi . In our working ex- [Reiter, 1987] – the search tree (hitting set directed acyclic ample, there is no such dead feature (assuming that the con- graph – HSDAG) is expanded in breadth-first manner. straints in ∆5 have been deleted from the feature model). If we would substitute the constraint c9 : games → gui with c9 : ¬gui ∨ ¬games, the feature games would be a dead feature. If we then want to make games a feature which is included in at least one configuration, the diagnoses for (S = CF, AC = CF ∪ {c0 } ∪ {games = true}) are ∆1 = {c3 } and ∆2 = {c9 }. Conditionally dead feature fi . Such a feature fi is not included in all of the possible configurations, i.e., consis- tent (CF ∪ {c0 } ∪ {fi =false}) and consistent (CF ∪ {c0 } ∪ {fi =true}). If we want to have fi in each configuration, we have to add {fi = true} to the set CF. In our working exam- ple, games is a conditionally dead feature since there are also solutions with no inclusion of this feature. In order to make Figure 2: Determination of diagnoses for a given inconsistent games part of every possible feature model configuration, we feature model (FM). The following discussion of anomaly have to make this clear in the feature model. One way to types assumes that ∆5 = {c6 , c7 } was chosen. achieve this would be to convert constraint c4 into a manda- tory constraint – this would have the same effect as adding One possible approach to determine the complete set of games = true as an additional constraint to CF. diagnoses is based on the hitting set directed acyclic graph Full mandatory feature fi . A feature fi is fully mandatory (HSDAG) algorithm introduced by Reiter [Reiter, 1987]. The if it is included in every possible solution (configuration), i.e., basic idea of this algorithm is to determine a conflict (in the inconsistent(CF ∪ {c0 } ∪ {fi = f alse}). If we want to adapt example CS1 : {c1 , c2 , c6 }) and then to resolve this conflict. the feature model in such a way that it also allows fi to be not If this conflict is resolved (e.g., by deleting the constraint c1 ) included, we can determine the corresponding (minimal) sets the algorithm checks whether further conflicts exist in the fea- of responsible constraints by solving the FM diagnosis task ture model. In our example this is the case and the next deter- (S=CF, AC= CF ∪ {c0 } ∪ {fi = f alse}). In our working mined conflict set is CS2 : {c2 , c3 , c7 }. If we delete, for ex- example, the feature gui is a full mandatory feature since it ample c3 from CS2 , we receive the diagnosis ∆1 = {c1 , c3 }. is part of every possible configuration. If we want to allow In a similar fashion all other diagnoses can be determined. configurations where gui is not included, the only diagnosis Note that {c1 , c2 } is not a (minimal) diagnosis since {c2 } is for (S=CF, AC= CF ∪ {c0 } ∪ {gui = f alse}) is ∆1 = {c3 }. already a diagnosis. The HSDAG algorithm is a traditional False optional feature fi . A false optional feature fi is in- way of determining diagnoses – more efficient approaches cluded in all configurations (e.g., products of a product line) will be presented in Section 4. although it has not been modeled as mandatory. If we replace Feature Model Anomaly Patterns. We can now discuss the constraint c9 : games → gui with c9 : gui → games, in more detail different basic types of feature model anoma- the feature games becomes a false optional feature since it is lies. Ways to explain these anomalies and related algorithms included in every possible configuration. An alternative in- will then be discussed in detail in Section 4. An overview terpretation of a false optional feature focuses on the optional of these anomalies and related property checks is shown in relationship between a feature fpar and fopt . If the consis- Table 1. The following types of anomalies are taken from tency check of (CF ∪ {c0 } ∪ {fpar = true ∧ fopt = f alse}) Benavides et al. [Benavides et al., 2010]. returns false (and fpar = true), the feature fopt is not an op- Void feature model. If model constraints in CF are in- tion. In our example (under the assumption that c9 is adapted consistent (inconsistent(CF ∪ c0 )), we are interested in so- as mentioned), the diagnosis for (S = CF, AC = CF ∪ {c0 } ∪ lutions to the FM diagnosis task (S=CF, AC = CF ∪ c0 ). {ubuntu = true ∧ games = f alse}) is ∆1 = {c3 }. In this case we want to figure out which are the minimal Redundant constraint ci . In our working example the con- sets of constraints that are responsible for the given incon- straint c9 : games → gui is redundant since gui is a full sistency in the feature model. We do not include c0 (e.g., mandatory feature. If we check the consistency of {CF - {c9 } c0 : ubuntu = true) in the set S since we are not interested ∪ ¬CF} we see that c9 is redundant since the expression is in- in changing this constraint. The feature model of our example consistent. In other words, CF - {c9 } |= c9 , i.e., c9 logically (see Figure 1) is an example of a void feature model. follows from CF - {c9 } – therefore it is redundant. The sec- Note that for the following discussions we assume that ond redundant constraint in our working example is c4 since ∆5 = {c6 , c7 } (see Figure 2) has been chosen by the engi- the feature ubuntu is a full mandatory feature as well. Con- Michel Aldanondo and Andreas Falkner, Editors Proceedings of the 15th International Configuration Workshop August 29-30, 2013, Vienna, Austria Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank 121 Analysis operation Property Check Explanation (Diagnosis Task) Void feature model inconsistent(CF ∪ {c0 })? FAST D IAG(CF,CF ∪ {c0 }) Dead (fi ) inconsistent(CF ∪ {c0 } ∪ {fi =true})? FAST D IAG(CF,CF ∪ {c0 } ∪ {fi = true}) Conditionally consistent(CF ∪ {c0 } ∪ {fi =false}) and CF ← CF ∪ {fi =true} dead (fi ) consistent(CF ∪ {c0 } ∪ {fi =true})? Full mandatory (fi ) inconsistent(CF ∪ {c0 } ∪ {fi =false})? FAST D IAG(CF,CF ∪ {c0 } ∪ {fi = f alse}) False optional (fopt ) inconsistent(CF ∪ {c0 } ∪ FAST D IAG(CF, CF ∪ {c0 } ∪ {fpar =true ∧ fopt =false})? {fpar = true ∧ fopt = f alse}) Redundant (ci ) inconsistent((CF ∪ {c0 } - {ci }) ∪ ¬(CF ∪ c0 ))? / FMC ORE(CF ∪ {c0 }) ci ∈ Table 1: Feature model analysis operations, property checks, and related explanations. For example, figuring out whether a feature model is void (no solution can be found) can be determined on the basis of a consistency check inconsistent (CF ∪ {c0 }). A related explanation can be determined by solving the FM diagnosis task (CF, CF ∪ {c0 }). The related diagnosis (FAST D IAG) and redundancy detection algorithm (FMC ORE) are discussed in Section 4. sequently, the constraints {c4 , c9 } can be deleted from the Algorithm 2 D IAG(D, S = {s1 , ..., sr }, AC): ∆ feature model without changing the underlying semantics.3 if D 6= ∅ and consistent(AC) then In the following section we focus on the presentation of return ∅; two algorithms which help to determine explanations for the end if different feature model anomaly patterns. if singleton(S) then return S; 4 Explaining Anomalies end if The two basic algorithms for determining diagnoses and k ← d 2r e; redundancies are FAST D IAG and FMC ORE. FAST D IAG S1 ← {s1 , ..., sk }; S2 ← {sk+1 , ..., sr }; [Felfernig et al., 2012] is a divide-and-conquer algorithm ∆1 ← DIAG(S2 , S1 , AC − S2 ); that supports the efficient determination of minimal diagnoses ∆2 ← DIAG(∆1 , S2 , AC − ∆1 ); without the need of having conflict sets available. FMC ORE return(∆1 ∪ ∆2 ); is an algorithm which focuses on the determination of mini- mal cores, i.e., redundancy-free subsets of a constraint set. Determination of Diagnoses. In FAST D IAG (see Algo- rithm 1), the set S represents the set of constraints where a consistent, the diagnosis is searched in the other part and the diagnosis should be searched, The set AC contains all con- first part can be omitted (no constraints part of the diagnosis straints of the feature model. For example, if we want to will be found there). If a singleton constraint of S triggers diagnose a void feature model (CF ∪ {c0 } is inconsistent – an inconsistency, this constraint is considered a part of the see Table 1), we would activate the algorithm with FAST- diagnosis. FAST D IAG determines exactly one diagnosis at D IAG(CF,CF ∪ {c0 }), i.e., S = CF and AC = CF ∪ {c0 }. a time. If we want to determine more than one or even the We do not include c0 in the set of diagnosable constraints complete set of diagnoses, we need to combine FAST D IAG since c0 (the root constraint) is assumed to be correct (e.g., with a corresponding algorithm that supports the construc- c0 : ubuntu = true). First, the algorithm (see Algorithm tion of HSDAGs. The discussion of this approach is outside 1) checks whether the considered constraint set can be diag- the scope of this paper. We want to refer the reader to the nosed (if the set S is empty, no diagnosis will be found) and work of Felfernig et al. [Felfernig et al., 2012]. Compared whether the constraints in AC-S are inconsistent (in this case to traditional diagnosis approaches, FAST D IAG needs in the no diagnosis can be determined). worst case 2d × log2 ( nd ) + 2d consistency checks where d is the number of constraints in the minimal diagnosis and n is Algorithm 1 FAST D IAG(S, AC): ∆ the number of constraints in S [Felfernig et al., 2012]. The corresponding best case complexity in terms of the number of if isEmpty(S) or inconsistent(AC − S) then consistency checks is log2 ( nd +2d). A similar worst case (and return ∅; best case) complexity in traditional diagnosis approaches can else be expected for each determination of a conflict set (see, e.g., return DIAG(∅, S, AC) Figure 2) [Felfernig et al., 2012]. end if Determination of Redundancies. A constraint fi of a fea- ture model (represented by the constraint set CF) is redundant The major idea of FAST D IAG (and its subfunction D IAG if its deletion from the model does not change the set of pos- – see Algorithm 2) is to divide a set S of inconsistent con- sible solutions. More formally, CF - {fi } |= fi which means straints into two subsets S1 and S2 . If the first part becomes that fi logically follows from CF - {fi } and therefore is re- 3 Note that redundancies can also be intended to achieve goals dundant. An algorithm for redundancy detection should def- such as improving understandability or increasing efficiency – a dis- initely not check redundancy properties on the basis of con- cussion of related issues is outside the scope of this paper. crete configurations since such an approach becomes com- Michel Aldanondo and Andreas Falkner, Editors Proceedings of the 15th International Configuration Workshop August 29-30, 2013, Vienna, Austria 122 Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank Feature Model: Car Selection #Variables: 72 #Constraints:96 # Diagnoses Inconsistency Rate 2% (8 diagnoses) 5% (64 diagnoses) 7% (182 diagnoses) FAST D IAG HSDAG FAST D IAG HSDAG FAST D IAG HSDAG 1 452 874 561 1888 858 5366 2 749 890 920 1891 1638 5382 3 1045 921 1294 2138 2059 5506 4 1373 936 1653 2143 2324 5522 5 1529 968 1872 2262 2464 5544 10 – – 2511 2418 2932 5709 20 – – 2964 2450 3806 6162 all 1632 1027 4383 3339 11856 8860 Table 2: Evaluation of FAST D IAG and HSDAG with the Car Selection feature model from S.P.L.O.T. Feature Model: SmartHome V. 2.2 #Variables: 61 #Constraints:63 # Diagnoses Inconsistency Rate 2% (8 diagnoses) 5% (12 diagnoses) 7% (77 diagnoses) FAST D IAG HSDAG FAST D IAG HSDAG FAST D IAG HSDAG 1 297 920 312 952 577 2683 2 437 967 452 968 951 2684 3 609 983 592 983 1341 2686 4 734 998 733 1139 1762 2699 5 843 1014 842 1155 2090 2671 10 – – 967 1529 2792 2715 20 – – – – 3369 2746 all 1155 1061 1606 1545 6224 3151 Table 3: Evaluation of FAST D IAG and HSDAG with the SmartHome V 2.2 feature model from S.P.L.O.T. pletely inefficient even in the case of simple feature models. Algorithm 3 FMC ORE(S): ∆ The basic idea of the FMC ORE algorithm is to iterate over {S: the (redundant constraint set)} the given set of constraints (S) and for each constraint ci ∈ S {S: the complement of S} to check whether the deletion of ci changes the semantics of {∆: set of redundant constraints} S. The assumption is that if ci is non-redundant, its deletion Stemp ← S; from S will change the semantics of S, i.e., S − {ci } ∪ S be- for all ci in Stemp do comes consistent. All these individual redundant constraints if isInconsistent((Stemp − {ci }) ∪ {¬cj }) then are deleted from Stemp (a temporal copy of S). Finally, the Stemp ← Stemp − {ci }; algorithm returns the set Stemp which represents a minimal end if core, i.e., the original set S without redundant constraints. end for Note that – instead of checking the inconsistency of CS − return Stemp ; {ci } ∪ S (see, e.g., [Felfernig et al., 2011]) – FMC ORE sys- tematically reduces the number of constraints to be checked in S. Given a configuration knowledge base S and its com- gorithms on the basis of different feature models provided by plement S, the (in)consistency check of S − {ci } ∪ S can be the S.P.L.O.T. repository. The results of this analysis are pre- reduced to the inconsistency check of S − {ci } ∪ S 0 where sented in the following section. S 0 = {¬ci }. If we assume that S = {c1 ∧c2 ∧..∧cm ∧cm+1 ∧ .. ∧ cn }, S = {¬c1 ∨ ¬c2 ∨ .. ∨ ¬cm ∨ ¬cm+1 ∨ .. ∨ ¬cn }, and 5 Performance Evaluation γ = {cm+1 ∧ .. ∧ cn } then the consistency check of S − γ ∪ S can be reduced to {c1 ∧ c2 ∧ .. ∧ cm } ∪ {¬cm+1 ∨ .. ∨ ¬cn }. In For evaluation purposes we selected different feature models FMC ORE (Algorithm 3) this property is taken into account. offered by the S.P.L.O.T. repository: Car Selection (Table 2), SmartHome V. 2.2. (Table 3), and Xerox (Table 4). In order The number of consistency checks of FMC ORE in the best to evaluate the performance of FAST D IAG, we randomly in- case equals the number of consistency checks in the worst serted additional cross-tree constraints in the feature models case – in both cases the number of consistency checks needed for inducing inconsistencies which could then be exploited is exactly n (the number of constraints in S). for determining minimal diagnoses. For a systematic eval- In order to analyze the performance of FAST D IAG and uation we generated different versions of the (inconsistent) FMC ORE we conducted a performance analysis for both al- feature models which differed in terms of their inconsistency Michel Aldanondo and Andreas Falkner, Editors Proceedings of the 15th International Configuration Workshop August 29-30, 2013, Vienna, Austria Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank 123 Feature Model: Xerox #Variables: 172 #Constraints:205 # Diagnoses Inconsistency Rate 2% (140 diagnoses) 5% (84 diagnoses) 7% (55 diagnoses) FAST D IAG HSDAG FAST D IAG HSDAG FAST D IAG HSDAG 1 1638 3354 1260 2996 1740 3023 2 2013 6646 1710 3167 2050 3203 3 2262 12106 1970 9454 2330 9544 4 2434 12355 2180 9536 2580 9654 5 2637 28111 2341 12044 2790 12165 10 3417 69950 2921 64631 3330 65240 20 4758 75317 3911 90715 5010 91726 all 46785 >100000 17301 >100000 10541 >100000 Table 4: Evaluation of FAST D IAG and HSDAG with the Xerox feature model from S.P.L.O.T. Feature Model #Variables #Constraints Redundancy Rate Runtime (ms) Car Selection 72 96 0.64 5070 SmartHome V. 2.2 61 63 0.29 1907 Xerox 172 205 0.71 3261 Table 5: Evaluation of FMC ORE with selected S.P.L.O.T. feature models. rate (see Formula 1) which was categorized in {2%, 5%, 7%}. can be applied as well. Finally, we also evaluated the perfor- We used a random variable to control the degree of generated mance of the redundancy detection algorithm FMC ORE (see inconsistencies (the number of conflicts) in a feature model. Table 5). Our goal was to figure out for the selected feature As reasoning engine we used the CHOCO constraint solving models to which extent the constraints in the feature models library.4 In order to import feature models to our environment are redundant. We measured redundancy in the terms of the we implemented a parser that generated CHOCO knowledge redundancy rate (see Formula 2). bases from S.P.L.O.T. SXFM based feature models. #redundant constraints in F M Redundancy Rate = #conf licts in F M #constraints in F M Inconsistency Rate = (1) (2) #constraints in F M The outcome of this analysis was that all the investigated The performance tests were executed within a Java appli- feature models showed quite different degrees of redundancy cation running on a 64bit Windows 7 desktop PC using 8GB (see Table 5). However, we consider these as preliminary RAM and an Intel(R) Core(TM) i5-2320 CPU with 3.0GHz. results and further analyses have to be conducted, for exam- Each run of the diagnosis algorithm for a specific setting ple, we are interested in intra-constraint redundancies and the has been repeated 10 times were in each run the ordering of share of redundancy in cross-tree constraints with regard to the constraints was randomized. For each setting we eval- the overall number of constraints in the feature model. uated the runtime (in ms) of both, the standard hitting set Note that the FMC ORE algorithm is especially useful in based approach to the termination of diagnoses [Reiter, 1987] situations where models are developed by one or a few en- (HSDAG) and FAST D IAG. As scenario we choose the diag- gineers. In this case the degree of redundant constraints in nosis of void feature models where we induced different de- the model is low. For scenarios with high redundancy rate, grees of inconsistency (based on the inconsistency rate mea- alternative algorithms have already been developed (see, e.g., sure – see Formula 1). The upper bound for the evaluation [Felfernig et al., 2011]). time was set to 100.000 ms – in the case that this upper limit was exceeded, the search was stopped. 6 Conclusions If one or a few diagnoses are required (which is typical for interactive settings) then FAST D IAG outperforms the stan- In this paper we presented a consistency-based approach to dard HSDAG approach in most of the cases. If all diagnoses explaining anomalies in feature models. We introduced defi- are required, for example, in situations where diagnoses are nitions which are useful for the explanation of anomalies and computed offline, the standard HSDAG approach seems to be discussed the corresponding algorithms which help to deter- the better choice. We want to emphasize that the presented di- mine minimal diagnoses (FAST D IAG) and minimal sets of agnosis algorithms are independent of the underlying reason- non-redundant constraints (FMC ORE). Our future work will ing mechanisms, i.e., beside using a basic constraint-based focus on: (1) The definition of further anomaly patterns in approach for supporting the reasoning tasks (mainly consis- alternative knowledge representations such as advanced fea- tency checking), description logics or SAT-based approaches ture models [Batory, 2005] and UML models [Felfernig et al., 2000]. Due to higher expressiveness, these representa- 4 www.emn.fr/z-info/choco-solver. tions include further anomaly patterns such as multiplicity Michel Aldanondo and Andreas Falkner, Editors Proceedings of the 15th International Configuration Workshop August 29-30, 2013, Vienna, Austria 124 Alexander Felfernig , D. Benavides, J. Galindo, F. Reinfrank bounds which can not represented by configurations, unsat- mass customization. IEEE Transactions on Engineering isfiable preconditions in constraints, and unexplained incom- Management, 54:41–56, 2007. patibilities. (2) The development of mechanisms for the auto- [Fleischanderl, 2002] G. Fleischanderl. Suggestions mated generation of test cases for feature models. (3) Further from the software engineering practice for applying algorithms that enable the determination of diagnoses and re- consistency-based diagnosis to configuration knowledge dundancies on an intra-constraint level. (4) Evaluation of the bases. In 13th Intl. Workshop on Principles of Diagnosis developed algorithms with further benchmarks. (DX-02), pages 33–35, Semmering, Austria, 2002. [Junker, 2004] U. Junker. QuickXPlain: preferred explana- References tions and relaxations for over-constrained problems. In [Bakker et al., 1993] R. Bakker, F. Dikker, F. Tempelman, Proceedings of the 19th National Conference on Artifical and P. Wogmim. Diagnosing and solving over-determined Intelligence, AAAI 2004, pages 167–172. AAAI, 2004. constraint satisfaction problems. In Proceedings of IJCAI- [Kang et al., 1990] K. Kang, S. Cohen, J. Hess, W. No- 93, pages 276–281. Morgan Kaufmann, 1993. vak, and S. Peterson. Feature-oriented Domain Analysis [Batory et al., 2006] D. Batory, D. Benavides, and A. Ruiz- (FODA) – Feasibility Study. TechnicalReport CMU – SEI- Cortes. Automated analysis of feature models: challenges 90-TR-21, 1990. ahead. Comm. of the ACM, 49:45–47, 2006. [Mendonca and Cowan, 2010] M. Mendonca and D. Cowan. [Batory, 2005] D. Batory. Feature Models, Grammars, and Decision-making coordination and efficient reasoning Propositional Formulas. In H. Obbink and K. Pohl, edi- techniques for feature-based configuration. Science of tors, Software Product Lines Conference, volume 3714 of Computer Programming, 75(5):311 – 332, 2010. Coordi- LNCS, pages 7–20. Springer, 2005. nation Models, Languages and Applications(SAC 2008). [Benavides et al., 2010] D. Benavides, S. Segura, and [Reiter, 1987] R. Reiter. A theory of diagnosis from first A. Ruiz-Cortes. Automated analysis of feature models principles. Artificial Intelligence, 32(1):57–95, 1987. 20 years later: A literature review. Information Systems, 35:615–636, 2010. [Segura et al., 2010] S. Segura, R. Hierons, D. Benavides, and A. Ruiz-Cortes. Automated test data generation on [Benavides et al., 2013] D. Benavides, A. Felfernig, the analyses of feature models: A metamorphic testing ap- J. Galindo, and F. Reinfrank. Automated Analysis in proach. In 3rd Intl. Conference on Software Testing, Veri- Feature Modelling and Product Configuration. In 13th fication and Validation (ICST), pages 35–44, 2010. International Conference on Software Reuse (ICSR 2013), number 7925 in LNCS, pages 160–175, Pisa, Italy, 2013. [Trinidad et al., 2008] P. Trinidad, D. Benavides, A. Duran, A. Ruiz-Cortez, and M. Toro. Automated error analysis [Chandola et al., 2009] V. Chandola, A. Banerjee, and for the agilization of feature modeling. Journal of Systems V. Kumar. Anomaly detection: A survey. ACM Computing and Software, 81:883–896, 2008. Surveys, 41:15:1–15:58, July 2009. [Tsang, 1993] E. Tsang. Foundations of Constraint Satisfac- [Czarnecki et al., 2005] K. Czarnecki, S.Helsen, and tion. Academic Press, London, 1993. U.Eisenecker. Formalizing Cardinality-based Feature Models and their Specialization. SoftwareProcess: [von der Massen and Lichter, 2004] T. von der Massen and Improvement and Practice, 10(1):7–29, 2005. H. Lichter. Deficiencies in Feature Models. In T. Mannisto and J. Bosch, editors, Workshop on Software Variability [Felfernig et al., 2000] A. Felfernig, G. E. Friedrich, and Management for Product Derivation, 2004. D. Jannach. UML as Domain Specific Language for the Construction of Knowledge-based Configuration Systems. [Wang et al., 2010] B. Wang, Y. Xiong, Z. Hu, H. Zhao, International Journal of Software Engineering and Knowl- W. Zhang, and H. Mei. A dynamic-priority based approach edge Engineering, 10(4):449–469, 2000. to fixing inconsistent feature models. In D. Petriu, N. Rou- quette, and O. Haugen, editors, Model Driven Engineer- [Felfernig et al., 2004] A. Felfernig, G. Friedrich, D. Jan- ing Languages and Systems, volume 6394 of LNCS, pages nach, and M. Stumptner. Consistency-based diagnosis 181–195. Springer Berlin, 2010. of configuration knowledge bases. Artificial Intelligence, 152(2):213 – 234, 2004. [White et al., 2010] J. White, D. Benavides, D. Schmidt, P. Trinidad, B. Dougherty, and A. Ruiz-Cortes. Automated [Felfernig et al., 2011] A. Felfernig, C. Zehentner, and diagnosis of feature model configurations. Journal of Sys- P. Blazek. Corediag: Eliminating redundancy in constraint tems and Software, 83(7):1094–1107, 2010. sets. In 22nd International Workshop on Principles of Di- agnosis, pages 219–224, Murnau, Germany, 2011. [Felfernig et al., 2012] A. Felfernig, M. Schubert, and C. Ze- hentner. An efficient diagnosis algorithm for inconsistent constraint sets. AI for Engineering Design, Analysis, and Manufacturing (AIEDAM), 26(1):53–62, 2012. [Felfernig, 2007] A. Felfernig. Standardized configuration knowledge representations as technological foundation for Michel Aldanondo and Andreas Falkner, Editors Proceedings of the 15th International Configuration Workshop August 29-30, 2013, Vienna, Austria