=Paper=
{{Paper
|id=None
|storemode=property
|title=Towards Hybrid Techniques for Efficient Declarative Configuration
|pdfUrl=https://ceur-ws.org/Vol-958/paper5.pdf
|volume=Vol-958
|dblpUrl=https://dblp.org/rec/conf/confws/Feinerer12
}}
==Towards Hybrid Techniques for Efficient Declarative Configuration==
Towards Hybrid Techniques for Efficient Declarative Configuration Ingo Feinerer1 Abstract. During the last decades configuration has been ex- other declarative formalisms (e.g. logics or answer-set programming) tensively employed in a wide range of application domains, im- as well. The following framework was implemented and tested with plemented by a multitude of techniques like logics, procedural, S WI-Prolog but should work equally well in any other dialect with object-oriented, or resourced-driven approaches. Especially declara- minor modifications. tive methods provide the foundation for precise and well-understood semantics for reasoning tasks and allow for a succinct representa- C M1 ..M2 N1 ..N2 D a1 tion of the underlying knowledge base. However, a drawback in us- ing such powerful declarative techniques lies in their computational Figure 1: Specification with two classes, an association with complexity. In this paper we present a simple declarative framework multiplicities, and the O CL constraint context C inv: for configuration in Prolog in order to show the advantages of logic- C.allInstances()->size() >= L. based techniques but also to identify some challenges for such for- malisms. We argue for hybrid systems which combine and utilize ef- ficient techniques from different configuration methodologies under Figure 1 depicts a minimal U ML class diagrams as typically a unified declarative interface. used for configuration purposes. It shows a specification with two classes C and D and one association a1 relating them. The multi- plicities restrict the number of valid links; each object of class C 1 INTRODUCTION must be connected with at least N1 and with at most N2 objects In the history of configuration research multiple high-level ap- of class D. Analogously, each object of class D must have links to proaches for the design and solution of configuration tasks have been M1 ..M2 objects of class C. The small arrow at the end of the associ- presented. Declarative systems have had strong proponents due to ation shows the direction (and not a hierarchy). The constraint in the their expressive semantics based on well investigated and understood Object Constraint Language [9] (O CL) for class C states that C must logics. Unfortunately the expressive power limits the applicability be instantiated with at least L objects to obtain a valid configuration. due to high computational costs, observable for a broad class of im- The concepts of a class and of an association can be directly trans- plementations [3]. Related topics like reconfiguration unveil further lated to interesting but often computational hard topics [6, 2] for declara- class(CN-L) :- atom(CN), L >= 0. tive formalisms. In this paper we identify some typical challenges for declarative frameworks and propose a combined hybrid approach assoc(CNs, AN-(C,I)>>(D,J)) :- which utilizes techniques from other fields, like integer linear pro- atom(AN), member(C, CNs), member(D, CNs), gramming (I LP), to overcome these. Section 2 presents a simple mult(I), mult(J). declarative framework in order to demonstrate the benefits but also the challenges (Section 3) of such a system. Section 4 shows three where each class has a name (CN) and a lower bound (L). Each as- strategies to deal with the discussed challenges. sociation has a name (AN) and includes information on the related classes (C, D) and the multiplicities (I, J). Further the class names are checked for validity against a predefined list of names (CNs). 2 DECLARATIVE FORMULATION Now we can define a specification as a tuple (Cs, As) of Object-oriented modeling languages provide a natural formalism for classes Cs and associations As the design of configuration systems. Historically entity-relationship spec((Cs, As)) :- diagrams had a strong toehold but during the recent decade class maplist(class, Cs), pairs_keys(Cs, CNs), diagrams in the Unified Modeling Language [8] (U ML) have seen maplist(assoc(CNs), As). a substantial growth as a domain-specific language for configura- tion. [5, 1] Consequently we use U ML class diagrams as our starting where maplist() applies a predicate to all arguments of a list and point and propose a declarative formulation translating its key fea- pairs_keys() provides access to the subterms of Pair-Key struc- tures into a declarative framework. We chose Prolog for this task as tures. it has an established and well-known semantics and provides several E.g., instantiating the multiplicities of a1 in Figure 1 with M1 = 1, efficient implementations. The main ideas can be easily transfered to M2 = 2, N1 = 3, and N2 = 4 and enforcing a lower bound L of 1 for the number of objects of class C this yields 1 Vienna University of Technology, Austria, email: Ingo.Feinerer@tuwien.ac.at :- spec(([c-1,d-0], [a1-(c,1..2)>>(d,3..4)])) A configuration can be modeled as a tuple (Os, Ls) consist- The predicate satisfies() checks whether (Os, Ls) is a valid ing of objects Os and of links Ls. Each object is identified by its configuration for the specification (Cs, As). The first forall() name O and its corresponding class C. Each link has a name L and checks the number of objects for each class against the correspond- stores information on the corresponding association A and on the ob- ing lower bound whereas the second forall() checks that each jects [O,P] it consists of. object O of the configuration is linked to the right number of part- ner objects as constrained by all participating associations and their is_object(CNs, O-C) :- multiplicities. E.g. atom(O), member(C, CNs). :- satisfies(([c1-c, d1-d, d2-d, d3-d], is_link(As, Os, L-(A,[O,P])) :- [l1-(a1,[c1,d1]),l2-(a1,[c1,d2]), atom(L), member(A-(C,_)>>(D,_), As), l1-(a1,[c1,d3])]), member(O-C, Os), member(P-D, Os). ([c-1,d-0], [a1-(c,1..2)>>(d,3..4)])). Although simplistic and minimal this framework allow us to We call a configuration an instance of a specification if all the model reasonable configurations and will serve the purpose of pre- objects and links have a corresponding class and association, respec- senting the advantages but also challenges for such a declarative tively. framework. It can be easily extended to cover multiple aspects which instance((Os, Ls), (Cs, As)) :- are necessary for a more complete handling of configurations in pairs_keys(Cs, CNs), real-world applications (and in fact many features are already im- maplist(is_object(CNs), Os), plemented in a prototype). maplist(is_link(As, Os), Ls). A central advantage we observe in the design phase is the clear and straightforward formulation of the underlying terminology. Specifi- For each object we check its class and for each link we identify a cations, configurations, and the notions of instance, validity and sat- matching association with compatible participating classes. isfiability can be defined with just a few lines of code in parallel In order to handle multiplicities as defined by the U ML standard, to the formal definitions of the underlying concepts. Prototyping is we define a predicate gamma() which counts for each association rapid and succinct; the visualization of graph structures and configu- and a given object the number of different partner objects induced by rations is straightforward. For the computation of configurations one the links of a configuration [4] of the main advantages of this declarative formulation is the semi- automatic search for solutions. Prolog provides efficient algorithms gamma(I, P, A, N, Ls) :- to explore and backtrack within the search space; further it has opti- reduct(I, Os, P), mizations for tail recursion. E.g., with trivial strategies for object and findall(Os, member(_-(A,Os), Ls), Bag), link creation list_to_set(Bag, Set), length(Set, N). :- gen_objects([c-2, d-3], where I denotes the “position” (index) of the partner objects to be [c_0-c, c_1-c, d_0-d, d_1-d, d_2-d]). counted for within the given binary or multiary association A, the list P contains the single object to be fixed (in general this could be any :- gen_links([a1-1, a2-2], partial link), and Ls is a list of links to be considered. The result [a1_0-(a1,_), a2_0-(a2,_), a2_1-(a2,_)]). (count) is unified with N. The reduct() predicate generates a list we can generate configurations on the fly (we implement the predi- of objects (to be used as partial links) such that P is the projection cate gen_instance() for this purpose) and check if they satisfy on all but the I-th component, and findall() collects all such a given specification objects forming a link in Ls. For example gen_model(C, S) :- :- gamma(2, [c1], a1, 3, gen_instance(C, S), satisfies(C, S). [l1-(a1, [c1, d1]), l2-(a1, [c1, d2]), For a small number of classes and associations this strategy works l3-(a1, [c1, d3])]). very well, is intuitive, and allow us to explore the whole search space fixes a single object c1 (of class C) and counts how many different without extra programming. objects (of class D corresponding to index 2) can be reached over association a1 when considering the provided links: three. 3 CHALLENGES Now we have all parts to define the notion of a valid configuration. We say a configuration satisfies a specification if it is an instance and For a declarative framework as presented in the previous section chal- both the lower bounds for each class and the multiplicities of each lenges typically arise when the search space is large and/or back- association are respected. tracking is expensive. The former is not a drawback of declarative formulations per se; other formalisms which need to deal with prob- satisfies((Os, Ls), (Cs, As)) :- lem instances with a large solution space will suffer from the same forall(member(C-LB, Cs), performance penalties. However, when backtracking is expensive, (findall(ON, member(ON-C, Os), ONs), i.e., when it takes some computational effort to find out that the cur- length(ONs, N), N >= LB)), rent unification state will never lead to a valid configuration, there forall((member(O-C, Os), is often room for improvement. First of all, it depends whether the (member(AN-(C,_)>>(_,L..U),As),I=2 computational complexity is inherent to the problem or just induced ;member(AN-(_,L..U)>>(C,_),As),I=1)), by the use of expressive logics or other declarative formalisms. Sec- (gamma(I, [O], AN, N, Ls), ond, there exists a plethora of methods and algorithms in computer between(L, U, N))). science which are tailored to specific problem classes. 3.1 Equations over association chains This subtle change makes a significant difference in runtime and backtrack behavior for our declarative formulation. Although the number of objects and links to be considered is still rather small C 1 1..2 D 2..1 1 E (8 and 10, respectively) the number of combinations to be explored a1 a2 before the equation can be checked is exponential. Backtracking is therefore very expensive as basically a full configuration needs to be 1 1 built before the equation can be checked. We see a major increase in a3 the runtime (and stopped measuring after 3600 seconds). Figure 2: Specification with three classes and the O CL constraint context C inv: C.allInstances()->size() >= 2. 3.2 Partial configurations We would like to use our framework not only for configuration but Consider the specification depicted in Figure 2. There are three also for reconfiguration in order to repair existing configurations. A classes C, D, and E, where C has a lower bound of two on the num- main challenge is a fast way to identify parts of an input configura- ber of instantiated objects. Each object of class C should be con- tion which cannot be taken as a subcomponent of the overall solution. nected via association a1 to one or two objects of class D which in This is important to avoid late backtracking where a lot of computa- turn are uniquely associated with an object of class E via a2 . Finally tion time has been used for building a variety of alternations which association a3 enforces a one-to-one relationship between objects of can never lead to a valid configuration. Clearly, for arbitrary com- class C and objects of class E. plex input configurations this problem is NP-hard, however for many This poses no severe problem for our declarative implementation basic tasks it is not (e.g., checking whether some links will violate yet. We can find a valid configuration by stepwise incrementing the certain sets of constraints later on). number of instantiated objects and the number of desired links and the satisfies() predicate will filter out all invalid combinations. However, the situation gets interesting if we add some constraints; 3.3 Cost functions the specification in Figure 2 probably tells not the full story of the intended semantics. We might want to ensure that each object of Another limiting factor of logic-based formalisms is finite domain class E which can be reached from an object of class C over inter- reasoning, especially notable when working with numbers in an in- mediary objects of class D via associations a1 and a2 is in fact the teger domain. Configuration tasks often need ways to express how same as linked by a3 . This models the concept of a modular com- expensive certain components or connections are. A natural imple- pound unit which consists of a set of interconnected subcomponents. mentation is to use cost functions which assign costs or other integer We can formalize such constraints by introducing equations on numbers to individual objects in a configuration. The aim is now to association chains. They restrict the links of valid configurations by minimize an objective function which takes into account all compo- imposing limits on valid objects involved. Association chains can be nents and their corresponding costs. modeled by classical tuple composition, similar to a join in database theory. E.g., for Figure 2 we would add the equation a1 ◦ a2 = a3 to 4 TOWARDS A HYBRID APPROACH achieve the semantics mentioned before. In the previous section we saw some typical challenges for declara- tive frameworks. These are by far not the only ones but give a sam- d2 ple of relevant problems of various kinds. Fortunately, there has been c1 d2 e1 tremendous progress in configuration research and artificial intelli- c1 d1 e1 gence in general which triggered specialized algorithms and strate- d1 gies for specific problems. In order to tackle the outlined challenges we propose to use specialized algorithms from other formalisms and d4 d4 to include them in a unified declarative framework. Such a hybrid ap- proach, taking the best from multiple worlds and combining them in c2 d3 e2 c2 d3 e2 a consolidated interface, allows us to attack the previously described challenges towards efficient declarative configuration. (a) A configuration (b) A configuration consisting of two dis- violating the con- 4.1 Integer linear programming joint partitions sat- straint a1 ◦ a2 = a3 . isfying the equation We start out with the observation that backtracking can be very ex- a1 ◦ a2 = a3 . pensive if the domain of variables is not restricted or bounded to a specific range. Ideally we would like to compute the exact number Figure 3: Two satisfying configurations for the specification in Fig- of needed components for a configuration and then our declarative ure 2, but only the left one adheres to the constraint a1 ◦ a2 = a3 . framework can concentrate on finding appropriate links to connect the parts. Clearly, this approach cannot always work as some special Figures 3a and 3b both show valid configurations for the specifi- constraints on the interconnection of components may enforce addi- cation 2 before adding the new constraint on the association chain. tional objects which are not required by pure associations (and their Once we enforce the constraint a1 ◦ a2 = a3 only Figure 3a remains multiplicities) in the underlying U ML class diagram. a satisfying instance; Figure 3b is not valid anymore as the configura- For the computation of the needed number of objects for each class tion violates the equation since there is a connection from c1 over d4 we use a highly efficient technique based on integer linear program- to e2 which is not reachable via a link of association a3 . ming. [7, 4] The idea is to translate a U ML class diagram to a system of inequalities. Its solution indicates whether an instantiation into a Such a strategy is especially suited for finding an initial linking for valid configuration is possible at all (satisfiability problem) but more a whole configuration. This can be both the basis for further investi- interestingly in our context gives also the number of objects for each gation regarding the challenge “equations on association chains” or component. E.g., for the specification in Figure 1 with association a1 provide initial solutions for a reconfiguration problem as necessary and the lower bound on C we obtain for the challenge “partial configurations”. M1 · |D| ≤ N2 · |C| |C| ≥ M1 |C| ≥ L 4.3 Generate and test N1 · |C| ≤ M2 · |D| |D| ≥ N1 Our final suggestion towards hybrid systems is a meta-strategy. Both expressing constraints enforced by the multiplicities (left), con- previous techniques also fall into this category as special cases. straints on the minimal number of objects due to the association se- Declarative systems have a strong standing with their efficient and mantics (middle), and the lower bound constraint on C (right). This native backtracking as it is at the heart of their operation. This prop- translation is performed for all classes and associations involved in a erty is extremely useful for configuration as exploring the solution specification forming an integer linear program. The objective func- space is one of the fundamental aspects in a configuration task. The tion is typically just the minimum over all classes involved. main advantage is that almost arbitrary constraints can be added with S WI-Prolog provides support for integer linear programming via minimal changes to the declarative implementation. This allows us its extension library “simplex” shipped with the standard installa- fast prototyping and complex constraint checking. We therefore ar- tion. We added a set of D CG (definite clause grammar) rules which gue in favor of such systems and the simple but effective strategy of generate I LP constraints out of each association: “generate and test”. The generation step is either done by more so- phisticated algorithms implemented by the user in Prolog (as shown assoc_constraint( in the previous subsections) or by calls to external tools which are _AN-(C,M1..M2)>>(D,N1..N2)) --> specialized on a specific tasks. The declarative framework can then constraint([M1*D, -N2*C] =< 0), take these parts and combine them into a configuration and test it for constraint([N1*C, -M2*D] =< 0). validity. Further constraints like lower bounds or the objective function are added separately forming the whole I LP program. 5 CONCLUSION The native integration of I LP into our declarative framework al- lows us to rule out a broad range of possibilities which do not need Declarative formalisms provide a natural environment for the im- to be explored any more. This has a significant effect on backtrack- plementation of configuration systems as it is easy to write succinct ing as it cuts the search space into more fine grained areas. With this and precise programs with integrated support to explore the solution technique we can scale our framework to a greater number of objects space with backtracking. We motivated this by a simple declarative and links when dealing with the challenges “equations on association framework but showed challenges arising from their use. We argue chains” and “partial configurations”. Costs or weights are an integral for hybrid systems which combine specialized techniques from other part of I LP and can thus easily be handled with this approach; this fields into a unified declarative interface. addresses several aspects of the challenge “cost functions”. REFERENCES 4.2 Procedural link generation [1] Andreas Falkner, Ingo Feinerer, Gernot Salzer, and Gottfried Schenner, ‘Computing product configurations via UML and integer linear program- So far we have only identified one possible way to efficiently com- ming’, Int. Journal of Mass Customisation, 3(4), 351–367, (2010). pute the number of necessary components. Still, it might take a long [2] Andreas Falkner, Gerhard Friedrich, Alois Haselböck, Anna Ryabokon, time to find valid links between these objects (as motivated in the Gottfried Schenner, and Herwig Schreiner, ‘(Re)configuration using an- challenge on association chains). Therefore it seems a promising ap- swer set programming’, in IJCAI 2011 Configuration Workshop, (2011). proach to optimize the way links are generated. One strategy we [3] Andreas Falkner, Alois Haselböck, Gottfried Schenner, and Herwig Schreiner, ‘Modeling and solving technical product configuration prob- found useful for certain classes of configurations is to “balance” links lems’, Artificial Intelligence for Engineering Design, Analysis and Man- between objects as far as possible. This can be seen as a procedural ufacturing, 25, 115–129, (2011). implementation to form a uniform distribution among the links be- [4] Ingo Feinerer and Gernot Salzer, ‘Consistency and minimality of UML tween the participating classes of a given association: class specifications with multiplicities and uniqueness constraints’, in Proceedings of the 1st IEEE/IFIP International Symposium on Theoret- seq(J, M, N, [X, Y]) :- ical Aspects of Software Engineering, June 6–8, 2007, Shanghai, China, pp. 411–420. IEEE Computer Society Press, (2007). J >= 0, M > 0, N > 0, [5] Alexander Felfernig, Gerhard Friedrich, and Dietmar Jannach, ‘UML as X is J mod M, domain specific language for the construction of knowledge-based con- lcm(M, N, LCM), figuration systems’, International Journal of Software Engineering and Y is (J + floor(J/LCM)) mod N. Knowledge Engineering, 10(4), 449–469, (2000). [6] Gerhard Friedrich, Anna Ryabokon, Andreas A. Falkner, The idea of the seq() predicate is to generate a sequence (with J as Alois Haselböck, Gottfried Schenner, and Herwig Schreiner, ‘(Re)configuration based on model generation’, in 2nd LoCoCo the index into it) of tuples [X,Y] for M objects of the first class and Workshop, volume 65 of EPTCS, pp. 26–35, (2011). N objects of the second class (for a binary association) which can be [7] Maurizio Lenzerini and Paolo Nobili, ‘On the satisfiability of depen- used as links and form a uniform distribution. E.g., for J = 0, . . . , 3 dency constraints in entity-relationship schemata’, Information Systems, and M = 2 and N = 6 we obtain the tuples (0, 0), (1, 1), (0, 2), 15(4), 453–461, (1990). and (1, 3). Consequently we would generate a link from object 0 of [8] Object Management Group, Unified Modeling Language 2.4.1, 2011. [9] Object Management Group, Object Constraint Language 2.3.1, 2012. class C to object 0 of class D, from object 1 of class C to object 1 of class D, and so on.