=Paper= {{Paper |id=None |storemode=property |title=Towards Hybrid Techniques for Efficient Declarative Configuration |pdfUrl=https://ceur-ws.org/Vol-958/paper5.pdf |volume=Vol-958 |dblpUrl=https://dblp.org/rec/conf/confws/Feinerer12 }} ==Towards Hybrid Techniques for Efficient Declarative Configuration== https://ceur-ws.org/Vol-958/paper5.pdf
                                Towards Hybrid Techniques for
                               Efficient Declarative Configuration
                                                              Ingo Feinerer1


Abstract. During the last decades configuration has been ex-            other declarative formalisms (e.g. logics or answer-set programming)
tensively employed in a wide range of application domains, im-          as well. The following framework was implemented and tested with
plemented by a multitude of techniques like logics, procedural,         S WI-Prolog but should work equally well in any other dialect with
object-oriented, or resourced-driven approaches. Especially declara-    minor modifications.
tive methods provide the foundation for precise and well-understood
semantics for reasoning tasks and allow for a succinct representa-                C          M1 ..M2              N1 ..N2           D
                                                                                                          a1
tion of the underlying knowledge base. However, a drawback in us-
ing such powerful declarative techniques lies in their computational    Figure 1: Specification with two classes, an association with
complexity. In this paper we present a simple declarative framework     multiplicities, and the O CL constraint context C inv:
for configuration in Prolog in order to show the advantages of logic-   C.allInstances()->size() >= L.
based techniques but also to identify some challenges for such for-
malisms. We argue for hybrid systems which combine and utilize ef-
ficient techniques from different configuration methodologies under        Figure 1 depicts a minimal U ML class diagrams as typically
a unified declarative interface.                                        used for configuration purposes. It shows a specification with two
                                                                        classes C and D and one association a1 relating them. The multi-
                                                                        plicities restrict the number of valid links; each object of class C
1     INTRODUCTION                                                      must be connected with at least N1 and with at most N2 objects
In the history of configuration research multiple high-level ap-        of class D. Analogously, each object of class D must have links to
proaches for the design and solution of configuration tasks have been   M1 ..M2 objects of class C. The small arrow at the end of the associ-
presented. Declarative systems have had strong proponents due to        ation shows the direction (and not a hierarchy). The constraint in the
their expressive semantics based on well investigated and understood    Object Constraint Language [9] (O CL) for class C states that C must
logics. Unfortunately the expressive power limits the applicability     be instantiated with at least L objects to obtain a valid configuration.
due to high computational costs, observable for a broad class of im-       The concepts of a class and of an association can be directly trans-
plementations [3]. Related topics like reconfiguration unveil further   lated to
interesting but often computational hard topics [6, 2] for declara-     class(CN-L) :- atom(CN), L >= 0.
tive formalisms. In this paper we identify some typical challenges
for declarative frameworks and propose a combined hybrid approach       assoc(CNs, AN-(C,I)>>(D,J)) :-
which utilizes techniques from other fields, like integer linear pro-     atom(AN), member(C, CNs), member(D, CNs),
gramming (I LP), to overcome these. Section 2 presents a simple           mult(I), mult(J).
declarative framework in order to demonstrate the benefits but also
the challenges (Section 3) of such a system. Section 4 shows three      where each class has a name (CN) and a lower bound (L). Each as-
strategies to deal with the discussed challenges.                       sociation has a name (AN) and includes information on the related
                                                                        classes (C, D) and the multiplicities (I, J). Further the class names
                                                                        are checked for validity against a predefined list of names (CNs).
2     DECLARATIVE FORMULATION                                              Now we can define a specification as a tuple (Cs, As) of
Object-oriented modeling languages provide a natural formalism for      classes Cs and associations As
the design of configuration systems. Historically entity-relationship   spec((Cs, As)) :-
diagrams had a strong toehold but during the recent decade class          maplist(class, Cs), pairs_keys(Cs, CNs),
diagrams in the Unified Modeling Language [8] (U ML) have seen            maplist(assoc(CNs), As).
a substantial growth as a domain-specific language for configura-
tion. [5, 1] Consequently we use U ML class diagrams as our starting    where maplist() applies a predicate to all arguments of a list and
point and propose a declarative formulation translating its key fea-    pairs_keys() provides access to the subterms of Pair-Key struc-
tures into a declarative framework. We chose Prolog for this task as    tures.
it has an established and well-known semantics and provides several        E.g., instantiating the multiplicities of a1 in Figure 1 with M1 = 1,
efficient implementations. The main ideas can be easily transfered to   M2 = 2, N1 = 3, and N2 = 4 and enforcing a lower bound L of 1
                                                                        for the number of objects of class C this yields
1 Vienna University of Technology, Austria,
    email: Ingo.Feinerer@tuwien.ac.at                                   :- spec(([c-1,d-0], [a1-(c,1..2)>>(d,3..4)]))
   A configuration can be modeled as a tuple (Os, Ls) consist-              The predicate satisfies() checks whether (Os, Ls) is a valid
ing of objects Os and of links Ls. Each object is identified by its         configuration for the specification (Cs, As). The first forall()
name O and its corresponding class C. Each link has a name L and            checks the number of objects for each class against the correspond-
stores information on the corresponding association A and on the ob-        ing lower bound whereas the second forall() checks that each
jects [O,P] it consists of.                                                 object O of the configuration is linked to the right number of part-
                                                                            ner objects as constrained by all participating associations and their
is_object(CNs, O-C) :-                                                      multiplicities. E.g.
  atom(O), member(C, CNs).
                                                                            :- satisfies(([c1-c, d1-d, d2-d, d3-d],
is_link(As, Os, L-(A,[O,P])) :-                                                [l1-(a1,[c1,d1]),l2-(a1,[c1,d2]),
  atom(L), member(A-(C,_)>>(D,_), As),                                          l1-(a1,[c1,d3])]),
  member(O-C, Os), member(P-D, Os).                                           ([c-1,d-0], [a1-(c,1..2)>>(d,3..4)])).
                                                                               Although simplistic and minimal this framework allow us to
   We call a configuration an instance of a specification if all the
                                                                            model reasonable configurations and will serve the purpose of pre-
objects and links have a corresponding class and association, respec-
                                                                            senting the advantages but also challenges for such a declarative
tively.
                                                                            framework. It can be easily extended to cover multiple aspects which
instance((Os, Ls), (Cs, As)) :-                                             are necessary for a more complete handling of configurations in
  pairs_keys(Cs, CNs),                                                      real-world applications (and in fact many features are already im-
  maplist(is_object(CNs), Os),                                              plemented in a prototype).
  maplist(is_link(As, Os), Ls).                                                A central advantage we observe in the design phase is the clear and
                                                                            straightforward formulation of the underlying terminology. Specifi-
For each object we check its class and for each link we identify a          cations, configurations, and the notions of instance, validity and sat-
matching association with compatible participating classes.                 isfiability can be defined with just a few lines of code in parallel
   In order to handle multiplicities as defined by the U ML standard,       to the formal definitions of the underlying concepts. Prototyping is
we define a predicate gamma() which counts for each association             rapid and succinct; the visualization of graph structures and configu-
and a given object the number of different partner objects induced by       rations is straightforward. For the computation of configurations one
the links of a configuration [4]                                            of the main advantages of this declarative formulation is the semi-
                                                                            automatic search for solutions. Prolog provides efficient algorithms
gamma(I, P, A, N, Ls) :-                                                    to explore and backtrack within the search space; further it has opti-
  reduct(I, Os, P),                                                         mizations for tail recursion. E.g., with trivial strategies for object and
  findall(Os, member(_-(A,Os), Ls), Bag),                                   link creation
  list_to_set(Bag, Set), length(Set, N).
                                                                            :- gen_objects([c-2, d-3],
where I denotes the “position” (index) of the partner objects to be           [c_0-c, c_1-c, d_0-d, d_1-d, d_2-d]).
counted for within the given binary or multiary association A, the list
P contains the single object to be fixed (in general this could be any      :- gen_links([a1-1, a2-2],
partial link), and Ls is a list of links to be considered. The result         [a1_0-(a1,_), a2_0-(a2,_), a2_1-(a2,_)]).
(count) is unified with N. The reduct() predicate generates a list
                                                                            we can generate configurations on the fly (we implement the predi-
of objects (to be used as partial links) such that P is the projection
                                                                            cate gen_instance() for this purpose) and check if they satisfy
on all but the I-th component, and findall() collects all such
                                                                            a given specification
objects forming a link in Ls. For example
                                                                            gen_model(C, S) :-
:- gamma(2, [c1], a1, 3,                                                      gen_instance(C, S), satisfies(C, S).
         [l1-(a1, [c1, d1]), l2-(a1, [c1, d2]),
                                                                            For a small number of classes and associations this strategy works
          l3-(a1, [c1, d3])]).
                                                                            very well, is intuitive, and allow us to explore the whole search space
fixes a single object c1 (of class C) and counts how many different         without extra programming.
objects (of class D corresponding to index 2) can be reached over
association a1 when considering the provided links: three.                  3   CHALLENGES
   Now we have all parts to define the notion of a valid configuration.
We say a configuration satisfies a specification if it is an instance and   For a declarative framework as presented in the previous section chal-
both the lower bounds for each class and the multiplicities of each         lenges typically arise when the search space is large and/or back-
association are respected.                                                  tracking is expensive. The former is not a drawback of declarative
                                                                            formulations per se; other formalisms which need to deal with prob-
satisfies((Os, Ls), (Cs, As)) :-                                            lem instances with a large solution space will suffer from the same
  forall(member(C-LB, Cs),                                                  performance penalties. However, when backtracking is expensive,
         (findall(ON, member(ON-C, Os), ONs),                               i.e., when it takes some computational effort to find out that the cur-
          length(ONs, N), N >= LB)),                                        rent unification state will never lead to a valid configuration, there
  forall((member(O-C, Os),                                                  is often room for improvement. First of all, it depends whether the
    (member(AN-(C,_)>>(_,L..U),As),I=2                                      computational complexity is inherent to the problem or just induced
    ;member(AN-(_,L..U)>>(C,_),As),I=1)),                                   by the use of expressive logics or other declarative formalisms. Sec-
         (gamma(I, [O], AN, N, Ls),                                         ond, there exists a plethora of methods and algorithms in computer
          between(L, U, N))).                                               science which are tailored to specific problem classes.
3.1    Equations over association chains                                     This subtle change makes a significant difference in runtime and
                                                                          backtrack behavior for our declarative formulation. Although the
                                                                          number of objects and links to be considered is still rather small
          C        1         1..2   D     2..1         1     E            (8 and 10, respectively) the number of combinations to be explored
                        a1                       a2                       before the equation can be checked is exponential. Backtracking is
                                                                          therefore very expensive as basically a full configuration needs to be
                   1                                   1                  built before the equation can be checked. We see a major increase in
                                    a3
                                                                          the runtime (and stopped measuring after 3600 seconds).
Figure 2: Specification with three classes and the O CL constraint
context C inv: C.allInstances()->size() >= 2.                             3.2    Partial configurations
                                                                          We would like to use our framework not only for configuration but
   Consider the specification depicted in Figure 2. There are three
                                                                          also for reconfiguration in order to repair existing configurations. A
classes C, D, and E, where C has a lower bound of two on the num-
                                                                          main challenge is a fast way to identify parts of an input configura-
ber of instantiated objects. Each object of class C should be con-
                                                                          tion which cannot be taken as a subcomponent of the overall solution.
nected via association a1 to one or two objects of class D which in
                                                                          This is important to avoid late backtracking where a lot of computa-
turn are uniquely associated with an object of class E via a2 . Finally
                                                                          tion time has been used for building a variety of alternations which
association a3 enforces a one-to-one relationship between objects of
                                                                          can never lead to a valid configuration. Clearly, for arbitrary com-
class C and objects of class E.
                                                                          plex input configurations this problem is NP-hard, however for many
   This poses no severe problem for our declarative implementation
                                                                          basic tasks it is not (e.g., checking whether some links will violate
yet. We can find a valid configuration by stepwise incrementing the
                                                                          certain sets of constraints later on).
number of instantiated objects and the number of desired links and
the satisfies() predicate will filter out all invalid combinations.
However, the situation gets interesting if we add some constraints;       3.3    Cost functions
the specification in Figure 2 probably tells not the full story of the
intended semantics. We might want to ensure that each object of           Another limiting factor of logic-based formalisms is finite domain
class E which can be reached from an object of class C over inter-        reasoning, especially notable when working with numbers in an in-
mediary objects of class D via associations a1 and a2 is in fact the      teger domain. Configuration tasks often need ways to express how
same as linked by a3 . This models the concept of a modular com-          expensive certain components or connections are. A natural imple-
pound unit which consists of a set of interconnected subcomponents.       mentation is to use cost functions which assign costs or other integer
   We can formalize such constraints by introducing equations on          numbers to individual objects in a configuration. The aim is now to
association chains. They restrict the links of valid configurations by    minimize an objective function which takes into account all compo-
imposing limits on valid objects involved. Association chains can be      nents and their corresponding costs.
modeled by classical tuple composition, similar to a join in database
theory. E.g., for Figure 2 we would add the equation a1 ◦ a2 = a3 to      4     TOWARDS A HYBRID APPROACH
achieve the semantics mentioned before.
                                                                          In the previous section we saw some typical challenges for declara-
                                                                          tive frameworks. These are by far not the only ones but give a sam-
                   d2                                                     ple of relevant problems of various kinds. Fortunately, there has been
                                         c1           d2   e1
                                                                          tremendous progress in configuration research and artificial intelli-
          c1       d1          e1                                         gence in general which triggered specialized algorithms and strate-
                                                      d1
                                                                          gies for specific problems. In order to tackle the outlined challenges
                                                                          we propose to use specialized algorithms from other formalisms and
                   d4                                 d4                  to include them in a unified declarative framework. Such a hybrid ap-
                                                                          proach, taking the best from multiple worlds and combining them in
          c2       d3          e2        c2           d3   e2             a consolidated interface, allows us to attack the previously described
                                                                          challenges towards efficient declarative configuration.

          (a) A configuration            (b) A configuration
          consisting of two dis-         violating the con-               4.1    Integer linear programming
          joint partitions sat-          straint a1 ◦ a2 = a3 .
          isfying the equation                                            We start out with the observation that backtracking can be very ex-
          a1 ◦ a2 = a3 .                                                  pensive if the domain of variables is not restricted or bounded to a
                                                                          specific range. Ideally we would like to compute the exact number
Figure 3: Two satisfying configurations for the specification in Fig-
                                                                          of needed components for a configuration and then our declarative
ure 2, but only the left one adheres to the constraint a1 ◦ a2 = a3 .
                                                                          framework can concentrate on finding appropriate links to connect
                                                                          the parts. Clearly, this approach cannot always work as some special
   Figures 3a and 3b both show valid configurations for the specifi-      constraints on the interconnection of components may enforce addi-
cation 2 before adding the new constraint on the association chain.       tional objects which are not required by pure associations (and their
Once we enforce the constraint a1 ◦ a2 = a3 only Figure 3a remains        multiplicities) in the underlying U ML class diagram.
a satisfying instance; Figure 3b is not valid anymore as the configura-      For the computation of the needed number of objects for each class
tion violates the equation since there is a connection from c1 over d4    we use a highly efficient technique based on integer linear program-
to e2 which is not reachable via a link of association a3 .               ming. [7, 4] The idea is to translate a U ML class diagram to a system
of inequalities. Its solution indicates whether an instantiation into a        Such a strategy is especially suited for finding an initial linking for
valid configuration is possible at all (satisfiability problem) but more    a whole configuration. This can be both the basis for further investi-
interestingly in our context gives also the number of objects for each      gation regarding the challenge “equations on association chains” or
component. E.g., for the specification in Figure 1 with association a1      provide initial solutions for a reconfiguration problem as necessary
and the lower bound on C we obtain                                          for the challenge “partial configurations”.

       M1 · |D| ≤ N2 · |C|           |C| ≥ M1           |C| ≥ L
                                                                            4.3    Generate and test
       N1 · |C| ≤ M2 · |D|           |D| ≥ N1
                                                                            Our final suggestion towards hybrid systems is a meta-strategy. Both
expressing constraints enforced by the multiplicities (left), con-          previous techniques also fall into this category as special cases.
straints on the minimal number of objects due to the association se-        Declarative systems have a strong standing with their efficient and
mantics (middle), and the lower bound constraint on C (right). This         native backtracking as it is at the heart of their operation. This prop-
translation is performed for all classes and associations involved in a     erty is extremely useful for configuration as exploring the solution
specification forming an integer linear program. The objective func-        space is one of the fundamental aspects in a configuration task. The
tion is typically just the minimum over all classes involved.               main advantage is that almost arbitrary constraints can be added with
   S WI-Prolog provides support for integer linear programming via          minimal changes to the declarative implementation. This allows us
its extension library “simplex” shipped with the standard installa-         fast prototyping and complex constraint checking. We therefore ar-
tion. We added a set of D CG (definite clause grammar) rules which          gue in favor of such systems and the simple but effective strategy of
generate I LP constraints out of each association:                          “generate and test”. The generation step is either done by more so-
                                                                            phisticated algorithms implemented by the user in Prolog (as shown
assoc_constraint(
                                                                            in the previous subsections) or by calls to external tools which are
    _AN-(C,M1..M2)>>(D,N1..N2)) -->
                                                                            specialized on a specific tasks. The declarative framework can then
  constraint([M1*D, -N2*C] =< 0),
                                                                            take these parts and combine them into a configuration and test it for
  constraint([N1*C, -M2*D] =< 0).
                                                                            validity.
Further constraints like lower bounds or the objective function are
added separately forming the whole I LP program.                            5     CONCLUSION
   The native integration of I LP into our declarative framework al-
lows us to rule out a broad range of possibilities which do not need        Declarative formalisms provide a natural environment for the im-
to be explored any more. This has a significant effect on backtrack-        plementation of configuration systems as it is easy to write succinct
ing as it cuts the search space into more fine grained areas. With this     and precise programs with integrated support to explore the solution
technique we can scale our framework to a greater number of objects         space with backtracking. We motivated this by a simple declarative
and links when dealing with the challenges “equations on association        framework but showed challenges arising from their use. We argue
chains” and “partial configurations”. Costs or weights are an integral      for hybrid systems which combine specialized techniques from other
part of I LP and can thus easily be handled with this approach; this        fields into a unified declarative interface.
addresses several aspects of the challenge “cost functions”.
                                                                            REFERENCES
4.2    Procedural link generation                                           [1] Andreas Falkner, Ingo Feinerer, Gernot Salzer, and Gottfried Schenner,
                                                                                ‘Computing product configurations via UML and integer linear program-
So far we have only identified one possible way to efficiently com-
                                                                                ming’, Int. Journal of Mass Customisation, 3(4), 351–367, (2010).
pute the number of necessary components. Still, it might take a long        [2] Andreas Falkner, Gerhard Friedrich, Alois Haselböck, Anna Ryabokon,
time to find valid links between these objects (as motivated in the             Gottfried Schenner, and Herwig Schreiner, ‘(Re)configuration using an-
challenge on association chains). Therefore it seems a promising ap-            swer set programming’, in IJCAI 2011 Configuration Workshop, (2011).
proach to optimize the way links are generated. One strategy we             [3] Andreas Falkner, Alois Haselböck, Gottfried Schenner, and Herwig
                                                                                Schreiner, ‘Modeling and solving technical product configuration prob-
found useful for certain classes of configurations is to “balance” links        lems’, Artificial Intelligence for Engineering Design, Analysis and Man-
between objects as far as possible. This can be seen as a procedural            ufacturing, 25, 115–129, (2011).
implementation to form a uniform distribution among the links be-           [4] Ingo Feinerer and Gernot Salzer, ‘Consistency and minimality of UML
tween the participating classes of a given association:                         class specifications with multiplicities and uniqueness constraints’, in
                                                                                Proceedings of the 1st IEEE/IFIP International Symposium on Theoret-
seq(J, M, N, [X, Y]) :-                                                         ical Aspects of Software Engineering, June 6–8, 2007, Shanghai, China,
                                                                                pp. 411–420. IEEE Computer Society Press, (2007).
  J >= 0, M > 0, N > 0,                                                     [5] Alexander Felfernig, Gerhard Friedrich, and Dietmar Jannach, ‘UML as
  X is J mod M,                                                                 domain specific language for the construction of knowledge-based con-
  lcm(M, N, LCM),                                                               figuration systems’, International Journal of Software Engineering and
  Y is (J + floor(J/LCM)) mod N.                                                Knowledge Engineering, 10(4), 449–469, (2000).
                                                                            [6] Gerhard Friedrich, Anna Ryabokon, Andreas A. Falkner,
The idea of the seq() predicate is to generate a sequence (with J as            Alois Haselböck, Gottfried Schenner, and Herwig Schreiner,
                                                                                ‘(Re)configuration based on model generation’, in 2nd LoCoCo
the index into it) of tuples [X,Y] for M objects of the first class and
                                                                                Workshop, volume 65 of EPTCS, pp. 26–35, (2011).
N objects of the second class (for a binary association) which can be       [7] Maurizio Lenzerini and Paolo Nobili, ‘On the satisfiability of depen-
used as links and form a uniform distribution. E.g., for J = 0, . . . , 3       dency constraints in entity-relationship schemata’, Information Systems,
and M = 2 and N = 6 we obtain the tuples (0, 0), (1, 1), (0, 2),                15(4), 453–461, (1990).
and (1, 3). Consequently we would generate a link from object 0 of          [8] Object Management Group, Unified Modeling Language 2.4.1, 2011.
                                                                            [9] Object Management Group, Object Constraint Language 2.3.1, 2012.
class C to object 0 of class D, from object 1 of class C to object 1 of
class D, and so on.