Specification of Decision Diagram Operations Alexandre Hamez1 , Steve Hostettler2 , Alban Linard2 , Alexis Marechal2 , Emmanuel Paviot-Adet3 , and Matteo Risoldi2 1 CNRS - LAAS, 7, avenue du Colonel Roche, 31077 Toulouse, France FirstName.LastName@laas.fr 2 Université de Genève, 7 route de Drize, 1227 Carouge, Switzerland FirstName.LastName@unige.ch 3 Université Pierre & Marie Curie, LIP6 - CNRS UMR 7606, 4 place Jussieu, 75252 Paris Cedex 05, France FirstName.LastName@lip6.fr Abstract. Decision Diagrams (DDs) are a well populated family of data structures, used for efficient representation and manipulation of huge data sets. Typically a given application requires choosing one particular category of DDs, like Binary Decision Diagrams (BDDs) or Data Decision Diagrams (DDDs), and sticking with it. Each category provides a language to specify its operations. For instance, the operation language of BDDs provides if-then-else, apply, etc. We focus on two main kinds of operation languages: BDD-like and DDD- like. They overlap: some operations can be expressed in both kinds of languages, while others are only available in one kind. We propose in this article a critical comparison of BDD-like and DDD- like languages. From the identified problems, we also propose a unified language for DD operations. It covers both BDD-like and DDD-like lan- guages, and even some operations that cannot be expressed in either. 1 Introduction Decision Diagrams (DDs) are now widely used in model checking as an extremely compact representations of state spaces [1]. Numerous DD categories have been developed over the past twenty years based on the same principles. Each category is adapted to a particular application domain and comes with a manipulation language, that is used to create and modify the DDs. Typically a given application requires choosing one particular category of DDs, like Binary Decision Diagrams (BDDs) or Data Decision Diagrams (DDDs), and sticking with it. Then, the user has to learn its operations, which might be a non-trivial task. DDs are used for both efficient memory representation and effi- cient computation time. But knowing which operation leads to better efficiency sometimes requires deep knowledge and understanding of the DD category. Two solutions try to circumvent this problem. The first one is to use high- level Domain Specific Languages (DSLs), specific to an application domain. For Recent Advances in Petri Nets and Concurrency, S. Donatelli, J. Kleijn, R.J. Machado, J.M. Fernandes (eds.), CEUR Workshop Proceedings, ISSN 1613-0073, Jan/2012, pp. 437–451. 438 Petri Nets & Concurrency Hamez et al. instance, CrocoPat4 completely hides the DDs and provides to the user a lan- guage for manipulating relational expressions. We do not consider high-level languages here. The second solution is automatic optimization of low-level oper- ations [2]. We are interested in this article only in low-level languages provided with the DDs. DSLs can then be translated to low-level programs. Section 2 first does a brief presentation of DDs. It presents the terminology we are using in the remainder of the article, and the categories of DDs we cover. We compare two kinds of languages, those used in BDD-like structures (BDDs, Algebraic Decision Diagrams (ADDs), Multi-valued Decision Diagrams (MDDs), for instance) in Section 3, and those used in DDD-like structures (DDDs, Set De- cision Diagrams (SDDs), Σ Decision Diagrams (ΣDDs)) in Section 4. They offer very different operations to their users. Moreover, their expressiveness differ. We think they are therefore good candidates for comparison. We then propose in Section 5 low-level operations that generalize both BDD- like and DDD-like operations. They only apply to DDs where the function’s re- sults are stored in terminal vertices, so Edge-Valued Decision Diagrams (EVDDs) are not covered in this article. Moreover, hierarchy as in SDDs and continuous input domains are not handled for simplicity. 2 Decision Diagram Principles Decision Diagrams are representations of functions using Directed Acyclic Graphs (DAGs) with maximal sharing from the leaves and roots. DD categories differ on the signatures of represented functions. We note B the Boolean domain and N the natural integers. Function signatures are for instance Bn → B for BDDs [3], Bn → N for ADDs [4], and N∗ → B (unbounded sequences of integers as inputs) for DDDs [5]. Categories also differ in the graph representation of functions. In most cases, the result is stored in terminal vertices, but Edge-Valued Bi- nary Decision Diagrams (EVBDDs) [6] and Edge-Valued Multi-Valued Decision Diagrams (EVMDDs) [7] store them along the paths. Represented functions are total. They return a value in their ouput domain for each value of their input domain. These domains can be infinite, either be- cause they are discrete but not bounded as in DDDs or ADDs [4], or because they are continuous as in Interval Decision Diagrams (IDDs) [8]. As the DD is a finite graph, not all functions have a DD counterpart. This excludes functions that effectively return an infinite number of different values, as they cannot be represented by a finite number of terminals in the DD categories presented here. A DD represents a function, which can also be seen as an association of each input element with an output one. We consider categories where every input element is a path of the graph, ending with a terminal vertex labeled by the function’s corresponding output value. Each vertex on the path is a variable of the function, and edges are labeled by their possible values. An order over 4 http://www.sosy-lab.org/~dbeyer/CrocoPat/ Specification of decision diagram operations Petri Nets & Concurrency – 439 variables specifies the allowed succession of vertices in the graph. For instance, the Boolean function x ∨ y is represented by the BDD in Figure 1. The function is described by the following y {0, 1} 1 set of paths: 1   x 1 x − 0 →y− 0 0 → 0, x − →y− → 1 1 0 y 0 0 x − 1 →y− 0 1 → 1, x − →y− → 1 1 Fig. 1. BDD for x ∨ y with variable order x < y, and its corresponding paths We compare operations for two categories of DDs that have deep differences. First, operations on BDDs and other categories with a fixed number of vari- ables are presented in Sec. 3. Then, Sec. 4 presents operations on DDDs where paths can use different variables, and even have different lengths. Sec.5 makes a synthesis of these two kinds, by proposing a new language for operations. 3 BDD-like Operations BDD-like operations are used on DDs representing functions with a finite and homogeneous input domain. Their signatures are thus of the form Xn → O, where X is a finite domain. This includes categories such as BDDs [3], ADDs [4], x1 xn MDDs [9,10], etc. Their paths are therefore of the form v1 −→ . . . vn −−→ t. Several operations are defined for BDDs in [3]. We give their extension to finite edges [10] and terminal [4] domains. In practice, they are used in BDD libraries, such as CUDD5 . The following operations form the BDD-like manipulation language. It is composed of functions, which take DDs (noted di ), variables (vi ), input val- ues (xi ), output values (ti ) or operators ( ) as parameters. We give – informally – the result of each operation as its set of returned paths. constant(t) creates a DD where all paths lead to the terminal value t. It repre- sents the constant n function that returnso t whatever its inputs are. x1 xn constant(t)= v1 −→ . . . vn −−→ t 6 make(v, x) creates a DD where only the variable v is relevant to the function’s result. The paths where v has value x lead to terminal value 1, and others to terminal 0, even for DD categories with unbounded terminals such as ADDs. We explain this restriction  in Problem 1.  n X\{x} o x1 xn x1 x xn make(v, x)= v1 −→ . . . v −−−−→ . . . vn − −→ 0 ∪ v1 −→ ...v −→ . . . vn −−→ 1 apply( , dl , dr ) computes dl dr . It takes two DDs dl and dr representing functions of signature Xn → O, and a binary operator : O × O → O on their output domain. It returns ( the DD of the function computed by: ) x1 xn x1 xn v1 −→ . . . vn −−→ t1 ∈ dl apply( , dl , dr )= v1 −→ . . . vn − −→ t1 t2 x1 xn ∧ v1 −→ . . . vn −−→ t2 ∈ dr 5 http://vlsi.colorado.edu/~fabio/CUDD/ 6 We omit universal quantification on xi for readability, as in other operations. 440 Petri Nets & Concurrency Hamez et al. restrict(v, x, d) restricts d to the value x of the variable v, and then ignores this variable’s values. The returned function’s result is similar to the original one, when v has value x, so this operation computes f |v=x . The resulting DD still has all variables,nincluding v, which has no effect on its result. o x1 X xn x1 x xn restrict(v, x, d)= v1 −→ . . . v − → . . . vn −−→ t v1 −→ ...v − → . . . vn −−→ t∈d compose(v, dl , dr ) substitutes the variable v in dl with the DD dr . Both DDs dl and dr must represent functions of type Xn → X. This operation can be expressed using apply and restrict, and does not appear in articles on BDDs other than [3]. satisfy-all(t, d) returns the set of paths with terminal value t. The result of this operation is not na DD but an iterable enumeration o of paths. x1 xn satisfy-all(t, d)= v1 −→ . . . vn − −→ t ∈ d satisfy-one(t, d) chooses one element from those returned by satisfy-all. Choice may be deterministic or not, depending on the implementation. satisfy-count(t, d) Instead of returning paths, this operation counts them. Again, the result is not na DD, it is an integer. o x1 xn satisfy-count(t, d)= v1 −→ . . . vn − −→ t ∈ d The language for BDD-like operations is widely used, in many libraries and applications. The example given below7 builds a BDD for x ∧ y ∧ z. It then restricts the function for variable x = 0 and counts the paths leading to ter- minal 1. Results of operations are given as comments, after //. Notice that all operations, except satisfy-count, return a DD containing all variables x, y, z. Their declaration is required before executing the operation. As the operation is composed of nested function calls, it should be read from bottom to top. The call to constant is common in BDD programming, but not useful here. satisfy - count // =⇒ 2 ( 1 0,1 0 0 , restrict // =⇒ x −−→ y − →z− → 1 ∪ ··· → 0 ( x, 0 0 0 0 , apply // =⇒ x − →y− →z− → 1 ∪ ··· → 0 ( ∧ 0,1 0,1 0 0,1 0,1 1 , make (z , 0) // =⇒ x −−→ y −−→ z − → 1 ∪ x −−→ y −−→ z − →0 0 0 0,1 , apply // =⇒ x − →y− → z −−→ 1 ∪ · · · → 0 ( ∧ 0,1 0 0,1 0,1 1 0,1 , make (y , 0) // =⇒ x −−→ y − → z −−→ 1 ∪ x −−→ y − → z −−→ 0 0 0,1 0,1 1 0,1 0,1 , apply // =⇒ x − → y −−→ z −−→ 1 ∪ x − → y −−→ z −−→ 0 ( ∧ 0 0,1 0,1 1 0,1 0,1 , make (x, 0) // =⇒ x − → y −−→ z −−→ 1 ∪ x − → y −−→ z −−→ 0 0,1 0,1 0,1 , constant (1) // =⇒ x −−→ y −−→ z −−→ 1 ) ) ) ) ) 7 Inspired by the documentation of CUDD. Specification of decision diagram operations Petri Nets & Concurrency – 441 This language for BDD-like operations has several drawbacks. We describe them in the remainder of this section. Note that these drawbacks do not prevent wide use of this language. Problem 1 (make restricts terminal values). Creating a DD with this oper- ation requires a default value for other paths. For instance, make(x, 0) creates a DD returning 1 whenever x = 0, but implicitly also 0 for the other cases. Boolean functions have a trivial implicit result, as only two terminal values are available. But it is not the case for ADDs, which represent functions Bn → N. Their implicit result is chosen as the identity element 0 of addition. With addi- tion, all terminal values can be obtained, as in apply(+, make(x, 0), make(x, 0)) 0 0,1 0,1 1 0,1 0,1 which returns x − → y −−→ z −−→ 2 ∪ x − → y −−→ z −−→ 0. So, the set of terminal values should be a monoid for BDD-like operations. It requires defining a + operation to generate all possible terminal values. Note that constant can then be also restricted to the terminals’ generator only. Problem 2 (The language is not minimalist). The compose operation is redundant. It can be expressed using two other operations, apply and restrict. Usually, this is not a problem, as redundant operations can be safely removed from the language to get a minimal one. But the operation is specifically de- fined because its algorithm is more efficient. So, this BDD-like language is not minimalist, but removing the redundant operation has an efficiency cost. The same task can be defined in several ways, that are not equivalent regard- ing efficiency. Writing a complex operation therefore requires knowledge from the user on how its evaluation internally works. Problem 3 (The language requires embedding). A usual operation in model checking is to compute the fixed point of next states computation. But the BDD-like language has no operations for iteration or recursion. So, it requires to be embedded into a general-purpose language, that can compute the fixed point. The BDD-like languages are thus truly DSLs. They describe an operation in a concise and readable manner, but are not as expressive as a general-purpose programming language. They can express only very simple computations. Problem 4 (Operations cannot be optimized). In [11] it has been shown that rewriting the operations can lead to huge performance improvements. This has been partially automated in DDD-like languages [2]. Because of embedding (see Problem 3), a library with BDD-like operations cannot see the whole opera- tion to perform. It only processes it in small operations. Only few optimizations are available for these, whereas most improvements require access to the full syntax tree of the operation. This problem is very similar to Problem 2, which occurs because the compose operation cannot be expressed efficiently enough using other operations. The lan- guage is not low-level enough to enable optimizations that could bring the same efficiency. Its operations take whole DDs as parameters and return a whole DD. So, the language lacks more intrusive operations, that apply to parts of the DDs. DDD-like operations presented in Section 4 define them, and have shown great optimizations thanks to lower-level operations. 442 Petri Nets & Concurrency Hamez et al. 4 DDD-like Operations The main difference between BDD-like and DDD-like structures is the input domain of represented functions. In [5], paths are defined as “sequences of as- signments”, because in the same DD, paths can be of various lengths, and the same variable can appear several times along a path. We cannot describe the do- main of functions represented by DDD-like structures with a cartesian product notation. We propose in [12] a specification of their domain. These categories include DDDs [5] and SDDs [13] which represent sets and have Boolean terminals, and MultiSet Decision Diagrams (MSDDs) [14] which represent multisets with natural integer terminals. We propose in [12] a unifi- cation of BDD-like and DDD-like structures, where they represent functions on complex data types rather than functions with multiple arities. We do not give importance to these subtleties in this article, so readers only need to be aware of paths with different lengths and variable repetition. DDD-like operations are partitioned in two languages: a language of binary set operations (Sec. 4.1) and a language of unary set transformation (Sec. 4.2). They are not totally disjoint, as bridges exist between them. 4.1 Binary Set Operations The language of set expressions is composed of the usual binary set opera- tions ∪, ∩, \ – or their multiset counterparts – applied to two DDs. These three operations are distinct, whereas they are all covered by apply in BDD-like op- erations. A general definition is difficult, because domains of variables can be unbounded. To preserve the finite graph representation of the function, an infi- nite number of paths must lead to terminal 0, whereas a finite number lead to other terminals. So, the operation applied to terminals should always return 0 when both its operands are 0 to ensure termination. A general apply could exist, given operator on terminals respects this constraint. These operations are a small subset of BDD-like operations. For instance, they are not sufficient to compute a state space. So, they are completed with a language for set transformations. 4.2 Unary Set Transformations The unary set transformations, initially proposed in [5], have nothing in common with BDD-like operations. They are close to the map functions found in most functional languages: for a DD category representing sets, an operation is applied to each path. But instead of returning a transformed path, application on each path returns a set of paths, each set represented by a DD. These transformations are homomorphisms. They must therefore enforce a constraint: all must be linear for a union ∪ operator defined in the DD category: h(d1 ∪ d2 ) = h(d1 ) ∪ h(d2 ). The linearity property means that every input DD can be split in several DDs, then the operation applied to each, and the result obtained by union of their results. For simplicity, operations given below do not Specification of decision diagram operations Petri Nets & Concurrency – 443 mention particular cases required by linearity. These cases are common to all operations, and are discussed in Problems 8 and 9. Note that these operations are applied on quasi-reduced DDs, i.e. without level skipping. Unary transformations are parametric: they take parameters that define their behavior in addition to the DD to transform. To describe the language of such operations, we show each one with both kinds of parameters. The first one, between brackets, contains all parameters except the DD to transform. The second one, between parentheses, is this DD. We thus clearly distinguish between parameters that define the operation, and the parameter to which it is applied. terminal[t] is the only operation that allows to create a DD outside unary set transformations. It is therefore crucial, as the basis for each operation. This nullary operation creates a DD terminal valued by t. It is close to the constant BDD-like operation. But they differ because the DD returned by terminal repre- sents the nullary function that returns t, whereas the DD returned by constant represents a n-ary function. terminal[t]()= {7→ t} constant[dc ] returns the constant DD dc , whatever its input DD is. This opera- tion is parameterized by the DD to return, and is applied to a DD that it ignores (almost, see Problem 8). The name of this operation can be misleading. The re- turned DD represents a function that can be constant or not: all its paths do not necessarily lead to the same terminal. But the constant operation returns this DD independently of its input. It thus differs from constant of BDD-like operations, which returns a constant function. constant[dc ](d) = dc identity[] returns its input with no modification. This operation can be ex- pressed using other operations, but is introduced in DDDs for efficiency. identity[](d) = d; x x →] adds a prefix v − make[v − → to a DD. It differs from the BDD-like make opera- tions, because it adds a variable to the represented function given as parameter, whereas the BDD-like  creates an n-ary function from scratch.  x x1 xn  v − → v 1 −→ . . . vn − −→ t  x x1 xn make[v −→](d)= v1 −→ . . . vn −−→ t∈d  v −−−−→ v − {y6 = x} x1 xn  1 → . . . vn − −→ 0 x x →, h] is the inverse operation of make. It selects paths where v − match[v − → is a prefix of the DD, and then applies another operation h to the subDD that prefix leads to. Other paths n return the identity DD. o x x1 xn x x1 xn match[v −→, h](d)= h v1 −→ . . . vn −−→ t v−→ v1 −→ . . . vn −−→ t∈d This operation is overloaded, as it also exists to match a terminal. match[t, h](d)= h ({t | t ∈ d}) composition[h1 , h2 ] computes the composition h1 ◦h2 , and applies it to its input DD. It has no relation with compose of BDD-like operations, which substitutes a variable with the result of a DD. composition[h1 , h2 ](d) = (h1 ◦ h2 )(d) = h1 (h2 (d)) 444 Petri Nets & Concurrency Hamez et al. union[h1 , h2 ] (respectively intersection) computes the union (resp. intersec- tion) of results of h1 and h2 applied to the input DD. This operation wraps some of the set operations presented in Sec. 4.1. Set difference \ is not wrapped because it is not a linear operation. Note that these two operations can be easily extended to n-ary versions as they are associative and commutative. union[h1 , h2 ](d) = h1 (d) ∪ h2 (d) intersection[h1 , h2 ](d) = h1 (d) ∩ h2 (d) fixpoint[h] has been introduced for model checking applications. Problem 3 shows that BDD-like languages lack operations for iteration. fixpoint provides such an operation. It computes the fixed point of h, applied to the input DD. fixpoint[h](d) = h∞ (d) = h(. . . h(d) . . . ) The DDD-like unary set transformations are much more expressive than the BDD-like operations. Figure 2 shows a Petri net and an operation that computes its state space, which will be improved later. The operation uses the fixpoint operator where BDD-like operations require to be embedded in a general purpose language. To encode an least fixed point, identity is used to keep previously computed states. We show how the operation works by applying the operation for transition t 1 2 0 to the DDD path q − →p− →r− → 1 representing the initial state of the Petri net. 1 1 2 0 →, . . .] (q − match[q − →p− →r− → 1) 0 2 0 =⇒ →, . . .] (p − make[q − →r− → 1) 0 2 2 0 =⇒ q−→ match[p − →, . . .] (p − →r− → 1) 0 0 0 =⇒ q−→ make[p − →, . . .] (r − → 1) 0 0 0 0 =⇒ q − →p− → match[r − →, . . .] (r − → 1) 0 0 1 =⇒ q−→p− → make[r − →, . . .] (1) 0 0 1 =⇒ q − →p− →r− → match[1, . . .] (1) 0 0 1 =⇒ q−→p− →r− → identity[] (1) 0 0 1 =⇒ q−→p− →r− →1 Problem 5 (Operations usually must be lazily defined). The operation given in Figure 2 has a bug, because the match operation for transition u requires exactly one token in p. So, it cannot be applied from the initial state. We have to enumerate all possible cases, by adding : 1 0 1 0 , match [q − → , make [q − → , match [p − → , make [p − →, 0 1 match [r − → , make [r − → , match [1, identity []. . .] // u We cannot express “add n tokens” or “remove n tokens”, only “set to n tokens”. In practice, an operation usually has an a priori unbounded number of match calls. They will be discovered during computation. Description of the operation would be infinite, so it is in practice rather defined lazily. Then match uses a function returning the operation associated with each encountered value. Problem 6 (Lazy operations cannot be optimized). The operations that compute the state space of Petri net in Figure 2 can be optimized. The operation Specification of decision diagram operations Petri Nets & Concurrency – 445 1 1 p t u q 2 1 1 1 r 1 1 1 v fixpoint [ union [ identity 1 0 2 0 , match [q − → , make [q − → , match [p − → , make [p − →, 0 1 match [r − → , make [r − → , match [1, identity []. . .] // t 1 0 1 0 , match [q − → , make [q − → , match [p − → , make [p − →, 0 1 match [r − → , make [r − → , match [1, identity []. . .] // u 0 1 0 1 → , make [q − , match [q − → , match [p − → , make [p − →, 1 0 match [r − → , make [r − → , match [1, identity []. . .] // v ] ] Fig. 2. A Petri net and the operation encoding its firing rule. The variable order used by operations is q < p < r. Transitions’ names are given as comments after their encoding. We use ]. . .] instead of numerous closing brackets, so indentation is meaningful. is rewritten to an equivalent but more efficient one, given below, by merging identical match calls. It is more efficient because some parts of the operation are merged, and thus require fewer application of the operations. The currently most advanced optimization technique, called “automatic saturation” has been introduced in [2]. It follows the same principle of rewriting an operation into a more efficient one. fixpoint [ union [ identity 1 0 , match [q − → , make [q − → , union [ 1 0 0 1 match [p − → , make [p − → , match [r − → , make [r − →, match [1, identity []. . .] 2 , match [p − → , union [ 0 0 1 make [p − → , match [r − → , make [r − → , match [1, identity []. . .] 1 0 1 make [p − → , match [r − → , make [r − → , match [1, identity []. . .] ]. . .] 0 1 0 1 , match [q − → , make [q − → , match [p − → , make [p − →, 1 0 match [r − → , make [r − → , identity []. . .] ] ] 446 Petri Nets & Concurrency Hamez et al. Non-optimized and optimized ones are expressed in the same language. So, the language is suitable for optimization by program transformation. It solves the Problem 4 of BDD-like languages. Lazy operations seen in Problem 5 are useful to deal with a priori unbounded structures. But they have a drawback: they cannot be optimized, because their suboperations are not known until execution. To circumvent the problem, new operations have been added in DDD-like languages. They are redundant with existing ones, but provide the missing information. However, as more optimiza- tions are added, more specific operations like these must also be added. This approach creates several redundant ways to define the same operations. Users have to know which one is better to enable most optimizations. Problem 7 (Weak typing). Each operation is designed to work on DDs of a given type, i.e. a variable order and a category that defines the variables’ domains and output domain. For instance, the following operation can only be 1 1 applied to DDDs of variable order x < y. It detects the path x − →y−→ 1. 1 1 match [x − → , match [y − → , match [1, constant [ terminal [1]]]]] Operations’ operands which are DDs are not explicitly typed. Only the user knows their type. So, an operation can discover at runtime that it is not applied to an intended operand. The operation then fails, but how this failure is reported is unclear in the DDD-like languages: return of a special terminal >, assertion to stop the program, exception that can be caught. . . or even return of terminal 0 which is in fact silent. Instead of giving a DD of the wrong type to an operation, typing errors can also appear because of DDs returned by inner operations. This leads to errors (known in DDDs as the “top” > terminal) that are hard to debug8 . Such a typing error occurs when operands of union or intersection do not return DDs of the same type. For instance, the following operation is erroneous because the first part of union returns a DDD with variable order x < y whereas the second has variable order y < x. 1 0 union [ make [x − → , make [y − → , constant [ terminal [1]]] 1 0 , make [y − → , make [x − → , constant [ terminal [1]]]] It seems easy here to check the typing problems. But lazy operations, required for Problem 5, prevent checking in most practical cases. Problem 8 (An identity DD is only transformed to an identity DD). The linearity constraint of homomorphisms requires that every DD can be split in several other DDs, operations applied to them, and their results merged. Each DD can be decomposed to itself and the identity DD (noted d0 ), where all paths lead to terminal 0. From the definition of linearity, a DD can be de- composed in itself and the identity, as in h(d) = h(d ∪ d0 ) = h(d) ∪ h(d0 ). So, every operation must return the identity DD when applied to the identity d0 . 8 Users that have used at least once a DDD library know the pain to look at thousands of lines of DD paths to search where something went wrong. Specification of decision diagram operations Petri Nets & Concurrency – 447 For instance, all operations applied to the DDD representing the empty set return the empty set. Even constant[dc ](d0 ) cannot return its DD parameter dc when applied to d0 . In this particular case, it must return the identity d0 . This constraint does not exist in BDD-like operations. They can return any DD when applied to the identity one. This requirement prevents the user of DDD-like operations to start its ap- plication with the identity DD, and then filling it through operations. In model checking applications using DDDs, we have to start therefore with an initially non-empty set of states, which is usually the initial state. Problem 9 (Operations on terminals must also be linear). This is an- other consequence of homomorphisms, as in Problem 8. Consider a DD repre- senting multisets. Each path represents an element of the multiset, its terminal represents its cardinality. The multiset union sums the terminals of identical paths. Because each DD can be decomposed, the result computed for a path leading to n must be the same as the sum of results for the same path ending with n1 and n2 , where n = n1 + n2 . For instance, Fig. 3 shows an operation that tries to count paths leading to terminal 1. Its result is computed by adding all subresults, given as destinations of red bold arrows. The final result is shown for a DD (left part) and for one ar- bitrary decomposition of the DD (right part). They differ, because the operation on terminals is not linear. 1 1 0 0 0 0 ⇔ 0 2 0 x 1 + x 1 + x 0 + ... x + 1 1 1 x 1 + x 0 + ... 1 1 1 0 =1 6= =3 Fig. 3. Counting paths leading to terminal 1 is not a linear operation Users specify operations only for paths leading to 1 in DDDs. This category has Boolean terminals, and homomorphisms always return the identity DD d0 when paths end with 0. For DDs representing multisets (MSDDs in [14]), letting the user specify the operation for any non-zero terminal value is dangerous, as it might break linearity. To ensure this property, user should still define the operation only for terminal 1. Problem 10 (Goal of restriction to linear operations is unclear). Even if we allow operations to return not only DDs, but also other data types such as integers, the linearity constraint is too restrictive for many operations. For instance, even a simple operation like counting the number of paths leading to a particular terminal value is not a linear operation (see Fig. 3 of Problem 9). DDD-like languages have an operation to count paths, but it cannot be used within linear operations. So, this special operation is doomed to appear only at the end of a computation. The reason why DDDs define their operations as homomorphisms is not explicitly given in [5]. They are well adapted for set transformation, but they restrict drastically the expressiveness of their language. We believe that the main justification for such constraints is efficiency. 448 Petri Nets & Concurrency Hamez et al. Linearity, h(d1 ∪ d2 ) = h(d1 ) ∪ h(d2 ), ensures not only that a DD can be decomposed to its paths to compute the operation’s result, but also that the subresults can always be combined before the next computation. Consider the operation (f ◦ g)(d1 ∪ d2 ). It can be rewritten as the union of two compositions (f ◦ g)(d1 ) ∪ (f ◦ g)(d2 ), or as f (g(d1 ) ∪ f (g(d2 )). In both cases, the result if the same, whether f is applied to two subresults g(d1 ) and g(d2 ), or to their union g(d1 ) ∪ g(d2 ). The second case is likely to do less computations9 than the first one, thus saving time and memory. This optimization is possible because of the linearity constraint. We propose in Sec. 5 to move this constraint where it should be: as an optimization enabled only when operations can be proven linear. Problem 11 (Unary operations are not enough). Some operations, such as set union ∪, intersection ∩ or difference \ for DDDs are not unary. As DDD- like set transformations are unary only, there is no way for a user to describe them inside the homomorphisms framework. Restriction to unary functions is useful to define operations as combina- 1 0 tors. For instance, composition[match[x − →, identity], make[y −→, identity]] is a unary operation, because each suboperation is also unary and returns ex- actly one result. The input DD argument of composed functions is implicit. Composition of n-ary operations does not allow such shortcuts, because sev- eral links can exist between the inputs and outputs of composed operations. But n-ary operations are the general case, that is missing in DDD-like operations. 5 Generalized Operations We propose in this section a language for generalized operations, that solves the problems identified in Sections 3 and 4. This language covers both BDD-like and DDD-like languages. Throughout the section, a simple program given in Fig. 4 is used as example. It counts the similar paths in two DDs. Note that the proposed operations are applied to quasi-reduced DDs, as in DDD-like unary operations. Missing parts must be rebuilt on-the-fly when the operations are applied. Program A program is a set of named operations, like check , check y , check x