=Paper=
{{Paper
|id=Vol-2220/13_CONFWS18_paper_29
|storemode=property
|title=Quasi-finite Domains: Dealing with the Infinite in Mass Customization
|pdfUrl=https://ceur-ws.org/Vol-2220/13_CONFWS18_paper_29.pdf
|volume=Vol-2220
|authors=Albert Haag
|dblpUrl=https://dblp.org/rec/conf/confws/Haag18
}}
==Quasi-finite Domains: Dealing with the Infinite in Mass Customization==
Quasi-Finite Domains: Dealing with the Infinite in Mass Customization Albert Haag1 Abstract. In this paper we propose to relax finiteness in relational used in assembling the variant3 . tables and tabular constraints in a controlled way. We preserve the MC adds customization to this setting by placing the emphasis syntactic representation of a row in a table as a tuple of symbols. on individualization, i.e. there will be many variants of a product Some of these symbols refer to an atomic value as usual. Others, and the business is prepared to produce only a single unit of each which we call quasi-finite symbols (QF-symbols), refer to infinite one on demand (lot size one). Accordingly, the product properties subsets of an underlying infinite domain. Practical examples for QF- that distinguish the variants are central and potentially numerous4 . symbols are references to (uncountable) real-valued intervals and In other work [13] we discuss the MC setting in more detail and wildcards representing countably infinite sets. Our goal is to provide show that compression of variant tables, tables listing combinations a simple and smooth extension of the tabular paradigm, predomi- of product features, is a key element in managing the exponential nant in business, that is compatible with compression of the table explosion of the number of variants caused by the increase in the to c-tuples [14] or to a variant decomposition diagram [11], and is number of customization choices, which production technology amenable to constraint processing, such as local propagation. now enables. Here, we propose to add to the expressivity of The approach is based on organizing the QF-symbols pertaining variant tables by presenting a quasi-finite (QF) framework that al- to each product property in a specialization relation [8, 9]. A spe- lows dealing with infinite sets of choices within the tabular paradigm. cialization relation is a partial ordering that expresses specificity of meaning. A QF-symbol can be ignored in the presence of a more If the domains of all descriptive properties are finite, then the num- special one. ber of variants is finite as well. Leaving the reference to the under- To ensure that the sets represented by two distinct QF-symbols lying generic MC product aside, each variant is defined by a value pertaining to the same domain are disjoint, we further require that it assignment to the properties, which we represent as a relational tuple must be possible to represent the intersection and set-differences of (r-tuple). If the number of offered variants is not large, these r-tuples QF-symbols. In order to be able to remove duplicates implicated by can be maintained as rows in a database table or spreadsheet, which a disjunction of QF-symbols from result sets of queries, we require then acts as a product model comprised of a single tabular constraint. that it is possible to represent their normalized set-union. If desired or needed, this overall variant table can conceptually be QF-symbols may refer to any objects as long as the above require- split into smaller tables that together form a constraint satisfaction ments are met, e.g. regular expressions (unary predicates), rectangles problem (CSP). Each CSP variable corresponds to a product prop- (geometric shapes), etc. erty and each CSP solution to an offered product variant5 . We refer to any tabular constraint on product properties as a variant table. Variant tables are a form of modeling that is very acceptable to 1 Introduction a business. Their downside is that they may not scale with a grow- ing number of choices for individualization. However, we show in This work expands on a common theme: that data in tabular form is a [13] that expected regularities in the product variants will allow a natural, non-proprietary medium for communicating between inter- compressed form of the table to scale. Here, our choice of the com- related business processes within an enterprise, as well as between pressed form is a variant decomposition diagram (VDD) [10, 11], enterprises. We focus on mass customization (MC), which we take to and the associated c-tuples, a term adopted from [14] and used there be mass production with a lot size of one. A non-configurable prod- for a Cartesian product of sets of values. uct, amenable to mass production, can have product variants2 . For However, infinite domains occur in MC practice, and neither ta- example, mass produced ballpoint pens come in several colors, but bles of r-tuples nor classic CSP approaches allow infinite sets. In are otherwise identical. The offered colors are in a one-to-one cor- this paper we propose to relax finiteness in variant tables, and by ex- respondence with the manufactured ballpoint variants. The business tension in the associated constraint processing, in a controlled way. attributes are maintained once for the generic ballpoint pen. Only one 3 The SAP tables relevant for configuration are listed in [6], Appendix A generic bill of materials that covers all possibilities needs to be main- 4 For simplicity of exposition, we disregard the possibility of needing to tained. For each variant the value of an additional product property, deal with variant structures, i.e. variants that have variants as parts. Our color, is needed to determine which ink filling and matching cap is approach here addresses tabular data in general and could be extended to variant structures if needed. 1 Product Management GmbH, Germany, email: albert@product- 5 In practice, product models are not limited to use only tabular constraints. management-haag.de However, the reasoning here shows that product variants could be exclu- 2 We use SAP terminology pertaining to the handling of products and product sively expressed in tables in the finite case. The single overall table listing variants, citing [6] as a general reference. A brief sketch of the history of all variants can be seen the the result of compiling the product model, e.g. the SAP Variant Configurator is given in [7] to a decision diagram [2]. Our goal is to provide a simple and smooth extension of the tabular As stated, the goal of this work is to smoothly extend the tabular paradigm that retains its acceptance in business, and, particularly, al- paradigm, not to compete with other dedicated problem solving lows compression to a VDD and c-tuples. We preserve the syntactic approaches, and we do not make any such comparisons here. representation of a row in a table as a tuple of symbols, while allow- The quasi-finite (QF) approach has not yet been tried in the field. ing some of these, which we call quasi-finite symbols (QF-symbols), Therefore, we cannot present results. Given that the VDD processing to refer to infinite sets. A symbol for a real-valued interval, which remains syntactically alike to the finite case, and given our positive is uncountably infinite by definition, is an example of a QF-symbol. experiences with specialization relations in other endeavors, we A wildcard symbol that refers to an infinite domain is also a QF- are confident that performance is not the issue. Instead, it will be symbol. In contrast, a wildcard for a finite domain is just an alias for a primary concern to establish usefulness in practice and evaluate the finite set of values. We treat this as “syntactic sugar” and equiva- acceptance by the business community. lent to the expanded set of values. The approach we take here is illustrated by example in Section 4 The paper is structured as follows: and based on the following ideas6 : • We summarize a database approach to configuration in Section 2 • Each property domain is defined by a finite set of symbols: and the topic of compression to VDDs and c-tuples in Section 3. • We illustrate all ideas using an extensive example based on an MC – a finite domain by its values, which we refer to as r-symbols, T-shirt in Section 4. – an infinite domain by one or more (disjoint) QF-symbols. • Constructing VDDs from QF-tuples is akin to constructing them • We represent a value assignment to the product properties as a from c-tuples. This topic is beyond the scope of this paper. How- tuple of symbols. If the tuple contains only r-symbols, it is an r- ever, we summarize the basic problem of ensuring disjoint c-tuples tuple. If it also contains QF-symbols, we call it a QF-tuple. Both (QF-tuples) in Section 5. r-tuples and QF-tuples are interpreted a special cases of a c-tuple, • We look at the motivating examples of QF-symbols and how they where an r-symbol in the tuple is treated as a singleton set. meet our requirements in Section 6 in some detail. • We adapt the concept of a specialization relation from [8, 9] to • We discuss queries to variant tables with QF-symbols in Section QF-symbols. When queries or constraint solving need to consider 7. two different QF-symbols for the same property simultaneously, • We present our ideas on specialization relations and their relation they can ignore both symbols and focus instead on the more spe- to constraint processing in Section 8. In particular we show that cial symbol for their set intersection. local propagation works seamlessly. • Only a finite number of QF-symbols is needed, which can be de- • We also believe that using QF-symbols (and perhaps c-tuples in rived in advance from the product model (e.g. the property do- general) directly in the definition of a product variant has business mains and the variant tables)7 . benefits, which we discuss in Section 9. • Compression to a VDD, and through that to c-tuples, can be done • We provide a summary and an outlook in Section 10. as in the finite case, if we can ensure certain requirements are met. 2 Configuration in the Database Paradigm One difference between a QF-symbol and an r-symbol is that the former still allows choice, i.e. it can be specialized or restricted The easiest MC business setting is when the business offering is a further when required. The consequence is that some constraints small finite set of product variants actually represented in extensional may need to be formulated in non-tabular form, e.g. to express that form in a relational database table or spreadsheet. Even when this is for two real-valued properties length and width it should hold that: not possible, due to the size such a table would have, tabular con- length ≥ width. These constraints can be seen as inter-property straints can be used to define the valid variants. predicates. Whereas we discuss unary intra-property predicates as The extensional form of a tabular constraint naturally supports var- QF-symbols, we will not deal with other inter-property predicates in ious data queries such as (1) and (2), here formulated in SQL, which this paper, except to note in passing that restricting real-valued in- are the most relevant for configuration as discussed in [11]8 . tervals with numeric linear (in)equalities is an established technique The query in (1) returns a result set of all variants matching the (see Section 8.2) that can be smoothly integrated with our intended user’s criteria9 . The k product properties are denoted by v1 , . . . , vk . processing. The variant table is denoted as hvtabi. hRj i denotes a subset of the We show how configuration queries over variant tables with QF- domain Dj for product property vj . The values of interest to a user symbols can be meaningfully supported. We also believe that the when configuring can be communicated in the WHERE clause. concept of specialization relations is an important bridge to con- SELECT * FROM hvtabi straint processing in general. The idea of defining a specialization relation via the subset-relation can be inverted: given a set of sym- WHERE hv1 i IN hR1 i AND . . . hvk i IN hRk i; (1) bols from the column of a table that correspond to elements of a The query in (2) returns the domain restriction for property vj partial order, such that a unique greatest successor and a unique least under the WHERE clause. common predecessor exists for any two elements, these symbols can be treated in a like manner to QF-symbols for purposes of queries SELECT DISTINCT hvj i FROM hvtabi and constraint processing, if we are willing to interpret the partial WHERE hv1 i IN hR1 i AND . . . hvk i IN hRk i; (2) order as a specialization relation. 8 While the approach here may be extended to cover further SQL queries, this 6 This extends the simple processing of real-valued intervals and wildcards is beyond the scope of this paper. proposed in [10, 11] for a set-labeled VDD. 9 In the SQL syntax, an IN term in the WHERE clause need not be specified 7 If further QF-symbols are generated dynamically externally, the special- where no restriction is intended. However, for purposes of representing a ization relation will have to be extended dynamically. Nevertheless, at any query condition as a c-tuple (see Section 3), we will substitute Rj = Dj given time a finite number of symbols will be needed. for an omitted IN term These queries can also be done to further filter the result sets of must denote disjoint sets, in order to allow a unique decision for a previous queries (see [11, 10]). node, given a value assignment. Nodes in an l-chain that all have a common HI-link represent the To sum up: tabular constraints in extensional form can be evalu- disjunction of their value assignments and could be merged into one ated using database queries. In [11] we have shown that this extends set-labeled node. In [10, 11] we introduced a VDD with set-labeled to tables represented as VDDs in a way that also guarantees the effi- nodes, where a node was labeled with a finite set of r-symbols rep- ciency of the queries. We now have to show here how to handle the resenting a disjunction of value assignments. Since any node labeled queries (1) and particularly (2) in conjunction with QF-symbols. with such a finite set can be re-expanded into an l-chain of regular VDD nodes nodes that assign the symbols one at a time, we do not propose to use VDDs with set-labeled nodes in practice. We use them 3 C-Tuples, Table Compression, and Decision here in Section 4 to simplify the exposition. Diagrams A VDD is functionally equivalent to the extensional form of the The discussion of compression in this section is illustrated with table it represents from the perspective of the queries (1) and (2) examples using a simple T-shirt in Section 4. relevant for configuration, see [11]. The extension of these queries to include QF-symbols is the topic of Section 7. A VDD can also In the finite case a variant can be represented as an r-tuple. If we support counting the number of tuples in a table or a result set of a substitute sets for values in this tuple, the tuple is no longer relational, query and access a tuple directly by its position in the table/result set. but represents the Cartesian set of all r-tuples that can be formed as combinations using values from the sets. We call such a Cartesian 4 T-Shirt Example tuple a c-tuple10 . As a tuple we denote it by C = hC1 , C2 , . . . , Ck i, where Cj ⊂ Dj and Dj is the domain of the product property vj ∈ 4.1 Classic Finite T-Shirt Variants {v1 , . . . , vk }. As a Cartesian set it would be written as C = C1 × C2 × . . . × Ck . In [10] the concepts of representing a variant table using a VDD are In the context of the above definition, we don’t care whether an illustrated using an example of a simple T-Shirt. We use this exam- element Cj of a c-tuple is finite or infinite. Note that the set of r- ple here both to illustrate the concepts discussed so far, and also to tuples represented by a c-tuple is uncountable if one of the sets in the illustrate the proposed extension to infinite sets. c-tuple refers to a real-valued interval. The simple T-shirt has the three properties Imprint (v1 ), Size (v2 ), The WHERE clause with the k IN operators in (1) and (2) itself and Color (v3 ) with the finite domains: describes a c-tuple hR1 , . . . , Rk i, which expresses the set of values the user (the problem solving agent) believes in. We will refer to this • {M IB(Men in Black), ST W (Save the Whales)} c-tuple as the query condition. We will allow a query condition to be • {L(Large), M (Medium), S(Small)} any c-tuple from our variant domain. • {Black, Blue, Red, W hite} Only 11 variants are valid due to constraints that state that M IB C-tuples offer a way to compress tables. For example, if the set of implies Black and ST W implies ¬S(Small). Table 1 is the exten- all variants is totally unconstrained, this can be represented by a sin- sional form of the variant table, which is small enough to be used as gle c-tuple, which is the Cartesian product of the product domains. the only and definitive representation of the variants for the purposes With constraints, there will be more c-tuples, but often a c-tuple rep- of both business and configuration. It encodes the underlying CSP as resentation is much more compact than the extensional form [12]. a single tabular constraint. For this reason, c-tuples are already used both formally and infor- The query (1) can be used to filter the variants to the set match- mally in configuration practice. ing any given selection criteria (query condition) hR! , . . . , Rk i. For In the case of finite domains, a set of c-tuples can be further com- example, if the user needs a small (S) sized T-shirt, there is only one pressed to a decision diagram (DD). We use the form of a Variant solution (the first row in Table 1). Alternatively, if a Red T-shirt is de- Decomposition Diagram (VDD). As introduced in [10, 11], a VDD is sired, there are two variants that satisfy this (eighth and ninth rows), a binary rooted Directed Acyclic Graph (DAG), where each node has and the domains are restricted as follows: Imprint ∈ {ST W }, a label denoting the assignment of a property to a value (r-symbol). Size ∈ {M edium, Large}, and Color ∈ {Red} by applying the Here we will allow QF-symbols in node labels as well. Each node has query (2) for each property in turn. two emanating links, HI and LO, which we characterize as follows given a fixed ordering of the product properties: v1 , . . . , vk :11 Table 1. Simple T-shirt • the HI-link of a node points to a node for the next product property vj+1 or to the terminal sink > (true) from last column nodes. Imprint Size Color • the LO-link points to an alternate value assignment for the same MIB S Black MIB M Black product property vj or to the terminal sink ⊥ (false). MIB L Black STW M Black We will call a chain of nodes linked via LO-links an l-chain. If STW L Black more than one QF-symbol appears in an l-chain, the QF-symbols STW M White STW L White 10 We adapt the term from [14], which investigates direct compression to STW M Red c-tuples. STW L Red 11 Under these assumptions, a multi-valued decision diagram (MDD), a more STW M Blue widely known form of a DD [3, 1], can be mapped to a VDD. This is further STW L Blue detailed in [11] Figure 1 depicts a VDD with set-labeled nodes for Table 1. The Table 4. Split c-tuples for simple T-shirt with infinite domains HI-links in each path from the root to the sink > in the VDD in Figure 1 define a c-tuple. The set of all c-tuples that can be formed is disjoint and is a way to represent Table 1 in compressed form. Table Scale Imprint Size Color 1.0 MIB S;M;L Black 2 lists the two c-tuples needed to represent the 11 variants. 1.0 STW M;L Black;White;Red;Blue [1.0] himg-filenamei S;M;L White [0.5, 1.0) himg-filenamei S;M;L White 1:(3, MIB)|7 We now discuss how to construct a VDD with set-labeled nodes for these c-tuples. Figure 2 shows the result12 . 2:(3, STW)|6 1. We construct a root node ν1 labeled hv1 , [1.0]i starting with the first c-tuple. This node will be used for both the first and second 12:(2, [Large, Medium])|5 10:(2, [Large, Medium, Small])|3 c-tuples in Table 3. 11:(1, [Black, Blue, Red, White])|4 6:(1, Black)|2 2. We construct the l-chain (chain of LO-links) for the root node: • We pointed out in Section 3 that nodes linked in an l-chain need F T to have disjoint set labels. This is illustrated here. It will be a problem if we label ν2 with [0.5, 1.0] (third c-tuple), because then the value Scale = 1.0 does not allow deciding uniquely for either ν1 or ν2 . So we split [0.5, 1.0] into the two disjoint Figure 1. VDD of T-shirt with set-labeled nodes c-tuples. [1.0] himg-filenamei {S, M, L} W hite [0.5, 1.0) himg-filenamei {S, M, L} W hite Table 2. C-tuples for simple T-shirt • Table 4 shows all the c-tuples to be handled in constructing the VDD. The new third c-tuple is covered by ν1 . We label the Imprint Size Color second node ν2 , linked from ν1 via its LO-link, with the half- MIB S;M;L Black open interval [0.5, 1.0) from the fourth c-tuple. This handles STW M;L Black;White;Red;Blue the first column. 3. We next process the rest of the three tuples in Table 4 that start with C1 := [1.0]. We create: • ν3 , the target for the HI-link of ν1 , labeled with hv2 , M IBi 4.2 T-Shirt with Infinite Domains • ν4 , the target for the LO-link of ν3 , labeled with hv2 , ST W i To illustrate the use of infinite sets, we modify the example to allow • ν5 , the target for the LO-link of ν4 , labeled with an arbitrary user-provided image as an imprint on a white T-shirt. An hv2 , himg-filenameii image is identified at runtime via a file name. The file name must refer to a processable graphic, which is taken to mean that only a 4. It is straightforward to handle the third column for the above three jpg or a tiff format can be accepted. Hence, the domain is the in- c-tuples: nodes ν3 , ν4 , and ν5 have their HI-links pointing to finite set of all legal file names that match the regular expression nodes ν6 , ν7 , and ν8 , respectively with the labels depicted in Fig- himg-filenamei = ∗.jpg| ∗ .tiff. ure 2: We also add a property Scale to capture a factor to be used to scale • The first of the c-tuples allows only the color Black (node ν9 ). the image printed on the T-shirt. For the vintage prints MIB and STW • The second allows all colors (node ν10 ), and we require Scale = 1. For the user-provided images, the scale can be arbitrarily chosen by the user as a floating point number in the • the third allows only the color White (node ν11 ). range 0.5 to 1 (the interval [0.5, 1.0]). This completely handles the first three c-tuples. Table 3 lists the c-tuples needed to describe this setting. The prod- uct property Scale has here been placed as the first property v1 . The 5. It is now trivial to finish the VDD. Node ν2 still needs to be pro- other properties are now v2 (Imprint), v3 (Size), and v4 (Color). cessed with respect to the last (fourth) c-tuple. But the columns two to four are identical to those in the third c-tuple. Node ν5 was Table 3. C-tuples for simple T-shirt with infinite domains already constructed for this. We note that we skirted the issue that the product property Im- Scale Imprint Size Color print (v2 ) allows both values from a finite list, e.g. {M IB, ST W }, 1.0 MIB S;M;L Black 1.0 STW M;L Black;White;Red;Blue as well as arbitrary “additional” values (himg-filenamei). This is not [0.5, 1.0] himg-filenamei S;M;L White 12 To reduce the size needed to display the graph, the terminal sink ⊥ has been omitted. Conceptually, it terminates all chains of LO-links. Also, the nodes ν1 , ν2 , . . . νn are identified by “n1”, “n2”, . . . “nn” The result set intersected with the external condition is then: h[0.5, 1.0], {my-img.jpg}, {S, M, L}, W hitei n1:(1, [1.0]) If the query formulates the additional restriction Scale ∈ [0.25, 0.75], then the result set intersected with the external condi- n2:(1, [0.5, 1.0)) n3:(2, MIB) tion is: h[0.5, 0.75], {my-img.jpg}, {S, M, L}, W hitei n4:(2, STW) n6:(3, [L, M, S]) 5 Excursion on the Construction of VDDs from C-Tuples n5:(2,) n7:(3, [L, M]) A c-tuple C can be decomposed into its head (the first element C1 ) and its tail T, which is also a c-tuple. We denote this by C := C1 |T: n8:(3, [L, M, S]) n10:(4, [Black, Blue, Red, White]) C := hC1 , . . . , Ck i = C1 |hC2 , . . . , Ck i = C1 |T (3) When constructing a (partial) VDD from a list of c-tuples n11:(4, White) n9:(4, Black) C1 , . . . Cm an l-chain for the head (root) node is constructed us- ing the first elements C11 , C21 , . . . , Cm1 . As discussed in Section 3 and evident from the example in Section 4.2 these elements must be T disjoint. If there are two c-tuples Ci , Ci0 with the same tail, i.e. Ci = Ci1 |T and Ci0 1 = Ci0 1 |T Figure 2. VDD of T-shirt with non-finite set-labeled nodes then their first elements must be merged to yield one c-tuple C0 = (Ci1 ∪ Ci0 1 )|T uncommon in practice where a “standard” solution is modeled with a predefined finite domain, but additional values are allowed (see We can ensure disjointness of any other pair of c-tuples Ci , Ci0 [6]). The sets {M IB}, {ST W } and himg-filenamei are effectively with differing tails treated as disjoint by the VDD due to the constructed l-chain of nodes Ci = Ci1 |Ti and Ci0 1 = Ci0 1 |Ti0 ν3 , ν4 , and ν5 ). This could be formally ensured by augmenting the QF element himg-filenamei to himg-filenamei ∩ ¬{M IB, ST W } by replacing them with the three c-tuples Ca , Cb , Cc in (4) (a c-tuple (see Section 6). with an empty element is considered empty and can be disregarded): Lastly, we informally discuss some exemplary queries. QF queries are the subject of Section 7. The query to Table 1 for small (S) T- Ca = (Ci1 \ Ci0 1 )|Ti Shirts yielded a result set consisting of one r-tuple (the first row). The Cb = (Ci0 1 \ Ci1 )|Ti0 domains for the three product properties were restricted to {M IB}, Cc = (Ci1 ∩ Ci0 1 )|(Ti ∪ Ti0 ) (4) {S}, and {Black}. The same query condition against Table 4 yields three c-tuples (the first, third and fourth c-tuple). Each of these c- As the example in Section 4.2 shows, the c-tuple heads show up tuples in the result set must be intersected with the query condi- directly as labels of set-labeled nodes. We already stated that a set- tion to eliminate the sizes medium (M) and large (L) that are in labeled node labeled with a finite set of symbols (r-symbols or QF- the c-tuples but excluded by the query condition. Consequently, the symbols) can be expanded to an l-chain of regular VDD nodes. domains for the four product properties are restricted to [0.5, 1.0] {M IB, himg-filenamei}, {S}, and {Black, W hite}13 . 6 Operations with Non-Finite Elements in Similarly, the query to Table 1 for Red T-Shirts yielded a result set C-Tuples consisting of two r-tuples (the eighth and ninth row). Against Table 4 the result set consists of one (the second) c-tuple. After intersection As the discussion in Section 5 and the example in Section 4 make with the query condition the four product properties are restricted to clear, it will be necessary to both split and combine c-tuples when [1.0] {ST W }, {M, L}, and {Red}. constructing a VDD and result sets. Therefore, we need the follow- Instead, if the query condition simply specifies a file name for a ing operations on c-tuple elements Cij , Ci0 j pertaining to the same particular image, my-img.jpg, then the last two c-tuples would be the product property vj : result set. They agree completely except in the first column. As there • set intersection: Cij ∩ Ci0 j is no need for the split here (as there was when constructing the orig- • set union: Cij ∪ Ci0 j inal VDD), the two tuples could be combined into one14 : • negation with respect to the overall domain: ¬Cij := Dj \ Cij • set difference: Cij \ Ci0 j = Cij ∩ ¬Ci0 j h[0.5, 1.0], himg-filenamei, {S, M, L}, W hitei For finite sets this is a given. For QF-symbols that are used in the 13 Formed by collecting all symbols occurring for each column and calculat- labels of VDD nodes, we must ensure that these operations are well ing the the union. The result of the union of the two intervals is normalized (see Section 6) defined and fit in our QF framework. 14 This reduction is actually required where we want the tuples in a result set In the following subsections we look at this in detail for the infinite to be distinct, i.e. to have been normalized. elements we propose to add: • Real-valued intervals A unary predicate can be represented by its name (a symbol), • Unconstrained countably infinite sets which serves as the QF-symbol identifying it. In the example in Sec- • Sets of exclusions, particularly finite exclusion sets. tion 4.2, we used the notation himg-filenamei to refer to a regular expression for legal file names. Where the domain underlying negation needs to be made clear we The set operations translate into logical operations for predicates. will denote negation as: The union of two infinite sets qualified by predicates π1 and π2 is just D ¬C := C =D\C a set qualified with the disjunction π1 ∨ π2 . Similarly, intersection translates to π1 ∧ π2 , and negation to ¬π1 . Again, we require normalization to reduce a complex logical 6.1 Real-Valued Intervals and the Xnumeric expression by removing any redundant elements. It has yet to be Datatype determined what works best in practice here. From a theoretical We denote a real-valued interval using conventional mathematical view, we might require a disjunctive normal form (DNF). The notation, e.g. [a, b) for a half-open interval with a closed lower bound overall predicate could then be represented as a list (finite set) of a and an open upper bound b. This is the set of all real numbers x conjunctions. A set-labeled node for such a list can be expanded to such that x >= a ∧ x < b. We allow lower and upper infinity, an l-chain of nodes, as for xnumerics. Each such node would be denoted by − inf and + inf, with open bounds. A single real number labeled by a conjunction of predicates, which would be treated as an x can be encoded as a singleton interval [x]. All other interval bounds indivisible QF-symbol. can be open or closed We define an xnumeric to be a finite list of real-valued intervals We must also deal with set unions between finite sets and count- representing the union of its elements in a normalized form. Normal- ably infinite sets. In the example in Section 4, the standard imprints ized means that the intervals in the list are disjoint, separable, and in for the T-shirt formed a finite set {M IB, ST W }, but “additional ascending order, e.g. the set of intervals {[0.5, 1.0), [1.0]} is disjoint values” were then allowed, which were specified by the QF-symbol and ascending, but it is not separable. In normalized form it is just himg-filenamei. The domain of the property imprint is just the union [0.5, 1.0]. (Remark: The interval {[0.5, 1.0), (1.0, 2.0]} is separable of these sets. Generally, the domain D for a product property with a and thus normalized, because its two intervals are separated by the non-xnumeric datatype is D = {F, π} := F ∪ π, where F is a finite “gap” of the singleton interval [1.0].) set of values, π a predicate representing an countable infinite set, and For xnumerics it is straightforward to ensure normalization. First, both F and π respect the datatype assigned to the product property.16 any intervals that are non-disjoint or not separable can be merged The set-operations then become: into one interval. Since the remaining intervals are disjoint, they can be ordered. Hence the union of two xnumeric is just the set union fol- • set intersection: {F, hπi} ∩ {F 0 , hπ 0 i} = {F ∩ F 0 , hπ ∧ π 0 i} lowed by normalization. The intersection is just the list of pairwise • set union: {F, hπi} ∪ {F 0 , hπ 0 i} = {F ∪ F 0 , hπ ∨ π 0 i} hπi intersections. Because the xnumeric is ordered due to normalization, • negation: ¬{F, hπi} := F ∩ ¬hπi this operation is efficient in the sense that it is not necessary to actu- • set difference: {F, hπi} \ {F 0 , hπ 0 i} = {F 00 , ¬F 0 , hπi \ hπ 0 i} ally intersect all pairs. – where F 00 is the finite set F \ F 0 ∪ F \ hπ 0 i, and The set union of two intervals is not necessarily again an interval, hπi hence we need the concept of an xnumeric. – the finite set ¬F 0 = F 0 is an exclusion set of all values in The unconstrained xnumeric is the interval (− inf, + inf). The F 0 that lie in hπi (see Section 6.3). negation of an xnumeric is the set difference to this unconstrained set. It is formed by inverting the finite number of “gaps” between the 6.3 Exclusions and Exclusion Sets intervals in the xnumeric. For example An exclusion of a value x from a property domain D is a way of stat- ¬{[0.5, 1.0), (1.0, 2.0]} = {(− inf, 0.5), [1.0], (2.0, + inf)} ing D\{x}. It means that x is considered to be invalid, which we will Remark: a finite set of real number values can be represented as denote by ¬x. For real-valued domains, exclusions can be directly an xnumeric using singleton intervals. All interaction with finite sets formulated as xnumerics, e.g., {(− inf, x)(x, + inf)} would exclude is covered by the above operations defined for xnumerics. the real number x. For a finite domain or an xnumeric domain, we can An xnumeric is a list of QF-symbols (intervals) representing their simply positively represent the set D \ {x}. For a countably infinite set union. A set-labeled node for an xnumeric can be expanded to an domain, we need further expressiveness. Given a countably infinite l-chain of nodes with interval labels. domain D for a product property and a finite set of values E ⊂ D, D we introduce an exclusion set ¬E := E := D \ E. An exclusion set ¬E can be merged with a unary predicate π by removing any val- 6.2 Countably Infinite Sets and Domains ues from E that do not satisfy the predicate π, i.e. ¬E ∩ hπi ⊂ ¬E Examples of countably infinite domains are the list of all integers or is a reduced exclusion set. In order to keep the exposition simple, we all strings. This requires that each product property is associated with will ignore this reduction and denote ¬E ∩ hπi also by ¬E. an immutable datatype. We consider a domain D or any C ⊂ D to be For two exclusion sets ¬E, ¬E 0 , the required set operations are qualified by a unary predicate (condition) that filters out disallowed inverted: values at run-time (e.g. a regular expression for a string). Any value fulfilling the predicate (e.g. any string matching the regular expres- • set intersection: ¬E ∩ ¬E 0 = ¬(E ∪ E 0 ) sion) is an acceptable value. Examples for qualifying predicates for • set union: ¬E ∪ ¬E 0 = ¬(E ∩ E 0 ) an integer datatype are: positive p, even p or odd p.15 • negation: ¬¬E = D \ (D \ ¬E) = E 15 In the absence of more specialized predicates, htruei is taken as the default 16 Ideally, F and π will be disjoint. Either F or π can be empty. We define predicate. the predicate hfalsei to represent the empty set. • set difference: ¬E \ ¬E 0 = E 0 \ E • There is a unique symbol ⊥ (false) that is a special of all other symbols. This is a QF-symbol for the empty set. Finite exclusion sets are needed in order to meet our require- • For any two symbols in the PO, there exists a unique symbol for a ments of negation of finite sets against infinite domains. The greatest common successor/special (the “intersection”). concept can also be extended to infinite exclusion sets. Indeed, a • For any two symbols in the PO, there exists a unique symbol for a negated unary predicate corresponds to such a set. For example, if least common predecessor/general (the “normalized union”)17 . the set of all prime numbers is represented by the unary predicate • There is a unique top-level symbol Ω that represents the entire hprime pi, the ¬hprime pi represents exclusion of all prime integers. domain. For any symbol φ in the PO, there exists a symbol ¬φ in the PO, such that the least common predecessor of φ and ¬φ is Ω In either case, a reference to an exclusion set is treated as a QF- and the greatest common successor is ⊥. ¬φ denotes the negation symbol. For example, for a predicate π, ¬π is the symbol represent- of φ.18 ing the exclusion of all values in π. For example, we could arrange images in a PO and declare it a spe- cialization relation, paying some attention to fulfill the requirements 7 Queries on Quasi-Finite VDDs in the spirit of the intended PS. Specialization relations provide some conceptual and practical In the classic finite case, the result set R of the query (1) is a finite set benefits: of r-tuples. In the QF framework, it is a finite set of QF-tuples that may contain both value symbols and QF-symbols. A QF-symbol in • They can be pre-calculated and stored in a graph. This may be the result set must be specialized to conform to the the query condi- more performant than calculating intersections and unions on the tion, e.g. by set intersection with the query condition. Problem solv- fly. ing (PS) must expect the remaining degree of non-determinism. • They provide a general concept to adapt PS to QF-symbols: PS The query (2) contains the keyword DISTINCT. This means any must simply be prepared to specialize the assignment of a QF- duplicates must be removed from the result set for the particular col- symbol to a CSP variable. umn (property). We see replacing QF-symbols by their normalized • They generalize to other objects, e.g. shapes, images, taxonomies, union akin to removing duplicates. Therefore, the symbols in the re- etc. sult set, both QF-symbols and r-symbols, must be replaced by their normalized union, which ensures also that remaining symbols are pairwise disjoint. 8.2 Local Propagation with QF-Symbols If a product model contains multiple constraints, local propagation 8 Constraint Processing with Quasi-Finite Symbols [4] can be used to restrict the domains of the product properties to a state of arc consistency. Any domain restriction of a product prop- 8.1 Specialization Relations erty is propagated to all constraints that reference the same product property. The process continues until no further restrictions are pos- Given two QF-symbols φ1 , φ2 for the same CSP variable, we regard sible. For a tabular constraint, the query (2) can be used to determine φ2 to be more special than φ1 if φ2 denotes a subset of φ1 . This the domain restrictions, which are then propagated (see Section 7). leads to a partial ordering (PO) of the symbols that occur in the vari- The QF framework fits nicely in this scheme. A c-tuple com- ant tables, which we call a specialization relation, introduced and prised of finite sets of symbols that may include the normalized union motivated for another context in [9]. Generally, a specialization rela- of QF-symbols may be used as a query condition. The domain re- tion on a set of facts expresses specificity of meaning, characterized strictions that result from the query (2) with this query condition by the following three properties: may again contain the normalized union of QF-symbols. The c-tuple • Problem solving (PS) need not consider an otherwise valid fact formed from the resulting domain restriction for each column can be in the presence of a more special one (procedural-subsumption smoothly propagated to other constraints. property). This property requires the acquiescence of PS. If the product model is entirely made-up of tabular constraints, the • A fact is logically implied by any of its specializations (semantic- local propagation of QF-symbols is covered by our approach. It is compatibility property). also straightforward to include constraints representing numeric lin- • Negation inverts specialization (symmetry-under-negation prop- ear (in)equalities when propagating real-valued intervals.19 Extend- erty). ing the propagation of QF-symbols yet further is a topic of future work. The facts we deal with in this paper are assignments of r-symbols and QF-symbols to a CSP variable. The PS we consider consists of 8.3 General Constraint Solving queries to the table and constraint processing, particularly local prop- agation of constraints. From the perspective of queries we addition- Extending existing problem solvers to deal with QF-symbols, will ally need to be able to aggregate the result sets into a normalized require an analysis of the particular methods employed. However, a form, e.g. delete duplicate r-symbols, replace two QF-symbols by a 17 We had noted that the union of two QF-symbols need not itself be a QF- more general one representing their union, etc. We have shown in symbol. Here, however, it is an advantage to be able to have a least com- Section 6 that the QF-symbols we primarily envision meet these re- mon predecessor/general representing the union as part of the PO. A more quirements. detailed treatment of specialization relations is deferred to a discussion us- We can also turn the reasoning around and define a PO of symbols ing practical examples when they arise. 18 We need “negation” primarily to ensure a “set-difference” operation to pertaining to the same property domain as a specialization relation if be able to split two symbols into a disjoint triple of symbols as in (4) in we can show that it has the above properties and if we also guarantee Section 5. 19 This is implemented in the SAP product configurators ([6]). the following: main common idea is that constraint problem solving will make use of MC business. This also includes evaluating the need of integrating of the concept of specialization relations. Instead of exploring the with other types of constraints, such as linear (in)equality constraints, validity of a simple value assignment, an assignment of a variable and the business value of indeterminism in product variants. to a QF-symbol can be specialized. When considering such an as- signment and a constraint on the same variable, the greatest common ACKNOWLEDGEMENTS special must be substituted in the assignment. A forced specializa- tion to the empty set would invalidate an assignment. The solutions I would like to thank my daughter Laura and the reviewers for their found by constraint solving may contain (specialized) QF-symbols, comments, which helped improve this paper considerably. i.e. exhibit a degree of non-determinism that cannot be avoided and is to be expected. REFERENCES [1] Jérôme Amilhastre, Hélène Fargier, Alexandre Niveau, and Cédric 9 Indeterminism in Variants and Sub-Variants Pralet, ‘Compiling csps: A complexity map of (non-deterministic) mul- tivalued decision diagrams’, International Journal on Artificial Intelli- A product variant is classically defined as an r-tuple, a value assign- gence Tools, 23(4), (2014). ment to each of its product properties, but this is neither ideal nor [2] Henrik Reif Andersen, Tarik Hadzic, John N. Hooker, and Peter Tiede- sufficient in practice. Some degree of indeterminism in a variant is mann, ‘A constraint store based on multivalued decision diagrams’, In Bessiere [5], pp. 118–132. needed when a variant is to be further specialized in a later business [3] Rüdiger Berndt, Peter Bazan, Kai-Steffen Jens Hielscher, Reinhard process (e.g., at the customer’s site). For example, a pump may be German, and Martin Lukasiewycz, ‘Multi-valued decision diagrams for sold with a connection that fits several different sizes of hoses. The the verification of consistency in automotive product data’, in 2012 12th end customer may have to make a manual adjustment for the partic- International Conference on Quality Software, Xi’an, Shaanxi, China, ular hose they want to attach by cutting off a part of the provided August 27-29, 2012, eds., Antony Tang and Henry Muccini, pp. 189– 192. IEEE, (2012). connector. The pump being sold to the customer by the business is [4] C. Bessiere, ‘Constraint propagation’, in Handbook of Constraint Pro- a variant of their MC “pump” product that allows further individual- gramming, eds., F. Rossi, P. van Beek, and T. Walsh, chapter 3, Elsevier, ization at the customer’s site. (2006). The number of sub-variants can be infinite, e.g. for the frequency [5] Christian Bessiere, ed. Principles and Practice of Constraint Program- ming - CP 2007, 13th International Conference, CP 2007, Providence, a radio receiver may be tuned to. As built, the property Frequency RI, USA, September 23-27, 2007, Proceedings, volume 4741 of Lecture would be described by a list of real-valued intervals for possible re- Notes in Computer Science. Springer, 2007. ception bands, e.g. {[7.2, 7.45], [9.4, 9.9], [11.6, 12.1]} (MHz). [6] U. Blumöhr, M. Münch, and M. Ukalovic, Variant Configuration with Allowing a variant to be defined by a c-tuple solves the above SAP, second edition, SAP Press, Galileo Press, 2012. problem. However, c-tuples that define variants must be distin- [7] A. Haag, ‘Chapter 27 - Product Configuration in SAP: A Retrospec- tive’, in Knowledge-Based Configuration, eds., Alexander Felfernig, guished from those that are merely the by-product of compression. Lothar Hotz, Claire Bagley, and Juha Tiihonen, 319 – 337, Morgan This would be an open MC business topic. Kaufmann, Boston, (2014). [8] Albert Haag, ‘Konzepte zur praktischen handhabbarkeit einer atms- basierten problemlösung’, in Das PLAKON-Buch, Ein Expertensys- 10 Summary and Outlook temkern für Planungs- und Konfigurierungsaufgaben in technischen Domänen, eds., Roman Cunis, Andreas Günter, and Helmut Strecker, This work extends a common theme: that data in tabular form is not volume 266 of Informatik-Fachberichte, 212–237, Springer, (1991). only natural for modeling variants, but also a natural, non-proprietary [9] Albert Haag, The ATMS - an assumption based problem solving ar- medium for communicating between interrelated business processes chitecture utilizing specialization relations, Ph.D. dissertation, Kaiser- slautern University of Technology, Germany, 1995. within an enterprise, as well as between enterprises. Compression of [10] Albert Haag, ‘Column oriented compilation of variant tables’, in Pro- tables is essential in MC for letting variant tables scale with a grow- ceedings of the 17th International Configuration Workshop, Vienna, ing number of choices, which production technology now enables Austria, September 10-11, 2015., eds., Juha Tiihonen, Andreas A. [13]. C-tuples are a transparent yet powerful form of compression Falkner, and Tomas Axling, volume 1453 of CEUR Workshop Proceed- ings, pp. 89–96. CEUR-WS.org, (2015). that is transparent and upon which non-proprietary exchange formats [11] Albert Haag, ‘Managing variants of a personalized product’, Journal of can be based. This is addressed in other work, e.g. [13]. Here, we Intelligent Information Systems, 1–28, (2016). propose to add to the expressivity of variant tables by presenting a [12] Albert Haag, ‘Assessing the complexity expressed in a variant table’, quasi-finite (QF) framework that allows dealing with infinite sets of in Proceedings of the 19th International Configuration Workshop, La choices within the tabular paradigm. Defense, France, September 14-15, 2017., pp. 20–27, (2017). [13] Albert Haag and Laura Haag, ‘Empowering the use of variant tables in We believe the QF framework presented here meets these expec- mass customization’, in Proceedings of the MCP-CE 2018 conference, tations. The main idea is that problem solving will deal with the Novi Sad, Serbia, September 19-21, 2018., p. (Submitted), (2018). QF-symbols representing infinite sets via specialization relations. In- [14] G. Katsirelos and T. Walsh, ‘A compression algorithm for large arity stead of exploring the validity of a value assignment of a value (fea- extensional constraints’, In Bessiere [5], pp. 379–393. ture) to a variable (product property), the assignment to a QF-symbol can be specialized. A forced specialization to the empty set would in- validate an assignment. The discussed techniques involving c-tuples and VDDs using QF- symbols has not yet been deployed in practice. The individual ingre- dients: c-tuples, VDDs, and the management of partial orders (POs) for specialization relations have all been applied with positive re- sults. As already mentioned, the question of how to define practical normalization of predicates is open. However, we believe that the primary open issue is to verify that it actually meets the expectations