Constraint-Aware Recommendation of Complex Items Mathias Uta1,2 , Alexander Felfernig2 and Denis Helic2 1 Siemens Energy AG, Freyeslebenstraße 1, 91058, Erlangen, Germany 2 Graz University of Technology, Rechbauerstraße 12, 8010, Graz, Austria Abstract In contrast to basic items such as movies, books, and songs, configurable items consist of individual subcomponents that can be combined following a predefined set of constraints. Due to the increasing size and complexity of configurable items (e.g., cars and software), a simple enumeration of all possible configurations in terms of a product catalog is not possible. Configuration systems try to identify a solution (configuration) that takes into account both, the preferences of the user and a set of constraints that defines in which way individual subcomponents are allowed to be combined. Due to time limitations, cognitive overloads, and missing domain knowledge, configurator users are in many cases not able to completely specify their preferences with regard to all relevant component properties. As a consequence, recommendation technologies need to be integrated into configurators that are able to predict the relevance of individual components for the current user. In this paper, we show how the determination of configurations can be supported by neural network based recommendation. This approach helps to predict user-relevant item properties using historical interaction data. In this context, we introduce a semantic regularization approach that helps to take into account configuration constraints within the scope of neural network learning. Furthermore, we demonstrate the applicability of our approach on the basis of an evaluation in an industrial configuration scenario (high-voltage switchgear configuration). Keywords Recommender systems, Knowledge representation and reasoning, Neural networks, 1. Introduction knowledge representation level, configuration problems can be defined, for example, as a constraint satisfaction In contrast to basic items such as books, movies, and problem (CSP) [5] or in terms of a rule-based represen- songs, configurable items are composed of subcompo- tation [6]. Using CSP representations, possible combi- nents which must be combined conform to a set of pre- nations of individual components are defined in terms defined constraints [1]. For reasons of combinatorial of constraints with a strict separation of domain knowl- explosion, it is in many cases impossible to enumerate edge and problem solving knowledge [1]. In contrast, in the individual items (configurations) in terms of a prod- rule-based approaches product domain knowledge and uct catalog. Related example domains are automotive [2], problem solving knowledge are intermingled. In this software (e.g., configuration of operating systems) [3], paper, we use a rule-based knowledge representation and telecommunication infrastructures [4]. Due to the which is applied in the reported application domain of increasing size and complexity of configurable items, it high-voltage switchgear configuration. becomes important to integrate recommendation algo- Due to the increasing size and complexity of config- rithms into configuration processes to support users in urable items, recommendation technologies are needed, component and/or parameter selection. that proactively support underlying choice processes. Informally, configuration can be regarded as a product There exist a couple of approaches that already support design activity where the resulting item (also denoted as the recommendation of complex items. First, knowledge- product or configuration) is composed of elements of a based recommender systems [7] support recommenda- pre-defined set of basic components/parameters [1]. In tion processes on the basis of a product catalog and de- this context, the chosen components must be consistent termine recommendations either on the basis of a set with a given set of constraints that define restrictions of strict selection criteria (constraints) [8] or similarity regarding the possible component combinations. On the metrics [9]. The ranking of items is often implemented 3rd Edition of Knowledge-aware and Conversational Recommender on the basis of a utility analysis [10] or further evaluation Systems (KaRS) & 5th Edition of Recommendation in Complex criteria that measure to which extent the preferences of Environments (ComplexRec) Joint Workshop @ RecSys 2021, the user are satisfied by individual decision alternatives September 27–1 October 2021, Amsterdam, Netherlands [11, 9, 12]. Importantly, with a few exceptions [13, 14, 15], Envelope-Open mathias.uta@siemens-energy.com (M. Uta); the approaches to the handling of user preferences in aflefern@ist.tugraz.at (A. Felfernig); dhelic@tugraz.at (D. Helic) GLOBE https://felfernig.ist.tugraz.at/ (A. Felfernig) configuration-related scenarios do not take into account Orcid 0000-0002-1670-7508 (M. Uta); 0000-0003-0108-3146 the preferences of other users but focus more on different (A. Felfernig); 0000-0003-0725-7450 (D. Helic) types of decision-theoretic optimizations. An overview Β© 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). of existing integration approaches of recommendation CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) technologies into configuration systems is provided a.o. 2. Working Example in Falkner et al. [13]. Existing integrations focus on a 2-phase process where recommendations of feature set- As a basis for the following discussions on integrating tings are predetermined and then recommended to the neural network based predictions of user preferences, we user. In the case of inconsistent recommendations, alter- first introduce the definition of a configuration task (see native recommendations are calculated repeatedly until Definition 1). a consistent recommendation can be presented. Definition 1. A configuration task can be defined by a Compared to existing approaches to the integration tuple (𝑉 , 𝐷, 𝑅, 𝑅𝐸𝑄) where 𝑉 = {𝑣1 , 𝑣2 , .., 𝑣𝑛 } is a set of fi- of recommendation algorithms with configuration, we nite domain variables, 𝐷 = {π‘‘π‘œπ‘š(𝑣1 ), π‘‘π‘œπ‘š(𝑣2 ), .., π‘‘π‘œπ‘š(𝑣𝑛 )} show how to take into account configuration constraints is a set of corresponding domain definitions, and 𝑅 = already in the learning phase and thus minimize the prob- {π‘Ÿ1 , π‘Ÿ2 , .., π‘Ÿπ‘š } is a set of rules that define how a configura- ability of inconsistent recommendations to be detected in tion can be derived from a given set of customer require- the subsequent configuration phase. In this paper, we fol- ments 𝑅𝐸𝑄 = {𝑣𝛼 = π‘£π‘Žπ‘™π›Ό , .., 𝑣𝛾 = π‘£π‘Žπ‘™π›Ύ } where elements of low the idea of case-based recommendation [16] where 𝑅𝐸𝑄 are regarded as variable value assignments. historical configurations with similar parameter settings A simple example of a configuration task definition as those already specified by the current user are used as is the following (see Example 1) where 𝑝𝑑𝑐 represents a basis for identifying nearest-neighbor configurations. a park distance control feature and 𝑓 𝑒𝑒𝑙 represents fuel In our work, we use such a case-based approach as a consumption in gallons/100miles. baseline version. This version is then compared with two Example 1: Configuration Task. different versions of a feed-forward neural network based configurator integration. The first version focuses on the β€’ 𝑉 = {𝑑𝑦𝑝𝑒, 𝑝𝑑𝑐, 𝑓 𝑒𝑒𝑙, π‘ π‘˜π‘–π‘π‘Žπ‘”, 4-π‘€β„Žπ‘’π‘’π‘™, π‘π‘œπ‘™π‘œπ‘Ÿ} prediction of configuration parameter settings relevant β€’ 𝐷 = {π‘‘π‘œπ‘š(𝑑𝑦𝑝𝑒) = for the user. The second version follows the same goal {𝑐𝑖𝑑𝑦, π‘™π‘–π‘šπ‘œ, π‘π‘œπ‘šπ‘π‘–, π‘₯π‘‘π‘Ÿπ‘–π‘£π‘’}, π‘‘π‘œπ‘š(𝑝𝑑𝑐) = {𝑦𝑒𝑠, π‘›π‘œ}, but also takes into account the fact that recommenda- π‘‘π‘œπ‘š(𝑓 𝑒𝑒𝑙) = {1.7, 2.6, 4.2}, π‘‘π‘œπ‘š(π‘ π‘˜π‘–π‘π‘Žπ‘”) = tions should be consistent with the underlying constraint {𝑦𝑒𝑠, π‘›π‘œ}, π‘‘π‘œπ‘š(4-π‘€β„Žπ‘’π‘’π‘™) = {𝑦𝑒𝑠, π‘›π‘œ}, π‘‘π‘œπ‘š(π‘π‘œπ‘™π‘œπ‘Ÿ) = set. To support this goal, we propose a semantic regular- {π‘Ÿπ‘’π‘‘, 𝑏𝑙𝑒𝑒}} ization of a feed-forward (multi-class and multi-branch) β€’ 𝑅 = {π‘Ÿ1 ∢ 4-π‘€β„Žπ‘’π‘’π‘™ = 𝑦𝑒𝑠 β†’ 𝑑𝑦𝑝𝑒 = π‘₯π‘‘π‘Ÿπ‘–π‘£π‘’, π‘Ÿ2 ∢ neural network that is used as a configuration parameter π‘ π‘˜π‘–π‘π‘Žπ‘” = 𝑦𝑒𝑠 β†’ 𝑑𝑦𝑝𝑒 β‰  𝑐𝑖𝑑𝑦, π‘Ÿ3 ∢ 𝑓 𝑒𝑒𝑙 = 1.7 β†’ prediction model. 𝑑𝑦𝑝𝑒 = 𝑐𝑖𝑑𝑦, π‘Ÿ4 ∢ 𝑓 𝑒𝑒𝑙 = 2.6 ∧ 𝑑𝑦𝑝𝑒 = π‘₯π‘‘π‘Ÿπ‘–π‘£π‘’ β†’ The major contributions of this paper are the follow- 𝑓 π‘Žπ‘™π‘ π‘’, π‘Ÿ5 ∢ 𝑑𝑦𝑝𝑒 = π‘π‘œπ‘šπ‘π‘– β†’ π‘ π‘˜π‘–π‘π‘Žπ‘” = 𝑦𝑒𝑠, π‘Ÿ6 ∢ ing. (1) we introduce a semantic regularization approach 𝑑𝑦𝑝𝑒 = π‘™π‘–π‘šπ‘œ β†’ 𝑝𝑑𝑐 = 𝑦𝑒𝑠} specifically useful for integrating case-based recommen- β€’ 𝑅𝐸𝑄 = {𝑑𝑦𝑝𝑒 = 𝑐𝑖𝑑𝑦, 𝑝𝑑𝑐 = 𝑦𝑒𝑠, 𝑓 𝑒𝑒𝑙 = 1.7} dation with rule-based configuration environments, (2) we compare the predictive quality of the developed ap- Given the definition of a configuration task proach on the basis of a real-world dataset from a complex (𝑉 , 𝐷, 𝑅, 𝑅𝐸𝑄), we are able to introduce the definition industrial configuration task (high-voltage switchgear of a corresponding configuration (solution for a configuration) with regard to the evaluation criteria of configuration task) – see Definition 2. prediction quality and recommendation consistency, and Definition 2. A configuration for a given configuration (3) we show how the presented results can be further task definition (𝑉 , 𝐷, 𝑅, 𝑅𝐸𝑄) is a set of variable assign- generalized to be applicable for configuration scenarios ments 𝐢𝑂𝑁 𝐹 = {𝑣1 = π‘£π‘Žπ‘™1 , .., 𝑣𝑛 = π‘£π‘Žπ‘™π‘› } where βˆ€{𝑣𝑖 = π‘£π‘Žπ‘™π‘– } beyond rule-based configuration. βŠ† 𝐢𝑂𝑁 𝐹 ∢ π‘£π‘Žπ‘™π‘– ∈ π‘‘π‘œπ‘š(𝑣𝑖 ) and π‘π‘œπ‘›π‘ π‘–π‘ π‘‘π‘’π‘›π‘‘(𝐢𝑂𝑁 𝐹 βˆͺ𝑅βˆͺ𝑅𝐸𝑄). The remainder of this paper is organized as follows. A configuration is complete if each variable in 𝑉 has an In Section 2, we introduce a working example in terms assignment in 𝐢𝑂𝑁 𝐹. of a simplified configuration knowledge base from the automotive domain. In this context, we also introduce An example configuration 𝐢𝑂𝑁 𝐹 for the configuration the concepts of a configuration task and a corresponding task of Example 1 is the following (see Example 2). configuration. Thereafter, in Section 3, we introduce our Example 2. 𝐢𝑂𝑁 𝐹 = {𝑑𝑦𝑝𝑒 = 𝑐𝑖𝑑𝑦, 𝑝𝑑𝑐 = 𝑦𝑒𝑠, 𝑓 𝑒𝑒𝑙 = neural network based approach to the recommendation 1.7, π‘ π‘˜π‘–π‘π‘Žπ‘” = π‘›π‘œ, 4-π‘€β„Žπ‘’π‘’π‘™ = π‘›π‘œ, π‘π‘œπ‘™π‘œπ‘Ÿ = π‘Ÿπ‘’π‘‘} of configuration parameter settings. In Section 4, we sum- We regard a configuration as complete if each of the marize our evaluation approach and report the results variables in 𝑉 is associated with a corresponding value of an evaluation conducted on the basis of a real-world assignment and these assignments are consistent with the dataset from the domain of high-voltage switchgear con- rules in 𝑅. As already mentioned, in many configuration figuration. The paper is concluded with an overview of scenarios users are not able or do not want to specify future research issues (Section 5). values for all the defined variables in 𝑉 but are interested in recommendations that help to more easily complete a Table 1 A simple example of a collection of already completed configuration sessions (one hot encoding). The abbreviation pdc denotes a park distance control feature. Furthermore, fuel denotes the fuel consumption in gallons/100 miles. Finally, session current is an ongoing session where variable settings for {π‘ π‘˜π‘–π‘π‘Žπ‘”, 4-π‘€β„Žπ‘’π‘’π‘™, π‘π‘œπ‘™π‘œπ‘Ÿ} should be recommended. attribute type pdc fuel skibag 4-wheel color session city limo combi xdrive yes no 1.7 2.6 4.2 yes no yes no red blue 1 1 0 0 0 1 0 1 0 0 0 1 0 1 1 0 2 0 0 0 1 1 0 0 0 1 0 1 1 0 0 1 3 0 1 0 0 1 0 0 1 0 0 1 0 1 0 1 current 0 0 1 0 0 1 0 1 0 ? ? ? ? ? ? configuration session [13]. We now introduce a definition 1–3 have already been completed. The current session of a recommendation in the context of a configuration is ongoing and we are interested in a recommendation task (see Definition 3). for the variables skibag, 4-wheel, and color. For the pur- Definition 3. Given the definition of a configuration poses of this example and also for discussing the neural task (𝑉 , 𝐷, 𝑅, 𝑅𝐸𝑄), a corresponding recommendation network based recommendation approach, we apply a 𝑅𝐸𝐢 = {𝑣𝛽 = π‘£π‘Žπ‘™π›½ , .., 𝑣𝛿 = π‘£π‘Žπ‘™π›Ώ } is a set of variable value one hot encoding of the configuration variables, for exam- assignments of 𝑣𝑖 ∈ 𝑉. A recommendation 𝑅𝐸𝐢 is consis- ple, in session 1, the configured car is of type city. In the tent if 𝑅𝐸𝐢 βˆͺ 𝑅𝐸𝑄 βˆͺ 𝑅 is consistent, i.e., a solution can be scenario shown in Table 1, a simple case-based reason- found. ing recommender would search for one or more nearest neighbors (NN) and recommend the variable settings that Example 3. 𝑅𝐸𝐢 = {π‘ π‘˜π‘–π‘π‘Žπ‘” = π‘›π‘œ, 4-π‘€β„Žπ‘’π‘’π‘™ = π‘›π‘œ, π‘π‘œπ‘™π‘œπ‘Ÿ = were choosen most often by the nearest neighbors. In π‘Ÿπ‘’π‘‘} our example, the nearest neighbor (session) of the current Following the approach of case-based reasoning [16], session is session 3 (in terms of the number of equivalent it can be the case that recommended variable value as- variable values). If we assume |NN|=1, we would recom- signments are inconsistent with the already defined user mend 𝑅𝐸𝐢 = {π‘ π‘˜π‘–π‘π‘Žπ‘” = π‘›π‘œ, 4-π‘€β„Žπ‘’π‘’π‘™ = π‘›π‘œ, π‘π‘œπ‘™π‘œπ‘Ÿ = 𝑏𝑙𝑒𝑒} to requirements and the rules (constraints) defined in the the user in the current session (if we intend to predict all knowledge base. This is the case if recommendations unspecified variable values at the same time). are determined from already completed configuration Importantly, since the current user is interested in a sessions without taking into account configuration con- car of type combi which requires the inclusion of a skibag straints (rules in 𝑅). In the following, we provide a simple (see Example 1), such a recommendation (𝑅𝐸𝐢) induces example of a case-based reasoning approach and then an inconsistency between the user requirements and the focus on how to take into account configuration rules configuration knowledge base (the set of rules 𝑅). A in terms of a semantic regularization when optimizing a traditional approach to deal with such a situation is to neural network responsible for recommending variable test the next recommendation for consistency and do this settings. until a consistent recommendation could be found [13]. Our approach (that will be introduced in the following) to deal with such a situation is to introduce a semantic 3. Recommending Configurable regularization into the neural network learning phase Items which helps to avoid inconsistent recommendations as far as possible. As already sketched in the previous section, recommenda- tions in the context of configuration scenarios are repre- Neural Network based Recommendation Our ba- sented by a set of attribute assignments, i.e., a recommen- sic approach to recommend variable values in the context dation could include a single attribute setting but also of rule-based configuration is based on the feed-forward numerous settings recommended at the same time. In this neural network structure depicted in Figure 1. In such section, we discuss different approaches to recommend networks, the input layer consists of possible values (rep- variable value settings in the context of knowledge-based resented in terms of a one hot encoding) that have already configuration scenarios. been specified by a user. For example, the variable values (preferences) that have already been specified by the user Case-based Recommendation Table 1 represents a in session π‘π‘’π‘Ÿπ‘Ÿπ‘’π‘›π‘‘ are {𝑑𝑦𝑝𝑒 = π‘π‘œπ‘šπ‘π‘–, 𝑝𝑑𝑐 = π‘›π‘œ, 𝑓 𝑒𝑒𝑙 = 2.6}. simple example of a set of already completed configu- Networks as those depicted in Figure 1 can be trained rations that can be used as a basis for determining rec- in a domain-dependent fashion on the basis of a dataset ommendations. In this example, configuration sessions type = city π‘ π‘˜π‘–π‘π‘Žπ‘” = 𝑦𝑒𝑠(0.3) 𝑑𝑦𝑝𝑒 = π‘™π‘–π‘šπ‘œ skibag = no(0.7) ... ... fuel = 1.7 color = red(0.8) 𝑓 𝑒𝑒𝑙 = 2.6 π‘π‘œπ‘™π‘œπ‘Ÿ = 𝑏𝑙𝑒𝑒(0.2) Figure 1: A simple neural network architecture with an input layer representing specified (e.g., type and fuel) and un-specified (e.g., skibag andcolor) variables, one hidden layer, and an output layer that helps to estimate variable values of relevance. comprised of already completed configuration sessions Constraint-Aware Recommendation For reasons (see Sessions 1–3 in Table 1). Furthermore, the hidden of potentially inconsistent recommendations, we have layer is used for learning dependencies between input introduced an enhanced neural network learning phase values selected by the user and corresponding variable including a semantic regularization where inconsistent values of potential relevance for the user. The number of recommendations are taken into account as regulariza- nodes in the hidden layer is regarded as hyper-parameter tion term. In other words, although parts of the domain- to be optimized in an item domain dependent fashion (in specific rules/constraints can be learned from the under- [17] an equal amount of neurons in the input layer and lying training dataset, it can be the case that some or the hidden layer showed the best performance). Finally, even many constraints are neglected and the resulting the output layer supports a multi-branch approach (one variable value recommendations induce an inconsistency. branch per variable) where each branch 𝑏 is splitted into We denote this approach as constraint-aware neural net- π‘π‘œ output nodes representing the different domain val- works which are extremely relevant in recommendation ues of variable 𝑣𝑏 . In contrast to the input and hidden scenarios where domain-specific constraints/rules have layer which use a ReLU activation function, classification to be taken into account by the recommender. To re- is implemented using softmax. The choice of the train- duce the probability of inconsistent variable value rec- ing hyper-parameters has been made based on several ommendations, knowledge base rules/constraints are test executions. Optimizer β€œAdam” [18] has shown the taken into account in the learning process. This goal best performance compared to other gradient decent op- is achieved by integrating the results of a consistency timizers like β€œADAGRAD”, β€œRMSProp” and β€œSGD” [19]. check of the proposed recommendation 𝑅𝐸𝐢 (more pre- β€œAdam” uses adaptive estimation of first order and sec- cisely, 𝑅𝐸𝑄 βˆͺ 𝑅 βˆͺ 𝑅𝐸𝐢) into a corresponding loss function ond order moments, which slows down the adjustment as shown in Formula 1. of neuron weights the more steps have been done. The 𝐿(πœƒ) ← πΏπ‘‘π‘Ÿπ‘Žπ‘–π‘› (πœƒ) + Ξ©(πœƒ) + πœ‡ Γ— Ξ (πœƒ) (1) selected parameters for the β€œAdam” optimizer were an initial learning rate of 0.001, 𝛽1 = 0.9 and 𝛽2 = 0.999. In this context, 𝐿(πœƒ) denotes a loss function on the vec- Please note that this network architecture assumes cate- tor πœƒ of weights in the neural network, πΏπ‘‘π‘Ÿπ‘Žπ‘–π‘› (πœƒ) denotes gorical variables (e.g., similar to our Example 1) – other the prediction loss, Ξ©(πœƒ) represents a corresponding 𝐿2 variable types require preprocessing such as binning or regularization term, πœ‡ represents a hyper-parameter that alternative architectures. Our neural network derived controls the impact of an inconsistency on the overall from the knowledge base introduced in Section 2 consists loss, and Ξ (πœƒ) indicates whether the recommendation re- of 6 output nodes and also 9 input nodes (assuming the sulting from πœƒ is consistent (0 is returned) or inconsistent example from above). (1 is returned). As Ξ (πœƒ) is a discrete non-differentiable In the basic version of our approach, neural networks function, the optimization of our loss function has to are trained on the basis of training dataset (see, e.g., ses- resort to approximation of gradients via computation of sions (1–3 in Table 1).The usage scenario of this basic finite differences. neural network approach is the following: if a user inter- acts with a configurator and has already specified a set of User Interaction and Knowledge Representation initial requirements (𝑅𝐸𝑄), the neural network can be ex- Our approach to neural network based variable value rec- ploited for the recommendation of variable values. Since ommendation for knowledge-based configuration helps the basic version of the neural network can only learn to reduce the probability of inconsistency-inducing rec- constraints/rules from the available set of completed con- ommendations (see Section 4) and thus also can help to figurations, it can be the case that predictions induce an make configuration processes less time-consuming for inconsistency with the underlying rule set. users. The proposed recommendation approach is flexi- ble in the sense that recommendations for single-variable assignments as well as combined variable assignment recommendations can be supported. In our working ex- veloped a case-based reasoning approach (see Section 3) ample, we did not take into account settings, where a that recommends variable value settings on the basis of configuration task is organized in phases where in each the preferences of the 𝑁 nearest neighbors (in the given phase a specific subcomponent of the product is config- setting, 𝑁 = 1 achieved the highest prediction quality). ured (e.g., software configuration as part of the configu- Furthermore, the two versions of the neural network ration of a whole computer). On the user interface level, based approach have been implemented on the basis of recommendations are mostly related to variables within the Keras API [21]. 2 The learning of the neural network a specific phase. However, recommendations can also model is based on 32 iterations during the learning phase be determined on the basis of existing variable settings of the model where 80% of the data is used for training from different phases. purposes and 20% for testing. The first version of the neural network model has been trained without taking Recommendation Consistency The achievable de- into account the rules in the configuration knowledge gree of recommendation consistency also depends on base whereas the learning of the model for the second the used knowledge representation. In the case of a rule- version is based on the loss function included in Formula based knowledge representation [6], it is not always fea- 1. For validation of the models they have been applied sible to correctly predict if it is possible to complete a separately to 20 configurations which where not part of partial configuration, i.e., given a (consistent) set of cus- the training or testing data. tomer requirements, is it possible to find assignments for the remaining variables in such a way that a consis- Prediction Quality Our first goal was to analyze the tent and complete configuration can be achieved. If a prediction quality of the three variable value recommen- more compact representation of all satisfiable variable dation approaches discussed in this paper: (1) case-based assignments is available [20], our approach can be ap- recommendation, (2) neural network based recommenda- plied to recommend the most relevant option among the tion, and (3) neural network based recommendation with consistent ones. In a similar fashion, constraint-based semantic regularization. To measure the prediction qual- approaches [5] can be applied to infer remaining vari- ity, precision has been chosen as the key performance able assignments that are still consistent with 𝑅𝐸𝑄 and indicator. The precision of the recommendation of a 𝑅. In the following, we present the results of an empir- configuration 𝐴𝑅𝑒𝑐 has been measured in terms of the ical analysis of our constraint-aware recommendation share of predictions part of the configuration accepted by approach using a real-world dataset from the domain of the user 𝐴 in relation to the total number of predictions high-voltage switchgear configuration. contained in 𝐴𝑅𝑒𝑐 , see equation 2. In the context of our evaluation, we were specifically interested in the predic- tive performance depending on the number of already 4. Evaluation known attribute values. The prediction task was specified in such a way that given a chosen set of input attributes The configurator application for the high-voltage (the known settings representing 𝑅𝐸𝑄), the task was to switchgear domain has been developed with the goal to predict all other missing attributes to complete the con- reduce engineering effort during the offering stage of figuration. Since our configurator is organized in 5 con- these highly complex systems. The underlying dataset figuration phases, the phase number of a to be predicted includes 𝑁 = 720 complete configurations developed by variable had to be equal to the number of the current skilled sales employees from Siemens Energy AG. Each configuration phase and the phase number of known entry of the dataset consists of 𝑀 = 60 attribute settings variables (𝑅𝐸𝑄) had to be lower or equal to the phase (assignments), i.e., each configuration is described by 60 number of the to be predicted variable. Consequently, the variables (representing features and subcomponents). In missing attributes were iterative predicted by selecting in this context, 10 out of the 60 variables have been defined each iteration the values for those attributes part of the as basic switchgear features which are assumed to be configuration phase with the lowest phase number. In selected by the user before a recommendation can be the following iteration the previously predicted variables triggered (e.g., basic switchgear category to be installed). have been utilized as known variables for predicting the The focus of recommendation are the remaining 50 more variables of the next configuration phase. This has been specific features with a sometimes lower degree of un- repeated till the configuration was complete. derstandability where it is often an issue for users to find good or even optimal settings (e.g., AC supply voltage). The dataset is composed of consistent configurations that |𝐴 ∩ 𝐴| π‘π‘œπ‘›π‘“ π‘–π‘”π‘’π‘Ÿπ‘Žπ‘‘π‘–π‘œπ‘› π‘Ÿπ‘’π‘π‘œπ‘šπ‘šπ‘’π‘›π‘‘π‘Žπ‘‘π‘–π‘œπ‘› π‘π‘Ÿπ‘’π‘π‘–π‘ π‘–π‘œπ‘› = 𝑅𝑒𝑐 have been built on the basis of a rule-based configuration |𝐴𝑅𝑒𝑐 | system.1 As a baseline in our evaluation, we have de- (2) 1 2 www.camos.de. https://github.com/MaUt89/ConLearn 100 avg. precision without SR Figure 2 provides an overview of the outcome of our 95 avg. precision with SR evaluation by averaging the precision achieved for each 90 avg. precision of CBR of the 20 validation configurations applied to the models. 85 precision (%) 80 The results clearly indicate the potential of prediction 75 quality improvements that can be achieved by the in- 70 clusion of semantic regularization concepts. Whereas, 65 (1) case-based recommendation achieves only a preci- 60 sion of 60.5% when starting with ten initially specified 55 variable values, the neural network based recommenda- 0 10 20 30 40 50 60 number of already known variable values tion approaches predict the variable values with 74% (2) and 76.33% (3) precision. Noticeable is that, the neural Figure 2: Precision of high voltage switchgear related predic- network based recommendation with semantic regular- tions (SR = semantic regularization, CBR = case-based reason- ization (3) outperforms the approach without semantic ing). regularization (2) in every validation scenario. Finally, avg. consistency without SR with 50 out of 60 variable values given as an input for 100 avg. consistency with SR avg. consistency of CBR the configuration both neural network based approaches 99 reach a precision of 100%. 98 consistency (%) 97 96 Consistency We were also interested in the degree of 95 consistency of the determined recommendations 𝑅𝐸𝐢, 94 i.e., consistent(𝑅𝐸𝐢 βˆͺ 𝑅𝐸𝑄 βˆͺ 𝑅). The consistency of the 93 recommendation of a configuration 𝐴𝑅𝑒𝑐 has been mea- 0 10 20 30 40 50 60 sured in terms of the share of knowledge base consistent number of already known variable values predictions π΄βˆ— part of the recommendation in relation Figure 3: Consistency of high voltage switchgear related to the total number of predictions contained in 𝐴𝑅𝑒𝑐 , see predictions (SR = semantic regularization, CBR = case-based equation 3. As can be seen in Figure 3, the semantic reg- reasoning). ularization helps to decrease the inconsistency degree of recommendations (compared to the CBR and the basic neural network based approach). Starting with a consis- rators and compared it with cased-based and basic neural tency of 95.27% (ten variables values already known) the network based recommendation. To improve the predic- neural network based approach with semantic regulariza- tion quality and consistency of recommendations, we tion (3) reaches already with 30 initially known variable have introduced a semantic regularization approach that values a consistency of 100%. Both other approaches (1) helps to further increase the consistency degree of rec- ommendations especially in the context of rule-based and (2) achieve poorer results with ten initially given vari- able values 93.59% (1) and 94.88% (2) and reach a consis- configuration scenarios. The presented approach has been integrated into a industrial rule-based configuration tency of 100% not until 40 variables are initially specified. All in all, the higher consistency of approach (3) includ-environment that focuses on the configuration of high- ing the semantic regularization has been expected since voltage switchgears. The presented approach can in- this approach is penalizing non-consistent predictions crease both, prediction quality and consistency of the de- during the learning phase of the model. Nevertheless, termined recommendations. Furthermore, the approach the impact could have been higher and the consistency is generalizable to other types of configuration knowl- especially with a low number of initially known variable edge representations such as constraint satisfaction prob- values is improvable. To achieve this an optimization lems (CSPs). of the hyper-parameter πœ‡ is desirable and part of future Future work will focus on the integration of the de- work. veloped concepts into model-based configuration knowl- edge representations such as CSPs [5]. Furthermore, we βˆ— will extend the scope of considered machine learning ap- |𝐴 ∩ 𝐴𝑅𝑒𝑐 | π‘π‘œπ‘›π‘“ π‘–π‘”π‘’π‘Ÿπ‘Žπ‘‘π‘–π‘œπ‘› π‘Ÿπ‘’π‘π‘œπ‘šπ‘šπ‘’π‘›π‘‘π‘Žπ‘‘π‘–π‘œπ‘› π‘π‘œπ‘›π‘ π‘–π‘ π‘‘π‘’π‘›π‘π‘¦ = proaches a.o. with an integration of matrix factorization |𝐴𝑅𝑒𝑐 | based variable value prediction. The dataset size used for (3) the evaluation presented in this paper can be considered as a limitation of this work, in particular w.r.t. neural 5. Conclusions and Future Work network models. A major focus of future work will be the evaluation of our approach with larger industrial con- We have introduced an approach to the integration of figuration datasets. The neural network based prediction recommendation features into knowledge-based configu- models will also be evaluated for their applicability in the context of diagnosis scenarios, i.e., scenarios where tion using the minimax decision criterion, Artificial users receive recommendations regarding requirements Intelligence 170 (2006) 686–713. changes that help to get out from an inconsistent situa- [12] J. Goldsmith, U. Junker, Preference handling for tion. Also, we plan to investigate alternative formulations artificial intelligence, AI Magazine 29 (2008) 9–12. of the optimization problem, for example, with consis- [13] A. Falkner, A. Felfernig, A. Haag, Recommendation tency conditions being (partially) defined as optimization technologies for configurable products, AI Maga- constraints. Finally, although already integrated into the zine 32 (2011) 99–108. configuration environment of Siemens Energy, the evalu- [14] M. Zanker, A Collaborative Constraint-based Meta- ation of the proposed recommendation approach will be level Recommender, in: ACM RecSys, ACM, Lau- further extended especially with regard to the quality of sanne, Switzerland, 2008, pp. 139–146. the user interface and the need of additional explanations [15] D. Jannach, L. Kalabis, Incremental prediction of for the proposed recommendations. configurator input values based on association rules – a case study, in: Proceedings Workshop on Con- figuration, 2011, pp. 32–35. References [16] B. Smyth, Case-Based Recommendation, in: The Adaptive Web, volume 4321 of LNCS, Springer, [1] M. Stumptner, An overview of knowledge-based Berlin, Heidelberg, 2007, pp. 342–376. configuration, AI Communications 10 (1997) [17] M. Uta, A. Felfernig, Towards machine learn- 111–125. ing based configuration, in: C. Forza, L. Hvam, [2] J. Landahl, D. BergsjΓΆ, H. Johannesson, Future Alter- A. Felfernig (Eds.), 22nd International Configura- natives for Automotive Configuration Management, tion Workshop, 2020, pp. 25–28. Procedia Computer Science 28 (2014) 103–110. [18] D. P. Kingma, J. Ba, Adam: A method for stochastic [3] J. Sincero, W. SchrΓΆder-Preikschat, The linux kernel optimization, 2017. a r X i v : 1 4 1 2 . 6 9 8 0 . configurator as a feature modeling tool, in: Work- [19] S. Ruder, An overview of gradient descent optimiza- shop on Analyses of Software Product Lines, ASPL, tion algorithms, CoRR abs/1609.04747 (2016). URL: Limerick, Ireland, 2008, pp. 257–260. http://arxiv.org/abs/1609.04747. [4] G. Fleischanderl, G. Friedrich, A. HaselbΓΆck, [20] H. Andersen, H. Hulgaard, Boolean Expression Di- H. Schreiner, M. Stumptner, Configuring large sys- agrams, in: 12th IEEE Symp. on Logic in Computer tems using generative constraint satisfaction, IEEE Science, IEEE, Warsaw, Poland, 1997, pp. 88–98. Intelligent Systems 13 (1998) 59–68. [21] F. Chollet, Keras, https://github.com/fchollet/keras, [5] E. Tsang, Foundations of Constraint Satisfaction, 2015. Computation in cognitive science, Academic Press, London, UK, 1993. [6] D. Dhungana, C. Tang, C. Weidenbach, P. Wis- chnewski, Automated Verification of Interac- tive Rule-based Configuration Systems, in: 28th IEEE/ACM Intl. Conference on Automated Software Engineering, IEEE, Silicon Valley, CA, USA, 2013, pp. 551–561. [7] R. Burke, Knowledge-based recommender systems, Encyclopedia of Library and Information Systems 69 (2000) 180–200. [8] A. Felfernig, G. Friedrich, D. Jannach, M. Zanker, Constraint-based Recommender Systems, in: Rec- ommender Systems Handbook, 2 ed., Springer, Boston, MA, 2015, pp. 161–190. [9] L. Chen, P. Pu, Critiquing-based Recommenders: Survey and Emerging Trends, UMUAI 22 (2012) 125–150. [10] D. Winterfeldt, W. Edwards, Decision Analysis and Behavioral Research, Cambridge University Press, Cambridge, England, 1986. [11] C. Boutilier, R. Patrascu, P. Poupart, D. Schuurmans, Constraint-based optimization and utility elicita-