Constraint-Aware Recommendation of Complex Items
Mathias Uta1,2 , Alexander Felfernig2 and Denis Helic2
1
    Siemens Energy AG, Freyeslebenstraße 1, 91058, Erlangen, Germany
2
    Graz University of Technology, Rechbauerstraße 12, 8010, Graz, Austria


                                             Abstract
                                             In contrast to basic items such as movies, books, and songs, configurable items consist of individual subcomponents that
                                             can be combined following a predefined set of constraints. Due to the increasing size and complexity of configurable items
                                             (e.g., cars and software), a simple enumeration of all possible configurations in terms of a product catalog is not possible.
                                             Configuration systems try to identify a solution (configuration) that takes into account both, the preferences of the user and a
                                             set of constraints that defines in which way individual subcomponents are allowed to be combined. Due to time limitations,
                                             cognitive overloads, and missing domain knowledge, configurator users are in many cases not able to completely specify their
                                             preferences with regard to all relevant component properties. As a consequence, recommendation technologies need to be
                                             integrated into configurators that are able to predict the relevance of individual components for the current user. In this paper,
                                             we show how the determination of configurations can be supported by neural network based recommendation. This approach
                                             helps to predict user-relevant item properties using historical interaction data. In this context, we introduce a semantic
                                             regularization approach that helps to take into account configuration constraints within the scope of neural network learning.
                                             Furthermore, we demonstrate the applicability of our approach on the basis of an evaluation in an industrial configuration
                                             scenario (high-voltage switchgear configuration).

                                             Keywords
                                             Recommender systems, Knowledge representation and reasoning, Neural networks,


1. Introduction                                                                                                       knowledge representation level, configuration problems
                                                                                                                      can be defined, for example, as a constraint satisfaction
In contrast to basic items such as books, movies, and                                                                 problem (CSP) [5] or in terms of a rule-based represen-
songs, configurable items are composed of subcompo-                                                                   tation [6]. Using CSP representations, possible combi-
nents which must be combined conform to a set of pre-                                                                 nations of individual components are defined in terms
defined constraints [1]. For reasons of combinatorial                                                                 of constraints with a strict separation of domain knowl-
explosion, it is in many cases impossible to enumerate                                                                edge and problem solving knowledge [1]. In contrast, in
the individual items (configurations) in terms of a prod-                                                             rule-based approaches product domain knowledge and
uct catalog. Related example domains are automotive [2],                                                              problem solving knowledge are intermingled. In this
software (e.g., configuration of operating systems) [3],                                                              paper, we use a rule-based knowledge representation
and telecommunication infrastructures [4]. Due to the                                                                 which is applied in the reported application domain of
increasing size and complexity of configurable items, it                                                              high-voltage switchgear configuration.
becomes important to integrate recommendation algo-                                                                      Due to the increasing size and complexity of config-
rithms into configuration processes to support users in                                                               urable items, recommendation technologies are needed,
component and/or parameter selection.                                                                                 that proactively support underlying choice processes.
   Informally, configuration can be regarded as a product                                                             There exist a couple of approaches that already support
design activity where the resulting item (also denoted as                                                             the recommendation of complex items. First, knowledge-
product or configuration) is composed of elements of a                                                                based recommender systems [7] support recommenda-
pre-defined set of basic components/parameters [1]. In                                                                tion processes on the basis of a product catalog and de-
this context, the chosen components must be consistent                                                                termine recommendations either on the basis of a set
with a given set of constraints that define restrictions                                                              of strict selection criteria (constraints) [8] or similarity
regarding the possible component combinations. On the                                                                 metrics [9]. The ranking of items is often implemented
3rd Edition of Knowledge-aware and Conversational Recommender                                                         on the basis of a utility analysis [10] or further evaluation
Systems (KaRS) & 5th Edition of Recommendation in Complex                                                             criteria that measure to which extent the preferences of
Environments (ComplexRec) Joint Workshop @ RecSys 2021,                                                               the user are satisfied by individual decision alternatives
September 27–1 October 2021, Amsterdam, Netherlands                                                                   [11, 9, 12]. Importantly, with a few exceptions [13, 14, 15],
Envelope-Open mathias.uta@siemens-energy.com (M. Uta);
                                                                                                                      the approaches to the handling of user preferences in
aflefern@ist.tugraz.at (A. Felfernig); dhelic@tugraz.at (D. Helic)
GLOBE https://felfernig.ist.tugraz.at/ (A. Felfernig)                                                                 configuration-related scenarios do not take into account
Orcid 0000-0002-1670-7508 (M. Uta); 0000-0003-0108-3146                                                               the preferences of other users but focus more on different
(A. Felfernig); 0000-0003-0725-7450 (D. Helic)                                                                        types of decision-theoretic optimizations. An overview
                                       © 2021 Copyright for this paper by its authors. Use permitted under Creative
                                       Commons License Attribution 4.0 International (CC BY 4.0).                     of existing integration approaches of recommendation
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
technologies into configuration systems is provided a.o.     2. Working Example
in Falkner et al. [13]. Existing integrations focus on a
2-phase process where recommendations of feature set-        As a basis for the following discussions on integrating
tings are predetermined and then recommended to the          neural network based predictions of user preferences, we
user. In the case of inconsistent recommendations, alter-    first introduce the definition of a configuration task (see
native recommendations are calculated repeatedly until       Definition 1).
a consistent recommendation can be presented.                    Definition 1. A configuration task can be defined by a
   Compared to existing approaches to the integration        tuple (𝑉 , 𝐷, 𝑅, 𝑅𝐸𝑄) where 𝑉 = {𝑣1 , 𝑣2 , .., 𝑣𝑛 } is a set of fi-
of recommendation algorithms with configuration, we          nite domain variables, 𝐷 = {𝑑𝑜𝑚(𝑣1 ), 𝑑𝑜𝑚(𝑣2 ), .., 𝑑𝑜𝑚(𝑣𝑛 )}
show how to take into account configuration constraints      is a set of corresponding domain definitions, and 𝑅 =
already in the learning phase and thus minimize the prob-    {𝑟1 , 𝑟2 , .., 𝑟𝑚 } is a set of rules that define how a configura-
ability of inconsistent recommendations to be detected in    tion can be derived from a given set of customer require-
the subsequent configuration phase. In this paper, we fol-   ments 𝑅𝐸𝑄 = {𝑣𝛼 = 𝑣𝑎𝑙𝛼 , .., 𝑣𝛾 = 𝑣𝑎𝑙𝛾 } where elements of
low the idea of case-based recommendation [16] where         𝑅𝐸𝑄 are regarded as variable value assignments.
historical configurations with similar parameter settings        A simple example of a configuration task definition
as those already specified by the current user are used as   is the following (see Example 1) where 𝑝𝑑𝑐 represents
a basis for identifying nearest-neighbor configurations.     a park distance control feature and 𝑓 𝑢𝑒𝑙 represents fuel
In our work, we use such a case-based approach as a          consumption in gallons/100miles.
baseline version. This version is then compared with two
                                                                Example 1: Configuration Task.
different versions of a feed-forward neural network based
configurator integration. The first version focuses on the         • 𝑉 = {𝑡𝑦𝑝𝑒, 𝑝𝑑𝑐, 𝑓 𝑢𝑒𝑙, 𝑠𝑘𝑖𝑏𝑎𝑔, 4-𝑤ℎ𝑒𝑒𝑙, 𝑐𝑜𝑙𝑜𝑟}
prediction of configuration parameter settings relevant            • 𝐷              =              {𝑑𝑜𝑚(𝑡𝑦𝑝𝑒)        =
for the user. The second version follows the same goal               {𝑐𝑖𝑡𝑦, 𝑙𝑖𝑚𝑜, 𝑐𝑜𝑚𝑏𝑖, 𝑥𝑑𝑟𝑖𝑣𝑒}, 𝑑𝑜𝑚(𝑝𝑑𝑐) = {𝑦𝑒𝑠, 𝑛𝑜},
but also takes into account the fact that recommenda-                𝑑𝑜𝑚(𝑓 𝑢𝑒𝑙) = {1.7, 2.6, 4.2}, 𝑑𝑜𝑚(𝑠𝑘𝑖𝑏𝑎𝑔) =
tions should be consistent with the underlying constraint            {𝑦𝑒𝑠, 𝑛𝑜}, 𝑑𝑜𝑚(4-𝑤ℎ𝑒𝑒𝑙) = {𝑦𝑒𝑠, 𝑛𝑜}, 𝑑𝑜𝑚(𝑐𝑜𝑙𝑜𝑟) =
set. To support this goal, we propose a semantic regular-            {𝑟𝑒𝑑, 𝑏𝑙𝑢𝑒}}
ization of a feed-forward (multi-class and multi-branch)
                                                                   • 𝑅 = {𝑟1 ∶ 4-𝑤ℎ𝑒𝑒𝑙 = 𝑦𝑒𝑠 → 𝑡𝑦𝑝𝑒 = 𝑥𝑑𝑟𝑖𝑣𝑒, 𝑟2 ∶
neural network that is used as a configuration parameter
                                                                     𝑠𝑘𝑖𝑏𝑎𝑔 = 𝑦𝑒𝑠 → 𝑡𝑦𝑝𝑒 ≠ 𝑐𝑖𝑡𝑦, 𝑟3 ∶ 𝑓 𝑢𝑒𝑙 = 1.7 →
prediction model.
                                                                     𝑡𝑦𝑝𝑒 = 𝑐𝑖𝑡𝑦, 𝑟4 ∶ 𝑓 𝑢𝑒𝑙 = 2.6 ∧ 𝑡𝑦𝑝𝑒 = 𝑥𝑑𝑟𝑖𝑣𝑒 →
   The major contributions of this paper are the follow-
                                                                     𝑓 𝑎𝑙𝑠𝑒, 𝑟5 ∶ 𝑡𝑦𝑝𝑒 = 𝑐𝑜𝑚𝑏𝑖 → 𝑠𝑘𝑖𝑏𝑎𝑔 = 𝑦𝑒𝑠, 𝑟6 ∶
ing. (1) we introduce a semantic regularization approach
                                                                     𝑡𝑦𝑝𝑒 = 𝑙𝑖𝑚𝑜 → 𝑝𝑑𝑐 = 𝑦𝑒𝑠}
specifically useful for integrating case-based recommen-
                                                                   • 𝑅𝐸𝑄 = {𝑡𝑦𝑝𝑒 = 𝑐𝑖𝑡𝑦, 𝑝𝑑𝑐 = 𝑦𝑒𝑠, 𝑓 𝑢𝑒𝑙 = 1.7}
dation with rule-based configuration environments, (2)
we compare the predictive quality of the developed ap-          Given the definition of a configuration task
proach on the basis of a real-world dataset from a complex   (𝑉 , 𝐷, 𝑅, 𝑅𝐸𝑄), we are able to introduce the definition
industrial configuration task (high-voltage switchgear       of a corresponding configuration (solution for a
configuration) with regard to the evaluation criteria of     configuration task) – see Definition 2.
prediction quality and recommendation consistency, and          Definition 2. A configuration for a given configuration
(3) we show how the presented results can be further         task definition (𝑉 , 𝐷, 𝑅, 𝑅𝐸𝑄) is a set of variable assign-
generalized to be applicable for configuration scenarios     ments 𝐶𝑂𝑁 𝐹 = {𝑣1 = 𝑣𝑎𝑙1 , .., 𝑣𝑛 = 𝑣𝑎𝑙𝑛 } where ∀{𝑣𝑖 = 𝑣𝑎𝑙𝑖 }
beyond rule-based configuration.                             ⊆ 𝐶𝑂𝑁 𝐹 ∶ 𝑣𝑎𝑙𝑖 ∈ 𝑑𝑜𝑚(𝑣𝑖 ) and 𝑐𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡(𝐶𝑂𝑁 𝐹 ∪𝑅∪𝑅𝐸𝑄).
   The remainder of this paper is organized as follows.      A configuration is complete if each variable in 𝑉 has an
In Section 2, we introduce a working example in terms        assignment in 𝐶𝑂𝑁 𝐹.
of a simplified configuration knowledge base from the
automotive domain. In this context, we also introduce           An example configuration 𝐶𝑂𝑁 𝐹 for the configuration
the concepts of a configuration task and a corresponding     task of Example 1 is the following (see Example 2).
configuration. Thereafter, in Section 3, we introduce our       Example 2. 𝐶𝑂𝑁 𝐹 = {𝑡𝑦𝑝𝑒 = 𝑐𝑖𝑡𝑦, 𝑝𝑑𝑐 = 𝑦𝑒𝑠, 𝑓 𝑢𝑒𝑙 =
neural network based approach to the recommendation          1.7, 𝑠𝑘𝑖𝑏𝑎𝑔 = 𝑛𝑜, 4-𝑤ℎ𝑒𝑒𝑙 = 𝑛𝑜, 𝑐𝑜𝑙𝑜𝑟 = 𝑟𝑒𝑑}
of configuration parameter settings. In Section 4, we sum-      We regard a configuration as complete if each of the
marize our evaluation approach and report the results        variables in 𝑉 is associated with a corresponding value
of an evaluation conducted on the basis of a real-world      assignment and these assignments are consistent with the
dataset from the domain of high-voltage switchgear con-      rules in 𝑅. As already mentioned, in many configuration
figuration. The paper is concluded with an overview of       scenarios users are not able or do not want to specify
future research issues (Section 5).                          values for all the defined variables in 𝑉 but are interested
                                                             in recommendations that help to more easily complete a
Table 1
A simple example of a collection of already completed configuration sessions (one hot encoding). The abbreviation pdc denotes
a park distance control feature. Furthermore, fuel denotes the fuel consumption in gallons/100 miles. Finally, session current is
an ongoing session where variable settings for {𝑠𝑘𝑖𝑏𝑎𝑔, 4-𝑤ℎ𝑒𝑒𝑙, 𝑐𝑜𝑙𝑜𝑟} should be recommended.
   attribute                   type                    pdc              fuel            skibag       4-wheel         color
    session     city    limo    combi     xdrive    yes      no   1.7    2.6   4.2    yes    no     yes    no     red    blue
       1         1        0       0         0        1        0    1      0     0      0      1      0      1      1       0
       2         0        0       0         1        1        0    0      0     1      0      1      1      0      0       1
       3         0        1       0         0        1        0    0      1     0      0      1      0      1      0       1
    current      0        0       1         0        0        1    0      1     0      ?      ?      ?      ?      ?       ?


configuration session [13]. We now introduce a definition         1–3 have already been completed. The current session
of a recommendation in the context of a configuration             is ongoing and we are interested in a recommendation
task (see Definition 3).                                          for the variables skibag, 4-wheel, and color. For the pur-
   Definition 3. Given the definition of a configuration          poses of this example and also for discussing the neural
task (𝑉 , 𝐷, 𝑅, 𝑅𝐸𝑄), a corresponding recommendation              network based recommendation approach, we apply a
𝑅𝐸𝐶 = {𝑣𝛽 = 𝑣𝑎𝑙𝛽 , .., 𝑣𝛿 = 𝑣𝑎𝑙𝛿 } is a set of variable value     one hot encoding of the configuration variables, for exam-
assignments of 𝑣𝑖 ∈ 𝑉. A recommendation 𝑅𝐸𝐶 is consis-            ple, in session 1, the configured car is of type city. In the
tent if 𝑅𝐸𝐶 ∪ 𝑅𝐸𝑄 ∪ 𝑅 is consistent, i.e., a solution can be      scenario shown in Table 1, a simple case-based reason-
found.                                                            ing recommender would search for one or more nearest
                                                                  neighbors (NN) and recommend the variable settings that
  Example 3. 𝑅𝐸𝐶 = {𝑠𝑘𝑖𝑏𝑎𝑔 = 𝑛𝑜, 4-𝑤ℎ𝑒𝑒𝑙 = 𝑛𝑜, 𝑐𝑜𝑙𝑜𝑟 =
                                                                  were choosen most often by the nearest neighbors. In
𝑟𝑒𝑑}
                                                                  our example, the nearest neighbor (session) of the current
   Following the approach of case-based reasoning [16],           session is session 3 (in terms of the number of equivalent
it can be the case that recommended variable value as-            variable values). If we assume |NN|=1, we would recom-
signments are inconsistent with the already defined user          mend 𝑅𝐸𝐶 = {𝑠𝑘𝑖𝑏𝑎𝑔 = 𝑛𝑜, 4-𝑤ℎ𝑒𝑒𝑙 = 𝑛𝑜, 𝑐𝑜𝑙𝑜𝑟 = 𝑏𝑙𝑢𝑒} to
requirements and the rules (constraints) defined in the           the user in the current session (if we intend to predict all
knowledge base. This is the case if recommendations               unspecified variable values at the same time).
are determined from already completed configuration                  Importantly, since the current user is interested in a
sessions without taking into account configuration con-           car of type combi which requires the inclusion of a skibag
straints (rules in 𝑅). In the following, we provide a simple      (see Example 1), such a recommendation (𝑅𝐸𝐶) induces
example of a case-based reasoning approach and then               an inconsistency between the user requirements and the
focus on how to take into account configuration rules             configuration knowledge base (the set of rules 𝑅). A
in terms of a semantic regularization when optimizing a           traditional approach to deal with such a situation is to
neural network responsible for recommending variable              test the next recommendation for consistency and do this
settings.                                                         until a consistent recommendation could be found [13].
                                                                  Our approach (that will be introduced in the following)
                                                                  to deal with such a situation is to introduce a semantic
3. Recommending Configurable                                      regularization into the neural network learning phase
   Items                                                          which helps to avoid inconsistent recommendations as
                                                                  far as possible.
As already sketched in the previous section, recommenda-
tions in the context of configuration scenarios are repre-
                                                         Neural Network based Recommendation Our ba-
sented by a set of attribute assignments, i.e., a recommen-
                                                         sic approach to recommend variable values in the context
dation could include a single attribute setting but also
                                                         of rule-based configuration is based on the feed-forward
numerous settings recommended at the same time. In this
                                                         neural network structure depicted in Figure 1. In such
section, we discuss different approaches to recommend
                                                         networks, the input layer consists of possible values (rep-
variable value settings in the context of knowledge-based
                                                         resented in terms of a one hot encoding) that have already
configuration scenarios.
                                                         been specified by a user. For example, the variable values
                                                         (preferences) that have already been specified by the user
Case-based Recommendation Table 1 represents a in session 𝑐𝑢𝑟𝑟𝑒𝑛𝑡 are {𝑡𝑦𝑝𝑒 = 𝑐𝑜𝑚𝑏𝑖, 𝑝𝑑𝑐 = 𝑛𝑜, 𝑓 𝑢𝑒𝑙 = 2.6}.
simple example of a set of already completed configu- Networks as those depicted in Figure 1 can be trained
rations that can be used as a basis for determining rec- in a domain-dependent fashion on the basis of a dataset
ommendations. In this example, configuration sessions
                                type = city                                 𝑠𝑘𝑖𝑏𝑎𝑔 = 𝑦𝑒𝑠(0.3)
                                 𝑡𝑦𝑝𝑒 = 𝑙𝑖𝑚𝑜                                skibag = no(0.7)
                                           ...                              ...
                                 fuel = 1.7                                 color = red(0.8)
                                  𝑓 𝑢𝑒𝑙 = 2.6                               𝑐𝑜𝑙𝑜𝑟 = 𝑏𝑙𝑢𝑒(0.2)

Figure 1: A simple neural network architecture with an input layer representing specified (e.g., type and fuel) and un-specified
(e.g., skibag andcolor) variables, one hidden layer, and an output layer that helps to estimate variable values of relevance.


comprised of already completed configuration sessions            Constraint-Aware Recommendation For reasons
(see Sessions 1–3 in Table 1). Furthermore, the hidden           of potentially inconsistent recommendations, we have
layer is used for learning dependencies between input            introduced an enhanced neural network learning phase
values selected by the user and corresponding variable           including a semantic regularization where inconsistent
values of potential relevance for the user. The number of        recommendations are taken into account as regulariza-
nodes in the hidden layer is regarded as hyper-parameter         tion term. In other words, although parts of the domain-
to be optimized in an item domain dependent fashion (in          specific rules/constraints can be learned from the under-
[17] an equal amount of neurons in the input layer and           lying training dataset, it can be the case that some or
the hidden layer showed the best performance). Finally,          even many constraints are neglected and the resulting
the output layer supports a multi-branch approach (one           variable value recommendations induce an inconsistency.
branch per variable) where each branch 𝑏 is splitted into        We denote this approach as constraint-aware neural net-
𝑏𝑜 output nodes representing the different domain val-           works which are extremely relevant in recommendation
ues of variable 𝑣𝑏 . In contrast to the input and hidden         scenarios where domain-specific constraints/rules have
layer which use a ReLU activation function, classification       to be taken into account by the recommender. To re-
is implemented using softmax. The choice of the train-           duce the probability of inconsistent variable value rec-
ing hyper-parameters has been made based on several              ommendations, knowledge base rules/constraints are
test executions. Optimizer “Adam” [18] has shown the             taken into account in the learning process. This goal
best performance compared to other gradient decent op-           is achieved by integrating the results of a consistency
timizers like “ADAGRAD”, “RMSProp” and “SGD” [19].               check of the proposed recommendation 𝑅𝐸𝐶 (more pre-
“Adam” uses adaptive estimation of first order and sec-          cisely, 𝑅𝐸𝑄 ∪ 𝑅 ∪ 𝑅𝐸𝐶) into a corresponding loss function
ond order moments, which slows down the adjustment               as shown in Formula 1.
of neuron weights the more steps have been done. The
                                                                             𝐿(𝜃) ← 𝐿𝑡𝑟𝑎𝑖𝑛 (𝜃) + Ω(𝜃) + 𝜇 × Π(𝜃)             (1)
selected parameters for the “Adam” optimizer were an
initial learning rate of 0.001, 𝛽1 = 0.9 and 𝛽2 = 0.999.            In this context, 𝐿(𝜃) denotes a loss function on the vec-
Please note that this network architecture assumes cate-         tor 𝜃 of weights in the neural network, 𝐿𝑡𝑟𝑎𝑖𝑛 (𝜃) denotes
gorical variables (e.g., similar to our Example 1) – other       the prediction loss, Ω(𝜃) represents a corresponding 𝐿2
variable types require preprocessing such as binning or          regularization term, 𝜇 represents a hyper-parameter that
alternative architectures. Our neural network derived            controls the impact of an inconsistency on the overall
from the knowledge base introduced in Section 2 consists         loss, and Π(𝜃) indicates whether the recommendation re-
of 6 output nodes and also 9 input nodes (assuming the           sulting from 𝜃 is consistent (0 is returned) or inconsistent
example from above).                                             (1 is returned). As Π(𝜃) is a discrete non-differentiable
   In the basic version of our approach, neural networks         function, the optimization of our loss function has to
are trained on the basis of training dataset (see, e.g., ses-    resort to approximation of gradients via computation of
sions (1–3 in Table 1).The usage scenario of this basic          finite differences.
neural network approach is the following: if a user inter-
acts with a configurator and has already specified a set of      User Interaction and Knowledge Representation
initial requirements (𝑅𝐸𝑄), the neural network can be ex-        Our approach to neural network based variable value rec-
ploited for the recommendation of variable values. Since         ommendation for knowledge-based configuration helps
the basic version of the neural network can only learn           to reduce the probability of inconsistency-inducing rec-
constraints/rules from the available set of completed con-       ommendations (see Section 4) and thus also can help to
figurations, it can be the case that predictions induce an       make configuration processes less time-consuming for
inconsistency with the underlying rule set.                      users. The proposed recommendation approach is flexi-
                                                                 ble in the sense that recommendations for single-variable
                                                                 assignments as well as combined variable assignment
recommendations can be supported. In our working ex-            veloped a case-based reasoning approach (see Section 3)
ample, we did not take into account settings, where a           that recommends variable value settings on the basis of
configuration task is organized in phases where in each         the preferences of the 𝑁 nearest neighbors (in the given
phase a specific subcomponent of the product is config-         setting, 𝑁 = 1 achieved the highest prediction quality).
ured (e.g., software configuration as part of the configu-      Furthermore, the two versions of the neural network
ration of a whole computer). On the user interface level,       based approach have been implemented on the basis of
recommendations are mostly related to variables within          the Keras API [21]. 2 The learning of the neural network
a specific phase. However, recommendations can also             model is based on 32 iterations during the learning phase
be determined on the basis of existing variable settings        of the model where 80% of the data is used for training
from different phases.                                          purposes and 20% for testing. The first version of the
                                                                neural network model has been trained without taking
Recommendation Consistency The achievable de-                   into account the rules in the configuration knowledge
gree of recommendation consistency also depends on              base whereas the learning of the model for the second
the used knowledge representation. In the case of a rule-       version is based on the loss function included in Formula
based knowledge representation [6], it is not always fea-       1. For validation of the models they have been applied
sible to correctly predict if it is possible to complete a      separately to 20 configurations which where not part of
partial configuration, i.e., given a (consistent) set of cus-   the training or testing data.
tomer requirements, is it possible to find assignments
for the remaining variables in such a way that a consis-     Prediction Quality Our first goal was to analyze the
tent and complete configuration can be achieved. If a        prediction quality of the three variable value recommen-
more compact representation of all satisfiable variable      dation approaches discussed in this paper: (1) case-based
assignments is available [20], our approach can be ap-       recommendation, (2) neural network based recommenda-
plied to recommend the most relevant option among the        tion, and (3) neural network based recommendation with
consistent ones. In a similar fashion, constraint-based      semantic regularization. To measure the prediction qual-
approaches [5] can be applied to infer remaining vari-       ity, precision has been chosen as the key performance
able assignments that are still consistent with 𝑅𝐸𝑄 and      indicator. The precision of the recommendation of a
𝑅. In the following, we present the results of an empir-     configuration 𝐴𝑅𝑒𝑐 has been measured in terms of the
ical analysis of our constraint-aware recommendation         share of predictions part of the configuration accepted by
approach using a real-world dataset from the domain of       the user 𝐴 in relation to the total number of predictions
high-voltage switchgear configuration.                       contained in 𝐴𝑅𝑒𝑐 , see equation 2. In the context of our
                                                             evaluation, we were specifically interested in the predic-
                                                             tive performance depending on the number of already
4. Evaluation                                                known attribute values. The prediction task was specified
                                                             in such a way that given a chosen set of input attributes
The configurator application for the high-voltage
                                                             (the known settings representing 𝑅𝐸𝑄), the task was to
switchgear domain has been developed with the goal to
                                                             predict all other missing attributes to complete the con-
reduce engineering effort during the offering stage of
                                                             figuration. Since our configurator is organized in 5 con-
these highly complex systems. The underlying dataset
                                                             figuration phases, the phase number of a to be predicted
includes 𝑁 = 720 complete configurations developed by
                                                             variable had to be equal to the number of the current
skilled sales employees from Siemens Energy AG. Each
                                                             configuration phase and the phase number of known
entry of the dataset consists of 𝑀 = 60 attribute settings
                                                             variables (𝑅𝐸𝑄) had to be lower or equal to the phase
(assignments), i.e., each configuration is described by 60
                                                             number of the to be predicted variable. Consequently, the
variables (representing features and subcomponents). In
                                                             missing attributes were iterative predicted by selecting in
this context, 10 out of the 60 variables have been defined
                                                             each iteration the values for those attributes part of the
as basic switchgear features which are assumed to be
                                                             configuration phase with the lowest phase number. In
selected by the user before a recommendation can be
                                                             the following iteration the previously predicted variables
triggered (e.g., basic switchgear category to be installed).
                                                             have been utilized as known variables for predicting the
The focus of recommendation are the remaining 50 more
                                                             variables of the next configuration phase. This has been
specific features with a sometimes lower degree of un-
                                                             repeated till the configuration was complete.
derstandability where it is often an issue for users to find
good or even optimal settings (e.g., AC supply voltage).
The dataset is composed of consistent configurations that                                                   |𝐴 ∩ 𝐴|
                                                               𝑐𝑜𝑛𝑓 𝑖𝑔𝑢𝑟𝑎𝑡𝑖𝑜𝑛 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑎𝑡𝑖𝑜𝑛 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑅𝑒𝑐
have been built on the basis of a rule-based configuration                                                    |𝐴𝑅𝑒𝑐 |
system.1 As a baseline in our evaluation, we have de-                                                                 (2)
   1                                                               2
       www.camos.de.                                                   https://github.com/MaUt89/ConLearn
                                                                                             100
                                                                                                                avg. precision without SR
   Figure 2 provides an overview of the outcome of our         95                                                 avg. precision with SR

evaluation by averaging the precision achieved for each        90
                                                                                                                  avg. precision of CBR


of the 20 validation configurations applied to the models.     85


                                                                             precision (%)
                                                               80
The results clearly indicate the potential of prediction
                                                               75
quality improvements that can be achieved by the in-           70
clusion of semantic regularization concepts. Whereas,          65

(1) case-based recommendation achieves only a preci-           60

sion of 60.5% when starting with ten initially specified       55


variable values, the neural network based recommenda-             0  10     20        30      40         50 60
                                                                    number of already known variable values
tion approaches predict the variable values with 74% (2)
and 76.33% (3) precision. Noticeable is that, the neural Figure 2: Precision of high voltage switchgear related predic-
network based recommendation with semantic regular- tions (SR = semantic regularization, CBR = case-based reason-
ization (3) outperforms the approach without semantic ing).
regularization (2) in every validation scenario. Finally,                                                      avg. consistency without SR
with 50 out of 60 variable values given as an input for      100                                                avg. consistency with SR
                                                                                                                 avg. consistency of CBR
the configuration both neural network based approaches        99

reach a precision of 100%.                                    98


                                                                      consistency (%)
                                                                                             97

                                                                                             96
Consistency We were also interested in the degree of                                         95
consistency of the determined recommendations 𝑅𝐸𝐶,                                           94
i.e., consistent(𝑅𝐸𝐶 ∪ 𝑅𝐸𝑄 ∪ 𝑅). The consistency of the                                      93
recommendation of a configuration 𝐴𝑅𝑒𝑐 has been mea-                                              0   10       20       30       40       50    60
sured in terms of the share of knowledge base consistent                                              number of already known variable values
predictions 𝐴∗ part of the recommendation in relation
                                                                      Figure 3: Consistency of high voltage switchgear related
to the total number of predictions contained in 𝐴𝑅𝑒𝑐 , see
                                                                      predictions (SR = semantic regularization, CBR = case-based
equation 3. As can be seen in Figure 3, the semantic reg-
                                                                      reasoning).
ularization helps to decrease the inconsistency degree of
recommendations (compared to the CBR and the basic
neural network based approach). Starting with a consis-   rators and compared it with cased-based and basic neural
tency of 95.27% (ten variables values already known) the  network based recommendation. To improve the predic-
neural network based approach with semantic regulariza-   tion quality and consistency of recommendations, we
tion (3) reaches already with 30 initially known variable have introduced a semantic regularization approach that
values a consistency of 100%. Both other approaches (1)   helps to further increase the consistency degree of rec-
                                                          ommendations especially in the context of rule-based
and (2) achieve poorer results with ten initially given vari-
able values 93.59% (1) and 94.88% (2) and reach a consis- configuration scenarios. The presented approach has
                                                          been integrated into a industrial rule-based configuration
tency of 100% not until 40 variables are initially specified.
All in all, the higher consistency of approach (3) includ-environment that focuses on the configuration of high-
ing the semantic regularization has been expected since   voltage switchgears. The presented approach can in-
this approach is penalizing non-consistent predictions    crease both, prediction quality and consistency of the de-
during the learning phase of the model. Nevertheless,     termined recommendations. Furthermore, the approach
the impact could have been higher and the consistency     is generalizable to other types of configuration knowl-
especially with a low number of initially known variable  edge representations such as constraint satisfaction prob-
values is improvable. To achieve this an optimization     lems (CSPs).
of the hyper-parameter 𝜇 is desirable and part of future     Future work will focus on the integration of the de-
work.                                                     veloped concepts into model-based configuration knowl-
                                                          edge representations such as CSPs [5]. Furthermore, we
                                              ∗           will extend the scope of considered machine learning ap-
                                            |𝐴 ∩ 𝐴𝑅𝑒𝑐 |
𝑐𝑜𝑛𝑓 𝑖𝑔𝑢𝑟𝑎𝑡𝑖𝑜𝑛 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑎𝑡𝑖𝑜𝑛 𝑐𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑐𝑦 =               proaches a.o. with an integration of matrix factorization
                                              |𝐴𝑅𝑒𝑐 |     based variable value prediction. The dataset size used for
                                                      (3)
                                                          the evaluation presented in this paper can be considered
                                                          as a limitation of this work, in particular w.r.t. neural
5. Conclusions and Future Work                            network models. A major focus of future work will be
                                                          the evaluation of our approach with larger industrial con-
We have introduced an approach to the integration of figuration datasets. The neural network based prediction
recommendation features into knowledge-based configu- models will also be evaluated for their applicability in
the context of diagnosis scenarios, i.e., scenarios where       tion using the minimax decision criterion, Artificial
users receive recommendations regarding requirements            Intelligence 170 (2006) 686–713.
changes that help to get out from an inconsistent situa-   [12] J. Goldsmith, U. Junker, Preference handling for
tion. Also, we plan to investigate alternative formulations     artificial intelligence, AI Magazine 29 (2008) 9–12.
of the optimization problem, for example, with consis-     [13] A. Falkner, A. Felfernig, A. Haag, Recommendation
tency conditions being (partially) defined as optimization      technologies for configurable products, AI Maga-
constraints. Finally, although already integrated into the      zine 32 (2011) 99–108.
configuration environment of Siemens Energy, the evalu-    [14] M. Zanker, A Collaborative Constraint-based Meta-
ation of the proposed recommendation approach will be           level Recommender, in: ACM RecSys, ACM, Lau-
further extended especially with regard to the quality of       sanne, Switzerland, 2008, pp. 139–146.
the user interface and the need of additional explanations [15] D. Jannach, L. Kalabis, Incremental prediction of
for the proposed recommendations.                               configurator input values based on association rules
                                                                – a case study, in: Proceedings Workshop on Con-
                                                                figuration, 2011, pp. 32–35.
References                                                 [16] B. Smyth, Case-Based Recommendation, in: The
                                                                Adaptive Web, volume 4321 of LNCS, Springer,
 [1] M. Stumptner, An overview of knowledge-based
                                                                Berlin, Heidelberg, 2007, pp. 342–376.
     configuration, AI Communications 10 (1997)
                                                           [17] M. Uta, A. Felfernig, Towards machine learn-
     111–125.
                                                                ing based configuration, in: C. Forza, L. Hvam,
 [2] J. Landahl, D. Bergsjö, H. Johannesson, Future Alter-
                                                                A. Felfernig (Eds.), 22nd International Configura-
     natives for Automotive Configuration Management,
                                                                tion Workshop, 2020, pp. 25–28.
     Procedia Computer Science 28 (2014) 103–110.
                                                           [18] D. P. Kingma, J. Ba, Adam: A method for stochastic
 [3] J. Sincero, W. Schröder-Preikschat, The linux kernel
                                                                optimization, 2017. a r X i v : 1 4 1 2 . 6 9 8 0 .
     configurator as a feature modeling tool, in: Work-
                                                           [19] S. Ruder, An overview of gradient descent optimiza-
     shop on Analyses of Software Product Lines, ASPL,
                                                                tion algorithms, CoRR abs/1609.04747 (2016). URL:
     Limerick, Ireland, 2008, pp. 257–260.
                                                                http://arxiv.org/abs/1609.04747.
 [4] G. Fleischanderl, G. Friedrich, A. Haselböck,
                                                           [20] H. Andersen, H. Hulgaard, Boolean Expression Di-
     H. Schreiner, M. Stumptner, Configuring large sys-
                                                                agrams, in: 12th IEEE Symp. on Logic in Computer
     tems using generative constraint satisfaction, IEEE
                                                                Science, IEEE, Warsaw, Poland, 1997, pp. 88–98.
     Intelligent Systems 13 (1998) 59–68.
                                                           [21] F. Chollet, Keras, https://github.com/fchollet/keras,
 [5] E. Tsang, Foundations of Constraint Satisfaction,
                                                                2015.
     Computation in cognitive science, Academic Press,
     London, UK, 1993.
 [6] D. Dhungana, C. Tang, C. Weidenbach, P. Wis-
     chnewski, Automated Verification of Interac-
     tive Rule-based Configuration Systems, in: 28th
     IEEE/ACM Intl. Conference on Automated Software
     Engineering, IEEE, Silicon Valley, CA, USA, 2013,
     pp. 551–561.
 [7] R. Burke, Knowledge-based recommender systems,
     Encyclopedia of Library and Information Systems
     69 (2000) 180–200.
 [8] A. Felfernig, G. Friedrich, D. Jannach, M. Zanker,
     Constraint-based Recommender Systems, in: Rec-
     ommender Systems Handbook, 2 ed., Springer,
     Boston, MA, 2015, pp. 161–190.
 [9] L. Chen, P. Pu, Critiquing-based Recommenders:
     Survey and Emerging Trends, UMUAI 22 (2012)
     125–150.
[10] D. Winterfeldt, W. Edwards, Decision Analysis and
     Behavioral Research, Cambridge University Press,
     Cambridge, England, 1986.
[11] C. Boutilier, R. Patrascu, P. Poupart, D. Schuurmans,
     Constraint-based optimization and utility elicita-