<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Predicting Answers in Preference Elicitation Dialogs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ralph Samer</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>The elicitation of user preferences represents an effective means to identify the relevant user preference requirements for a configuration task. One common method to determine the preferences of users is to survey users using dialogs that consist of (multiple-choice) questions with selectable answers. A major drawback of dialog-based preference elicitation is that many new users are not willing to answer a fairly large number of questions which is a prerequisite for the identification of user constraints. To that end, we have developed a novel similarity-based approach which aims to solve such ramp-up (cold-start) scenarios by automatically completing a set of remaining questions in a preference elicitation dialog given a small set of pre-answered questions. Our approach has been evaluated with two small real-world datasets. Initial evaluation results reveal that our approach is able to find the most probable answers a respondent is likely to give to a set of remaining questions. First insights of an evaluation also show that our approach can keep the number of initial questions at a very low level, meaning that only between 35% and 50% of questions have to be asked in most cases in order to predict the complete set of user requirements. The results also indicate that this level can be further reduced with increasing amounts of training data.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Configuration systems are tools that help users to find matching
solutions / objects (e.g., items or services) in large object spaces
or to generate relevant solutions which satisfy all pre-defined
constraints [
        <xref ref-type="bibr" rid="ref1 ref13 ref14 ref8">1, 8, 13, 14</xref>
        ]. The set of constraints usually represents the
combination of a user’s preferences (user constraints) and the given
knowledge base (background knowledge constraints). In this context,
the active querying of user preferences within the scope of online
sessions [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] represents a proper method to elicit the preferences of a user
in an interactive and iterative fashion.
      </p>
      <p>The identification of matching solutions often presupposes an
extensive description of a user’s preferences as a major precondition
to gather a representative and complete set of user constraints. It
is important to mention that the complete set of user constraints
must meticulously reflect the imprint of the individual needs the
user has. This requires diligent questioning of the user’s individual
preferences in the form of preference elicitation dialogs. These
dialogs typically consist of a number of multiple-choice questions
specially geared to the respective configuration domain and task. The
main purpose of preference elicitation dialogs is to collect the user
preferences in order to create all relevant user constraints for the
configuration environment.</p>
      <p>
        A common challenge that often arises in this context is the
timeintensive preference elicitation process which involves many
questions and, hence, lengthy preference elicitation dialogs. Existing
research [
        <xref ref-type="bibr" rid="ref10 ref11 ref12">10, 11, 12</xref>
        ] shows that the explicit querying of all user
preferences is known to be very challenging which can result in situations
where users tend to leave the active querying process and discontinue
the dialog before all relevant user preferences can be captured.
      </p>
      <p>Therefore, smart mechanisms are needed to facilitate the
elicitation process of user preferences. To that end, we devised a novel
approach to reduce the length of preference elicitation dialogs. A major
advantage of our approach is to address cold-start issues which
occur whenever a configuration system has to capture the preferences of
a new user in a dialog-based preference elicitation process.
Preliminary evaluation results indicate that the developed approach is able to
effectively reduce lengthy preference elicitation dialogs which
typically consist of more than 20 multiple-choice questions. According
to our results, our approach achieves a reduction of up to 65% of
questions for most users (i.e., only 35% of all questions need to be
answered by these users in order to predict the missing answers and
to derive the complete set of user constraints).</p>
      <p>The remainder of this paper is structured as follows. In Section 2,
we introduce and explain our similarity-based prediction approach to
support new users in preference elicitation dialogs. Section 3 focuses
on the evaluation of our approach and a brief discussion of the
evaluation results. In Section 4, we give an overview of existing research
that is related to our work. Finally, Section 5 concludes this paper
and gives an outlook on ideas regarding future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Methodology</title>
      <p>In this section, we discuss an approach which predicts the (most
probable) answers of a user to the remaining questions in the
context of multiple-choice preference elicitation dialogs. The
purpose of preference elicitation dialogs is to support the
collection of user preferences necessary to define user constraints for a
configuration system.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Preliminaries</title>
      <p>
        In order to better understand the approach discussed in this paper,
we first introduce configuration systems and provide some formal
definitions. In general, a configuration system represents a
sophisticated search-based tool that finds solutions for problems by taking
into account a variety of variables and constraints. A configuration
problem can be regarded as a constraint satisfaction problem which
exploits search heuristics (i.e., search strategies) to find or generate
relevant solutions. In this particular context, a constraint satisfaction
problem defines a task (V, C) where V is a set of variables and C
represents a set of constraints. An assignment of a variable in V can be
regarded as consistent for the task (V, C) if it satisfies all constraints
in C (i.e., does not violate any of the constraints in C). The major
objective of a constraint solver is to ensure that all variable
assignments are consistent with C such that suitable solutions can be found
and presented – in case variable assignments are inconsistent with C,
diagnose and repair mechanisms can be applied (for further details,
we refer to [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]).
      </p>
      <p>The set of constraints C typically includes all system constraints
(SC ⊂ C) as well as all user constraints (U C ⊂ C). The system
constraints (SC) reflect the relevant background knowledge and the
user constraints (UC) represent unary constraints which refer to
concrete customer requirements. The system constraints have to be
predefined – they are typically specified by the system engineers during
the construction of the system. However, the user constraints need to
be collected by the users / customers. One common way that is often
used in practical scenarios is to precisely ask a customer a number
of questions in a dialog-based session. In these dialogs, many
questions are asked to elicit the user preferences and to derive the user
constraints (UC) for the underlying constraint satisfaction problem
(V, C). This usually represents a high effort for most users and
often results in undesired outcomes where users often leave the dialog
session before having answered all questions and seen any solution
presented by the configuration system. However, in order to
determine the relevant user constraints for a new user, the preferences of
the user have to be captured through a dialog-based preference
elicitation process. This situation is referred to as the cold-start
problem and constitutes a common well-known problem in the context
of recommendation and configuration systems. In the remainder of
this section, we describe a new approach to address this cold-start
problem for configuration systems.
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Technical Approach</title>
      <p>In order to reduce the overall effort for a user in preference
elicitation dialogs, we introduce a new approach to facilitate survey-based
preference elicitation. Our approach supports preference elicitation
dialogs with multiple-choice questions. It utilizes a
neighborhoodbased search strategy to determine the most suitable multiple-choice
questions which are needed to guide the user through a shortened
dialog and to infer remaining preferences from historic dialogs of
likeminded users. This shortened dialog allows to determine the
preferences of a user based on the answers declared by the user. The set of
preferences can then be completed by inferring missing preferences
from historical dialogs of other like-minded users. The answers of
the remaining questions given by the like-minded users define the
basis to complete the set of a user’s preferences.</p>
      <p>In order to make suitable predictions of missing answers in a
dialog session, the answering behavior of the respondent needs to
be captured. This requires our approach to pursue two major steps
which are (1) determine an initial set of questions which are most
appropriate to capture a respondent’s answering behavior in an
accurate fashion, and (2) collect the answers to these questions given
by the active respondent. In the first step, our approach seeks to
minimalize the number of initial questions, whereas in the second
phase our approach strives to recognize the answering behavior of
the active respondent.</p>
      <p>Table 1 depicts an example setting of 3 users who have already
answered all relevant questions. Each row corresponds to the answer
results of a user given in a preference elicitation dialog. Each
column refers to a multiple-choice question and presents the
answercombinations (i.e., choice combinations) given by the users /
respondents. For example, the value ”01” in Table 1 expresses that the user
has selected the second answer but not the first answer. We use this
table as a small working example to briefly explain the technical
details of our approach in the remainder of this section.</p>
      <p>In our approach, the initial questions represent those questions for
which all previous respondents have provided the most diverging
answers (i.e., strong variety of multiple-choice answer-combinations).
These questions represent the best entry point in a preference
elicitation dialog to be presented to a new respondent. The initial
questions are determined based on the answers given to these questions
in previous sessions by other respondents. For each question, we first
determine the most common / frequent multiple-choice answer and
then count how many respondents have provided an answer that
differs from the most frequent answer.</p>
      <p>Since each question can have a different number of
multiplechoice options (e.g., 2, 3, or 4 answer-fields can be selected), a
special peculiarity needs to be considered to determine the initial
questions. Regardless of the respondents’ answering-behavior, the
answers of questions with less choice-options (e.g., only 2
answerfields) typically tend to diverge more than the answers of questions
with many choice options (e.g., 4 answer-fields). For this reason, we
multiply our counted number (see above) with the binary logarithm
of the choice options the question has. This simple mathematical
operation is shown in Formula 1 where qx refers to a specific
question, counted value represents the number answers that differ from
the majority answer of qx, and answer options corresponds to the
number of choices that can be selected for qx. For example, the most
frequent answer of question 3 presented in Table 1 is 110 and, hence,
the divergence of question q3 can be estimated by applying
For1
mula 1 (div(q3) = ld(3) = 0.63). The divergences of the remaining
questions are div(q1) = ld3(2) = 3.0, div(q2) = ld0(2) = 0.0,
1
div(q4) = ld(4) = 0.5. Assuming that a new user ua starts a dialog,
we would thus choose q3 and q4 as initial questions (due to their high
divergence) to determine the first two (j = 2) initial questions.
div(qx) =</p>
      <p>counted value
ld(answer options(qx))
(1)</p>
      <p>Based on the answers a respondent gives to the initial questions,
a neighborhood-based mechanism attempts to find k (neighbor)
respondents in the second step, who have provided answers that are
either identical or closely similar (i.e., below a pre-defined threshold
parameter α) to the answers given by the current respondent. The
k neighbor respondents are determined by making pairwise
comparisons between the active respondent and the other users. We use the
cosine similarity to estimate how closely related the active
respondent is to another respondent by taking into account their individual
answers of the initial questions. For example, if we want to find the
k = 2 nearest neighbors of respondent ua, we need to investigate the
answers from Table 1 which were given to our initial questions q3
and q4. As can be seen in the table, the most similar users of ua are
u3 and u1. These closely similar users can be regarded as the top two
nearest neighbors or ua. In addition to the pairwise comparison and
the calculation of the cosine similarities, we also want to ensure that
the algorithm only takes into account closely similar nearest
neighbors. For this reason, we use a pre-defined parameter (denoted as α)
which defines a threshold for the minimum similarity that must exist
between the active respondent ua and another respondent ux such
that ux can be considered as nearest neighbor.</p>
      <p>In case (at least or exactly) k nearest-neighbor respondents can be
identified, the answers to the dialog’s remaining questions of these
neighbors can be compared. If these answers do not strongly diverge
(in terms of the average hamming distance3), a consensus-based
aggregation function (majority of the answers) is applied to identify
the most probable answers the active respondent is likely to give to
the remaining questions and the session can be closed without
asking any further questions. In case no or less than k neighbors could
be found or the answers of the neighbors diverge significantly, a
new iteration starts and the whole process is repeated. Our approach
then attempts to find the next best question for the current
respondent (i.e., the next most diverging question). After the respondent has
answered the next question, the k nearest neighbors are determined
once again. The whole process is repeated until k nearest neighbors
are found for whom the remaining questions have been answered
in a quite similar fashion (the answers do not differ so much from
each other and the standard deviation of the differences lies below
the aforementioned threshold).</p>
      <p>Due to the strict nature of selecting k nearest neighbors and our
threshold parameter α, there exist some rare cases where our
approach always fails to predict answers for new respondents even if
they have already answered almost all questions. The number of
respondents for whom this happens can be considered as quite low (see
Section 3) and mostly depends on the strictness of finding nearest
neighbors which can be controlled via the parameters α and k.
3</p>
    </sec>
    <sec id="sec-5">
      <title>Evaluation</title>
      <p>In this section, we present some preliminary evaluation results of our
approach (see Section 2). In order to carry out an early evaluation,
we used two real-world datasets to estimate the prediction quality of
our approach. The main purpose of the first dataset is to investigate
the scenario in which there are fewer questions overall, but enough
historical data is available to train our algorithm. Using the second
dataset, we want to evaluate our algorithm for certain situations in
which exactly the opposite is the case (i.e., fewer transactions and
longer dialogs with more questions). The first dataset (denoted as
DS-A) consists of 20 multiple-choice questions, 86 answers / options
(i.e., questions typically have 4.3 possible answers on average), and
125 historic sessions of different users. In this dataset the number of
user sessions outweighs the number of questions. In contrast to that,
the second dataset (denoted as DS-B) is more sparse in terms of the
number of historic user sessions but contains a significantly higher
number of questions. It consists of 78 multiple-choice questions, 301
answers / options (i.e., questions typically have 3.85 possible
answers on average), and only 32 historic sessions of different users.
3 The decision criteria was that the average hamming distance of the
neighbor’s answers given to all remaining questions must be below a pre-defined
threshold. The threshold value is learned via grid-search (see evaluation in
Section 3).</p>
      <p>To evaluate our approach, we split each dataset into a training set
(80%) and a test set (20%) and applied grid-search as well as k-fold
cross validation (k=10) for the purpose of selecting the best
hyperparameters (see parameters mentioned in Section 2). Furthermore,
we compared the performance results of our approach with a
basic neighborhood-based collaborative filtering approach which
represents our evaluation baseline. The prediction quality of our approach
was measured in terms of accuracy and the prediction error (mean
and standard deviation). Each training and test sample referred to a
single dialog session of a different user. During the training phase, we
took into account the entire content of each training sample. To test
and evaluate our approach and baseline approach, we only presented
the answers the users from the test set have provided for the initial
questions (the pre-answered questions) to the algorithm in order to
predict the answers of the remaining questions. These predicted
answers were then compared with the actual answers these users from
the test set have given for the remaining questions, in order to
compute the prediction accuracy (see below).</p>
      <p>A major difference that exists between our approach and the
baseline approach is the chronological order of the questions. While the
baseline approach (basic collaborative filtering) allows users to
answer questions in an arbitrary order before predictions are made in
a dialog session, the approach presented in this paper predetermines
the sequence of the initial questions that have to be answered by the
respondent (see description in Section 2). The main reason for
choosing such a baseline is that our approach and this baseline approach
are able to predict multiple answers per question. This stands in stark
contrast to existing approaches which are mentioned in Section 4.
Moreover, existing preference elicitation approaches do not support
multiple-choice responses and are thus inappropriate for drawing
reasonable comparisons between these approaches and our approach
(see Section 4 for further details).</p>
      <p>The two tables Table 2 as well as Table 3 show a comparison of the
evaluation results between our approach and the baseline after each
iteration for dataset DS-A and DS-B, respectively. In these tables, the
column pre-answered questions indicates how many pre-answered
questions of a user (from the test set) were provided as input to the
approach (or baseline approach, respectively) in order to predict all
answers of the remaining questions. Both tables present the average
prediction accuracy achieved for all users from the test set. Thereby,
we once measured the average accuracy in terms of the correctness
of the predictions from the viewpoint of the whole answer (see
column Avg. Accuracy (answer combinations)). This means that in this
case partly-correct answers are not considered as correct and only a
completely correct answer is regarded as a correct answer (i.e., all
multiple-choice options must be correct). Moreover, we also
compared the accuracy results between our approach and the baseline
approach from the perspective of individual answers (see column
Avg. Accuracy (individual answers)). In this case, the accuracy is
measured on the level of individual answers (i.e., selected
multiplechoice options / answers of a question) given for a question which
means that partly-correct answers are considered as partly-correct
rather than completely wrong. The user coverage rates are also stated
in a separate column in both tables (Table 2 and 3). In our evaluation,
the user coverage rate refers to the fraction of users from the test set
for whom our approach (or the baseline approach, respectively) could
predict the remaining answers in a dialog session. As mentioned in
Section 2 depending on some parameters and the answers provided
by a user, there exist cases in which our approach fails to make
predictions for the user. These cases get significantly smaller with an
increasing number of pre-answered questions. For this reason,
Table 2 and Table 3 present a list of results measured with a varying
number of pre-answered questions.</p>
      <p>In order to ensure comparability, we have used the same
hyperparameters (determined via grid-search and 10-fold CV) for both
approaches. The results presented in Table 2 and Table 3 (evaluation
results achieved on DS-A) indicate that our approach achieves high
accuracy scores on average. In almost all cases, these scores are
significantly better than the baseline results. Table 4 summarizes the
corresponding error rates of the predictions as well as the (standard)
deviation of the prediction error for both datasets. These values lie
in a reasonable small value range and can also be considered as
stable regardless of the number of pre-answered questions. This further
supports our observations regarding the prediction accuracy.</p>
      <p>Apart from these results, the most promising outcome of our
evaluation is the high effectiveness of our approach in terms of the user
coverage. It is obvious that the standard collaborative filtering
baseline approach requires a lot of user input in terms of pre-answered
questions (i.e., initial questions) in order to predict all remaining
answers for most users. In sharp contrast to that, our approach is able to
strongly increase the user coverage rate which allows the algorithm
to produce correct predictions for most users quite early in a session.
For example, accurate predictions can be made for half of the users
after 35% of the questions (in case of DS-A; 50% in case of DS-B)
have been answered by the users.</p>
      <p>A comparison between Table 2 and Table 3 reveals that our
approach heavily depends on the amount of training samples (historic
dialog sessions from different users). As can be seen in Table 3
(DSB), the first predictions could be made after around half of the
questions were asked. The main reason for this phenomenon is directly
related to the structure of dataset DS-B. As mentioned at the
beginning of this section, DS-B consists of 78 questions and only 32 dialog
sessions. Given such a large number of predicted answers and such a
small number of training samples (80% of 32 dialog sessions results
in 25 or 26 samples), the user coverage as well as the accuracy results
can still be regarded as promising. In particular, this stands out when
the achieved user coverage rates of our approach are compared with
those of the baseline approach.</p>
      <p>In general, the evaluation results of our approach reveal that our
approach represents a proper means to efficiently decrease the
number of pre-answered (initial) questions for most respondents / users
without having to lose much accuracy. This means that in many
practical situations, the developed prediction system allows these
respondents to leave the dialog unfinished (having less questions answered).
The more training data (i.e., historic session dialogs) are made
available to the prediction system, the higher the chances are that very
early predictions can be made for most users (see results achieved
on DS-A which contains many session dialogs of different users).
This is particularly essential for very complex configuration systems
which require many user constraints. In such cases, long dialogs with
many questions would be needed to gather the necessary user
preferences. By applying our prediction approach, these dialogs could
be enormously reduced once a sufficient amount of complete session
dialogs has been collected.</p>
      <p>Hence, the results also clearly show that our approach reaches its
full potential once a sufficient amount of data is available. First
empirical insights and observations (also with further datasets) indicate
that a sufficiently high number of complete dialogs (at least more
than 50 complete dialogs of different users) and a sufficiently high
number of questions (at least more than 15 questions) is required
in order to sharply reduce the number of questions by keeping the
uncertainty as well as the error rates at a low level. Moreover, the
used (i.e., learned via grid-search) hyperparameters (thresholds) have
proven to be a good compromise between achieving a good
prediction quality and reducing the length of a questionnaire for the
majority of respondents.</p>
      <p>Finally, another important observation that needs to be mentioned
is related to the user coverage rates. By taking a look at the
evaluation results achieved on both datasets, it can be observed that the user
coverage rate initially increases with the number of pre-answered
questions, but stops to grow fast after a certain level is reached
(approximately after 65% of pre-answered questions have been asked in
case of both datasets). This relates back to the phenomenon of rare
cases described in Section 2 where our prediction approach fails to
generate predictions due to our strict threshold parameters. An
effective way to counteract this phenomenon is to collect more training
data. Moreover, the threshold parameter could be increased (relaxed
constraint) which would, however, deteriorate the average prediction
accuracy. Another approach to tackle this phenomenon is explained
in the remainder of this section.</p>
      <p>
        Our empirical observations show that starting from this level of
pre-answered questions (in case of both datasets, around 65% of
preanswered questions) model-based collaborative filtering algorithms
(e.g., matrix factorization [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]) can also be successfully applied.
Typically, such algorithms are known to work well on very large and
sparse datasets [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Unlike our approach, further evaluations with
matrix factorization (MF) indicate that MF is less applicable to tackle
the cold-start scenario (i.e., make very early and accurate predictions
after only a few pre-answered questions are available) since it would
require large amounts of training data. Although these evaluations
are still ongoing and, hence, part of future work, we want to provide
some first insights. Our preliminary results show that the accuracy of
MF was even worse when compared to our baseline approach which
was the reason why we did not choose MF as a baseline in the
evaluation presented in this paper. However, our still ongoing evaluation
shows that in a later scenario (i.e., after more pre-answered questions
are available) MF seems to attain comparable accuracy results also
for candidates where our approach (presented in this paper) as well
as the baseline approach fail to predict all answers of the
remaining questions. This is due to the reason that even though we operate
on quite small datasets, these datasets can be regarded as complete
datasets (without missing data) where only the remaining part of the
active dialog session is missing (approx. 35% of questions that have
not yet been answered) and needs to be estimated by MF. In contrast
to the approach presented in this paper, MF could thus be used as a
fallback solution for those rare cases as it can deliver predictions for
all users immediately without asking any further questions once they
have already pre-answered more than half of the questions. Thereby,
the prediction error of MF can be expected to reach a similarly low
level when compared to our approach. Hence, predictions can be
generated for all active dialog sessions and a maximum user coverage
rate of 100% can be achieved which means that predictions can be
made for all users once more than half of the questions have been
answered by these users. Moreover, model-based algorithms based on
collaborative filtering (such as matrix factorization) are also suitable
for large datasets to address scalability issues in an efficient manner.
4
      </p>
    </sec>
    <sec id="sec-6">
      <title>Related Work</title>
      <p>
        To the best of our knowledge, there exists a relatively limited number
of solutions to facilitate preference elicitation dialogs in the context
of configuration systems. In particular, some research has been
conducted which investigates the use of recommendation techniques to
Pre-answered questions
4 (20%)
5 (25%)
6 (30%)
7 (35%)
8 (40%)
9 (45%)
10 (50%)
11 (55%)
12 (60%)
13 (65%)
14 (70%)
15 (75%)
16 (80%)
17 (85%)
18 (90%)
19 (95%)
39 (50%)
43 (55%)
47 (60%)
51 (65%)
55 (70%)
58 (75%)
63 (80%)
66 (85%)
70 (90%)
74 (95%)
Pre-answered questions
proactively support users in configuration scenarios for the sake /
purpose of preference elicitation [
        <xref ref-type="bibr" rid="ref16 ref2 ref4">2, 4, 16</xref>
        ]. Examples thereof include
the recommendation of features as well as feature values [
        <xref ref-type="bibr" rid="ref16 ref4">4, 16</xref>
        ], the
proposal of reasonable relaxations to repair inconsistencies of user
requirements in situations where no solution can be identified [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
and the reuse of previous cases (that resemble the current problem)
to determine new configurations [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Tiihonen and Felfernig [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]
discuss recommendation techniques which help to complete
configurations in ongoing configuration sessions. Moreover, Falkner et al.
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] propose recommendation techniques that interpret the user
preference elicitation process as a dialog session where users are asked
to specify feature values (i.e., the user requirements / constraints) for
features presented in a pre-ranked order. The proposed approaches
exploit different techniques to select and rank the features that are
presented to the user and based on a user’s selected values,
explanations are shown to mark potential inconsistencies.
      </p>
      <p>
        Further research is related to the automated support of active
questionnaire dialogs. Stettinger et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] introduce an e-learning
environment which exploits different recommendation techniques
following the purpose of intelligent question answering. The used
techniques identify the most relevant learning contents and
multiplechoice questions for a user that have to be repeated in future learning
sessions. The presented approach is particularly useful to counteract
forgetting processes in the context of online learning.
      </p>
      <p>A major strength of the approach presented in this paper is the
predetermination of a reduced set of textual questions with
multiple responses (i.e., multiple answers can be selected for each
question) whereas existing approaches focus on the assignment of single
values to given variables. This stands in stark contrast to all
existing approaches of related research discussed above. Another major
difference that exists between our approach and most existing
approaches, is related to the nature how the chronological order of the
proposed initial questions is determined. Apart from that, available
solutions focus on directly asking users to assign preferred values to
variables / features instead of indirectly eliciting the preferences by
asking more detailed verbose questions. Moreover, most existing
approaches do not go beyond the level of ranking or feature selection
and thus are not able to complete an ongoing session dialog. Due
to the very simple nature of most approaches, these approaches are
less suitable for drawing reasonable comparisons between these
approaches and our approach. In particular, this applies above all to the
measurement of the prediction accuracy (see Section 3), which needs
to be assessed on the level of multiple-choice questions (instead of
single-choice questions as presented in previous work).
5</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion &amp; Future Work</title>
      <p>Conclusion. Configuration systems follow a constraint satisfaction
problem (CSP) oriented approach to find or generate suitable
solutions for a user. Due to the nature of CSP-based approaches,
constraints have to be defined which should precisely reflect the domain
knowledge as well as the individual preferences of the user. In
particular, the elicitation of user preferences typically requires a lot of
input of the users which can be collected by asking specific questions
in a dialog-based style. In this paper, we introduced a new concept
to facilitate the preference elicitation process. The major objective
of our approach is to address user-related cold-start problems by
reducing the length and duration of preference elicitation dialogs. This
helps to significantly raise the efficiency of the overall
configuration process. The evaluation results show that our approach has the
potential to significantly reduce the number of pre-answered
questions in most dialog sessions. According to our results, only between
35% and 50% of questions have to be explicitly answered by most
users before our approach can predict the answers of all remaining
questions. This makes our approach well suited to guide new users
through dialog-based preference elicitation sessions for most
complex configuration scenarios which involve an extensive variety of
variables. One threat of validity is related to the limited explanatory
power of our evaluation results. The evaluation of the presented
approach was conducted with two smaller datasets. A reevaluation with
different and larger datasets may lead to small deviations of the
results presented in this paper. Moreover, it is important to point out
that the applicability of our approach can also be considered to be
somehow limited to more complex configuration scenarios and
domains where users want to complete a configuration session as fast
as possible.</p>
      <p>Future Work. The currently applied methodology for the
selection of the next relevant question is based on dissimilarity. In future
evaluations it is planned to evaluate additional methodologies based
on further statistical metrics with regard to the prediction quality of
the remaining answers in preference elicitation dialogs. Moreover,
the support of system constraints (background knowledge) represents
another important issue for future work. The strict consideration of
system constraints requires to rule out some answer-combinations
in a preference elicitation dialog, since such answers would
immediately introduce inconsistencies and conflicts. Within the scope of
future work, we want to extend our existing approach to
automatically extract rules from the system constraints and include these rules
to disallow certain answers in a dialog-based session. Furthermore,
further ideas for future work are related to apply automatic
diagnoses and present explanations on how to resolve a conflict
whenever users try to select a disallowed answer combination (e.g., an
answer combination that would lead to the variable assignment
ElectricalDrive=YES ∧ gearbox=MANUAL would violate the existing
knowledge base of a car configuration system since manual
transmission is not feasible for electrically powered cars). Besides of the
benefits our approach has for configuration-based systems, the presented
approach can also be applied in questionnaire- or exam-related
scenarios. Due to the fact that a significantly smaller amount of
questions has to be answered by users in a multiple-choice dialog, the
given answers are most likely of higher quality. This results from the
fact that the respondents tend to lose concentration or provide sloppy
feedback after having answered a large number of questions.
Additionally, the reduction of the amount of asked questions can lead to a
significant increase of the numbers of respondents in real-world
surveys or questionnaires, since less time is needed and thus more users
can be motivated to participate. In the scope of future work, we plan
to conduct further investigations and studies to ground the mentioned
hypothesis. Other application domains of the mentioned techniques
could be a precise forecast of personal opinions related to societal
/ political / social / economic topics even though the person is not
forced to answer questions which focus on these aspects.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C. C.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          ,
          <source>Knowledge-Based Recommender Systems</source>
          ,
          <volume>167</volume>
          -
          <fpage>197</fpage>
          , Springer International Publishing,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ardissono</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Friedrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Goy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          , G. Petrone,
          <string-name>
            <given-names>R.</given-names>
            <surname>Schafer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Zanker</surname>
          </string-name>
          , '
          <article-title>A framework for the development of personalized, distributed web-based configuration systems'</article-title>
          ,
          <source>AI Magazine</source>
          ,
          <volume>24</volume>
          (
          <issue>3</issue>
          ),
          <fpage>93</fpage>
          , (Sep.
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bokde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Girase</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Mukhopadhyay</surname>
          </string-name>
          , '
          <article-title>Role of matrix factorization model in collaborative filtering algorithm: A survey'</article-title>
          ,
          <source>Journal of Advance Foundation and Research in Computer (IJAFRC)</source>
          ,
          <volume>1</volume>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Co</surname>
          </string-name>
          <article-title>¨ ster, A</article-title>
          . Gustavsson,
          <string-name>
            <given-names>T.</given-names>
            <surname>Olsson</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Rudstro¨ m, '
          <article-title>Enhancing web-based configuration with recommendations and cluster-based help'</article-title>
          ,
          <source>in In Proceedings of the AH'2002 Workshop on Recommendation and Personalization in eCommerce,</source>
          (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S. P.</given-names>
            <surname>Erdeniz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Samer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Atas</surname>
          </string-name>
          , '
          <article-title>Matrix factorization based heuristics for constraint-based recommenders'</article-title>
          ,
          <source>in Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, SAC '19</source>
          , p.
          <fpage>16551662</fpage>
          , New York, NY, USA, (
          <year>2019</year>
          ).
          <article-title>Association for Computing Machinery</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Falkner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Haag</surname>
          </string-name>
          , '
          <article-title>Recommendation technologies for configurable products'</article-title>
          ,
          <source>AI Magazine</source>
          ,
          <volume>32</volume>
          (
          <issue>3</issue>
          ),
          <fpage>99</fpage>
          -
          <lpage>108</lpage>
          , (Oct.
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          , G. Friedrich,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schubert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mairitsch</surname>
          </string-name>
          , and E. Teppan, '
          <article-title>Plausible repairs for inconsistent requirements'</article-title>
          ,
          <source>in Proceedings of the 21st International Jont Conference on Artifical Intelligence, IJCAI'09</source>
          , p.
          <fpage>791796</fpage>
          , San Francisco, CA, USA, (
          <year>2009</year>
          ). Morgan Kaufmann Publishers Inc.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          , J. Spo¨ cklberger, R. Samer,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stettinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Atas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tiihonen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Raatikainen</surname>
          </string-name>
          , '
          <article-title>Configuring release plans'</article-title>
          ,
          <source>in Proceedings of the 20th Configuration Workshop</source>
          , Graz, Austria,
          <source>September 27-28</source>
          ,
          <year>2018</year>
          ., pp.
          <fpage>9</fpage>
          -
          <lpage>14</lpage>
          , (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Koren</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Volinsky</surname>
          </string-name>
          , '
          <article-title>Collaborative filtering for implicit feedback datasets'</article-title>
          , in 2008 Eighth IEEE International Conference on Data Mining, pp.
          <fpage>263</fpage>
          -
          <lpage>272</lpage>
          , (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lerche</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Zanker</surname>
          </string-name>
          ,
          <source>Recommending Based on Implicit Feedback</source>
          ,
          <fpage>510</fpage>
          -
          <lpage>569</lpage>
          , Springer International Publishing, Cham,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G.</given-names>
            <surname>Jawaheer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Szomszor</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Kostkova</surname>
          </string-name>
          , '
          <article-title>Comparison of implicit and explicit feedback from an online music recommendation service'</article-title>
          ,
          <source>in Proceedings of the 1st International Workshop on Information Heterogeneity and Fusion in Recommender Systems, HetRec '10</source>
          , p.
          <fpage>4751</fpage>
          , New York, NY, USA, (
          <year>2010</year>
          ).
          <article-title>Association for Computing Machinery</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>D.</given-names>
            <surname>Oard</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Kim</surname>
          </string-name>
          , '
          <article-title>Implicit feedback for recommender system'</article-title>
          ,
          <source>Proceedings of the AAAI Workshop on Recommender Systems, (07</source>
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Sabin</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Weigel</surname>
          </string-name>
          , '
          <article-title>Product configuration frameworks-a survey'</article-title>
          ,
          <source>IEEE Intelligent Systems</source>
          ,
          <volume>13</volume>
          (
          <issue>4</issue>
          ),
          <fpage>4249</fpage>
          , (
          <year>July 1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Salem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hong</surname>
          </string-name>
          , and W. Liu, '
          <article-title>History-guided conversational recommendation'</article-title>
          ,
          <source>in Proceedings of the 23rd International Conference on World Wide Web, WWW '14 Companion</source>
          , p.
          <fpage>9991004</fpage>
          , New York, NY, USA, (
          <year>2014</year>
          ).
          <article-title>Association for Computing Machinery</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Stettinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Tran</surname>
          </string-name>
          , I. Pribik,
          <string-name>
            <given-names>G.</given-names>
            <surname>Leitner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Samer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Atas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Wundara</surname>
          </string-name>
          , 'Knowledgecheckr:
          <article-title>Intelligent techniques for counteracting forgetting'</article-title>
          ,
          <source>in ECAI 2020 - 24th European Conference on Artificial Intelligence, including 10th Conference on Prestigious Applications of Artificial Intelligence, PAIS 2020 - Proceedings, Frontiers in Artificial Intelligence and Applications</source>
          , pp.
          <fpage>3034</fpage>
          -
          <lpage>3039</lpage>
          . IOS Press, (
          <year>August 2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>Tiihonen</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          .
          <source>Towards recommending configurable offerings</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>H.-E. Tseng</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.-C. Chang</surname>
            , and
            <given-names>S.-H.</given-names>
          </string-name>
          <string-name>
            <surname>Chang</surname>
          </string-name>
          , '
          <article-title>Applying casebased reasoning for product configuration in mass customization environments'</article-title>
          ,
          <source>Expert Systems with Applications</source>
          ,
          <volume>29</volume>
          (
          <issue>4</issue>
          ),
          <fpage>913</fpage>
          -
          <lpage>925</lpage>
          , (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>