=Paper= {{Paper |id=Vol-2467/paper-03 |storemode=property |title=Conversational Recommendations Utilizing Model-based Reasoning |pdfUrl=https://ceur-ws.org/Vol-2467/paper-03.pdf |volume=Vol-2467 |authors=Oliver Tazl,Alexander Perko,Franz Wotawa |dblpUrl=https://dblp.org/rec/conf/confws/TazlPW19 }} ==Conversational Recommendations Utilizing Model-based Reasoning== https://ceur-ws.org/Vol-2467/paper-03.pdf

Conversational Recommendations
Using Model-based Reasoning
Oliver A. Tazl and Alexander Perko and Franz Wotawa1

Abstract. Chatbots as conversational recommender have gained in- and finally introduce the results of an evaluation of the implementa-
creasing importance over the years. The chatbot market offers a va- tion. The evaluation is based on a case study from the tourism do-
riety of applications for research and industry alike. In this paper, we main, i.e., a scenario where a user wants to book a hotel in a certain
discuss an implementation that supports the use of our recommen- city. The obtained results show that the proposed chatbot approach is
dation algorithm during chatbot communication. The program eases applicable and beneficial for the intended purpose. Furthermore, we
communication and improves the underlying recommendation flow. gained experiences about the limitations of the approach. For exam-
In particular, the implementation makes use of our model-based rea- ple, it seems that entropy is not always the best measure for selecting
soning approach for improving user experience during a chat, i.e., in questions to be answered, and further research is needed.
cases where user configurations cause inconsistencies. The approach The main contributions of this paper can be summarized as the
deals with such issues by removing inconsistencies in order to gener- follows:
ate a valid recommendation. In addition to the underlying definitions,
we demonstrate our implementation along use cases from the tourism 1. An implementation of an algorithm that is based on model-based
domain. diagnosis and Shannon’s information entropy to solve recommen-
dation problems and
2. the evaluation of the system with synthetic and real-world data
1 INTRODUCTION sets.
Recommender systems aim to lead users in a helpful and individu-
The remainder of this paper is organized as follows: In the next
alized way to interesting or useful items drawn from a large space
section we give an overview of our algorithmic approach. After-
of possible options. Recommender systems may utilize knowledge-
wards, we present the implementation of the algorithms and show
bases for guiding the users through the whole process of finding the
evaluation results in greater detail. Finally, we discuss related re-
right recommendation, i.e., a recommendation that satisfies the user’s
search and conclude the paper.
requirements, needs, or expectations. Most recently, conversational
agents like chatbots have gained importance because of the fact that
they – in principle – offer a well-known and ideally more intuitive 2 FOUNDATIONS AND ALGORITHM
interface for human users, i.e., either textual or speech interaction.
In our previous work [17], we introduced the algorithm EntRecom,
In previous work [17] we introduced the basic foundations and
which utilizes model-based diagnosis and in particular the ConDiag
principles behind a chatbot-based recommender system that allows
algorithm [18], and on a method that applies Shannon’s information
to interact with users in a smart way being capable of finding contra-
entropy [23]. To be self-contained, we briefly recapitulate the under-
dictions during observation and efficiently pruning the overall rec-
lying definitions and EntRecom. We first formalize the inconsistent
ommendation process via selecting the right questions to be asked
requirements problem by exploiting the concepts of Model-Based
to the user. The basic principles behind our approach rely on classi-
Diagnosis (MBD) [1, 20] and constraint solving [2].
cal model-based reasoning. In case, the conversation leads to an in-
The inconsistent requirements problem requires information on
consistent state, e.g., caused by contradictions between user require-
the item catalog (i.e., the knowledge-base of the recommendation
ments and the recommendation knowledge-base, the chatbot is able
system) and the current customer’s requirements. Note that the
to react and to resolve this issue. For this purpose, the chatbot asks
knowledge-base of the recommender may be consistent with the cus-
the user which requirements to retract in order to eliminate incon-
tomer’s requirements (i.e., the customer’s query) and an appropriate
sistencies. In the case where the chatbot has far to many solutions
number of recommendations can be offered. In this case, the recom-
to be presented effectively to the user, the system makes use of an
mendation system shows the recommendations to the customer and
entropy-based approach for selecting those requirements or attributes
no further algorithms have to be applied. Otherwise, if no solutions
that have to be fixed in order to reduce the number of possible solu-
to the recommendation problem are available, then the minimal set
tions. When using entropy the number of steps necessary to reach to
of requirements, which determined the inconsistency with the knowl-
a solution can be substantially reduced.
edge base, have to be identified and consequently offered to the user
This paper is a direct successor of our previous work, where we
as explanation for not finding any recommendation. The user can in
report on an implementation of our chatbot approach. In particular,
this case adapt the requirement(s) (relax it/them). Here we borrow the
we discuss the implementation details, present experiences gained,
idea from MBD and introduce abnormal modes for the given require-
1 Graz University of Technology, Austria, email: {oliver.tazl, perko, ments, i.e., we use Ab predicates stating whether a requirement i is
wotawa}@ist.tugraz.at should be assumed valid (¬Abi ) or not (Abi ) in a particular context.

Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
The Ab values for the requirements are set by the model-based di- Algorithm 1 EntRecom(KBD , REQ, n)
agnosis algorithm so that the assumptions together with the require-
ments and the knowledge-base are consistent. In the following, we Input: A modified knowledge base KBD , a set of customer
define the inconsistent requirements problem and its solutions. requirements REQ and the maximum number of recommendations
More formally, we stating the inconsistent requirements problem n
as follows: Output: All recommendations S

Definition 1 (Inconsistent Requirements Problem). Given a tuple 1: Generate the constraint model CM from KBD and REQ
(KB, REQ) where KB denotes the knowledge base of the recom- 2: Call CSolver(CM ) to check consistency and store the result
mender system, i.e., the item catalog, and REQ denotes the cus- in S
tomer requirements. The Inconsistent Requirements Problem arises 3: if S = ∅ then
when KB together with REQ is inconsistent. In this case we are 4: Call MI REQ(CM, |REQ|) and store the inconsistent re-
interested in identifying those requirements that are responsible for quirements in IncReqs
the inconsistency. 5: Call askUser(IncReqs) and store the answer in
AdaptedReqs
A solution or explanation to the inconsistent requirements prob- 6: CM = KB ∪ (REQ \ IncReqs ∪ AdaptedReqs)
lem can be easily formalized using the analogy with the definition of 7: go to Step 2
diagnosis from Reiter [20]. We first introduce a modified representa- 8: end if
tion of (KB, REQ) comprising (KBD , REQ) where KBD com- 9: while |S| > n do
prises KB together with rules of the form AbR for each requirement 10: Call GetBestEntrAttr(AS ) and store the result in a
R in REQ. The solution to the Inconsistent Requirements Problem 11: AS = AS \ a
can now be defined using the modified representation as follows: 12: Call askUser(a) and store the answer in va
13: S = R (S, va ))
Definition 2 (Inconsistent Requirements). Given a modified recom- 14: end while
mendation model (KBD , REQ). A subset Γ ⊆ REQ is a valid set 15: return S
of inconsistent requirements iff KBD ∪ {¬AbR |R ∈ REQ \ Γ} ∪
{AbR |R ∈ Γ} is satisfiable. 3 IMPLEMENTATION AND EVALUATION
A set of inconsistent requirements Γ is minimal iff no other set
of inconsistent requirements Γ0 ⊂ Γ exists. A set of inconsistent For the user, the interaction with the recommender system within
requirements Γ is minimal with respect to cardinality iff no other set the chatbot starts, when he or she formulates a query for a search
of inconsistent requirements Γ0 with |Γ0 | < |Γ| exists. From here within the domain. The natural language processing framework
on we assume minimal cardinality sets when using the term minimal Rasa2 parses this query and passes it on to the EntRecom-algorithm.
sets. The recommender algorithm then searches a previously conducted
The second problem occurring during a recommendation session and preprocessed knowledge base. Depending on the satisfiability of
is the availability of too large number of recommendations, which the generated constraint model, results are presented to the user. If
have to be narrowed down to a reasonable number. The too many necessary the user is asked follow up questions regarding his or her
recommendation problem can be solved again using ideas bor- requirements until we have results to return to the user.
rowed from model-based diagnosis. In diagnosis, we have the similar
problem of coming up with too many diagnoses because of too less
observations known. The corresponding problem in case of recom-
mendation is that we do have far too less requirements from the user.
Hence, we have to ask the user about adding more information in
order to reduce the number of available solutions. In model-based
diagnosis Shannon’s information entropy [23] is used to come up
with observations and in our case requirements that should be known
in order to reduce the recommendations as fast as possible.
Algorithm 1 provides recommendations in the context of chatbots
making use of diagnosis and Shannon’s information entropy com-
putation. EntRecom converts the available knowledge into a corre- Figure 1: A Chatbot-Based Recommender as Bridge between User
sponding constraint model and checks its consistency. If the knowl- Input and Results from a Database
edge is inconsistent, the algorithm tries to find a requirements that
can be retracted by the user in order to get rid of the inconsistency.
When the query can be answered, EntRecom exits and the user
Afterwards, EntRecom searches for the best requirement to be set by
can continue interacting with the Rasa-based chatbot.
the user in order to reduce the number of solutions if necessary. The
As mentioned before, we choose a tourism domain, searching for
algorithm stops when reaching a set of recommendations that has a
a hotel to be more specific. The process of searching a hotel in an it-
cardinality of less than n.
erative conversation helps us to see the capabilities and shortcomings
With the provide algorithms a chatbot for recommendations can
of the algorithm.
be build that is able to deal with inconsistent requirements as well as
missing requirements in a more or less straightforward way making
2 see https://rasa.com/
use of previously invented algorithms.
3.1 Framework and Data Preprocessing every attribute in the domain and the second one being string trans-
lation for every value, regardless of the specific class of an attribute.
For our tests, we chose the community-curated data from Open Street In our implementation, every attribute and value is translated to a
Maps (OSM). We use the python bindings for Microsofts’ Z3 frame- Z3 string. While this may result in slower runtimes, we ensure flex-
work as a constraint solver. For language processing and basic inter- ibility, as data types may vary across attributes. Usually, the user is
action with the user, we utilize the Rasa framework. Rasa is used for asked to select a region of interest. For our tests, we use a data set
natural language understanding by extrating the users’ intend and re- of a specific size. This is a very critical and, depending of the size
quirements but can also respond immediatly with an answer (without of the test set, time consuming step. Therefore, we try to do it only
calling EntRecom) when it detects the users’ intend to chitchat. once when the user initially specifies an area. This Z3 constraint
In a first step, we query the OSM-API for hotels in a predefined model, consisting of OR-connected uniform clauses of Z3 strings is
region. The size of the region is a major factor for performance. Be- our main knwoledge base within EntRecom and shall be refered to
cause of this, it is reasonable to let the user select a region on initial- as kb and subsets therof as S in the remainder of this paper.
ization. The first interaction the user has with our chatbot implementa-
tion is handled via the trained natural language understanding (NLU)
model within Rasa. If the user’s intent to search within the domain
”hotels”, our recommender is called via a webhook, and the param-
eters are passed over. This call to our internal API is the only inter-
action between EntRecom and the chatbot framework, Rasa, in our
case, which leads to very low coupling and a high degree of flexibil-
ity. The parameters represent the NLUs interpretation of the users’
query and give us our initial set of requirements.

3.2 Recommender Algorithm
In this section, we are describing the implementation of the previ-
ously introduced algorithm EntRecom.
Before the algorithm ready to use, we have to prepare our data set,
Figure 2: Data from Open Street Maps generate the constraint model and interpret the user’s intent. Then
we enter the EntRecom-implementation. As described before, if the
query is satisfiable for our knowledge base and the given maximum
To conduct our internal knowledge base, we process the exported number of results, we return our recommendations and exit EntRe-
data set in the following way: As some attributes do not add human- com.A maximum of n hotels is presented to the user, we retrieved
readable information or tend to mislead users, we have to filter them from the knowledge base, with n representing a preselected maxi-
out. This is especially necessary because some classes of attributes mum number of results.
have a severe impact on later steps in our recommender system. For Though, as stated in [17], we are confronted with two potential
example, categories like ”fixme” (annotation from an OSM user) problems at this point. Given a knowledge base kb, a maximum num-
would give the user no advantage in a real-world scenario and in this ber of results n and a user-defined set of requirements REQ:
case, would lead to unwanted recommendations. We use a whitelist-
filter for the attributes of every entry in the data set. Furthermore, not • We could get too many results to present in a meaningful way. In
every entry in the OSM data has the same fields, why we maintain a this case, the function GetBestEntrAttr is called.
list of all attributes in a data set. If an entry does not include a cer- • We could get no results at all. In this scenario, we call the function
tain attribute, we add it with a value of False. This leaves us with a M iREQ.
”normalized” data set.
3.2.1 GetBestEntrAttr
This part of the algorithm is called, when the query is satisfiable, but
the result does not lie within [n]. Therefore, we have to add further
constraints to our model. Because we want to occupy as little of the
user’s time as possible, we have to efficiently select additional con-
straints. This is done by choosing a category out of the domain which
Figure 3: Preprocessed Data Set best splits the current subset S of the data set. The criterion for our
selection is Shannon’s information entropy [22]:
After our preprocessing step, every entry in the dictionary has
X
H(X) = − P (xi )log(P (xi ))
the same number of attributes we add uniform clauses to our con-
i
straint model. An exemplary normalized clause looks like this:
(amenity = ”none” ∧ addr : city = ”Graz” ∧ name = To apply entropy as a measure, we have to restructure the
”P arkhotelGraz” ∧ cuisine = ”none” ∧ . . . ∧ smoking = data slightly. AS represents this restructured version of S, which
”isolated” ∧ wheelchair = ”limited” ∧ swimming − pool = maps every attribute in the domain to all values of its occur-
”none” ∧ stars = ”4” ∧ tourism = ”hotel”). Regarding data rences in S. This is realized with python dictionaries of the form:
types, we choose between two different approaches, when creating ”attribute1 ” : [”value1 ”, ”value2 ”], ”attribute2 ” : [”value1 ”], ....
the Z3 constraint model. The first one being data type selection for After computing the number of occurrences of every value for a
(a) SAT
(b) MiREQ (c) GetBestEntrAttr

Figure 4: User Interface

specific attribute, we now calculate the entropy, for every attribute. consisting of our knowledge base and requirements with varying val-
The attribute with the highest entropy splits the data set S most ues for AB. Given the cardinality of Γ, we iterate over all possible
effectively. As mentioned above, some attributes may lead to distributions of AB with |REQ| − |Γ| considered requirements. Be-
unwanted recommendations. One reason for this is the appearance of cause we want to preserve as much of the user’s initial query as pos-
seemingly random values when looked at them lacking context. This sible, we want to find a minimal set of inconsistent constraints. As
could be attributes internally used within OSM, for instance. Fields proposed in [17], to find such a minimal set of inconsistencies, we
like ”source” often have seemingly random values like ”survey” or have to obtain a constraint model that is satisfiable for the smallest
”Kleine Zeitung”. The same pseudo-randomness can be observed cardinality of Γ possible. For this, we repeat the process of check-
with attributes like ”housenumber” which can take an arbitrary ing on combinatoric variations of subsets, with increasing numbers
integer value. Those values appear in many data points and, when of assumed inconsistencies, starting with one. This we do until we
observed isolated, do not contribute any information with regard reach |REQ|, in which case there are no consistent requirements.
to splitting the data set. When computing the information entropy When a satisfiable constraint model is found, we are able to retrieve
based on attributes lacking spatial ordering or clustering properties, all inconsistent requirements in the form of all unconsidered require-
we are not dividing the data set strategically but randomly. This is ments. With this subset of REQ, we return from M iREQ. After the
why we chose to exclude them from our knowledge base beforehand. M iREQ-function returns, we ask the user for his or her preference
If several attributes occur with equal entropies in the data set, we for dropping one of his or her previously defined requirements within
can randomly select one of these categories, as all of them split the the minimal set of inconsistencies. While the current implementa-
data equally well. In the last step, the user is asked for selecting one tions assure us to find minimal sets of inconsistent requirements, in
value for this category. The user’s selection is added to the set of future implementations, we hope to be able to make use of the unsat-
requirements and EntRecom gets called again. core functionality of Z3 for this task. This is expected to significantly
improve performance.

3.2.2 M iREQ
3.3 Evaluation
This part of the algorithm is called, when the query is not satisfiable
for our knowledge base KB with the user-defined set of require- We developed our tests concentrating on the algorithm itself and the
ments REQ. Following [17], we state the Inconsistent Requirements data structures needed for execution. The data preprocessing was not
Problem. As a counter measurement to the Inconsistent Requirement in the focus of this test. For our tests with synthetic test data this is
Problem, we have to soften the query to get results. To achieve this, especially true, as they use a adapted version of EntRecom without
we have to find inconsistencies in REQ, being a subset Γ thereof. ∀ user interaction, as depicted in Figure 5. The goal of our experiments
inconsistent subsets Γ, KB ∪{¬AbR | R ∈ REQ\Γ}∪{AbR |R ∈ is to show potentials for optimizations of the implementation, as well
Γ} is satisfiable, with AB being a Boolean variable for selecting as proofing the versatility of the algorithm proposed in [17].
and deselecting a requirement R to be considered [18] This, we im- In our tests, we used both, synthetic and real-world data. For
plemented by checking on models with combinatoric variations of benchmarking and basic experiments, we mainly used synthesized
subsets of REQ. This means, that we evaluate a constraint model data, while real-world data, specifically from Open Street Maps is
h) Ranges (e.g. distance-to-the-center, price-category)

ad a) Boolean values appear often in real-world data and are easy to
process. This class represents attributes like internet-access or
payment-credit-card in the domain tourism and esp or abs in the
automotive domain.
ad b) Arbitrary integer values appear often as counter variables like vis-
itors, likes or 5-star-reviews
ad c) Floating-point numbers occur within both domains, tourism, and
cars in various forms. This class covers all attributes in the context
Figure 5: Focus of Tests with Synthetic Data of distance, like distance-to-the-center for a hotel or mileage for
a car. It also stands for location attributes given in coordinates
important to ensure the flexibility and robustness of the implementa- (longitude, latitude) and mean values like user-rating.
tion in real-world scenarios. ad d) Strings are very versatile and therefore used in many ways in dif-
ferent sources. In many cases, strings contain informational value
3.3.1 Real-World Test Data beyond what a number may cover. But often strings are used in
places, where the informative content is not higher than number-
As described before, we use data from Open Street Maps for our valued attributes and could be represented with Boolean values or
tests. First, we want to observe, how the algorithm copes on real- numbers instead. This is true for several classes of attributes like
world data and which degree of data preprocessing and filtering of stars with values of ”4-star”.
the data is necessary. Furthermore, we developed the rules for gener- ad e) The name attribute is always a string and does not have to but is
ating our synthetic test data based on the exported data from OSM. likely to include a domain-specific term in its value. For hotels,
For this part of our evaluation, we empirically tested the algorithm this would be ”Hotel” or a synonym thereof. Because of these re-
for usable results as well as automated tests for exhaustive testing on curring terms and the omnipresence over all domain-specific data
the data. sources, we treat this attribute separately from the more general
string class. When generating data sets, we introduce these terms
3.3.2 Synthetic Test Data into our samples regularly.
ad f) Fields like city or manufacturer contain frequently reappearing
Our modus operandi for generating and conducting tests with syn- values, which makes it special regarding information entropy and
thetic data can be described as follows: therefore interesting to us.
ad g) Stars of a hotel can be represented with 1 to 5, which makes a
1.) Definition of attribute classes and generation of data sets
reappearance in this category very probable.
2.) Definition of requirement sets and classifying them within a test-
ad h) As we want to simplify the selection process, we reduce the date
ing matrix
to the year. This results in an integer value often constrained by an
3.) Performing speed tests following the testing matrix
upper boundary being the current year and a case dependent lower
To evaluate our results for the performance test, we order tests boundary. Depending on the specific attribute, these values may
within a three-dimensional testing matrix. The first axis of this ma- recur frequently.
trix represents the size of the test set. The second axis represents the ad i) Ranges are special because they consist not only of one, but two
number of attributes and the last axis represents the number of al- boundary values. For simplification reasons, we categorize ranges
lowed results n. The value at every position in the matrix represents within the relative classes: low, med and high. Of course, these
the called subfunction within our recommender algorithm. range classes are also very likely to reoccur.
We randomly generated data sets of different sizes and with dif-
Our synthetic data is randomly generated with respect to those
ferent numbers of attributes for our performance tests. Because data
classes of attributes and with varying occurrences of the different
within the domains ”hotels” and ”cars”, have several distinct proper-
types.
ties, we set up our test data following certain rules. To achieve this,
we categorized the attributes dependent on the data type of their cor-
responding values. Furthermore, we added complementary classes 3.4 Results
based on certain characteristics of the domain.
The results are splited in two parts. First, we discuss the results of
This results in the following basic classes of attributes based on
the synthetic data, followed by the proof of concept results with real-
their data type:
world data.
a) Boolean values Using the synthetic data, we define sets of requirements to test
b) Integer values the algorithm on. These sets belong to one of the following classes
c) Float values relative to the knowledge base e.g. the test data and the given n, they
d) String values are tested with:

Additionally, we added the following supplementary classes: i) Satisfiable with n or less results. This leads to a direct return from
the EntRecom-algorithm, presenting us the results S the con-
e) Often reappearing parts in string values (e.g. names) straint solver found.
f) Frequently recurring string values (e.g. city, manufacturer) ii) Satisfiable with more than n results. In this case, we have to call
g) Restricted numbers (e.g. star-rating, number of seats) GetBestEntrAttr, which calculates entropies for all attributes.
h) Dates (e.g. registration date) Another interaction with the algorithm is needed, as we have to
select a value for the attribute with the highest informational con- 4 RELATED WORK
tent.
iii) Not satisfiable. In this case, we have to call M iREQ which iter- The application of model-based reasoning, and especially model-
atively chooses subsets of REQ until it finds the largest satisfiable based diagnosis, in the field of recommender systems is not novel.
subset. Its complement is the set of inconsistent requirements δs. For example, papers like [4, 11, 19] compute the minimal sets of
Again, another interaction through AskU ser is necessary. Now faulty requirements. These requirements should be changed in or-
we have to select a requirement out of δs we want to keep in REQ der to find a solution. In these papers, the authors rely on the ex-
the other constraints are dropped. istence of minimal conflict sets computing the diagnosis for incon-
sistent requirements. Felfernig et al. [4] present an algorithm that
To classify the requirement sets for every test set according to the calculates personalized repairs for inconsistent requirements. The al-
three classes from above, we use an adapted version of EntRecom, gorithm combines concepts of MBD with a collaborative problem
which does not return results or perform any recursive calls. The pur- solving approach to improve the quality of repairs in terms of pre-
pose of this classification is, to be able to identify test results with diction accuracy. In [19], the concept of representative explanations
regard to the sub-functions called within EntRecom. The class of is introduced. This concept follows the idea of generating diversity
REQ - either i, ii, or iii - for a certain test configuration represents in alternative diagnoses informally, constraints that occur in con-
the value of the three-dimensional testing matrix. flicts should as well be included in diagnoses presented to the user.
After classification, we perform our tests on the generated data. Jannach [11] proposes to determine preferred conflicts ”on demand”
A typical test-scenario starts with the users first interaction with the and use a general-purpose and fast conflict detection algorithm for
chatbot environment, which, in turn, results in setting up the knowl- this task, instead of computing all minimal conflicts within the user
edge base. Regularly the user is asked to select a region of interest. requirements in advance.
For our tests, we use a data set of a specific size. Then a set of re- Papers that deal with the integration of diagnosis and constraint
quirements and a maximum for the expected results are chosen. We solving are [3] and [24, 25], who proposed a diagnosis algorithm for
now perform a test for every class in the testing matrix and get results tree-structured models. The approach is generally applicable due to
in the form of execution times. the fact that all general constraint models can be converted into an
For a fixed n of 5 and tests performed with one to five re- equivalent tree-structured model using decomposition methods, e.g.,
quirements in the sets, we plot the function calls of M iREQ and hyper tree decomposition [7, 8]. [26] provides more details regarding
GetBestEntrAttr on increasingly sized test sets. In case, we ob- the coupling of decomposition methods and the diagnosis algorithms
tain a result for our query which lies within n, our constraint solver for tree-structured models. In addition to that, [21] generalized the
CSolver is the only function call made. algorithms of [3] and [24]. In [15] the authors also propose the use of
constraints for diagnosis where conflicts are used to drive the com-
putation. In [6], which is maybe the earliest work that describes the
use of constraints for diagnosis, the authors introduce the use of con-
straints for computing conflicts under the correctness assumptions.
For this purpose they developed the concept of constraint propaga-
tion. Despite of the fact that all of these algorithms use constraints for
modeling, they mainly focus on the integration of constraint solving
for conflict generation, which is different to our approach. For pre-
senting recommendation tasks as constraint satisfaction problem, we
refer to [12].
Human-chatbot communication represents a broad domain. It cov-
ers technical aspects as well as psychological and human perspec-
tives. Contributions like [9, 30] show several ways of implementing
Figure 6: Classification Based on Function Calls within EntRecom chatbots in different domains. Wallace [30] demonstrates an artificial
intelligence robot based on a natural language interface (A.L.I.C.E.)
that extends ELIZA [31], which is based on an experiment of Alan
As you see, the function calls of M iREQ decrease, while the calls M. Turing in 1950 [29]. This work describes how to create a robot
of GetBestEntrAttr increase for larger test sets. While this basic personality using AIML, an artificial intelligence modelling lan-
assumption holds for real data, the impact is not as drastic, as users guage, to pretend intelligence and self-awareness.
tend to perform queries with less than five requirements initially. Sun et al. [27] introduced a conversational recommendation sys-
The algorithm was also used with real-world data. Therefore, we tem based on unsupervised learning techniques. The bot was trained
inserted the data from OSM into the knowledge base. We interact by successful order conversations between user and real human
with the textual chat-like web interface in the intended way and get agents.
the correct results from the algorithm. The algorithm returned a result Papers like [5, 10, 13, 32] address the topics user acceptance and
set within a few iterations and these results fit to the user specifica- experience. In [32] a pre-study shows that users infer the authentic-
tion. ity of a chat agent by two different categories of cues: agent-related
These experiments show also that there is a problem of relevance cues and conversational-related cues. To get an optimal conversa-
of attributes. The attributes, which have a high entropy to reduce the tional result, the bot should provide a human-like interaction. Ques-
result set, are not nessesarily relevant for users. Our tests show that tions of conversational UX design raised by [5] and [16] demonstrate
i.e. the attribute wheelchair, which is part of accessibility, has a high the need to rethink user interaction at all.
entropy but will not be interesting for a large amount of users. This The topic of recommender systems with conversational interfaces
issue has to be addressed in an upcoming version of the algorithm. is shown in [14], where an adaptive recommendation strategy was
shown based on reinforcement learning methods. In the paper [28], [15] Jakob Mauss and Martin Sachenbacher, ‘Conflict-driven diagnosis us-
the authors proposed a deep reinforcment learning framework to ing relational aggregations’, in Working Papers of the 10th Interna-
tional Workshop on Principles of Diagnosis (DX-99), Loch Awe, Scot-
build personalized conversational recommendation agents. In this land, (1999).
work, a recommendation model trained from conversational sessions [16] R.J. Moore, R. Arar, G.-J. Ren, and M.H. Szymanski, ‘Conversational
and rankings is also presented. ux design’, volume Part F127655, pp. 492–497, (2017).
[17] Iulia Nica, Oliver A. Tazl, and Franz Wotawa, ‘Chatbot-based tourist
recommendations using model-based reasoning’, in ConfWS, (2018).
5 CONCLUSION AND FUTURE WORK [18] Iulia Nica and Franz Wotawa, ‘Condiag - computing minimal diagnoses
using a constraint solver’, (2012). International Workshop on Principles
In this paper, we showed an implementation and its evaluation of En- of Diagnosis ; Conference date: 31-07-2012 Through 03-08-2012.
tRecomm, an algorithm using model-based diagnosis and Shanon’s [19] Barry O’Sullivan, Alexandre Papadopoulos, Boi Faltings, and Pearl
information entropy. In our tests, we used both, synthetic and real- Pu, ‘Representative explanations for over-constrained problems’, 1, (07
world data. For benchmarking and basic experiments, we mainly 2007).
[20] Raymond Reiter, ‘A theory of diagnosis from first principles’, Artificial
used synthesized data, while real-world data, specifically from Open Intelligence, 32(1), 57–95, (1987).
Street Maps is important to ensure the flexibility and robustness of [21] Martin Sachenbacher and Brian C. Williams, ‘Diagnosis as semiring-
the implementation in real-world scenarios. We also showed the per- based constraint optimization’, in European Conference on Artificial
formance for different data sets and also revealed open issues, like Intelligence, pp. 873–877, (2004).
[22] C. E. Shannon, ‘A Mathematical Theory of Communication’, The Bell
the relevance of chosen attributes, which is already a starting point
System Technical Journal, 27(3), 379–423, (1948).
for future work. Another important step will be a user study to eva- [23] C. E. Shannon, ‘A mathematical theory of communication’, Bell System
lute the acceptance of the algorithm. Technical Journal, 27, 379–623, (1948).
[24] Markus Stumptner and Franz Wotawa, ‘Diagnosing Tree-Structured
Systems’, in Proceedings 15th International Joint Conf. on Artificial
ACKNOWLEDGEMENTS Intelligence, Nagoya, Japan, (1997).
[25] Markus Stumptner and Franz Wotawa, ‘Diagnosing tree-structured sys-
Research presented in this paper was carried out as part of the AS- tems’, Artificial Intelligence, 127(1), 1–29, (2001).
IT-IC project that is co-financed by the Cooperation Programme In- [26] Markus Stumptner and Franz Wotawa, ‘Coupling CSP decomposition
terreg V-A Slovenia-Austria 2014-2020, European Union, European methods and diagnosis algorithms for tree-structured systems’, in Pro-
Regional Development Fund. ceedings of the 18th International Joint Conference on Artificial Intel-
ligence (IJCAI-03), pp. 388–393, Acapulco, Mexico, (2003).
[27] Y. Sun, Y. Zhang, Y. Chen, and R. Jin, ‘Conversational recommendation
REFERENCES system with unsupervised learning’, pp. 397–398, (2016).
[28] Yueming Sun and Yi Zhang, ‘Conversational recommender system’,
[1] Johan de Kleer and Brian C. Williams, ‘Diagnosing multiple faults’, in The 41st International ACM SIGIR Conference on Research &
Artificial Intelligence, 32(1), 97–130, (1987). Development in Information Retrieval, SIGIR ’18, pp. 235–244, New
[2] Rina Dechter, Constraint Processing, Morgan Kaufmann, 2003. York, NY, USA, (2018). ACM.
[3] Yousri El Fattah and Rina Dechter, ‘Diagnosing tree-decomposable cir- [29] Alan M. Turing, Computing Machinery and Intelligence, 23–65,
cuits’, in Proceedings 14th International Joint Conf. on Artificial In- Springer Netherlands, Dordrecht, 2009.
telligence, pp. 1742 – 1748, (1995). [30] R.S. Wallace, The anatomy of A.L.I.C.E., 2009.
[4] Alexander Felfernig, Gerhard Friedrich, Monika Schubert, Monika [31] J. Weizenbaum, ‘Eliza-a computer program for the study of natural lan-
Mandl, Markus Mairitsch, and Erich Teppan, ‘Plausible repairs for in- guage communication between man and machine’, Communications of
consistent requirements.’, in IJCAI International Joint Conference on the ACM, 9(1), 36–45, (1966).
Artificial Intelligence, pp. 791–796, (01 2009). [32] N.V. Wünderlich and S. Paluch, ‘A nice and friendly chat with a bot:
[5] Asbjørn Følstad and Petter Bae Brandtzæg, ‘Chatbots and the new User perceptions of ai-based service agents’, (2018).
world of hci’, interactions, 24(4), 38–42, (June 2017).
[6] Hector Geffner and Judea Pearl, ‘An Improved Constraint-Propagation
Algorithm for Diagnosis’, in Proceedings 10th International Joint
Conf. on Artificial Intelligence, pp. 1105–1111, (1987).
[7] Georg Gottlob, Nicola Leone, and Francesco Scarcello, ‘Hypertree De-
composition and Tractable Queries’, in Proc. 18th ACM SIGACT SIG-
MOD SIGART Symposium on Principles of Database Systems (PODS-
99), pp. 21–32, Philadelphia, PA, (1999).
[8] Georg Gottlob, Nicola Leone, and Francesco Scarcello, ‘A compari-
son of structural CSP decomposition methods’, Artificial Intelligence,
124(2), 243–282, (December 2000).
[9] B. Graf, M. Krüger, F. Müller, A. Ruhland, and A. Zech, ‘Nombot - sim-
plify food tracking’, volume 30-November-2015, pp. 360–363, (2015).
[10] Jennifer Hill, W. Randolph Ford, and Ingrid G. Farreras, ‘Real con-
versations with artificial intelligence: A comparison between human-
human online conversations and human-chatbot conversations’, Com-
puters in Human Behavior, 49, 245 – 250, (2015).
[11] Dietmar Jannach, ‘Finding preferred query relaxations in content-based
recommenders’, IEEE Intelligent Systems, 109, 81–97, (04 2008).
[12] Dietmar Jannach, Markus Zanker, and Matthias Fuchs, ‘Constraint-
based recommendation in tourism: A multiperspective case study’,
Journal of IT and Tourism, 11, 139–155, (2009).
[13] A. Khanna, M. Jain, T. Kumar, D. Singh, B. Pandey, and V. Jha,
‘Anatomy and utilities of an artificial intelligence conversational en-
tity’, pp. 594–597, (2016).
[14] Tariq Mahmood, Francesco Ricci, and Adriano Venturini, ‘Learning
adaptive recommendation strategies for online travel planning’, Infor-
mation and Communication Technologies in Tourism 2009, 149–160,
(2009).