=Paper=
{{Paper
|id=Vol-2467/paper-03
|storemode=property
|title=Conversational Recommendations Utilizing Model-based Reasoning
|pdfUrl=https://ceur-ws.org/Vol-2467/paper-03.pdf
|volume=Vol-2467
|authors=Oliver Tazl,Alexander Perko,Franz Wotawa
|dblpUrl=https://dblp.org/rec/conf/confws/TazlPW19
}}
==Conversational Recommendations Utilizing Model-based Reasoning==
Conversational Recommendations Using Model-based Reasoning Oliver A. Tazl and Alexander Perko and Franz Wotawa1 Abstract. Chatbots as conversational recommender have gained in- and finally introduce the results of an evaluation of the implementa- creasing importance over the years. The chatbot market offers a va- tion. The evaluation is based on a case study from the tourism do- riety of applications for research and industry alike. In this paper, we main, i.e., a scenario where a user wants to book a hotel in a certain discuss an implementation that supports the use of our recommen- city. The obtained results show that the proposed chatbot approach is dation algorithm during chatbot communication. The program eases applicable and beneficial for the intended purpose. Furthermore, we communication and improves the underlying recommendation flow. gained experiences about the limitations of the approach. For exam- In particular, the implementation makes use of our model-based rea- ple, it seems that entropy is not always the best measure for selecting soning approach for improving user experience during a chat, i.e., in questions to be answered, and further research is needed. cases where user configurations cause inconsistencies. The approach The main contributions of this paper can be summarized as the deals with such issues by removing inconsistencies in order to gener- follows: ate a valid recommendation. In addition to the underlying definitions, we demonstrate our implementation along use cases from the tourism 1. An implementation of an algorithm that is based on model-based domain. diagnosis and Shannon’s information entropy to solve recommen- dation problems and 2. the evaluation of the system with synthetic and real-world data 1 INTRODUCTION sets. Recommender systems aim to lead users in a helpful and individu- The remainder of this paper is organized as follows: In the next alized way to interesting or useful items drawn from a large space section we give an overview of our algorithmic approach. After- of possible options. Recommender systems may utilize knowledge- wards, we present the implementation of the algorithms and show bases for guiding the users through the whole process of finding the evaluation results in greater detail. Finally, we discuss related re- right recommendation, i.e., a recommendation that satisfies the user’s search and conclude the paper. requirements, needs, or expectations. Most recently, conversational agents like chatbots have gained importance because of the fact that they – in principle – offer a well-known and ideally more intuitive 2 FOUNDATIONS AND ALGORITHM interface for human users, i.e., either textual or speech interaction. In our previous work [17], we introduced the algorithm EntRecom, In previous work [17] we introduced the basic foundations and which utilizes model-based diagnosis and in particular the ConDiag principles behind a chatbot-based recommender system that allows algorithm [18], and on a method that applies Shannon’s information to interact with users in a smart way being capable of finding contra- entropy [23]. To be self-contained, we briefly recapitulate the under- dictions during observation and efficiently pruning the overall rec- lying definitions and EntRecom. We first formalize the inconsistent ommendation process via selecting the right questions to be asked requirements problem by exploiting the concepts of Model-Based to the user. The basic principles behind our approach rely on classi- Diagnosis (MBD) [1, 20] and constraint solving [2]. cal model-based reasoning. In case, the conversation leads to an in- The inconsistent requirements problem requires information on consistent state, e.g., caused by contradictions between user require- the item catalog (i.e., the knowledge-base of the recommendation ments and the recommendation knowledge-base, the chatbot is able system) and the current customer’s requirements. Note that the to react and to resolve this issue. For this purpose, the chatbot asks knowledge-base of the recommender may be consistent with the cus- the user which requirements to retract in order to eliminate incon- tomer’s requirements (i.e., the customer’s query) and an appropriate sistencies. In the case where the chatbot has far to many solutions number of recommendations can be offered. In this case, the recom- to be presented effectively to the user, the system makes use of an mendation system shows the recommendations to the customer and entropy-based approach for selecting those requirements or attributes no further algorithms have to be applied. Otherwise, if no solutions that have to be fixed in order to reduce the number of possible solu- to the recommendation problem are available, then the minimal set tions. When using entropy the number of steps necessary to reach to of requirements, which determined the inconsistency with the knowl- a solution can be substantially reduced. edge base, have to be identified and consequently offered to the user This paper is a direct successor of our previous work, where we as explanation for not finding any recommendation. The user can in report on an implementation of our chatbot approach. In particular, this case adapt the requirement(s) (relax it/them). Here we borrow the we discuss the implementation details, present experiences gained, idea from MBD and introduce abnormal modes for the given require- 1 Graz University of Technology, Austria, email: {oliver.tazl, perko, ments, i.e., we use Ab predicates stating whether a requirement i is wotawa}@ist.tugraz.at should be assumed valid (¬Abi ) or not (Abi ) in a particular context. Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). The Ab values for the requirements are set by the model-based di- Algorithm 1 EntRecom(KBD , REQ, n) agnosis algorithm so that the assumptions together with the require- ments and the knowledge-base are consistent. In the following, we Input: A modified knowledge base KBD , a set of customer define the inconsistent requirements problem and its solutions. requirements REQ and the maximum number of recommendations More formally, we stating the inconsistent requirements problem n as follows: Output: All recommendations S Definition 1 (Inconsistent Requirements Problem). Given a tuple 1: Generate the constraint model CM from KBD and REQ (KB, REQ) where KB denotes the knowledge base of the recom- 2: Call CSolver(CM ) to check consistency and store the result mender system, i.e., the item catalog, and REQ denotes the cus- in S tomer requirements. The Inconsistent Requirements Problem arises 3: if S = ∅ then when KB together with REQ is inconsistent. In this case we are 4: Call MI REQ(CM, |REQ|) and store the inconsistent re- interested in identifying those requirements that are responsible for quirements in IncReqs the inconsistency. 5: Call askUser(IncReqs) and store the answer in AdaptedReqs A solution or explanation to the inconsistent requirements prob- 6: CM = KB ∪ (REQ \ IncReqs ∪ AdaptedReqs) lem can be easily formalized using the analogy with the definition of 7: go to Step 2 diagnosis from Reiter [20]. We first introduce a modified representa- 8: end if tion of (KB, REQ) comprising (KBD , REQ) where KBD com- 9: while |S| > n do prises KB together with rules of the form AbR for each requirement 10: Call GetBestEntrAttr(AS ) and store the result in a R in REQ. The solution to the Inconsistent Requirements Problem 11: AS = AS \ a can now be defined using the modified representation as follows: 12: Call askUser(a) and store the answer in va 13: S = R (S, va )) Definition 2 (Inconsistent Requirements). Given a modified recom- 14: end while mendation model (KBD , REQ). A subset Γ ⊆ REQ is a valid set 15: return S of inconsistent requirements iff KBD ∪ {¬AbR |R ∈ REQ \ Γ} ∪ {AbR |R ∈ Γ} is satisfiable. 3 IMPLEMENTATION AND EVALUATION A set of inconsistent requirements Γ is minimal iff no other set of inconsistent requirements Γ0 ⊂ Γ exists. A set of inconsistent For the user, the interaction with the recommender system within requirements Γ is minimal with respect to cardinality iff no other set the chatbot starts, when he or she formulates a query for a search of inconsistent requirements Γ0 with |Γ0 | < |Γ| exists. From here within the domain. The natural language processing framework on we assume minimal cardinality sets when using the term minimal Rasa2 parses this query and passes it on to the EntRecom-algorithm. sets. The recommender algorithm then searches a previously conducted The second problem occurring during a recommendation session and preprocessed knowledge base. Depending on the satisfiability of is the availability of too large number of recommendations, which the generated constraint model, results are presented to the user. If have to be narrowed down to a reasonable number. The too many necessary the user is asked follow up questions regarding his or her recommendation problem can be solved again using ideas bor- requirements until we have results to return to the user. rowed from model-based diagnosis. In diagnosis, we have the similar problem of coming up with too many diagnoses because of too less observations known. The corresponding problem in case of recom- mendation is that we do have far too less requirements from the user. Hence, we have to ask the user about adding more information in order to reduce the number of available solutions. In model-based diagnosis Shannon’s information entropy [23] is used to come up with observations and in our case requirements that should be known in order to reduce the recommendations as fast as possible. Algorithm 1 provides recommendations in the context of chatbots making use of diagnosis and Shannon’s information entropy com- putation. EntRecom converts the available knowledge into a corre- Figure 1: A Chatbot-Based Recommender as Bridge between User sponding constraint model and checks its consistency. If the knowl- Input and Results from a Database edge is inconsistent, the algorithm tries to find a requirements that can be retracted by the user in order to get rid of the inconsistency. When the query can be answered, EntRecom exits and the user Afterwards, EntRecom searches for the best requirement to be set by can continue interacting with the Rasa-based chatbot. the user in order to reduce the number of solutions if necessary. The As mentioned before, we choose a tourism domain, searching for algorithm stops when reaching a set of recommendations that has a a hotel to be more specific. The process of searching a hotel in an it- cardinality of less than n. erative conversation helps us to see the capabilities and shortcomings With the provide algorithms a chatbot for recommendations can of the algorithm. be build that is able to deal with inconsistent requirements as well as missing requirements in a more or less straightforward way making 2 see https://rasa.com/ use of previously invented algorithms. 3.1 Framework and Data Preprocessing every attribute in the domain and the second one being string trans- lation for every value, regardless of the specific class of an attribute. For our tests, we chose the community-curated data from Open Street In our implementation, every attribute and value is translated to a Maps (OSM). We use the python bindings for Microsofts’ Z3 frame- Z3 string. While this may result in slower runtimes, we ensure flex- work as a constraint solver. For language processing and basic inter- ibility, as data types may vary across attributes. Usually, the user is action with the user, we utilize the Rasa framework. Rasa is used for asked to select a region of interest. For our tests, we use a data set natural language understanding by extrating the users’ intend and re- of a specific size. This is a very critical and, depending of the size quirements but can also respond immediatly with an answer (without of the test set, time consuming step. Therefore, we try to do it only calling EntRecom) when it detects the users’ intend to chitchat. once when the user initially specifies an area. This Z3 constraint In a first step, we query the OSM-API for hotels in a predefined model, consisting of OR-connected uniform clauses of Z3 strings is region. The size of the region is a major factor for performance. Be- our main knwoledge base within EntRecom and shall be refered to cause of this, it is reasonable to let the user select a region on initial- as kb and subsets therof as S in the remainder of this paper. ization. The first interaction the user has with our chatbot implementa- tion is handled via the trained natural language understanding (NLU) model within Rasa. If the user’s intent to search within the domain ”hotels”, our recommender is called via a webhook, and the param- eters are passed over. This call to our internal API is the only inter- action between EntRecom and the chatbot framework, Rasa, in our case, which leads to very low coupling and a high degree of flexibil- ity. The parameters represent the NLUs interpretation of the users’ query and give us our initial set of requirements. 3.2 Recommender Algorithm In this section, we are describing the implementation of the previ- ously introduced algorithm EntRecom. Before the algorithm ready to use, we have to prepare our data set, Figure 2: Data from Open Street Maps generate the constraint model and interpret the user’s intent. Then we enter the EntRecom-implementation. As described before, if the query is satisfiable for our knowledge base and the given maximum To conduct our internal knowledge base, we process the exported number of results, we return our recommendations and exit EntRe- data set in the following way: As some attributes do not add human- com.A maximum of n hotels is presented to the user, we retrieved readable information or tend to mislead users, we have to filter them from the knowledge base, with n representing a preselected maxi- out. This is especially necessary because some classes of attributes mum number of results. have a severe impact on later steps in our recommender system. For Though, as stated in [17], we are confronted with two potential example, categories like ”fixme” (annotation from an OSM user) problems at this point. Given a knowledge base kb, a maximum num- would give the user no advantage in a real-world scenario and in this ber of results n and a user-defined set of requirements REQ: case, would lead to unwanted recommendations. We use a whitelist- filter for the attributes of every entry in the data set. Furthermore, not • We could get too many results to present in a meaningful way. In every entry in the OSM data has the same fields, why we maintain a this case, the function GetBestEntrAttr is called. list of all attributes in a data set. If an entry does not include a cer- • We could get no results at all. In this scenario, we call the function tain attribute, we add it with a value of False. This leaves us with a M iREQ. ”normalized” data set. 3.2.1 GetBestEntrAttr This part of the algorithm is called, when the query is satisfiable, but the result does not lie within [n]. Therefore, we have to add further constraints to our model. Because we want to occupy as little of the user’s time as possible, we have to efficiently select additional con- straints. This is done by choosing a category out of the domain which Figure 3: Preprocessed Data Set best splits the current subset S of the data set. The criterion for our selection is Shannon’s information entropy [22]: After our preprocessing step, every entry in the dictionary has X H(X) = − P (xi )log(P (xi )) the same number of attributes we add uniform clauses to our con- i straint model. An exemplary normalized clause looks like this: (amenity = ”none” ∧ addr : city = ”Graz” ∧ name = To apply entropy as a measure, we have to restructure the ”P arkhotelGraz” ∧ cuisine = ”none” ∧ . . . ∧ smoking = data slightly. AS represents this restructured version of S, which ”isolated” ∧ wheelchair = ”limited” ∧ swimming − pool = maps every attribute in the domain to all values of its occur- ”none” ∧ stars = ”4” ∧ tourism = ”hotel”). Regarding data rences in S. This is realized with python dictionaries of the form: types, we choose between two different approaches, when creating ”attribute1 ” : [”value1 ”, ”value2 ”], ”attribute2 ” : [”value1 ”], .... the Z3 constraint model. The first one being data type selection for After computing the number of occurrences of every value for a (a) SAT (b) MiREQ (c) GetBestEntrAttr Figure 4: User Interface specific attribute, we now calculate the entropy, for every attribute. consisting of our knowledge base and requirements with varying val- The attribute with the highest entropy splits the data set S most ues for AB. Given the cardinality of Γ, we iterate over all possible effectively. As mentioned above, some attributes may lead to distributions of AB with |REQ| − |Γ| considered requirements. Be- unwanted recommendations. One reason for this is the appearance of cause we want to preserve as much of the user’s initial query as pos- seemingly random values when looked at them lacking context. This sible, we want to find a minimal set of inconsistent constraints. As could be attributes internally used within OSM, for instance. Fields proposed in [17], to find such a minimal set of inconsistencies, we like ”source” often have seemingly random values like ”survey” or have to obtain a constraint model that is satisfiable for the smallest ”Kleine Zeitung”. The same pseudo-randomness can be observed cardinality of Γ possible. For this, we repeat the process of check- with attributes like ”housenumber” which can take an arbitrary ing on combinatoric variations of subsets, with increasing numbers integer value. Those values appear in many data points and, when of assumed inconsistencies, starting with one. This we do until we observed isolated, do not contribute any information with regard reach |REQ|, in which case there are no consistent requirements. to splitting the data set. When computing the information entropy When a satisfiable constraint model is found, we are able to retrieve based on attributes lacking spatial ordering or clustering properties, all inconsistent requirements in the form of all unconsidered require- we are not dividing the data set strategically but randomly. This is ments. With this subset of REQ, we return from M iREQ. After the why we chose to exclude them from our knowledge base beforehand. M iREQ-function returns, we ask the user for his or her preference If several attributes occur with equal entropies in the data set, we for dropping one of his or her previously defined requirements within can randomly select one of these categories, as all of them split the the minimal set of inconsistencies. While the current implementa- data equally well. In the last step, the user is asked for selecting one tions assure us to find minimal sets of inconsistent requirements, in value for this category. The user’s selection is added to the set of future implementations, we hope to be able to make use of the unsat- requirements and EntRecom gets called again. core functionality of Z3 for this task. This is expected to significantly improve performance. 3.2.2 M iREQ 3.3 Evaluation This part of the algorithm is called, when the query is not satisfiable for our knowledge base KB with the user-defined set of require- We developed our tests concentrating on the algorithm itself and the ments REQ. Following [17], we state the Inconsistent Requirements data structures needed for execution. The data preprocessing was not Problem. As a counter measurement to the Inconsistent Requirement in the focus of this test. For our tests with synthetic test data this is Problem, we have to soften the query to get results. To achieve this, especially true, as they use a adapted version of EntRecom without we have to find inconsistencies in REQ, being a subset Γ thereof. ∀ user interaction, as depicted in Figure 5. The goal of our experiments inconsistent subsets Γ, KB ∪{¬AbR | R ∈ REQ\Γ}∪{AbR |R ∈ is to show potentials for optimizations of the implementation, as well Γ} is satisfiable, with AB being a Boolean variable for selecting as proofing the versatility of the algorithm proposed in [17]. and deselecting a requirement R to be considered [18] This, we im- In our tests, we used both, synthetic and real-world data. For plemented by checking on models with combinatoric variations of benchmarking and basic experiments, we mainly used synthesized subsets of REQ. This means, that we evaluate a constraint model data, while real-world data, specifically from Open Street Maps is h) Ranges (e.g. distance-to-the-center, price-category) ad a) Boolean values appear often in real-world data and are easy to process. This class represents attributes like internet-access or payment-credit-card in the domain tourism and esp or abs in the automotive domain. ad b) Arbitrary integer values appear often as counter variables like vis- itors, likes or 5-star-reviews ad c) Floating-point numbers occur within both domains, tourism, and cars in various forms. This class covers all attributes in the context Figure 5: Focus of Tests with Synthetic Data of distance, like distance-to-the-center for a hotel or mileage for a car. It also stands for location attributes given in coordinates important to ensure the flexibility and robustness of the implementa- (longitude, latitude) and mean values like user-rating. tion in real-world scenarios. ad d) Strings are very versatile and therefore used in many ways in dif- ferent sources. In many cases, strings contain informational value 3.3.1 Real-World Test Data beyond what a number may cover. But often strings are used in places, where the informative content is not higher than number- As described before, we use data from Open Street Maps for our valued attributes and could be represented with Boolean values or tests. First, we want to observe, how the algorithm copes on real- numbers instead. This is true for several classes of attributes like world data and which degree of data preprocessing and filtering of stars with values of ”4-star”. the data is necessary. Furthermore, we developed the rules for gener- ad e) The name attribute is always a string and does not have to but is ating our synthetic test data based on the exported data from OSM. likely to include a domain-specific term in its value. For hotels, For this part of our evaluation, we empirically tested the algorithm this would be ”Hotel” or a synonym thereof. Because of these re- for usable results as well as automated tests for exhaustive testing on curring terms and the omnipresence over all domain-specific data the data. sources, we treat this attribute separately from the more general string class. When generating data sets, we introduce these terms 3.3.2 Synthetic Test Data into our samples regularly. ad f) Fields like city or manufacturer contain frequently reappearing Our modus operandi for generating and conducting tests with syn- values, which makes it special regarding information entropy and thetic data can be described as follows: therefore interesting to us. ad g) Stars of a hotel can be represented with 1 to 5, which makes a 1.) Definition of attribute classes and generation of data sets reappearance in this category very probable. 2.) Definition of requirement sets and classifying them within a test- ad h) As we want to simplify the selection process, we reduce the date ing matrix to the year. This results in an integer value often constrained by an 3.) Performing speed tests following the testing matrix upper boundary being the current year and a case dependent lower To evaluate our results for the performance test, we order tests boundary. Depending on the specific attribute, these values may within a three-dimensional testing matrix. The first axis of this ma- recur frequently. trix represents the size of the test set. The second axis represents the ad i) Ranges are special because they consist not only of one, but two number of attributes and the last axis represents the number of al- boundary values. For simplification reasons, we categorize ranges lowed results n. The value at every position in the matrix represents within the relative classes: low, med and high. Of course, these the called subfunction within our recommender algorithm. range classes are also very likely to reoccur. We randomly generated data sets of different sizes and with dif- Our synthetic data is randomly generated with respect to those ferent numbers of attributes for our performance tests. Because data classes of attributes and with varying occurrences of the different within the domains ”hotels” and ”cars”, have several distinct proper- types. ties, we set up our test data following certain rules. To achieve this, we categorized the attributes dependent on the data type of their cor- responding values. Furthermore, we added complementary classes 3.4 Results based on certain characteristics of the domain. The results are splited in two parts. First, we discuss the results of This results in the following basic classes of attributes based on the synthetic data, followed by the proof of concept results with real- their data type: world data. a) Boolean values Using the synthetic data, we define sets of requirements to test b) Integer values the algorithm on. These sets belong to one of the following classes c) Float values relative to the knowledge base e.g. the test data and the given n, they d) String values are tested with: Additionally, we added the following supplementary classes: i) Satisfiable with n or less results. This leads to a direct return from the EntRecom-algorithm, presenting us the results S the con- e) Often reappearing parts in string values (e.g. names) straint solver found. f) Frequently recurring string values (e.g. city, manufacturer) ii) Satisfiable with more than n results. In this case, we have to call g) Restricted numbers (e.g. star-rating, number of seats) GetBestEntrAttr, which calculates entropies for all attributes. h) Dates (e.g. registration date) Another interaction with the algorithm is needed, as we have to select a value for the attribute with the highest informational con- 4 RELATED WORK tent. iii) Not satisfiable. In this case, we have to call M iREQ which iter- The application of model-based reasoning, and especially model- atively chooses subsets of REQ until it finds the largest satisfiable based diagnosis, in the field of recommender systems is not novel. subset. Its complement is the set of inconsistent requirements δs. For example, papers like [4, 11, 19] compute the minimal sets of Again, another interaction through AskU ser is necessary. Now faulty requirements. These requirements should be changed in or- we have to select a requirement out of δs we want to keep in REQ der to find a solution. In these papers, the authors rely on the ex- the other constraints are dropped. istence of minimal conflict sets computing the diagnosis for incon- sistent requirements. Felfernig et al. [4] present an algorithm that To classify the requirement sets for every test set according to the calculates personalized repairs for inconsistent requirements. The al- three classes from above, we use an adapted version of EntRecom, gorithm combines concepts of MBD with a collaborative problem which does not return results or perform any recursive calls. The pur- solving approach to improve the quality of repairs in terms of pre- pose of this classification is, to be able to identify test results with diction accuracy. In [19], the concept of representative explanations regard to the sub-functions called within EntRecom. The class of is introduced. This concept follows the idea of generating diversity REQ - either i, ii, or iii - for a certain test configuration represents in alternative diagnoses informally, constraints that occur in con- the value of the three-dimensional testing matrix. flicts should as well be included in diagnoses presented to the user. After classification, we perform our tests on the generated data. Jannach [11] proposes to determine preferred conflicts ”on demand” A typical test-scenario starts with the users first interaction with the and use a general-purpose and fast conflict detection algorithm for chatbot environment, which, in turn, results in setting up the knowl- this task, instead of computing all minimal conflicts within the user edge base. Regularly the user is asked to select a region of interest. requirements in advance. For our tests, we use a data set of a specific size. Then a set of re- Papers that deal with the integration of diagnosis and constraint quirements and a maximum for the expected results are chosen. We solving are [3] and [24, 25], who proposed a diagnosis algorithm for now perform a test for every class in the testing matrix and get results tree-structured models. The approach is generally applicable due to in the form of execution times. the fact that all general constraint models can be converted into an For a fixed n of 5 and tests performed with one to five re- equivalent tree-structured model using decomposition methods, e.g., quirements in the sets, we plot the function calls of M iREQ and hyper tree decomposition [7, 8]. [26] provides more details regarding GetBestEntrAttr on increasingly sized test sets. In case, we ob- the coupling of decomposition methods and the diagnosis algorithms tain a result for our query which lies within n, our constraint solver for tree-structured models. In addition to that, [21] generalized the CSolver is the only function call made. algorithms of [3] and [24]. In [15] the authors also propose the use of constraints for diagnosis where conflicts are used to drive the com- putation. In [6], which is maybe the earliest work that describes the use of constraints for diagnosis, the authors introduce the use of con- straints for computing conflicts under the correctness assumptions. For this purpose they developed the concept of constraint propaga- tion. Despite of the fact that all of these algorithms use constraints for modeling, they mainly focus on the integration of constraint solving for conflict generation, which is different to our approach. For pre- senting recommendation tasks as constraint satisfaction problem, we refer to [12]. Human-chatbot communication represents a broad domain. It cov- ers technical aspects as well as psychological and human perspec- tives. Contributions like [9, 30] show several ways of implementing Figure 6: Classification Based on Function Calls within EntRecom chatbots in different domains. Wallace [30] demonstrates an artificial intelligence robot based on a natural language interface (A.L.I.C.E.) that extends ELIZA [31], which is based on an experiment of Alan As you see, the function calls of M iREQ decrease, while the calls M. Turing in 1950 [29]. This work describes how to create a robot of GetBestEntrAttr increase for larger test sets. While this basic personality using AIML, an artificial intelligence modelling lan- assumption holds for real data, the impact is not as drastic, as users guage, to pretend intelligence and self-awareness. tend to perform queries with less than five requirements initially. Sun et al. [27] introduced a conversational recommendation sys- The algorithm was also used with real-world data. Therefore, we tem based on unsupervised learning techniques. The bot was trained inserted the data from OSM into the knowledge base. We interact by successful order conversations between user and real human with the textual chat-like web interface in the intended way and get agents. the correct results from the algorithm. The algorithm returned a result Papers like [5, 10, 13, 32] address the topics user acceptance and set within a few iterations and these results fit to the user specifica- experience. In [32] a pre-study shows that users infer the authentic- tion. ity of a chat agent by two different categories of cues: agent-related These experiments show also that there is a problem of relevance cues and conversational-related cues. To get an optimal conversa- of attributes. The attributes, which have a high entropy to reduce the tional result, the bot should provide a human-like interaction. Ques- result set, are not nessesarily relevant for users. Our tests show that tions of conversational UX design raised by [5] and [16] demonstrate i.e. the attribute wheelchair, which is part of accessibility, has a high the need to rethink user interaction at all. entropy but will not be interesting for a large amount of users. This The topic of recommender systems with conversational interfaces issue has to be addressed in an upcoming version of the algorithm. is shown in [14], where an adaptive recommendation strategy was shown based on reinforcement learning methods. In the paper [28], [15] Jakob Mauss and Martin Sachenbacher, ‘Conflict-driven diagnosis us- the authors proposed a deep reinforcment learning framework to ing relational aggregations’, in Working Papers of the 10th Interna- tional Workshop on Principles of Diagnosis (DX-99), Loch Awe, Scot- build personalized conversational recommendation agents. In this land, (1999). work, a recommendation model trained from conversational sessions [16] R.J. Moore, R. Arar, G.-J. Ren, and M.H. Szymanski, ‘Conversational and rankings is also presented. ux design’, volume Part F127655, pp. 492–497, (2017). [17] Iulia Nica, Oliver A. Tazl, and Franz Wotawa, ‘Chatbot-based tourist recommendations using model-based reasoning’, in ConfWS, (2018). 5 CONCLUSION AND FUTURE WORK [18] Iulia Nica and Franz Wotawa, ‘Condiag - computing minimal diagnoses using a constraint solver’, (2012). International Workshop on Principles In this paper, we showed an implementation and its evaluation of En- of Diagnosis ; Conference date: 31-07-2012 Through 03-08-2012. tRecomm, an algorithm using model-based diagnosis and Shanon’s [19] Barry O’Sullivan, Alexandre Papadopoulos, Boi Faltings, and Pearl information entropy. In our tests, we used both, synthetic and real- Pu, ‘Representative explanations for over-constrained problems’, 1, (07 world data. For benchmarking and basic experiments, we mainly 2007). [20] Raymond Reiter, ‘A theory of diagnosis from first principles’, Artificial used synthesized data, while real-world data, specifically from Open Intelligence, 32(1), 57–95, (1987). Street Maps is important to ensure the flexibility and robustness of [21] Martin Sachenbacher and Brian C. Williams, ‘Diagnosis as semiring- the implementation in real-world scenarios. We also showed the per- based constraint optimization’, in European Conference on Artificial formance for different data sets and also revealed open issues, like Intelligence, pp. 873–877, (2004). [22] C. E. Shannon, ‘A Mathematical Theory of Communication’, The Bell the relevance of chosen attributes, which is already a starting point System Technical Journal, 27(3), 379–423, (1948). for future work. Another important step will be a user study to eva- [23] C. E. Shannon, ‘A mathematical theory of communication’, Bell System lute the acceptance of the algorithm. Technical Journal, 27, 379–623, (1948). [24] Markus Stumptner and Franz Wotawa, ‘Diagnosing Tree-Structured Systems’, in Proceedings 15th International Joint Conf. on Artificial ACKNOWLEDGEMENTS Intelligence, Nagoya, Japan, (1997). [25] Markus Stumptner and Franz Wotawa, ‘Diagnosing tree-structured sys- Research presented in this paper was carried out as part of the AS- tems’, Artificial Intelligence, 127(1), 1–29, (2001). IT-IC project that is co-financed by the Cooperation Programme In- [26] Markus Stumptner and Franz Wotawa, ‘Coupling CSP decomposition terreg V-A Slovenia-Austria 2014-2020, European Union, European methods and diagnosis algorithms for tree-structured systems’, in Pro- Regional Development Fund. ceedings of the 18th International Joint Conference on Artificial Intel- ligence (IJCAI-03), pp. 388–393, Acapulco, Mexico, (2003). [27] Y. Sun, Y. Zhang, Y. Chen, and R. Jin, ‘Conversational recommendation REFERENCES system with unsupervised learning’, pp. 397–398, (2016). [28] Yueming Sun and Yi Zhang, ‘Conversational recommender system’, [1] Johan de Kleer and Brian C. Williams, ‘Diagnosing multiple faults’, in The 41st International ACM SIGIR Conference on Research & Artificial Intelligence, 32(1), 97–130, (1987). Development in Information Retrieval, SIGIR ’18, pp. 235–244, New [2] Rina Dechter, Constraint Processing, Morgan Kaufmann, 2003. York, NY, USA, (2018). ACM. [3] Yousri El Fattah and Rina Dechter, ‘Diagnosing tree-decomposable cir- [29] Alan M. Turing, Computing Machinery and Intelligence, 23–65, cuits’, in Proceedings 14th International Joint Conf. on Artificial In- Springer Netherlands, Dordrecht, 2009. telligence, pp. 1742 – 1748, (1995). [30] R.S. Wallace, The anatomy of A.L.I.C.E., 2009. [4] Alexander Felfernig, Gerhard Friedrich, Monika Schubert, Monika [31] J. Weizenbaum, ‘Eliza-a computer program for the study of natural lan- Mandl, Markus Mairitsch, and Erich Teppan, ‘Plausible repairs for in- guage communication between man and machine’, Communications of consistent requirements.’, in IJCAI International Joint Conference on the ACM, 9(1), 36–45, (1966). Artificial Intelligence, pp. 791–796, (01 2009). [32] N.V. Wünderlich and S. Paluch, ‘A nice and friendly chat with a bot: [5] Asbjørn Følstad and Petter Bae Brandtzæg, ‘Chatbots and the new User perceptions of ai-based service agents’, (2018). world of hci’, interactions, 24(4), 38–42, (June 2017). [6] Hector Geffner and Judea Pearl, ‘An Improved Constraint-Propagation Algorithm for Diagnosis’, in Proceedings 10th International Joint Conf. on Artificial Intelligence, pp. 1105–1111, (1987). [7] Georg Gottlob, Nicola Leone, and Francesco Scarcello, ‘Hypertree De- composition and Tractable Queries’, in Proc. 18th ACM SIGACT SIG- MOD SIGART Symposium on Principles of Database Systems (PODS- 99), pp. 21–32, Philadelphia, PA, (1999). [8] Georg Gottlob, Nicola Leone, and Francesco Scarcello, ‘A compari- son of structural CSP decomposition methods’, Artificial Intelligence, 124(2), 243–282, (December 2000). [9] B. Graf, M. Krüger, F. Müller, A. Ruhland, and A. Zech, ‘Nombot - sim- plify food tracking’, volume 30-November-2015, pp. 360–363, (2015). [10] Jennifer Hill, W. Randolph Ford, and Ingrid G. Farreras, ‘Real con- versations with artificial intelligence: A comparison between human- human online conversations and human-chatbot conversations’, Com- puters in Human Behavior, 49, 245 – 250, (2015). [11] Dietmar Jannach, ‘Finding preferred query relaxations in content-based recommenders’, IEEE Intelligent Systems, 109, 81–97, (04 2008). [12] Dietmar Jannach, Markus Zanker, and Matthias Fuchs, ‘Constraint- based recommendation in tourism: A multiperspective case study’, Journal of IT and Tourism, 11, 139–155, (2009). [13] A. Khanna, M. Jain, T. Kumar, D. Singh, B. Pandey, and V. Jha, ‘Anatomy and utilities of an artificial intelligence conversational en- tity’, pp. 594–597, (2016). [14] Tariq Mahmood, Francesco Ricci, and Adriano Venturini, ‘Learning adaptive recommendation strategies for online travel planning’, Infor- mation and Communication Technologies in Tourism 2009, 149–160, (2009).