=Paper= {{Paper |id=Vol-2290/kars2018_paper6 |storemode=property |title=A Domain-independent Framework for building Conversational Recommender Systems |pdfUrl=https://ceur-ws.org/Vol-2290/kars2018_paper6.pdf |volume=Vol-2290 |authors=Fedelucio Narducci,Pierpaolo Basile,Andrea Iovine,Marco de Gemmis,Pasquale Lops,Giovanni Semeraro |dblpUrl=https://dblp.org/rec/conf/recsys/NarducciBIGLS18 }} ==A Domain-independent Framework for building Conversational Recommender Systems== https://ceur-ws.org/Vol-2290/kars2018_paper6.pdf
                  A domain-independent Framework for building
                     Conversational Recommender Systems
          Fedelucio Narducci                            Pierpaolo Basile                             Andrea Iovine
    Department of Computer Science             Department of Computer Science           Department of Computer Science
    University of Bari Aldo Moro, Italy        University of Bari Aldo Moro, Italy      University of Bari Aldo Moro, Italy
       fedelucio.narducci@uniba.it                 pierpaolo.basile@uniba.it                  andrea.iovine@uniba.it

           Marco de Gemmis                               Pasquale Lops                          Giovanni Semeraro
    Department of Computer Science             Department of Computer Science           Department of Computer Science
    University of Bari Aldo Moro, Italy        University of Bari Aldo Moro, Italy      University of Bari Aldo Moro, Italy
        marco.degemmis@uniba.it                      pasquale.lops@uniba.it                giovanni.semeraro@uniba.it
ABSTRACT                                                            best fit their needs. Accordingly, the acquisition of prefer-
Conversational Recommender Systems (CoRSs) implement a              ences is an incremental process that might not be necessarily
paradigm where users can interact with the system for defin-        finalized in a single step. CoRSs can provide several inter-
ing their preferences and discovering items that best fit their     action modes and can offer explanation mechanisms. Hence,
needs. A CoRS can be straightforwardly implemented as a             the goal of these systems is not only to improve the accu-
chatbot. Chatbots are becoming more and more popular for            racy of the recommendations, but also to provide an effective
several applications like customer care, health care, medical       user-recommender interaction.
diagnoses. In the most complex form, the implementation                In this paper we propose a framework, not dependent on
of a chatbot is a challenging task since it requires knowl-         the domain, for generating conversational recommender sys-
edge about natural language processing, human-computer              tems. Our framework implements most of the capabilities that
interaction, and so on. In this paper, we propose a general         a recommender should offer, such as preference acquisition,
framework for making easy the generation of conversational          profile exploration, critiquing strategies, and explanation
recommender systems. The framework, based on a content-             capability. Furthermore, it offers three different interaction
based recommendation algorithm, is independent from the             modes: natural language, buttons, and a combination of the
domain. Indeed, it allows to build a conversational recom-          two previous ones.
mender system with different interaction modes (natural                The most complex interaction mode is certainly the one
language, buttons, hybrid) for any domain. The framework            based on natural language. Indeed, a conversational rec-
has been evaluated on two state-of-the-art datasets with the        ommender based on natural language needs at least four
aim of identifying the components that mainly influence the         components: an intent recognizer, an entity recognizer, a sen-
final recommendation accuracy.                                      timent analyzer, and a recommendation algorithm. The next
                                                                    sections explain the tasks that each component carries out.
ACM Reference Format:
                                                                    Usually, the first three components are also used for purposes
Fedelucio Narducci, Pierpaolo Basile, Andrea Iovine, Marco de
Gemmis, Pasquale Lops, and Giovanni Semeraro. 2019. A domain-
                                                                    different from the recommendation task. For example, an
independent Framework for building Conversational Recommender       entity recognizer is useful in several applications where the
Systems. In Proceedings of Knowledge-aware and Conversational       identification of named entities in a given text is needed, such
Recommender Systems (KaRS) Workshop 2018 (co-located with           as news classification, search algorithms, customer support.
RecSys 2018). ACM, New York, NY, USA, 6 pages.                      In this work we first generalized, combined, and integrated
                                                                    the aforementioned components in order to make easy the
1    INTRODUCTION                                                   development of a new conversational recommender system,
Conversational Recommender Systems (CoRSs) are charac-              then we investigated the impact of each component on the
terized by the capability of interacting with the user during       recommendation process.
the recommendation process [11]. Instead of asking users to            By exploiting our framework, we implemented instances
provide all requirements in one step, CoRSs guide the users         of a conversational recommender system in three different
through an interactive dialog [8].                                  domains: movies, music, and books 1 .
   Users can provide functional requirements or technical con-         The rest of the paper is organized as follows: the relevant
straints used by the recommender for finding the items that         literature is analyzed in Section 2; Section 3 describes the
                                                                    architecture of our framework, and finally, the experimental
Knowledge-aware and Conversational Recommender Systems              evaluation and the discussion of results are reported in Section
(KaRS) Workshop 2018 (co-located with RecSys 2018), October
7, 2018, Vancouver, Canada.                                         4. Section 5 draws the conclusion and the future work.
2018. ACM ISBN Copyright for the individual papers remains with
the authors. Copying permitted for private and academic purposes.
                                                                    1
This volume is published and copyrighted by its editors..            On Telegram,   search   for:   @MovieRecSysBot,   @MusicRecSys,
                                                                    @BookRecSys.
Knowledge-aware and Conversational Recommender Systems (KaRS) Workshop 2018 (co-located with RecSys 2018), October 7,
2018, Vancouver, Canada.                                                                              F. Narducci et al.

2    RELATED WORK                                                  3      THE FRAMEWORK
Conversational Recommender Systems fall in the area of                    ARCHITECTURE
Goal-Oriented Dialog Systems. A Goal-Oriented Dialog Sys-          The architecture of our framework is depicted in Figure 1. The
tem, also known as Chatbot, is designed for helping users          main goal of this framework is to make easy the building of a
to achieve a given goal (e.g. to book a restaurant). These         new CoRS. Therefore, the components have been generalized
systems are generally closed-domain, thus can be exploited         making them independent from a specific domain. When the
in scenarios like recommendation [13, 14], retrieval [15] and      user wants to build e new CoRS for a new domain she should
can be integrated in larger systems, such as Amazon Alexa2 ,       update the configuration file, and provide the list of entities
to give the impression of a general coverage [7]. In the litera-   and properties in the Wikidata5 format. This requirement
ture there is a distinction between modular and end-to-end         will be better explained in the next sections and depends on
dialog systems. The former are composed of at least two com-       the Entity Recognizer. The interaction follows a slot-filling
ponents: a dialog-state tracking component and a response          model, where a set of features need to be filled in order to
generator; the latter do not rely on explicit internal states      accomplish the user goal. As an example, the recommendation
and learn a dialog model based on past conversations [2].          step requires that the user preferences have been filled. In
In this work we define a modular framework for generating          the following we analyze each component in detail.
goal-oriented dialog systems for the recommendation task in
any domain. These systems are very used on social networks
since they can acquire information on the user by analyzing
their activities on the platform [10].
   There are several work in the literature that tried to im-
prove various aspects of the conversational recommendation
process [8]. In [4] the authors demonstrated that a speech-
based interaction model produces higher user satisfaction
and needs less interaction cycles. In [16] the authors pro-
pose a chat-based group recommender system that iteratively
allows users to express and revise their preferences during
the decision making process. In [6] the authors present an
interactive visualization framework that combines recom-                     Figure 1: The Framework Architecture
mendation techniques with visualization ones to support
human-recommender interaction. Several researchers devel-             Dialog Manager This is the core component of the frame-
oped integrated frameworks for conversational recommender          work whose responsibility is to supervise the whole recommen-
systems [3, 19] by combining conversational functionalities        dation process. The Dialog Manager (DM) is the component
with adaptive and recovery functions. To the best of our           that keeps track of the dialog state. DM receives the user
knowledge, the framework proposed in this paper is the first       message, invokes the components needed for answering to
solution that allows to generate a CoRS for any domain             the user request, and returns the message to be shown to the
by offering a complete suite of functions for a multi-modal        user. When all the information for fulfilling the user request
interaction.                                                       are available, the Dialog Manager returns the message to
   A commercial solution is proposed by Microsoft with the         the client. DM is completely independent from the client,
Bot Framework3 that provides tools for building, connect-          indeed it receives a text message and returns a json message
ing, testing, and deploying intelligent bots. Even though          as a response. In this way the client can be any messaging
the framework is not designed for generating conversational        platform like Facebook Messenger, Telegram, Web apps, and
recommender systems, it allows to integrate services from          so on.
Microsoft Azure like the recommendation engine4 . However,            Intent Recognizer
the effort for integrating and connecting the different compo-        This component has the goal of defining the intent of the
nents is borne by the user of the framework. Furthermore, the      user formulated by natural language. The Intent Recognizer
framework does not offer features like critiquing strategies or    (IR) is based on DialogFlow6 developed by Google. Our
explanation functions. Also, the different interaction modes       framework uses the DialogFlow APIs for sending the user
(e.g., buttons) have to be implemented by the user. Last           message and to receive the intent recognized. DialogFlow
but not least, in this work we studied the accuracy of each        requires a set of example sentences for each intent. There are
component integrated in our framework. Conversely, other           four main intents to be recognized:
solutions can be used only as black box models.                       - preference: the user is providing a preference. The prefer-
                                                                   ence can be expressed on a new item, or on a recommended
                                                                   item. In the latter case, the preference is also considered as
2
  https://developer.amazon.com/alexa                               a critique (e.g., I like this movie, but I don’t its director).
3
  https://dev.botframework.com/
4                                                                  5
  https://azure.microsoft.com/en-us/resources/videos/building-a-       https://www.wikidata.org/
recommender-system-in-azure-ml-studio/                             6
                                                                       https://dialogflow.com/
  Knowledge-aware and Conversational Recommender Systems (KaRS) Workshop 2018 (co-located with RecSys 2018), October 7,
A domain-independent Framework for building Conversational Recommender Systems                2018, Vancouver, Canada.


   - recommendation: the user asks to receive a recommenda-       for two reasons: 1) existing entity recognizer/entity linking
tion. This intent is the condition for other sub-intents such     algorithms are hard to customize to a specific domain; 2)
as explanation where the user asks the motivation for a given     entity recognizers included in existing dialog manager toolkits
recommendation, critiquing where the user expresses a cri-        generally require annotated data to build a new model for a
tique on one or more features of the recommended item, more       specific domain, while we implemented a knowledge based
info where the user asks more details on the recommended          approach that does not need any annotated data. The task is
item (e.g., the plot, the trailer).                               challenging since more than one surface form Spielberg, Steven
   - show profile: the user asks to visualize (and modify) her    Spielberg can refer to Steven Spielberg:director, and the same
list of preferences.                                              surface form Spielberg can refer to more than one concept
   - help: the user asks for help from the system to complete     Steven Spielberg:director and Sasha Spielberg:actor in case of
a given task.                                                     ambiguous entities. Moreover, we need to limit the entities’
   Each intent can be composed of a set of sub-intents that       type according to the domain of the CoRS and the list of
activate specific functions. For example the intent profile       concepts and properties provided during the configuration of
has delete preference, update preference, reset profile as sub    the framework.
intents. The motivation behind this hierarchical organization        In order to recognize the entities in the user request, we
is that, generally, the sub-intent can be activated only when     build a search engine based on a classical Vector Space Model
the parent intent is activated too. The activation of a parent    in which for each entity we store all the possible alias provided
intent is managed by the Dialog Manager.                          by Wikidata. For example, for the concept Q8877 (Steven
   Sentiment Analyzer The Sentiment Analyzer (SA) is              Spielberg), we store the alias Steven Allan Spielberg, Spielberg
based on the Sentiment Tagger of Stanford CoreNLP7 . The          and Steven Spielberg. The index is exploited for retrieving
Sentiment Tagger takes as input the user sentence and re-         a list of candidate concepts according to the input text. In
turns the sentiment tags identified. Afterwards, SA assigns       particular, given a text 𝑇 as input, the ER module performs
the sentiment tags to the right entity identified into the sen-   a chunking operation in order to identify nominal chunks by
tence. For example, given the sentence I like The Matrix,         using the Apache OpenNLP library. Each nominal chunk is
but I hate Keanu Reeves, the Sentiment Tagger identifies a        sent as a query to the search engine in order to retrieve a list
positive sentiment (i.e. like) and a negative one (i.e. hate).    of candidate concepts. The output of this first recognition
SA associates the positive sentiment to the entity The Ma-        step is a list of candidate concepts assigned to each nominal
trix and the negative sentiment to the entity Keanu Reeves.       chunk. The list is sorted according to the score assigned
The association sentiment-entity is performed by computing        by the search engine. We use Apache Lucene as library for
the distance between the sentiment tag and the entity into        implementing our search engine.
the sentence. The distance is in terms of number of tokens           The last step consists in selecting the correct concept for
that separate the sentiment tag from the entity. Given the        each chunk. The idea is to choose the concept that is more
aforementioned example, the distance between like and The         similar to the other concepts occurring in the text following
Matrix is zero, as well as the distance between hate and          the hypothesis of one topic for discourse. The motivation
Keanu Reeves. The sentiment tag identified by CoreNLP is          behind this approach is that the user tends to cite in the
thus associated to the closest entity in the sentence. Fur-       same text entities that are in some way related. The score
thermore, SA implements a Property Type Recognizer that           𝑠(𝑐𝑗 ) assigned to each candidate concept 𝑐𝑖 for the chunk 𝑒𝑗
is able to identify property-mentions in the sentence. Given      is computing according to the Equation 1, where 𝐸 is the set
the sentence I like The Matrix, but I hate the director, SA       of the other nominal chunks in the text. The score 𝑠(𝑐𝑗 ) is
identifies the property director and assigns the negative         the sum for each chunk π‘’π‘˜ in 𝐸 of the maximum similarity
sentiment to it. Afterwards, SA will retrieve the entity as-      score between all the candidate concepts 𝑐𝑖 of π‘’π‘˜ and 𝑐𝑗 .
sociated to that property, Larry e Andy Wachowski in the
given example. The list of properties the framework is able                              βˆ‘οΈ
to recognize is provided in the configuration file.                           𝑠(𝑐𝑗 ) =           π‘Žπ‘Ÿπ‘”π‘šπ‘Žπ‘₯𝑐𝑖 βˆˆπΆπ‘’π‘˜ π‘ π‘–π‘š(𝑐𝑗 , 𝑐𝑖 )   (1)
   Entity Recognizer                                                                     π‘’π‘˜ ∈𝐸

   The aim of the Entity Recognizer (ER) module is to find
relevant entities mentioned in the user sentence and then            In order to compute the score 𝑠(𝑐𝑗 ), we need to define a
to link them to the correct concept in the Knowledge Base         similarity function between concepts. In our approach we rely
(KB). The KB chosen for building our framework is Wikidata        on graph embeddings that have recently gained considerable
since it is a free and open knowledge base and acts as a          attention [17]. These approaches allow to represent entities
hub of several structured data coming from Wikimedia sister       and relations through an embedding, which is a continuous
projects8 . Moreover, Wikidata covers several domains and         vector representation able to capture the semantics of an
this is a key feature for developing a domain-independent         entity or a relation. We investigate holographic embeddings
framework. We choose to develop a custom entity recognizer        (HolE) [18], which exploit the circular correlation of entity
                                                                  embeddings to create compositional representations of bi-
7
    https://stanfordnlp.github.io/CoreNLP/
                                                                  nary relational data coming from Wikidata. By exploiting
8
    Wikipedia, Wikivoyage, Wikisource, and others                 HolE, each entity is represented by an embedding and we
Knowledge-aware and Conversational Recommender Systems (KaRS) Workshop 2018 (co-located with RecSys 2018), October 7,
2018, Vancouver, Canada.                                                                              F. Narducci et al.


can compute the similarity between two entities by the co-        I loved. Would you recommend something I might like?
sine similarity between the corresponding embeddings. The         Ghost”.
exploited graph is built by querying Wikidata. Finally, the       We used this dataset for testing the impact of the Entity
score 𝑠(𝑐𝑗 ) is averaged with the score returned by the search    Recognizer, the Sentiment Analyzer, and the Intent Recog-
engine and the list of candidate concepts is re-sorted in de-     nizer on the recommendation process. Since the dataset has
scending order. The first element of each candidate list is the   the goal of evaluating end-to-end dialog systems, the dataset
concept assigned to each nominal chunk in the text. The ER        is split in three sub sets: training, test, and dev set. Given
module can be adapted to exploit a custom KB for particular       the goal of our experiment, we excluded the training set due
domains not covered by Wikidata. The only requirements            to its huge dimension, and we used the test, and dev sets
are: 1) the knowledge must be modeled through triples; 2)         composed respectively of 6,667 and 6,733 examples (each
each concept must have one or more alias.                         example contains the preference elicitation, the recommen-
   Recommendation Services This component collects                dation request, and the recommendation). We defined four
the services strictly related to the recommendation process.      different configurations of the framework:
The recommendation algorithm implemented is the PageR-               - Upper bound (UB): this configuration tests the ac-
ank with Priors [5], also known as Personalized PageRank.         curacy of our recommendation algorithm. The preferences
It works on a graph where the nodes are the entities the          and the recommendation requests are filled programmati-
recommender deals with (e.g., for the movie domain, actors,       cally, so, except for the recommendation algorithm, the other
movies, directors, genre, etc.). These entities are extracted     components of the frameworks do not work.
from Wikidata, and their connections (edges in the graph)            - Intent Recognizer Test (IR): in this configuration
are extracted from DBpedia9 . Hence, for example the movie        the only component that works is the Intent Recognizer. The
The Matrix is connected to the director node Larry e Andy         component detects the intention of the user of expressing a
Wachowski, to the genre node science fiction. The algorithm       preference and receiving the recommendation. If both intents
has been effectively used in other recommendation environ-        are correctly recognized, the recommendation is performed
ments [1]. Another recommendation service offered by the          by setting the entities and their sentiments programmatically.
framework is the explanation feature. The framework imple-           - Entity Recognizer Test (ER): in this configuration
ments an explanation algorithm inspired by [12]. The idea         the only component that works is the Entity Recognizer. It
is to use the connections between the user preferences and        detects the entities on which the user expressed a preference.
the recommended items for explaining why a given item has         The sentiment on the entities correctly identified are set
been recommended. An example of natural-language expla-           programmatically.
nation provided by the system is: ”I suggest you the movie           - Sentiment Recognizer Test (SR): in this configura-
Duplex because you like movies where: the actor is Ben Stiller    tion the only component that works is the Sentiment Recog-
as in Meet the Fockers, the genre is Comedy as in Ameri-          nizer. The component detects the sentiments in the sentence.
can Reunion.”. In this case the system used the connections       The entities on which the sentiments is expressed are set
between the recommended movie Duplex and the user prefer-         programmatically.
ences (Meet the Fockers, American Reunion, and Ben Stiller ).        These configurations allow to test one component at time
The last service implemented is the critiquing. This service      by excluding the influence of the other ones in the process.
allows to acquire a critique on a recommended item (e.g. I        For each configuration we computed the HitRate@n as the
like the movie Titanic, but I don’t like the actor Bill Paxton)   ratio of the hits in the recommendation list with n= 5, 10, 20.
and this feedback will be used in the next recommendation         In Table 1, the first row reports the upper bound in terms of
cycle by properly setting the weights of the nodes in the         HitRate@n: this is the best result that our recommendation
PageRank graph.                                                   algorithm can achieve on this dataset in the ideal situation
   As before stated, all these components are independent         where the other components work with a 100% of accuracy.
from the domain. The only requirement is that the entities        The other rows report the loss in terms of HitRate@n of each
have to be available in Wikidata.                                 configuration compared to the upper bound. Due to the space
4       EXPERIMENTAL EVALUATION                                   limit, we report only the results on the test set of bAbI, since
                                                                  the dev set follows the same trend.
The goal of the experimental evaluation was to define the
                                                                     It is worth to note that it is not surprising that the upper
 accuracy of each component involved in our framework. For
                                                                  bound is very low. Indeed, in the bAbI dataset even though
 this experiment we used the bAbI dataset developed by Face-
                                                                  the top predictions contain items that might be good for the
 book Research10 . The dataset collects a list of utterances,
                                                                  user, if the actual single true label is not recommended, the
 and each utterance contains a list of preferences, followed
                                                                  recommendation fails.
 by the recommendation request, and the recommended item.
                                                                     The analysis of the loss shows that the entity recognizer
An example of utterance is:
                                                                  (ER) is the component with the highest negative impact on
”Beauty and the Beast, Aladdin, Schindler’s List, The Shaw-
                                                                  the recommendation process. Even though the entity recog-
 shank Redemption, and The Silence of the Lambs are movies
                                                                  nizer was able to recognize ∼ 85% of the entities on the bAbI
9
    http://wiki.dbpedia.org/                                      dataset, the error (∼ 15% entities not recognized) determined
10
     https://research.fb.com/downloads/babi/                      a strong loss in terms of accuracy. The second component in
  Knowledge-aware and Conversational Recommender Systems (KaRS) Workshop 2018 (co-located with RecSys 2018), October 7,
A domain-independent Framework for building Conversational Recommender Systems                2018, Vancouver, Canada.


terms of negative impact on the recommendation accuracy            release the source code of the framework and make it available
is the Intent Recognizer (IR). In this case, the component         for the community. The preliminary experimental evaluation
was able to correctly recognize ∼ 77% of the intents (both         on two state-of-the-art datasets shows the impact of each
preference elicitation, and recommendation request). The           component in a classical movie recommendation scenario. In
component with the lowest impact on the recommendation             this way, the user is aware of the limitations of the single
accuracy is the Sentiment Recognizer (SR), in this case the        modules implemented. In the next future, we plan to run an
component was able to correctly recognize ∼ 83% of the             experimental evaluation on another synthetic dataset and
sentiments. By analyzing these results, it emerges that the        to perform an in-vivo evaluation with real users on three
Entity Recognizer plays a crucial role in the recommenda-          different domains (movie, book, music) with the aim of in-
tion process of a conversational recommender. This aspect          vestigating the impact of other capabilities (e.g. critiquing,
is particularly crucial when a rich user profile is not avail-     explanation) on the recommendation process.
able, and the recommender has to work on a small number
of preferences as in the bAbI dataset. The second compo-           ACKNOWLEDGMENT
nent that negatively influences the recommendation is the
                                                                   This work has been funded by the projects UNIFIED WEALTH
IR. Also in this case, if the CoRS is not able to correctly
                                                                   MANAGEMENT PLATFORM - OBJECTWAY SpA - Via
identify the user intention, it will not be able to activate the
                                                                   Giovanni Da Procida nr. 24, 20149 MILANO - c.f., P. IVA
correct process to satisfy the request. Finally, the SR is the
                                                                   07114250967, and PON01 00850 ASK-Health (Advanced sys-
component with the lowest negative impact. However, in the
                                                                   tem for the interpretations and sharing of knowledge in health
bAbI dataset all the sentences have a positive sentiment, and
                                                                   care).
this facilitates the work of the component. We also tested our
framework on a dataset recently released by Grouplens [9].
This dataset has been collected with the aim of analyzing the      REFERENCES
                                                                    [1] Pierpaolo Basile, Cataldo Musto, Marco de Gemmis, Pasquale
recommendation requests of real users to a conversational               Lops, Fedelucio Narducci, and Giovanni Semeraro. 2014. Content-
recommender. In this dataset we could only analyze the accu-            based recommender systems+ DBpedia knowledge= semantics-
racy of the entity recognizer and the intent recognizer for the         aware recommender systems. In Semantic Web Evaluation Chal-
                                                                        lenge. Springer, 163–169.
request-recommendation intent. The dataset is composed of           [2] Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit
694 sentences. The framework correctly recognized the 7.64%             Chopra, Alexander Miller, Arthur Szlam, and Jason Weston. 2015.
of request-recommendation intents, and the 64.39% of the                Evaluating prerequisite qualities for learning end-to-end dialog
                                                                        systems. arXiv preprint arXiv:1511.06931 (2015).
entities. The very low performance of the IR is due to the fact     [3] M Goker and Cynthia Thompson. 2000. The adaptive place
that in this dataset the user requests the recommendation               advisor: A conversational recommendation system. In Proceedings
                                                                        of the 8th German Workshop on Case Based Reasoning. Citeseer,
in a very varied and synthetic form (e.g., ”action movies”,             187–198.
”exploitations films”, ”film with sharks”, ”i’m looking for a       [4] Peter Grasch, Alexander Felfernig, and Florian Reinfrank. 2013.
hard sci-fi movie”), so this requires a specific training of the        ReComment: Towards Critiquing-based Recommendation with
                                                                        Speech Interaction. In Proceedings of the 7th ACM Conference
IR component. However, this limit could be overcome in a                on Recommender Systems (RecSys ’13). ACM, New York, NY,
real-world scenario since the system can ask for reformulating          USA, 157–164. DOI:https://doi.org/10.1145/2507157.2507161
the sentence when it is not able to understand. Moreover,           [5] Taher H Haveliwala. 2003. Topic-sensitive pagerank: A context-
                                                                        sensitive ranking algorithm for web search. IEEE transactions
the ER accuracy is lower than the one measured on the bAbI              on knowledge and data engineering 15, 4 (2003), 784–796.
dataset, probably because the entities are written directly by      [6] Chen He, Denis Parra, and Katrien Verbert. 2016. Interactive
                                                                        recommender systems: A survey of the state of the art and future
the users and might contain errors, or might not correspond             research challenges and opportunities. Expert Systems with Ap-
exactly to the entities in our database (e.g., ”call work or-           plications 56 (2016), 9 – 27. DOI:https://doi.org/10.1016/j.eswa.
ange”). This requires a disambiguation step that can not be             2016.02.013
                                                                    [7] Vladimir Ilievski, Claudiu Musat, Andreea Hossmann, and
performed in an in-vitro experiment.                                    Michael Baeriswyl. 2018.          Goal-Oriented Chatbot Dialog
                                                                        Management Bootstrapping with Transfer Learning. CoRR
                  HR@5       HR@10        HR@20                         abs/1802.00500 (2018). arXiv:1802.00500 http://arxiv.org/abs/
           UB     0.75       1.21         1.93                          1802.00500
                                                                    [8] Michael Jugovac and Dietmar Jannach. 2017. Interacting with Rec-
                  Loss@5     Loss@10      Loss@20                       ommenders—Overview and Research Directions. ACM
           IR     -34.00%    -30.86%      -24.03%                       Trans. Interact. Intell. Syst. 7, 3, Article 10 (Sept. 2017), 46 pages.
           ER     -46.00%    -35.80%      -27.13%                       DOI:https://doi.org/10.1145/3001837
                                                                    [9] Jie Kang, Kyle Condiff, Shuo Chang, Joseph A. Konstan, Loren G.
           SR     -20.00%    -16.05%      -14.73%                       Terveen, and F. Maxwell Harper. 2017. Understanding How
                                                                        People Use Natural Language to Ask for Recommendations. In
Table 1: Loss for each configuration in terms of Hi-                    Proceedings of the Eleventh ACM Conference on Recommender
tRate                                                                   Systems, RecSys 2017, Como, Italy, August 27-31, 2017, Paolo
                                                                        Cremonesi, Francesco Ricci, Shlomo Berkovsky, and Alexander
5    CONCLUSION AND FUTURE WORK                                         Tuzhilin (Eds.). ACM, 229–237. DOI:https://doi.org/10.1145/
                                                                        3109859.3109873
In this paper, we proposed a framework for building conver-        [10] P. Lops, M. De Gemmis, G. Semeraro, F. Narducci, and C. Musto.
                                                                        2011. Leveraging the LinkedIn social network data for extracting
sational content-based recommender systems in any domain.               content-based user profiles. RecSys’11 - Proc. of the 5th ACM
The only requirement to be satisfied is to provide a list of            Conf. on Recommender Systems (2011), 293–296. DOI:https:
entities and properties in the Wikidata format. We will soon            //doi.org/10.1145/2043932.2043986
Knowledge-aware and Conversational Recommender Systems (KaRS) Workshop 2018 (co-located with RecSys 2018), October 7,
2018, Vancouver, Canada.                                                                              F. Narducci et al.


[11] Tariq Mahmood and Francesco Ricci. 2009. Improving recom-
     mender systems with adaptive conversational strategies. In Pro-
     ceedings of the 20th ACM conference on Hypertext and hyper-
     media. ACM, 73–82.
[12] Cataldo Musto, Fedelucio Narducci, Pasquale Lops, Marco
     De Gemmis, and Giovanni Semeraro. 2016. ExpLOD: A Frame-
     work for Explaining Recommendations based on the Linked Open
     Data Cloud. In Proceedings of the 10th ACM Conference on
     Recommender Systems. ACM, 151–154.
[13] C. Musto, F. Narducci, P. Lops, G. Semeraro, M. De Gem-
     mis, M. Barbieri, J. Korst, V. Pronk, and R. Clout. 2012. En-
     hanced semantic TV-show representation for personalized elec-
     tronic program guides. In Int. Conf. on User Modeling, Adap-
     tation, and Personalization, Vol. 7379 LNCS. 188–199. DOI:
     https://doi.org/10.1007/978-3-642-31454-4 16
[14] F. Narducci, P. Basile, C. Musto, P. Lops, A. Caputo, M. de
     Gemmis, L. Iaquinta, and G. Semeraro. 2016. Concept-based item
     representations for a cross-lingual content-based recommendation
     process. Information Sciences 374 (2016), 1339–1351. DOI:https:
     //doi.org/10.1016/j.ins.2016.09.022
[15] F. Narducci, M. Palmonari, and G. Semeraro. 2013. Cross-
     language semantic retrieval and linking of e-gov services. Lec-
     ture Notes in Computer Science (including subseries Lecture
     Notes in Artificial Intelligence and Lecture Notes in Bioin-
     formatics) 8219 LNCS, PART 2 (2013), 130–145. DOI:https:
     //doi.org/10.1007/978-3-642-41338-4 9
[16] Thuy Ngoc Nguyen and Francesco Ricci. 2017. Dynamic Elicita-
     tion of User Preferences in a Chat-based Group Recommender
     System. In Proceedings of the Symposium on Applied Comput-
     ing (SAC ’17). ACM, New York, NY, USA, 1685–1692. DOI:
     https://doi.org/10.1145/3019612.3019764
[17] Maximilian Nickel, Kevin Murphy, Volker Tresp, and Evgeniy
     Gabrilovich. 2016. A review of relational machine learning for
     knowledge graphs. Proc. IEEE 104, 1 (2016), 11–33.
[18] Maximilian Nickel, Lorenzo Rosasco, Tomaso A Poggio, and others.
     2016. Holographic Embeddings of Knowledge Graphs.. In The
     Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16).
     1955–1961.
[19] Francesco Ricci and Fabio Del Missier. 2004. Supporting travel
     decision making through personalized recommendation. In De-
     signing personalized user experiences in eCommerce. Springer,
     231–251.