<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Bari, Italy
" andrea.iovine@uniba.it (A. Iovine); fedelucio.narducci@poliba.it (F. Narducci); marco.degemmis@uniba.it
(M. d. Gemmis); giovanni.semeraro@uniba.it (G. Semeraro)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>An Investigation on the Impact of Natural Language on Conversational Recommendations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Discussion Paper</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Iovine</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fedelucio Narducci</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco de Gemmis</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Semeraro</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Politecnico di Bari</institution>
          ,
          <addr-line>Via E. Orabona 4, 70125, Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Bari Aldo Moro</institution>
          ,
          <addr-line>Via E. Orabona 4, 70125, Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>In this paper, we investigate the combination of Virtual Assistants and Conversational Recommender Systems (CoRSs) by designing and implementing a framework named ConveRSE, for building chatbots that can recommend items from diferent domains and interact with the user through natural language. An user experiment was carried out to understand how natural language influences both the cost of interaction and recommendation accuracy of a CoRS. Experimental results show that natural language can indeed improve user experience, but some critical aspects of the interaction should be mitigated appropriately.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Conversational Recommender Systems</kwd>
        <kwd>Natural Language Processing</kwd>
        <kwd>Conversational Agents</kwd>
        <kwd>Information Retrieval</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        presents a work that was previously published in Decision Support Systems [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        A Conversational Recommender System (CoRS) is defined as a system that provides
recommendations to users via a multi-turn dialogue [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. CoRSs are characterized by the fact that they
acquire the user profile in an iterative fashion. The system can interact with users by asking
them to rate some items, and in turn they can influence the outcome of the recommendation by
providing feedback on the suggested items. Traditional recommender systems, on the other
hand, require that all user information is provided before generating a recommendation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        The idea of combining together Virtual Assistants and recommender systems has been
introduced in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which highlights the technological gap between the two systems. The authors
sustain that a VA can improve the recommendation process because it can learn the users’
evolving, diverse and multi-aspect preferences. We investigate this claim by proposing the
integration of a chat-based interface into a CoRS.
      </p>
      <p>
        CoRSs have been developed using many diferent input and output modalities, such as forms
[
        <xref ref-type="bibr" rid="ref10 ref6 ref8 ref9">8, 9, 6, 10</xref>
        ] and voice/text [
        <xref ref-type="bibr" rid="ref11 ref12 ref13 ref14">11, 12, 13, 14</xref>
        ]. CoRSs also difer based on the preference elicitation
strategy, such as constraint-based [
        <xref ref-type="bibr" rid="ref11 ref8">8, 11</xref>
        ], critiquing-based [
        <xref ref-type="bibr" rid="ref15 ref6">6, 15, 16</xref>
        ], or strategies that rely on
acquiring pairwise preferences [17]. In this paper, we propose a system that features a
userdriven preference elicitation strategy, which can build the profile via natural language messages
that are directly provided by users. We also compare its efectiveness against a system-driven,
question-answer elicitation strategy.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Workflow and System Architecture</title>
      <p>
        Interaction with ConveRSE is divided into three main steps, which are also found in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]: (i)
preference elicitation; (ii) generation of recommendations, (iii) acquisition of user feedback.
These steps are repeated until a satisfactory recommendation is generated. Preference elicitation
is the most important step in any CoRS [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and this is no exception in ConveRSE. In particular,
this step is largely driven by users, who interact with the CoRS by talking about the items
and the properties that they like or dislike (e.g. "I like The Matrix", "I love Sylvester Stallone, but
I hate Rocky"), as shown in Figure 2. This increases flexibility and allows the recommender
system to focus on the features that matter to users, compared to a system-driven interface
that proactively proposes the items to evaluate. After a recommendation has been generated,
users can provide additional feedback on it (e.g. "I don’t like this movie", "I like it but I don’t like
the genre"). This feedback is integrated into the user profile, and exploited to improve further
recommendations.
      </p>
      <p>The architecture of ConveRSE is shown in Figure 1. In particular, ConveRSE implements the
following components:</p>
      <p>Natural Language Understanding (NLU): It is in charge of understanding the user’s
utterance. It performs three tasks: (i) intent recognition, which classifies the action or request
expressed in the message (e.g. providing a preference, requesting a recommendation), (ii) entity
recognition, which extracts items (e.g. movies, actors, directors) that are mentioned in the
message, and (iii) sentiment analysis, which then assigns a sentiment score to each item. Intent</p>
      <p>NLU
Intent Recognition
Entity Recognition</p>
      <p>Sentiment Analysis
Dialogue Manager
Dialogue State Tracking</p>
      <p>Dialogue Policy</p>
      <p>Response Generation
Response</p>
      <p>Recommender System
recognition is implemented using Google Dialogflow 1, Sentiment Analysis is performed using
the Stanford CoreNLP2 Sentiment Tagger, while Entity Recognition is developed in-house.</p>
      <p>Dialogue Manager (DM): It supervises the interaction process, by coordinating the activity
of all other components. It performs three tasks: Dialogue State Tracking, which keeps track
of all information exchanged with the user, and updates it accordingly, Dialogue Policy, which
selects the best action to perform based on the current intent and the dialogue state (e.g.
generate a recommendation, ask for clarification), and Response Generation, that generates a
textual feedback by filling a template with contextual information. This component is developed
in-house.</p>
      <p>Recommender System: It handles all functions related to profile building and generation
of suggestions. In particular, ConveRSE uses a graph-based recommendation algorithm based
on the PageRank with Priors [18]. It exploits a knowledge graph extracted from Wikidata
[19], in which both items and their properties are represented as nodes in the graph. This
component is also responsible for generating explanations, exploiting the connections between
the recommended item and the items in the user profile.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Evaluation</title>
      <p>We created three instances of ConveRSE that are able to generate recommendations for diferent
domains: movies, books and music, with respectively 15, 954, 7, 592 and 12, 926 recommendable
items. Aside from this, we also implemented three diferent interaction modes for the CoRS:
(i) Natural Language (NL), in which users express their preferences in the form of short text
messages, and all the steps described in Section 3 are performed via text; (ii) Buttons, which uses
a traditional system-driven approach for profile elicitation, in which the CoRS proposes a set of
popular items, and users express a rating by pressing buttons; (iii) Mixed, which is an extension
of the NL interface, in which users can answer certain system questions using buttons (e.g.
1https://dialogflow.cloud.google.com/
2https://stanfordnlp.github.io/CoreNLP/
when multiple entities in the knowledge base match an item mentioned in the user utterance).
Therefore, there are nine configurations in total.</p>
      <p>We performed three within-subjects experiments (one for each domain), which involved
50 people for the movie domain, 55 for the book domain, and 54 for the music domain. The
results of the experiment will answer the following Research Questions: RQ1: Can natural
language improve a Conversational Recommender System in terms of cost of interaction?; RQ2:
Can natural language improve a Conversational Recommender System in terms of quality of the
recommendations?</p>
      <p>During the experiment, participants were briefly instructed on how to use the system. Then,
they performed all steps described in Section 3. After providing at least three preferences,
they received a set of five recommended movies, each of which could be accepted, rejected,
or more complex feedback could be provided. During the experiment, we collected several
metrics related to the interaction cost and recommendation accuracy. For the interaction cost,
we recorded the number of questions (NQ) asked by the system, the time needed to answer
those questions (TPQ), the total interaction Time (IT), and the Query Density (QD) [20], which
measures the average number of new concepts (i.e. entities) introduced in each utterance. For
the recommendation quality we measured the Accuracy and the Mean Average Precision (MAP).</p>
      <p>Results are shown in Table 1. We can observe that the NL and Mixed configurations ask a lower
number of questions compared to the button-based one. Also, the NL configuration requires
longer IT and TPQ, while the mixed configuration obtained the lowest values. This suggests
that an interaction based entirely on natural language can become ineficient in specific cases,
for example, when a disambiguation of the user input is required. Integrating a button-based
interface for these cases leads to reduced typing time and less mistakes, which dramatically
reduces the interaction cost. We also observe that the mixed interaction mode obtains the best
recommendation accuracy results in all domains, which means that it allows users to express
their preferences more efectively, thus improving the quality of the suggestions.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, we presented an experimental study on the efect of introducing natural language
interaction into a CoRS. Although a dialogue in natural language has the potential to improve
interaction cost and quality of recommendations of a CoRS, a purely NL-based interface poses
some issues that need to be addressed. Specifically, when the user has to choose among a set of
possible options, the integration of buttons drastically improves user experience.
Modeling and User-Adapted Interaction 22 (2012) 125–150. URL: http://link.springer.com/
10.1007/s11257-011-9108-6. doi:10.1007/s11257-011-9108-6.
[16] G. Wu, K. Luo, S. Sanner, H. Soh, Deep Language-based Critiquing for Recommender
Systems, in: Proceedings of the 13th ACM Conference on Recommender Systems, RecSys
’19, ACM, New York, NY, USA, 2019, pp. 137–145. URL: http://doi.acm.org/10.1145/3298689.
3347009. doi:10.1145/3298689.3347009, event-place: Copenhagen, Denmark.
[17] K. Christakopoulou, F. Radlinski, K. Hofmann, Towards Conversational Recommender
Systems, in: Proceedings of the 22nd ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining - KDD ’16, ACM Press, San Francisco,
California, USA, 2016, pp. 815–824. URL: http://dl.acm.org/citation.cfm?doid=2939672.2939746.
doi:10.1145/2939672.2939746.
[18] T. H. Haveliwala, Topic-sensitive pagerank: A context-sensitive ranking algorithm for
web search, IEEE transactions on knowledge and data engineering 15 (2003) 784–796.
[19] D. Vrandečić, M. Krötzsch, Wikidata: a free collaborative knowledgebase, Communications
of the ACM 57 (2014) 78–85. Publisher: ACM New York, NY, USA.
[20] J. Glass, J. Polifroni, S. Senef, V. Zue, Data collection and performance evaluation of
spoken dialogue systems: The mit experience, in: Sixth International Conference on
Spoken Language Processing, 2000.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Rafailidis</surname>
          </string-name>
          ,
          <article-title>The technological gap between virtual assistants and recommendation systems</article-title>
          , arXiv preprint arXiv:
          <year>1901</year>
          .
          <volume>00431</volume>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Jugovac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          ,
          <article-title>Interacting with recommenders: Overview and research directions</article-title>
          ,
          <source>ACM Trans. Interact. Intell. Syst</source>
          .
          <volume>7</volume>
          (
          <year>2017</year>
          )
          <volume>10</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          :
          <fpage>46</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Moller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-P.</given-names>
            <surname>Engelbrecht</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Kuhnel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Wechsung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <article-title>A taxonomy of quality of service and quality of experience of multimodal human-machine interaction</article-title>
          ,
          <source>in: Quality of Multimedia Experience</source>
          ,
          <year>2009</year>
          . QoMEx 2009. International Workshop on, IEEE,
          <year>2009</year>
          , pp.
          <fpage>7</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Iovine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Narducci</surname>
          </string-name>
          , G. Semeraro,
          <article-title>Conversational recommender systems and natural language: A study through the converse framework, Decision Support Systems (</article-title>
          <year>2020</year>
          )
          <article-title>113250</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.dss.
          <year>2020</year>
          .
          <volume>113250</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Manzoor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          <article-title>Survey on Conversational Recommender Systems</article-title>
          , arXiv preprint arXiv:
          <year>2004</year>
          .
          <volume>00646</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mahmood</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ricci</surname>
          </string-name>
          ,
          <article-title>Improving recommender systems with adaptive conversational strategies</article-title>
          ,
          <source>in: Proceedings of the 20th ACM conference on Hypertext and hypermedia, ACM</source>
          ,
          <year>2009</year>
          , pp.
          <fpage>73</fpage>
          -
          <lpage>82</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Rafailidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Manolopoulos</surname>
          </string-name>
          , Can Virtual Assistants Produce Recommendations?,
          <source>in: Proceedings of the 9th International Conference on Web Intelligence, Mining and Semantics</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Goker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <article-title>The adaptive place advisor: A conversational recommendation system</article-title>
          ,
          <source>in: Proceedings of the 8th German Workshop on Case Based Reasoning</source>
          , Citeseer,
          <year>2000</year>
          , pp.
          <fpage>187</fpage>
          -
          <lpage>198</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pu</surname>
          </string-name>
          ,
          <article-title>A Comparative Study of Compound Critique Generation in Conversational Recommender Systems</article-title>
          , in: V. P. Wade,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ashman</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          Smyth (Eds.),
          <source>Adaptive Hypermedia and Adaptive Web-Based Systems, Lecture Notes in Computer Science</source>
          , Springer, Berlin, Heidelberg,
          <year>2006</year>
          , pp.
          <fpage>234</fpage>
          -
          <lpage>243</lpage>
          . doi:
          <volume>10</volume>
          .1007/11768012_
          <fpage>25</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L. W.</given-names>
            <surname>Dietz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Myftija</surname>
          </string-name>
          , W. Wörndl,
          <article-title>Designing a conversational travel recommender system based on data-driven destination characterization</article-title>
          , in: ACM RecSys workshop on recommenders in tourism,
          <year>2019</year>
          , pp.
          <fpage>17</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>C. A.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Goker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Langley</surname>
          </string-name>
          ,
          <article-title>A Personalized System for Conversational Recommendations</article-title>
          ,
          <source>Journal of Artificial Intelligence Research</source>
          <volume>21</volume>
          (
          <year>2004</year>
          )
          <fpage>393</fpage>
          -
          <lpage>428</lpage>
          . URL: https://jair.org/index.php/jair/article/view/10374. doi:
          <volume>10</volume>
          .1613/jair.1318.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Conversational Recommender System, arXiv:
          <year>1806</year>
          .03277 [cs] (
          <year>2018</year>
          ). URL: http://arxiv.org/abs/
          <year>1806</year>
          .03277, arXiv:
          <year>1806</year>
          .03277.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>H.</given-names>
            <surname>Mori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chiba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Nose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ito</surname>
          </string-name>
          ,
          <article-title>Dialog-Based Interactive Movie Recommendation: Comparison of Dialog Strategies</article-title>
          , volume
          <volume>82</volume>
          , Springer International Publishing,
          <year>2018</year>
          , pp.
          <fpage>77</fpage>
          --
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Habib</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , K. Balog, IAI MovieBot:
          <article-title>A Conversational Movie Recommender System</article-title>
          ,
          <source>Proceedings of the 29th ACM International Conference on Information &amp; Knowledge Management</source>
          (
          <year>2020</year>
          )
          <fpage>3405</fpage>
          -
          <lpage>3408</lpage>
          . URL: http://arxiv.org/abs/
          <year>2009</year>
          .03668. doi:
          <volume>10</volume>
          .1145/3340531. 3417433, arXiv:
          <year>2009</year>
          .03668.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>L.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pu</surname>
          </string-name>
          ,
          <article-title>Critiquing-based recommenders: survey and emerging trends</article-title>
          , User
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>