<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Tangible Decision-making in Sensors Augmented Spaces?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>David Massimo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Ricci</string-name>
          <email>friccig@unibz.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Free University of Bolzano</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>53</fpage>
      <lpage>62</lpage>
      <abstract>
        <p>Recommender Systems (RSs) are web tools aimed at easing users' online decision-making. Here we propose a complementary scenario: supporting (tangible) decision-making in the physical space. In particular, we propose a novel RS technology that harness data coming from a sensor augmented environment, e.g., a Smart City. In such setting, users' movements can be tracked and the knowledge of their choices (visit to points of interest, POIs) can be used to generate recommendations for not yet visited POIs. The proposed technique overcome the inability of current RSs to generalise the preferences directly derived from the user's observed behaviour by decoupling the learning of the user behaviour (predicted choices) from the recommender model (recommended choices). In our approach we apply clustering to users' observed sequences of choices (i.e., POI visit trajectories) in order to identify likebehaving users and then we learn the optimal user behaviour model for each cluster. Then, by harnessing the learned optimal behaviour model we generate novel and relevant recommendations, which provide useful information in addition to choices that the user will make without any recommendation (predicted choices). In this paper we summarise the proposed RS technology; we describe its performance across di erent dimensions in an o ine test and a users study by comparing the proposed technique with session-aware nearest neighbour based baselines (SKNN). The o ine analysis results show that our approach suggests items that are novel and increases the user's satisfaction (high reward), whereas the SKNN approaches are good at predicting the exact user behaviour. Interestingly, the online results show that the proposed approach excels in what a (tourism) RS should do: suggesting items that the user is unaware of and also relevant.</p>
      </abstract>
      <kwd-group>
        <kwd>Recommender Systems Inverse Reinforcement Learning Clustering User Study</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Finding relevant information in an online catalogue is not an easy task. Users
may be exposed to a large variety of content, incurring in information overload
? The research described in this paper was developed in the project Suggesto Market</p>
      <p>Space, which is funded by the Autonomous Province of Trento, in collaboration with</p>
      <p>Ectrl Solutions and Fondazione Bruno Kessler.</p>
      <p>
        Copyright c 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], and therefore may make poorly informed decisions. In order to ease human
decision-making Recommender Systems (RSs) have been proposed. A RS is a
web-tool that identi es for a user items that are (potentially) appropriate for
her current need. Since users' preferences and behaviour may also be in uenced
by contextual factors, such as, the weather conditions at the time of the item
consumption, context-aware RSs have been introduced [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Moreover, in order
to leverage the knowledge derived from the order in which users consume items,
pattern-discovery [
        <xref ref-type="bibr" rid="ref10 ref14 ref4">10, 4, 14</xref>
        ] and reinforcement learning [
        <xref ref-type="bibr" rid="ref11 ref16">16, 11</xref>
        ] approaches have
also been proposed. The rst approach extracts common patterns from users'
behaviour logs and learns a predictive model of the next user action. The latter
generates recommendations by using the optimal choice model (policy) of the
user. In both models the recommendation generation process is strictly tight to
the learnt user's behaviour, i.e., they suggest the user's predicted next choice.
      </p>
      <p>Moreover, the rst approach can only suggest items that have been already
observed, while the second assumes that the utility the user gets from her choices
is known in advance. This is contrasting with the tendency of users to rarely
provide an explicit feedback (e.g., ratings).</p>
      <p>RSs technology has been mostly applied to the web scenario, where users
interact with online content. With the advent of sensor augmented spaces, like
Smart Cities, where sensors collects and leverages data to handle assets and
resources e ciently, RS technology could be applied to ease users' (tangible)
decision-making while they interact with the physical space. A RS can leverage
the observations of users' choices recorded by the sensors. In fact, our application
domain is tourism, where a user acts in di erent contexts and performs decisions
about what to visit in a sequential fashion. E.g., a tourist decides which point
of interest (POI) she would like to visit next, given her past visit choices and
contextual conditions. In a sensor augmented space the tourist's sequences of
choices (i.e., trajectory) may be recorded by sensors and can be leveraged to
identify relevant and useful next-POI visits for tourist.</p>
      <p>We propose a novel RS context-aware technique that, not only eases the
(tangible) decision-making of users while they interact with a sensor augmented
space, but also overcomes the main problem of the aforementioned RS solutions:
the inability to generalize from the observed data and, consequently, the poor
novelty of the recommendations. Hence, we have devised a RS model that can
explain and generalize from observed behaviours in order to generate non-trivial
and relevant recommendations for a user.</p>
      <p>
        Our RS approach models with a reward function the \satisfaction" that a
POI, with some given features, gives to a user. The reward function is learnt by
using the observation of the users' sequences of visited POIs and is estimated by
Inverse Reinforcement Learning (IRL) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], a behaviour learning approach that
is widely used in automation and behavioural economics [
        <xref ref-type="bibr" rid="ref18 ref3 ref5">5, 3, 18</xref>
        ]. Moreover,
since it is di cult to have at disposal the knowledge of a consistent part of a
new visitor's travel related choices, which would be needed to learn the reward
function of a single individual, IRL is instead applied to clusters of users, and
the learned reward function is therefore shared by all the users in a cluster. For
this reason we say that the system has learned a generalised, one per cluster,
tourist behaviour model, which identi es the action (POI visit) that a user in a
cluster should try next.
      </p>
      <p>In this paper we show the two main component of the proposed RS
technology: clustering of users in di erent tourist types in order to learn generalized
user behaviour models via IRL; recommendation strategies that harness the
learnt behaviour models in order to generate novel and relevant suggestions for
the user. Moreover, we discuss the performance of the proposed method across
several dimensions in an o ine study. The results indicate that the proposed
IRL-based solution excels in suggesting novel and rewarding items, whereas a
(SKNN-based) pattern-discovery baseline has a higher precision. We conjecture
that the lack of precision of the proposed solution is due to the fact that
SKNNbased methods favour items that appears frequently in the data, i.e., items that
are popular. To further study this aspect we hybridize the proposed RS
technique with an item popularity scoring technique and show that our conjecture
holds: biasing the IRL-based model with item popularity allows the model to
practically equals the precision of the KNN-based baseline.</p>
      <p>
        The remainder of the paper is structured as follows. In Section 2 we describe
the formalisation of the recommendation problem, how users are clustered in
tourist types and how the user's action-selection policy (i.e., behaviour) is learned
via Inverse Reinforcement Learning. Then we detail two recommendation
strategies: the strategies presented in [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ] and an additional model that combines the
proposed IRL-based approach with item popularity. In section 3 we present the
baselines, the metrics and the evaluation procedure of the o ine study and the
user study. In Section 4 we report the experimental results. Finally, in Section
5 we state the conclusion.
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Method</title>
      <sec id="sec-2-1">
        <title>User Behaviour Modelling</title>
        <p>User (tourist) behaviour modelling is here based on Markov Decision Processes
(MDP). A MDP is de ned by a tuple (S; A; T; r; ). S is the state space, and
in our scenario a state models the visit to a POI in a speci c context. The
contextual dimensions are: the weather (visiting a POI during a sunny, rainy
or windy time); the day time (morning, afternoon or evening); and the visit
temperature conditions (warm or cold). A is the action space; in our case it
represents the decisions to move to a POI. A user that is situated in a speci c
POI and context can reach all the other POIs in a new context. T is a nite
set of probabilities T (s0js; a): the probability to make a transition from state s
to s0 when action a is performed. For example, a user that visits Museo di San
Marco in a sunny morning (state s1) and wants to visit Palazzo Pitti (action a1)
in the afternoon can arrive to the desired POI with either a rainy weather (state
s2) or a clear sky (state s3) with transition probabilities T (s2; a1js1) = 0:2 and
T (s3; a1js1) = 0:8. The function r : S ! R models the reward a user obtains
from visiting a state. This function is unknown and must be learnt. We take the
restrictive assumption that we do not know the reward the user receives from
visiting a POI (the user is not supposed to reveal it). But, we assume that if the
user visited a POI and not another (nearby) one is because she believes that the
rst POI gives her a larger reward than the second. Finally, 2 [0; 1] is used to
measure how future rewards are discounted with respect to immediate ones.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>User Behavior Learning</title>
        <p>
          Given a MDP, our goal is to nd a policy : S ! A that maximises the
cumulative reward that the decision maker obtains by acting according to
(optimal policy). The value of taking a speci c action a in state s under the
policy , is computed as Q (s; a) = Es;a; [Pk1=0 kr(sk)], i.e., it is the expected
discounted cumulative reward obtained from a in state s and then following the
policy . The optimal policy dictates to a user in state s to perform the action
that maximizes Q. The problem of computing the optimal policy for a MDP is
solved by Reinforcement Learning algorithms [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
        <p>We denote with u a user u trajectory, which is a temporally ordered list
of states (POI-visits). For instance, u1 = (s10; s5; s15) represents a user u1
trajectory starting from state s10, moving to s5 and ending to s15. With Z we
represent the set of all the observed users' trajectories which can be used to
estimate the probabilities T (s0js; a).</p>
        <p>
          Since, typically users of a recommender system scarcely provide feedback on
the consumed items (i.e., visited POIs), the reward a user gets by consuming
an item is not known. Therefore, the MDP cannot be solved by using standard
Reinforcement Learning techniques. Instead, by having at disposal only a set of
POI-visit observations of a user (i.e., the users' trajectories), a MDP could be
solved via Inverse Reinforcement Learning (IRL) methods [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. In particular, IRL
enables to learn a reward function whose optimal policy (the learning objective)
dictates actions close to the demonstrated behavior (the user trajectory). In this
work we have used Maximum likelihood IRL [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>
          Having the knowledge of the full user history of travel related choices, which
would be needed to learn the reward function of a single individual, is generally
hard to obtain. Therefore, IRL is here applied to clusters of users (trajectories)
[
          <xref ref-type="bibr" rid="ref8 ref9">9, 8</xref>
          ]. This allows to learn a reward function that is shared by all the users in a
cluster. Hence, we say that the system has learned a generalized tourist behavior
model, which identi es the action (POI visit) that a user in a cluster should try
next. Clustering the users' trajectories is done by grouping them according to a
common semantic structure that can explain the resulting clusters. This is done
by employing Non Negative Matrix Factorization (NMF) [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] on document like
representations of the trajectories (features are treated as keywords).
2.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Recommending Next-POI visits</title>
        <p>
          Here we present the above mentioned IRL-based recommendation techniques:
Q-BASE shown in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] as well as a novel method that hybridizes Q-BASE with
the popularity of an item.
        </p>
        <p>
          Q-BASE. The behavior model of the cluster the user belongs to is used to
suggest the optimal action this user should take next, after the last visited POI. The
optimal action is the action with the highest Q value in the user current state [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>Q-POP PUSH. In order to recommend more popular items, we propose to
hybridise the Q-BASE model with the recommended item popularity, i.e., a score
proportional to the probability that a user visit the item.</p>
        <p>Q-POP PUSH scores the (potential) visit action a in state s as following:
score(s; a) = (1 +
2)</p>
        <p>Q(s; a) pop(a)
(Q(s; a) + pop(a)
2)
This is the harmonic mean of Q(s; a) and pop(a), the scaled (i.e., min-max
scaling) counts cZ (p) (in the data set Z) of the occurrences of the POI-visit p
selected by the action a (an action corresponds to the visit to of a single point).
3
3.1</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Evaluation</title>
      <sec id="sec-3-1">
        <title>Dataset</title>
        <p>
          In this study we used an extended version of the POI-visit data-set presented
in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. It consists of tourist trajectories reconstructed from the public photo
albums of users of the Flickr1 platform. The trajectory extraction process is as
follow, from the information about the GPS position and time of each single
photo in an album the corresponding Wikipedia page is queried (geo query)
in order to identify the name of the represented POI. The time information is
used to order the POI sequence derived from an album. In [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] the dataset has
been extended by adding information about the context of the visit (weather
summary, temperature and part of the day), as well as POI content information
(POI historic period, POI type and related public gure). In this paper we use
an extended version of the dataset that contains 1668 trajectories and 793 POIs.
Trajectories clustering identi ed 5 di erent clusters, as in the previous study.
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Baselines</title>
        <p>We compare here the performance of the recommendations generated by the
proposed IRL-based methods with two nearest neighbor baselines: session-based
KNN (SKNN) and sequence-aware SKNN (s-SKNN).</p>
        <p>
          SKNN [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] seeks for similar users in the system stored logs (trajectories) and
identi es the next-item (POI) to be recommended, given the current user log
(user trajectory), by using a classical collaborative ltering scoring rule.
        </p>
        <p>
          s-SKNN[
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] uses again the classical collaborative ltering rule but weights
the neighbours importance by weighting more those containing the most recent
1 www. ickr.com
items (recent POIs in the user trajectory). These methods have been applied to
di erent next-item recommendations tasks showing good performance.
3.3
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Metrics</title>
        <p>
          The proposed recommendation strategies were benchmarked by using several
metrics. The reward metric measures the average increase of reward of the
recommended actions compared to the observed one (in the test part of the
trajectory). It is the aggregated di erence of the recommended POI-visits Q values
and the Q value of the observed (test) visit. Dissimilarity measures how much
the recommendations are dissimilar from the observed visit and ranges in [0; 1].
Novelty estimates how unpopular are the recommended visit actions and ranges
in [0; 1]. A POI is assumed to be unpopular if its visits count is lower than the
median of this variable in the training set. Detailed de nitions of these metrics
can be found in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Precision is the percentage of recommended visits that match
the observed one, hence it shows to what extent the system suggests the actions
actually performed by the user.
3.4
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>O ine Study</title>
        <p>Initially, for each cluster, 80% of the trajectories were assigned to the train set
and the remaining 20% to the test set. Then, for each cluster, the train set
data was used to learn the generalised user behaviour model for that cluster.
Afterwards, in order to compute and evaluate recommendations, the
trajectories in the test set were split in two parts: the initial 70% of each trajectory
was considered as observed by the system and used to generate next action
recommendations, while the remaining part (30%) was actually used as test part
in order to asses the evaluation metrics. The SKNN-based baselines do not use
clustering, hence they were trained on all the trajectories in the train set and
the test set trajectories were split in observed and test parts as before. All the
models hyper-parameters have been selected via 5-fold cross validation.
3.5</p>
      </sec>
      <sec id="sec-3-5">
        <title>Online Study</title>
        <p>We also conducted an observational study with real users. They interacted with
an online system that we developed to assesses the novelty and user satisfaction
for the recommendations generated by the Q-BASE model, the Q-POP PUSH
model and the same SKNN baseline used in the o ine study. In the online
system the users can enter the set of POI that they have already visited in the
city of Florence and can receive suggestions for next POIs to visit. In particular,
the user can mark the suggestions with the labels \visited", \novel", \liked".
To avoid biases in the recommendation evaluation we do not reveal to the user
which recommendation algorithm produces which POI recommendation. The
suggestions that the user evaluates is a list that aggregates the top-3 suggestions
of each algorithm without giving to any algorithm a particular priority.
4.1</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <sec id="sec-4-1">
        <title>O ine Study Results</title>
        <p>
          The compared recommenders' performance for top-1 and top-5 next-POI visit
recommendations are shown in Table 1. One can observe that Q-BASE allows
users to obtain larger reward, compared to SKNN and s-SKNN. While, as
expected, SKNN-based baselines have the best precision, as they tends to suggest
next-POIs that the user would anyway visit. Interestingly, SKNN and s-SKNN
perform very similarly. Hence, in this data-set, the sequence-aware extension of
SKNN does not provide any performance improvement. These results con rm a
previous analysis [
          <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
          ].
        </p>
        <p>By looking at the performance of Q-POP PUSH we see that a stronger
popularity bias enables the algorithm to generate recommendations that are more
precise. In fact, the precision of Q-POP PUSH is equal to that of SKNN and
s-SKNN. But, as expected, reward and novelty are penalised.
The results of the online study are shown in Table 2. This table shows the
probabilities that a user marks as \visited", \novel", \liked" or both \liked"
and \novel" an item recommended by an algorithm. They are computed by
dividing the total number of items marked as, visited, liked, novel and both
liked and novel, for each algorithm, by the total number of items shown by an
algorithm. By construction, each algorithm contributes with 3 recommendations
in the aggregated list shown to each user. The number of recommended
nextPOI visits shown to the users is 1119 (approximately three by each of the three
methods per user, excluding the items recommended by two or more method
simultaneously). Hence on average a user has seen 7.1 recommendations.</p>
        <p>We note that the POIs recommended by SKNN and Q-POP PUSH have
the highest probability (24%) that the user have already visited them, and the
lowest probability to be considered as novel. Conversely, Q-BASE scores a lower
probability that the recommended item be already visited (16%) and the highest
probability that the recommended item be novel (52%). This is in line with the
o ine study where Q-BASE excels in recommending novel items.</p>
        <p>Considering now the user satisfaction for the recommendations (liked), we
conjectured that a high reward of an algorithm measured o ine, corresponds to a
high perceived satisfaction (likes) measured online. But, by looking at the results
in Table 2 we have a di erent outcome. Q-BASE, which has the highest o ine
reward recommends items that an online user likes with the lowest probability
(36%). Q-POP PUSH and SKNN recommend items that are more likely to be
liked by the user (46%).</p>
        <p>Another measure of system precision that we computed is the probability
that a user likes a novel recommended POI, i.e., a POI that the recommender
presented for the rst time to the user ( \Liked &amp; Novel" in Table 2). We argue
that this is the primary goal of a recommender system: to enable users to discover
novel items that are interesting for them. Q-BASE (highest reward and lowest
precision o ine) recommends items that a user will nd novel and also like with
the highest probability (0.09%), whereas SKNN and Q-POP PUSH recommends
items that the user will nd novel and will like with a lower probability(0.08%).
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this paper we presented a new next-POI RS technique that harness a
generalised tourists behaviour model. The tourist behaviour model is learnt by rstly
clustering users' POI-visit trajectories and then by solving an Inverse
Reinforcement Learning problem which determines, for each cluster, the reward function
and the optimal POI selection policy. The proposed recommendation strategies
(Q-BASE and Q-POP PUSH) adapt the next visit-action recommendations to
the learned model. We show with an o ine experiment that the proposed
QBASE model maximises the reward the user gains while discovering relevant,
novel and non-popular POIs. Moreover, the two SKNN-based baselines shows a
better o ine accuracy. We hypothesised that users like more the
recommendations produced by Q-BASE and that the poor o ine accuracy of these models,
compared to SKNN-based approaches, is due to the absence of a popularity bias
in the recommendation generation. Therefore, we hybridize Q-BASE with POI
popularity and show that the hybrid model (Q-POP PUSH) substantially equals
the SKNN baselines. With an online test we show that the Q-BASE model excels
in suggesting novel items that are also liked (\liked and novel") by the users.</p>
      <p>We plan to extend the presented analysis by conducting an evaluation with
tourists interacting with real systems while on the move2.
2 www.wondervalley.unibz.it and https://beacon.bz.it/wp-6/beaconrecommender/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Adomavicius</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tuzhilin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Context-aware recommender systems</article-title>
          . In: Ricci,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Rokach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Shapira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Kantor</surname>
          </string-name>
          , P.B. (eds.)
          <source>Recommender Systems Handbook</source>
          , pp.
          <volume>217</volume>
          {
          <issue>253</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Babes-Vroman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marivate</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Subramanian</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Littman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Apprenticeship learning about multiple intentions</article-title>
          .
          <source>In: Proceedings of the 28th International Conference on Machine Learning - ICML'11</source>
          . pp.
          <volume>897</volume>
          {
          <issue>904</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ermon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xue</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toth</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dilkina</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bernstein</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Damoulas</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>DeGloria</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mude</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barrett</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomes</surname>
            ,
            <given-names>C.P.</given-names>
          </string-name>
          :
          <article-title>Learning Large Scale Dynamic Discrete Choice Models of Spatio-Temporal Preferences with Application to Migratory Pastoralism in East Africa</article-title>
          . pp.
          <volume>644</volume>
          {
          <issue>650</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Jannach</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lerche</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Leveraging Multi-Dimensional User Models for Personalized Next-Track Music Recommendation</article-title>
          .
          <source>In: Proceedings of the Symposium on Applied Computing - SAC'17</source>
          . pp.
          <volume>1635</volume>
          {
          <issue>1642</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kennan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walker</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          :
          <source>The E ect of Expected Income on Individual Migration Decisions. Econometrica</source>
          <volume>79</volume>
          (
          <issue>1</issue>
          ),
          <volume>211</volume>
          {
          <fpage>251</fpage>
          (
          <year>2011</year>
          ). https://doi.org/10.3982/ECTA4657
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>D.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seung</surname>
            ,
            <given-names>H.S.</given-names>
          </string-name>
          :
          <article-title>Learning the parts of objects by non-negative matrix factorization</article-title>
          .
          <source>Nature</source>
          <volume>401</volume>
          (
          <issue>6755</issue>
          ),
          <volume>788</volume>
          {
          <fpage>791</fpage>
          (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Ludewig</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jannach</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Evaluation of session-based recommendation algorithms</article-title>
          .
          <source>User Model. User-Adapt. Interact</source>
          .
          <volume>28</volume>
          (
          <issue>4-5</issue>
          ),
          <volume>331</volume>
          {
          <fpage>390</fpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Massimo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ricci</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Harnessing a generalised user behaviour model for next-poi recommendation</article-title>
          .
          <source>In: Proceedings of the 12th ACM Conference on Recommender Systems, RecSys</source>
          <year>2018</year>
          , Vancouver, BC, Canada, October 2-
          <issue>7</issue>
          ,
          <year>2018</year>
          . pp.
          <volume>402</volume>
          {
          <issue>406</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Massimo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ricci</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Clustering users' pois visit trajectories for next-poi recommendation</article-title>
          .
          <source>In: Information and Communication Technologies in Tourism</source>
          <year>2019</year>
          ,
          <article-title>ENTER 2019</article-title>
          , Proceedings of the International Conference in Nicosia, Cyprus,
          <source>January 30-February 1</source>
          ,
          <year>2019</year>
          . pp.
          <volume>3</volume>
          {
          <issue>14</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Mobasher</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , H. Dao, T. Luo, Nakagawa,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Using Sequential and Non-Sequential Patterns in Predictive Web Usage Mining Tasks</article-title>
          .
          <source>In: Proceedings of the IEEE International Conference on Data Mining - ICDM '02</source>
          . pp.
          <volume>669</volume>
          {
          <issue>672</issue>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Moling</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baltrunas</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ricci</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Optimal radio channel recommendations with explicit and implicit feedback</article-title>
          .
          <source>In: Proceedings of the 6th ACM conference on Recommender systems - RecSys '12</source>
          . p.
          <volume>75</volume>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Muntean</surname>
            ,
            <given-names>C.I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardini</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Silvestri</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baraglia</surname>
          </string-name>
          , R.:
          <article-title>On Learning Prediction Models for Tourists Paths</article-title>
          .
          <source>ACM Transactions on Intelligent Systems and Technology</source>
          <volume>7</volume>
          (
          <issue>1</issue>
          ),
          <volume>1</volume>
          {
          <fpage>34</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Russell</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Algorithms for inverse reinforcement learning</article-title>
          .
          <source>In: Proceedings of the 17th International Conference on Machine Learning - ICML '00</source>
          . pp.
          <volume>663</volume>
          {
          <issue>670</issue>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Palumbo</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rizzo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baralis</surname>
          </string-name>
          , E.:
          <article-title>Predicting Your Next Stop-over from Locationbased Social Network Data with Recurrent Neural Networks</article-title>
          . In: RecSys '
          <volume>17</volume>
          , 2nd ACM International Workshop on Recommenders in
          <source>Tourism (RecTour'17)</source>
          ,
          <source>CEUR Proceedings</source>
          Vol.
          <year>1906</year>
          . pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Roetzel</surname>
            ,
            <given-names>P.G.</given-names>
          </string-name>
          :
          <article-title>Information overload in the information age: a review of the literature from business administration, business psychology, and related disciplines with a bibliometric approach and framework development</article-title>
          .
          <source>Business Research</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Shani</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heckerman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brafman</surname>
            ,
            <given-names>R.I.:</given-names>
          </string-name>
          <article-title>An mdp-based recommender system</article-title>
          .
          <source>Journal of Machine Learning</source>
          Research pp.
          <volume>1265</volume>
          {
          <issue>1295</issue>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Sutton</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barto</surname>
            ,
            <given-names>A.G.</given-names>
          </string-name>
          :
          <article-title>Reinforcement Learning: An Introduction (Second edition</article-title>
          , in progress). The MIT Press (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Ziebart</surname>
            ,
            <given-names>B.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bagnell</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dey</surname>
            ,
            <given-names>A.K.</given-names>
          </string-name>
          :
          <article-title>Maximum entropy inverse reinforcement learning</article-title>
          .
          <source>In: Proceedings of the 23rd National Conference on Arti cial Intelligence - AAAI'08</source>
          . pp.
          <volume>1433</volume>
          {
          <issue>1438</issue>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>