=Paper= {{Paper |id=Vol-3219/paper3 |storemode=property |title=Maximal Likelihood Itinerary Planning with User Interaction Data |pdfUrl=https://ceur-ws.org/Vol-3219/paper3.pdf |volume=Vol-3219 |authors=Keisuke Otaki,Yukino Baba |dblpUrl=https://dblp.org/rec/conf/recsys/OtakiB22 }} ==Maximal Likelihood Itinerary Planning with User Interaction Data== https://ceur-ws.org/Vol-3219/paper3.pdf
Maximal Likelihood Itinerary Planning with User
Interaction Data
Keisuke Otaki1 , Yukino Baba2
1
    Toyota Central R&D Labs., Inc., Koraku Mori Building 10F, 1-4-14 Koraku, Bunkyo-ku, Tokyo, 1120004, Japan
2
    The University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo, 1538902, Japan


                                         Abstract
                                         Planning itineraries is a complex task for tourists. While some tourists have their favorite events and
                                         plans (e.g., places to visit, ways to travel) precisely in mind, others would like to explore multiple choices
                                         of possible events. To improve the user experiences of tourism, we develop a novel itinerary planning
                                         framework in which users directly interact with our system by editing displayed itineraries. Our idea is
                                         to collect edition-based feedback via editions by users, to estimate user preferences from the editions,
                                         and to utilize this data when generating personalized itineraries. To implement this framework, we
                                         generalize the maximum likelihood planning framework by introducing a new optimization problem
                                         to estimate transition probabilities between POIs with both historical and interaction data. To explain
                                         how the maximum likelihood itinerary planning-based method works, we report our proof-of-concept
                                         experiments aiming to provide a new perspective for interactive itinerary planning with user interaction.

                                         Keywords
                                         itinerary recommendation, orienteering problem, maximal likelihood planning, optimization




1. Introduction
Background Planning an itinerary (also called a travel plan or trajectory) is a complex task
when a tourist plans a trip. Planning often involves places to visit (e.g., points-of-interests,
POIs), places to stay (i.e., accommodations), how to travel between places (e.g., transportation
and its mode), booking, and payments (if needed). While some tourists have their favorite
places and/or plans exactly in mind, others would like to explore several choices to visit. In
the literature, optimization-based methods have been studied as an important component for
generating itineraries [1, 2, 3, 4]. A well-known optimization problem called the orienteering
problem or its variants are often employed [5, 6]. The orienteering problem is the problem of
constructing a trajectory (i.e., sequences of POIs) to maximize the benefits from the visited
places under travel distances/time constraints. An important process behind the orienteering
problem is how to evaluate the benefits of POIs for users. Using some objective values (e.g., an
average rate or staying time of the POI) we can build traditional and common itineraries, while
we can build personalized itineraries with some subjective values (e.g, a rate or staying time by
a specific user).

RecSys Workshop on Recommenders in Tourism (RecTour 2022), September 22th, 2022, co-located with the 16th ACM
Conference on Recommender Systems, Seattle, WA, USA
$ otaki@mosk.tytlabs.co.jp (K. Otaki); yukino-baba@g.ecc.u-tokyo.ac.jp (Y. Baba)
 0000-0001-9431-0867 (K. Otaki)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
Figure 1: System overview. Our method generates a ranked list of itineraries. Our key component is a
transition probability matrix among POIs. We try to update such a matrix using interaction data; we
assume that users interact with our interface (e.g., Web/App), and operations like Swap, Ins, and Del are
allowed to edit itineraries (see also Sec. 3.1 for editions).


Related Work Integrating user preferences for places and/or itineraries with optimization
when generating personalized itineraries is promising to improving the user experiences. In
previous work, Choudhury et al. [2] constructed a sequence of photographic spots using mean
staying times of places to reflect user preferences, and they showed that generated trajectories
are comparable with those generated by professionals. Lim et al. [3] utilized estimated staying
times per user as user preferences in defining an objective function of the orienteering problem,
but feedback is not considered. Roy et al. studied the task of interactively planning itineraries
with feedback on POIs [7]. They proposed how to model such feedback and utilize them, but
a type of supported feedback is limited. Chen et al. [4] adopted a similar strategy by [7], but
formally discussed how historical data of itineraries are taken into account in optimization (this
framework is referred as maximum likelihood planning in the literature [8, 9]).

Contributions We develop a new framework using both historical data and data collected by
interaction with users. A systematic overview of our method is illustrated in Figure 1. We expect
that using interaction data is promising to improve the user experience, and then generalize a
type of feedback to collect richer data from users to estimate their preferences. Our method
generates itineraries based on collected itineraries (Historical data in Fig. 1) and learned
transition probabilities (Transition matrix in Fig.1) among point-of-interests (POIs). Further,
in order to incorporate with these richer data, we try to update the learned probabilities. Note
that we assume that interaction is implemented by some Web interface, and in this paper three
types of operations for itineraries, Swap, Ins, and Del of POIs (illustrated in Interaction of
Fig. 1), are considered. Our proposed method is a follower of both [7] and [4], but our estimation
strategy using user interactions is quite new in the sense that such probabilities can be learned
with collected interaction data. In our proof-of-concept experiments, we evaluate how collected
interaction data affect the resulted ranked lists of itineraries under our assumption that the
diversity on the resulted list is important to design itinerary planning service. We confirm that
our framework generates diverse itineraries based on collected data.
2. Preliminary
Throughout this paper, [𝑁 ] = {1, 2, . . . , 𝑁 } for some natural number 𝑁 ∈ Z. Any sequence
is 1-origin. On a finite set 𝐼 of symbols representing POIs, a length 𝐿-sequence consisting of
elements from 𝐼 is denoted by X = ⟨𝑋1 , 𝑋2 , . . . , 𝑋𝐿 ⟩, where 𝑋𝑗 ∈ 𝐼 for any 𝑗 ∈ [𝐿], and
𝐿 = |X|. A sequence represents an itinerary, where a user visits 𝑋1 first, 𝑋2 second, and 𝑋𝐿
in last. A direct travel from 𝑖 to 𝑗 in X is written as 𝑖 → 𝑗 ∈ X. In other words, for some
𝑡 ∈ [|X| − 1], it holds that 𝑋𝑡 = 𝑖 and 𝑋𝑡+1 = 𝑗. Further, we write 𝑖 ∈ X if and only if there
exists 𝑡 ∈ [|X|] such that 𝑋𝑡 = 𝑖. In this paper, we naturally generalize this relation ∈ for sets
𝒟 = {X1 , . . . , X|𝒟| } of sequences. Our framework generates a ranked list of itineraries, and a
length-𝑘 list ℒ = ⟨X(1) , X(2) , . . . , X(𝑘) ⟩ represents a ranked list of itineraries.

2.1. Orienteering problem for itinerary planning
The orienteering problem is a well-studied combinatorial optimization problem [5], and it is
applied to generate itineraries in the literature [2, 3, 6]. Without loss of generality, we assume
that 1 ∈ 𝑉 is the start POI and 𝑁 ∈ 𝑉 is the goal POI when planning itineraries. The naive
orienteering problem is defined on a complete graph 𝐺 = (𝑉, 𝐸); the vertex set 𝑉 represents a
set of POIs, and the edge set 𝐸 represents travels among POIs in 𝑉 , and the problem involves
finding a tour on 𝐺 with some objectives and constraints. We assume that 𝑡𝑖,𝑗 and 𝑐𝑖,𝑗 represent
the travel time and distance from 𝑖 to 𝑗, respectively. 𝑇max is the total travel time, and the score
Score(𝑖) for each POI 𝑖 ∈ 𝑉 is given. We prepare decision variables {𝑥𝑖,𝑗 ∈ {0, 1} | (𝑖, 𝑗) ∈ 𝐸}
and {𝑢𝑖 ∈ Z | 𝑖 ∈ 𝑉 } as 𝑥𝑖,𝑗 = 1 if and only if 𝑗 is visited after 𝑖, and 𝑢𝑖 represents the order
when 𝑖 is visited. Then, the orienteering problem is formally described below.

                         𝑁
                         ∑︁−1 ∑︁
                               𝑁
                 max                Score(𝑖) · 𝑥𝑖,𝑗                                                    (1a)
                   𝑥,𝑢
                         𝑖=2 𝑗=2
              𝑁
             ∑︁             𝑁
                            ∑︁−1                𝑁
                                                ∑︁−1             𝑁
                                                                ∑︁
                   𝑥1,𝑗 =          𝑥𝑖,𝑁 = 1,           𝑥𝑖,𝑘 =         𝑥𝑘,𝑗   (∀𝑘 = 2, . . . , 𝑁 − 1)   (1b)
             𝑗=2            𝑖=1                  𝑖=1            𝑗=2
     𝑁
     ∑︁−1 ∑︁
           𝑁
               𝑡𝑖,𝑗 𝑥𝑖,𝑗 ≤ 𝑇max                                                                        (1c)
     𝑖=1 𝑗=2

                    𝑢1 = 1, 𝑢𝑁 = 𝐿, 2 ≤ 𝑢𝑖 ≤ 𝑁                                   (∀𝑖 = [𝑁 ] ∖ {1})     (1d)
        𝑢𝑖 − 𝑢𝑗 + 1 ≤ (𝑁 − 1)(1 − 𝑥𝑖,𝑗 )                                       (∀𝑖, 𝑗 ∈ [𝑁 ] ∖ {1})    (1e)
                   𝑥𝑖,𝑗 ∈ {0, 1}, 𝑢𝑖 ∈ Z                                             (∀𝑖, 𝑗 ∈ [𝑁 ])    (1f)

Note that Eq. (1a) requires us to travel popular POIs in 𝑉 . Constraints Eq. (1b) ensure the
resulted tour is valid. Constraints Eq. (1c) are for bounding the total travel time with respect
to 𝑇max . Constraints Eq. (1d) and Eq. (1e) are from the well-known MTZ-constraint to avoid
sub-tours [10]. Constraints Eq. (1f) define variables.
   To consider the distance among POIs, a multi-objective function based on Eq. (1a) can be
adopted with 𝛼, 𝛽 ∈ R:
                       ⎛                  ⎞       ⎛                        ⎞
                         𝑁 ∑︁
                        ∑︁  𝑁                      𝑁 −1 ∑︁
                                                    ∑︁   𝑁
              max −𝛼 × ⎝      𝑐𝑖,𝑗 · 𝑥𝑖,𝑗 ⎠ + 𝛽 × ⎝        Score(𝑖) · 𝑥𝑖,𝑗 ⎠                       (2)
               𝑥,𝑢
                             𝑖=1 𝑗=1                         𝑖=2 𝑗=2

In our implementation, to   ∑︀generate length 𝐿-sequences from 1 to 𝑁 , we replace Eq. (1c) with
the following constraint 𝑗 𝑦𝑗 ≤ 𝐿 with 𝑦𝑗 ∈ {0, 1} for all 𝑗 ∈ [𝑁 ], where 𝑦𝑗 = 1 means that
POI 𝑗 is visited. In addition, by replacing Eq. (1a) with Eq. (2), we can build our baseline itinerary
generation method using the orienteering problem.

2.2. Maximum likelihood planning
Let X be a (feasible) itinerary and 𝒳 be the set of all feasible itineraries. In the following, we
focus on the case of |X| = 𝐿 for any X ∈ 𝒳 . The goal of maximum likelihood planning is to solve
maxX∈𝒳 Pr(X). This problem setting has been attracted much attention in the literature [4, 8,
9]. Under a first order Markov chain approximation, Pr(X) for X = ⟨𝑋1 , 𝑋2 , . . . , 𝑋𝐿 ⟩ can be
approximated as Pr(X) ≈ Pr(𝑋1 )Pr(𝑋2 | 𝑋1 ) . . . Pr(𝑋𝐿 | 𝑋𝐿−1 ). An implicit constraint
Pr(𝑋1 = 1) = 1 on our itinerary planning indicates the following:
                                             ∑︁
             arg max Pr(X) ≈ arg min                 − log Pr(𝑋𝑡+1 = 𝑗 | 𝑋𝑡 = 𝑖) · 𝑥𝑖𝑗             (3)
                𝑥,𝑢                  𝑥,𝑢
                                           (𝑖,𝑗)∈𝐸

 Equation (3) indicates that existing solvers for the orienteering problem are directly applicable
 to the maximum likelihood planning of Eq. (3) [8, 9]. That is, using a solver with the cost value
^𝑐𝑖𝑗 := − log Pr(𝑋𝑡+1 = 𝑗 | 𝑋𝑡 = 𝑖), we can obtain a maximal likelihood route X⋆ . In the
 following, we write 𝑃𝑖,𝑗 := Pr(𝑋𝑡+1 = 𝑗 | 𝑋𝑡 = 𝑖) for the sake of simplicity.

2.3. Generating lists of solutions
Typical optimization problems and their solvers just output an optimal solution. However,
displaying multiple solutions is preferred (e.g., on a Web) in applications. We now need to
compute (possibly) top-𝑘 solutions with existing solvers, and some known methods are already
proposed [11], where we need to fix an order among decision variables and to iteratively solve
sub-problems. Instead of this procedure that requires some algorithmic design, we use the
following implementation to obtain 𝐾 solutions. Let ℒ(𝑚) be the ordered set of 𝑚 solutions and
ℒ(0) = ∅. To generate a next solution (i.e., 𝑚-th solution), we solve the optimization problem
under additional constraints of X ̸= X(𝑗) for X(𝑗) ∈ ℒ(𝑚−1) . Finally, we have an ordered set
ℒ(𝑘) = {X(1) , X(2) , . . . , X(𝑘) } consisting of 𝑘 itineraries.


3. Proposed Framework
We propose a new framework in which users directly interact with our system by editing
displayed itineraries. Our motivation to design this framework is collecting edition-based
feedback via editions by users, estimating user preferences from the editions, and using such
data when generating personalized itineraries. To implement this framework, we generalize
maximum likelihood planning by defining a new optimization problem to estimate transition
probabilities among POIs with interaction data.

3.1. User Editions
A user interacts with service interfaces (e.g., Web/mobile app). We collect edition-based feedback
from the user representing his/her preference among itineraries. In our system, the following
three types of editions are considered.

Swap For X = ⟨𝑋1 , 𝑋2 , . . . , 𝑋|X| ⟩, a swap of the two adjacent POIs generates a new itinerary
                               ′ ⟩, where there exists 𝑗 ∈ [|X|−1] such that 𝑋 ′ = 𝑋
    X′ = ⟨𝑋1′ , 𝑋2′ , . . . , 𝑋|X|                                                          ′
                                                                              𝑗      𝑗+1 , 𝑋𝑗+1 =
    𝑋𝑗 and 𝑋𝑗 = 𝑋𝑗 for all 𝑗 ∈ [|X|] ∖ {𝑗, 𝑗 + 1}.
              ′                    ′


Insertion For X = ⟨𝑋1 , 𝑋2 , . . . , 𝑋|X| ⟩, an insertion of a new location 𝑋 ′ generates a new
      X′ = ⟨𝑋1′ , 𝑋2′ , . . . , 𝑋|X ′ | ⟩ such that |X| + 1 = |X | and for some 𝑗 ∈ |X| ∖ {1, 𝑁 } it
                                 ′                              ′

      holds that 𝑋𝑗′ ̸∈ X, X𝑘 = X′𝑘 for 𝑘 ≤ 𝑗 − 1, 𝑘 > 𝑗.

Deletion For X = ⟨𝑋1 , 𝑋2 , . . . , 𝑋|X| ⟩, a deletion of some location 𝑋 ′ ∈ X generates a new
     X′ = ⟨𝑋1′ , 𝑋2′ , . . . , 𝑋|X ′ | ⟩ such that |X| − 1 = |X | and for some 𝑗 ∈ |X| ∖ {1, 𝑁 } it
                                ′                              ′

     holds that X𝑘 = X′𝑘 for 𝑘 ≤ 𝑗 − 1, 𝑘 > 𝑗.

Note that these are different from existing work (e.g., [7]) that uses feedback only for POIs.

3.2. Maximum likelihood planning with user editions by optimization
We propose a new itinerary planning method using feedback from user editions defined in
Sec. 3.1. Our idea consists of the following three steps.
   1. converting the estimation task of {𝑃𝑖,𝑗 }𝑖,𝑗∈[𝑁 ] as an optimization problem,
   2. optimizing our generalized optimization problem from (1) by penalty functions and
       collected feedback data, and computes a modified {𝑃   ˜ 𝑖,𝑗 }𝑖,𝑗∈[𝑁 ] , and
   3. adopting modified probabilities by (2) when generating maximum likelihood itineraries.

3.2.1. Step 1: Learning-based interpretation
Existing methods estimated 𝑃𝑖,𝑗 by counting historical data 𝒟 (e.g., historical trajectories or
routes) [4, 8]. A simple method to estimate 𝑃𝑖,𝑗 is counting data in 𝒟 as 𝑃𝑖,𝑗 = |{𝑖→𝑗∈𝒟}|
                                                                                   |{𝑖∈𝒟}| . We
cast this estimation problem as the following optimization problem:
                          ⃒                     ⃒2
                          ⃒𝑃𝑖,𝑗 − |{𝑖 → 𝑗 ∈ 𝒟}| ⃒ subject to
                       ∑︁ ⃒                                  ∑︁
        ^ := arg min                                                                             (4)
                                                ⃒
        𝑃                                                       𝑃𝑖,𝑗 = 1(∀𝑖 ∈ [𝑁 ])
                  𝑃
                          ⃒          |{𝑖 ∈ 𝒟}| ⃒
                        𝑖,𝑗                                         𝑗

Here Eq. (4) can be solved by closed formula and a solution of Eq. (4) is denoted by 𝑃     ^ below.
Note that other variants have been discussed [8, 9]; For example, the Laplace smoothing with
𝛼 > 0 is possible to estimate 𝑃^ 𝑖,𝑗 , and this can also included in Eq. (4). In the following, we
               ∑︀ ⃒⃒                     ⃒2
                            |{𝑖→𝑗∈𝒟}| ⃒
write the term 𝑖,𝑗 ⃒𝑃𝑖,𝑗 − |{𝑖∈𝒟}| ⃒ with a loss function 𝐿data (𝑃, 𝒟).
3.2.2. Step 2: Generalization
Our key idea in this paper is generalizing Eq. (4) to consider user feedback data using penalty
functions. Intuitively, we define a new objective functions like 𝐿data (𝑃, 𝒟) + 𝐿(𝑃, 𝒟, 𝒟int ),
where 𝐿(𝑃, 𝒟, 𝒟int ) is the penalty term related to all of the estimated probabilities 𝑃 , historical
itinerary 𝒟, and feedback data 𝒟int collected from users. In practice, we propose the following
methods for the three types of editions.

Swap Let us explain using examples of length-4 sequences X = ⟨𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 ⟩ and X′ =
    ⟨𝑋1 , 𝑋3 , 𝑋2 , 𝑋4 ⟩. For X and X′ , we encode the relation X ≺ X′ by Pr(𝑋) < Pr(𝑋 ′ ).
    With our approximation, we have 𝑃𝑋1 ,𝑋2 𝑃𝑋2 ,𝑋3 𝑃𝑋3 ,𝑋4 < 𝑃𝑋1 ,𝑋3 𝑃𝑋3 ,𝑋2 𝑃𝑋2 ,𝑋4 . Then,
    we adopt a loss term 𝐿swap (𝑃𝑋1 ,𝑋2 𝑃𝑋2 ,𝑋3 𝑃𝑋3 ,𝑋4 − 𝑃𝑋1 ,𝑋3 𝑃𝑋3 ,𝑋2 𝑃𝑋2 ,𝑋4 ) for each 4-
    tuple (𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 ), and add this term to our learning problem in Eq. (4) (see also
    Swap in Fig. 1).

Insertion For two example length-3 and 4 itineraries X = ⟨𝑋1 , 𝑋2 , 𝑋3 ⟩ and X′ = ⟨𝑋1 , 𝑋2 , 𝑋4 , 𝑋3 ⟩,
      the insertion is encoded by Pr(X) ≤ Pr(𝑋 ′ ). Similarly, we should have 𝑃𝑋2 ,𝑋3 ≤
      𝑃𝑋2 ,𝑋4 𝑃𝑋4 ,𝑋3 , and a loss function 𝐿ins is adopted as a penalty term (see also Ins in
      Fig. 1).

Deletion In contrast, for two example length-4 and 3 itineraries X = ⟨𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 ⟩ and
     X′ = ⟨𝑋1 , 𝑋3 , 𝑋4 ⟩, we can use a loss function 𝐿del as well (see also Del in Fig. 1).

   In summary, we can collect datasets 𝒟int by designing interfaces, and data like the above
example (𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 ) for Swap are stored to modify the transition probability 𝑃𝑖,𝑗 . Here
we define a new objective function to estimate 𝑃𝑖,𝑗 using both 𝒟 and 𝒟int := 𝒟swap ⊔𝒟ins ⊔𝒟del
as follows.

                                                ∑︁
   𝑓 :=𝛾 × 𝐿data (𝑃, 𝒟) + 𝛿1 ×                                   𝐿swap (𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 )
                                       (𝑋1 ,𝑋2 ,𝑋3 ,𝑋4 )∈𝒟swap
                    ∑︁                                                   ∑︁                                    (5)
      +𝛿2 ×                        𝐿ins (𝑋2 , 𝑋3 , 𝑋4 ) + 𝛿3 ×                         𝐿del (𝑋1 , 𝑋2 , 𝑋3 ).
              (𝑋2 ,𝑋3 ,𝑋4 )∈𝒟ins                                  (𝑋1 ,𝑋2 ,𝑋3 )∈𝒟del


We write 𝑃
         ˜ := arg min𝑃 𝑓 (𝑃 ; 𝛾, 𝛿1 , 𝛿2 , 𝛿3 ) subject to ∑︀ 𝑃𝑖,𝑗 = 1 for all 𝑖 ∈ [𝑁 ].
                                                             𝑗


3.2.3. Step 3: Planning with modified probabilities
After computing Eq. (5), we obtain 𝑃    ˜ instead of 𝑃
                                                     ^ from Eq. (4), where we expect that 𝑃˜ can
reflect all interaction information from 𝒟int by soft constraints. We then could obtain different
itineraries (e.g., top-𝑘 itineraries) by using 𝑃
                                               ˜ rather than those obtained by 𝑃
                                                                               ^.

3.3. How our new optimization problem modifies transition matrices
We explain our framework using toy examples. Let us prepare 𝑁 = 10 synthetic locations and
generate 𝑃𝑋,𝑌 for 𝑋, 𝑌 ∈ [𝑁 ] randomly with 𝑃𝑋,𝑋 = 0. For our loss function, we adopt the
                                                                       ˜
                                                                    𝑃 →𝑃          neg (2,518)    pos (2,522)
                                                                  neg (2,349)        1,988            361
                                                                  pos (2,691)         530            2,161

  (a) Random          (b) Modified         (c) Differences      (d) Results and transformation of all 4-tuples
      𝑃𝑋𝑖 ,𝑋𝑗             ˜ 𝑋 ,𝑋
                          𝑃                    𝑃 −𝑃 ˜
                              𝑖   𝑗


Figure 2: How our problem in Eq. (5) modifies 𝑃 as 𝑃
                                                   ˜ with 10 interaction Swap pairs and 𝛾 = 0.25, 𝛿1 =
16, 𝛿2 = 𝛿3 = 0.


Frobenius norm for 𝐿data and the tanh function for 𝐿swap in Eq. (5), and explain our proposed
method works as we expected for Swap operation as an example.
  We first prepare a random transition matrix as shown in Fig. 2a. We randomly select 10 tuples
(𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 ) that violates the Swap condition to build 𝒟swap . Here, (𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 ) is
neg if 𝑃𝑋1 ,𝑋2 𝑃𝑋2 ,𝑋3 𝑃𝑋3 ,𝑋4 < 𝑃𝑋1 ,𝑋3 𝑃𝑋3 ,𝑋2 𝑃𝑋2 ,𝑋4 , and pos otherwise. We assume that a
user says ⟨𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 ⟩ ≺ ⟨𝑋1 , 𝑋3 , 𝑋2 , 𝑋4 ⟩. Using parameters 𝛾 = 0.25, 𝛿1 = 16, 𝛿2 =
𝛿3 = 0.0, we compute a modified matrix 𝑃      ˜ (as illustrated in Fig. 2b. 𝑃 − 𝑃˜ is also shown in
Fig. 2c).
  In results, we have 2, 349 neg and 2, 691 pos tuples by 𝑃 (i.e., total 10 P4 = 5, 040 tuples),
and 2, 518 neg and 2, 522 pos tuples in 𝑃   ˜ , respectively. Out of 2, 349 neg tuples by 𝑃 , 1, 988
tuples remain neg, and 361 tuples become pos. Similarly, out of 2, 691 pos tuples by 𝑃 , 530
tuples become neg, while 2, 161 tuples are pos as well, as summarized in Fig. 2d. For 𝒟swap , 𝑃    ˜
satisfies the condition for 7 tuples out of 10. We then confirm that 10 interaction samples in
𝒟swap softly affect a subset of 4-tuples by 𝑃 ˜ . Note that other loss functions for both 𝐿data and
𝐿swap are applicable.


4. Proof-of-concept experiments
We demonstrate how our proposed framework works with crawled real data. In the following
experiments, we keep the two functions (the Frobenius norm for 𝐿data and tanh for 𝐿swap ) for
our method, and focus on Swap only in Eq. (5). In this paper, we only evaluate how collected
interaction data 𝒟swap affect the computed ranked list of itineraries. To evaluate this, we
compare resulted lists with multiple settings, and quantitatively compare them.

Setup We extracted user-generated itineraries from TripHobo and rating data from TripAdvi-
sor1 . We found itineraries tagged with Tokyo, and collected individual itineraries. An itinerary
consists of several days (i.e., on day 1, on day 2, etc.). We then divided the multi-day itinerary
into a set of one-day ones to focus on planning within a day. We collected such one-day
itineraries to make a whole set as historical data. From the whole dataset, we only sampled a
selected area of Tokyo, named Asakusa2 , and we finally have 245 itineraries in our 𝒟. With

    1
     https://www.triphobo.com/ and https://www.tripadvisor.jp/ (access confirmed at June, 2022.)
    2
     By selecting POIs whose locations are included in the area of latitude [35.443674, 35.825408] and longitude
[139, 514896, 139.927981].
          (a) 𝑃
              ^          (b) 1st (𝑇1 )   (c) 2nd (𝑇2 )    (d) 3rd (𝑇3 )    (e) 4th (𝑇4 )    (f) 5th (𝑇5 )




                         (g) 1st (𝑇2 )   (h) 2nd (𝑇1 )    (i) 3rd (𝑇6 )    (j) 4th (𝑇7 )    (k) 5th (𝑇8 )




          (l) 𝑃
              ˜          (m) 1st (𝑇9 )   (n) 2nd (𝑇10 )   (o) 3rd (𝑇11 )   (p) 4th (𝑇12 )   (q) 5th (𝑇13 )

Figure 3: Two matrices 𝑃  ^ in Fig. 3a and 𝑃
                                           ˜ in Fig. 3l. Resulted itineraries with 𝑃
                                                                                   ^ and 𝛽 = 0 (above,
                       ^ and 𝛽 = 1 (middle, from Fig. 3g–Fig. 3k), and with 𝑃
from Fig. 3b–Fig. 3f), 𝑃                                                     ˜ and 𝛽 = 0 (bottom, from
Fig. 3m–Fig. 3q).


extracted itineraries, we also collected a set [𝑁 ] of all POIs in the data (𝑁 = 29). For each poi
𝑖 ∈ [𝑁 ], we obtained Score(𝑖) from stars recorded in TripAdvisor.
   We implemented our top-𝑘 itinerary planning algorithm (as in Sec. 2.3), set 𝑘 = 5, and tested
𝛼 = 1 and 𝛽 ∈ {0, 1}. To evaluate our method, we compared obtained lists of top-5 itineraries
by 𝑃
   ^ and 𝑃 ˜ . To learn 𝑃
                        ˜ , we just randomly sampled 300 pairs as 𝒟swap from neg 4-tuples as an
simulation data. Parameter settings were the same to those in Sec. 3.3.

Visualization of generated itineraries For random start and goal POIs out of 29 POIs, with
𝑃^ for both 𝛽 = 0 and 𝛽 = 1 cases, the baseline method generated 10 itineraries in total, as
illustrated in Fig. 3, where we had 8 unique itineraries. Using identifiers depicted in Fig. 3,
we had ℒ1 = ⟨𝑇1 , 𝑇2 , 𝑇3 , 𝑇4 , 𝑇5 ⟩ when 𝑃^ and 𝛽 = 0, and ℒ2 = ⟨𝑇2 , 𝑇1 , 𝑇6 , 𝑇7 , 𝑇8 ⟩ when 𝑃  ^
and 𝛽 = 1. With 𝑃 for both 𝛽 = 0, Fig. 3 illustrates a new itineraries generated through our
                    ˜
framework, where we have a new list ℒ3 = ⟨𝑇9 , 𝑇10 , 𝑇11 , 𝑇12 , 𝑇13 ⟩ when 𝑃  ˜ and 𝛽 = 0. For 𝑃   ˜
and 𝛽 = 1, we have another list ℒ4 = ⟨𝑇10 , 𝑇9 , 𝑇11 , 𝑇13 , 𝑇2 ⟩. Note that ℒ4 is not illustrated in
Fig. 3 as itineraries in ℒ4 are already illustrated.
                                                             ∑︀ −1 ∑︀𝑁
Evaluations To evaluate itineraries in terms of scores ( 𝑁      𝑖=2     𝑗=2 Score(𝑖)𝑥𝑖,𝑗 ) and ranking
(i.e., ℒ1 , ℒ2 , and ℒ3 ), we first measure total scores and travel costs of each itinerary. Figure 4a
shows a scatter plot of the two terms of Eq. (2); 𝑥-axis shows total travel distances of itineraries
and 𝑦-axis represent obtained values by itineraries. Next, we evaluate ℒ3 with different sizes
of 𝒟swap . Figure 4b shows how top-5 lists vary when numbers of neg samples increases
(corresponding to 𝑥-axis, from 0 to 500.), where 𝑦-axis represents top-𝑘 ranking with black
          (a) Scatter plot (Score and Cost)       (b) Ranking different score and sample size
Figure 4: Comparisons of itineraries and ranking. Fig. 4a shows a scatter plot among scores and travel
costs. Fig. 4b shows how ranking lists vary when the number of training data in 𝒟swap increases.


circles, and a line between two circles indicates the two itineraries are the same.
   In results, Fig. 3, Fig. 4a, and Fig. 4b indicate that we could generate a variety of itineraries by
our approach. In other words, our proposed method diversified the ranking results based on
interaction data.


5. Conclusion
We proposed a new framework in which users directly interact with our system by editing
displayed itineraries. Our idea is collecting rich feedback via editions by users, and utilizing such
data when generating personalized itineraries. Throughout our proof-of-concept experiments,
we confirm that our method could diversify ranking generated by top-𝑘 itinerary generation
via the orienteering problem.
   In our future work, we will investigate more deeply learning-based methods via interaction
data, and plan a quantitative user study to develop interaction and optimization-based itinerary
planning method like [2].


References
 [1] A. A. da Silva, R. Morabito, V. Pureza, Optimization approaches to support the planning
     and analysis of travel itineraries, Expert Systems with Applications 112 (2018) 321–330.
     URL: https://www.sciencedirect.com/science/article/pii/S0957417418303920. doi:https:
     //doi.org/10.1016/j.eswa.2018.06.045.
 [2] M. De Choudhury, M. Feldman, S. Amer-Yahia, N. Golbandi, R. Lempel, C. Yu, Automatic
     construction of travel itineraries using social breadcrumbs, in: Proceedings of the 21st
     ACM Conference on Hypertext and Hypermedia, HT ’10, Association for Computing
     Machinery, New York, NY, USA, 2010, pp. 35–44.
 [3] K. H. Lim, J. Chan, C. Leckie, S. Karunasekera, Personalized tour recommendation based
     on user interests and points of interest visit durations, in: Proceedings of the 24th
     International Conference on Artificial Intelligence, IJCAI’15, AAAI Press, Buenos Aires,
     2015, pp. 1778–1784.
 [4] D. Chen, C. S. Ong, L. Xie, Learning points and routes to recommend trajectories, in:
     Proceedings of the 25th ACM International on Conference on Information and Knowledge
     Management, CIKM ’16, Association for Computing Machinery, New York, NY, USA, 2016,
     pp. 2227–2232.
 [5] P. Vansteenwegen, W. Souffriau, D. Van Oudheusden, The orienteering problem: A survey,
     European Journal of Operational Research 209 (2011) 1–10.
 [6] Z. Friggstad, S. Gollapudi, K. Kollias, T. Sarlos, C. Swamy, A. Tomkins, Orienteering algo-
     rithms for generating travel itineraries, in: Proceedings of the Eleventh ACM International
     Conference on Web Search and Data Mining, WSDM ’18, Association for Computing
     Machinery, New York, NY, USA, 2018, pp. 180–188.
 [7] S. Basu Roy, G. Das, S. Amer-Yahia, C. Yu, Interactive itinerary planning, in: 2011 IEEE
     27th International Conference on Data Engineering, ICDE ’11, IEEE Computer Society,
     USA, 2011, pp. 15–26. doi:10.1109/ICDE.2011.5767920.
 [8] R. Canoy, T. Guns, Vehicle routing by learning from historical solutions, in: Proc. of
     CP2019, 2019, pp. 54–70.
 [9] J. Mandi, R. Canoy, V. Bucarey, T. Guns, Data driven vrp: A neural network model to learn
     hidden preferences for vrp, in: Proc. of CP2021, 2021.
[10] C. E. Miller, A. W. Tucker, R. A. Zemlin, Integer programming formulation of traveling
     salesman problems, Journal of the ACM (JACM) 7 (1960) 326–329.
[11] E. L. Lawler, A procedure for computing the k best solutions to discrete optimization
     problems and its application to the shortest path problem, Management science 18 (1972)
     401–405.