Itinerary Recommendation for Cruises: User Study
                            Diana Nurbakova∗                                                                  Léa Laporte
                            LIRIS - INSA Lyon                                                              LIRIS - INSA Lyon
                        20 avenue Albert Einstein                                                      20 avenue Albert Einstein
                    Villeurbanne 69621 cedex, France                                               Villeurbanne 69621 cedex, France
                     diana.nurbakova@insa-lyon.fr                                                       lea.laporte@insa-lyon.fr

                             Sylvie Calabretto                                                              Jérôme Gensel
                            LIRIS - INSA Lyon                                          Université Grenoble Alpes, CNRS, Grenoble INP, LIG
                        20 avenue Albert Einstein                                                   Grenoble F-38000, France
                    Villeurbanne 69621 cedex, France                                          jerome.gensel@univ-grenoble-alpes.fr
                     sylvie.calabretto@insa-lyon.fr

ABSTRACT                                                                                  In this work, we consider a case of a cruise. According to Florida-
Vacations and leisure activities constitute an important part of                      Caribbean Cruise Association (F-CCA) [6], about 25.3M passengers
human life. Nowadays, a lot of attention is paid to cruising, that                    are expected to cruise globally in 2017, showing a 7% average annual
is reported to be a favourite vacation choice for families with kids                  passenger growth rate over the last 30 years. Cruising has become a
and for Millenials. Like other distributed events (events that gather                 preferred vacation choice for families, especially with kids, making
multiple activities distributed in space and time under one umbrella)                 cruisers population younger and more diverse than non-cruisers. F-
such as big festivals, conventions, conferences etc., cruises offer                   CCA reports [6] that cruising is the favourite choice of Millennials
a vast variety of simultaneous on-board activities for all ages and                   and Generation X. Cruisers appreciate the opportunity to relax and
tastes. This results in a cruiser’s information overload, in particular               get away from it all, see and do new things. Cruise lines offer a vast
given a very limited availability of activities. Recommender systems                  variety of on-board activities, as well as in ports of call.
appear as a desirable solution in such an environment. Due to                             In this paper, we focus on the itinerary recommendation and
the number of time constraints, it is more convenient to get a                        present a user study based on a 7-night Disney Fantasy cruise. More
personalised itinerary of activities rather than a list of top-n. In                  precisely, we aim at answering the following research questions.
this paper, we present a user study conducted in order to create                          RQ1: What is itinerary recommendation and what makes it
a preliminary dataset that simulates users’ attendance of a cruise                    challenging?
and sheds the light on the activity selection behaviour. We discuss                       RQ2: What are the characteristics of the data treated by itinerary
challenges faced by the itinerary recommendation and illustrate                       recommendation? Is there any dataset that could be used as is?
them with user study examples.                                                            The remainder of the paper is organised as follows. In Section 2
                                                                                      we define the itinerary recommendation problem and the challenges
CCS CONCEPTS                                                                          it faces. Section 3 gives an overview of existing datasets, presents
                                                                                      our user study that simulates users’ attendance of a cruise and
• Information systems → Personalization;
                                                                                      discussion over conducted analysis. Section 4 concludes the paper.
KEYWORDS
recommendation of leisure activities, itinerary recommendation
                                                                                      2   PROBLEM STATEMENT AND CHALLENGES
                                                                                      In this paper, we aim at finding a personalised itinerary for a given
                                                                                      user that maximises his satisfaction and takes into account spatio-
1    INTRODUCTION                                                                     temporal constraints. More precisely, given a set of activities with
Nowadays, the field of leisure activities experiences a substantial                   their locations, descriptions, time windows of their availability,
growth. In this context, a rising phenomenon we are witnessing is                     duration, and a vector of categories, a set of users, and users’ history
distributed events that gather various activities under one umbrella.                 (attendance) binary matrix, find a feasible sequence of activities (or
They attract more and more attendees. Examples of such events are                     itinerary) that maximises the user’s satisfaction for every given user.
cruises, festivals, big conferences, conventions, etc.                                User’s satisfaction with respect to an itinerary is defined as the sum
   Attendees of distributed events are overwhelmed with the num-                      of the user’s satisfaction scores regarding all the activities within
ber of ongoing parallel activities and are looking for personalised                   the itinerary. For more details on the itinerary recommendation
experience. Recommender systems appear as a natural solution in                       problem, see [9].
such an environment. It is to note that given the density of activi-                      Itinerary recommendation faces the following challenges.
ties and their limited availability, participants are interested in a                     C-1: Implicit Feedback. Given that activities are happening in
personalised itinerary (a sequence of activities to undertake) rather                 future as in the case of event recommendation [8], there is very little
than in a list of top-n activities that may compete in terms of time.                 information to handle and there is much less user-item interactions
                                                                                      than in traditional recommendation scenarios. We deal with implicit
∗ D. Nurbakova held a doctoral fellowship from la Région Auvergne-Rhône-Alpes.        feedback, implying that the degree to which a user likes or not an


RecTour 2017, August 27th, 2017, Como, Italy.                                    31                                           Copyright held by the author(s).
item is not known. The use of multiple contexts may increase the                                                      300
recommendation performance of the algorithms.
   C-2: Interest vs. Attendance. Due to the limited availability and
                                                                                                                      200


                                                                                                        #activities
multiple parallel activities, we deal with attendance bias, as a user
may miss an activity of his/her interest or in contrast, may join an
activity that does not represent a particular interest to him/her.                                                    100
   C-3: List vs. Itinerary. Activities are competitive and short-lived,
which results in the user’s preference for one activity over the
others in a given time slot. In this context, an itinerary (a feasible                                                  0
sequence of activities) is more desirable than a list of interesting


                                                                                                                             1
                                                                                                                             2
                                                                                                                             3
                                                                                                                             4
                                                                                                                             5
                                                                                                                             6
                                                                                                                             7
                                                                                                                             8
                                                                                                                             9
                                                                                                                            10
                                                                                                                            11
                                                                                                                            12
                                                                                                                            13
                                                                                                                            14
                                                                                                                            15
                                                                                                                            16
                                                                                                                            17
                                                                                                                            18
                                                                                                                            19
                                                                                                                            20
                                                                                                                            21
                                                                                                                            22
                                                                                                                            23
activities.                                                                                                                                    User
   We will illustrate the challenges in the next section.                                                               Interested & Going            Not Interested & Going
                                                                                                                      Interested & Not Going     Not Interested & Not Going
3     USER STUDY
In this section, we formulate a list of characteristics of a dataset                       Figure 1: Distribution of interest in activities and attendance
satisfying the needs of the target problem, provide a comparison of                        per user.
available datasets (see Tab. 1) and describe a user study conducted
in order to collect data with desirable characteristics.
                                                                                           questionnaire consisted of 4 parts. The overview of the survey with
3.1     Data Characteristics and Existing Datasets                                         examples of questions is given in Tab. 2.
                                                                                              Thus, 23 contributions were collected. Statistics concerning the
We categorise the existing datasets w.r.t. the focus of data into 3
                                                                                           participants are provided in Tab. 3. The main statistics of the dataset
groups: Single Item, Schedule, and Sequence. We define a list of char-
                                                                                           are given in Tab. 4. The average duration of an activity is 45 minutes.
acteristics (column "Characteristics" in Tab. 1) based on the activity
                                                                                           The average number of ongoing simultaneous activities is 5.
attributes and consecutive nature of performed activities during dis-
tributed events. We cluster the characteristics into 5 types w.r.t. the
                                                                                           3.3      Data Analysis
entity they describe: item (unit under consideration), sequence (or-
dered sequence of items), user (information about users), user-item                        The conducted user study gives a more practical insight into person-
(user-item interactions), and user-user (relations between users).                         alised itinerary recommendation and the activity selection process.
We distinguish 5 essential characteristics (given in italics in Tab.                       In the following, we illustrate the challenges from Section 2.
1): (1) time windows (start and end time of activity availability),                           C-2: Interest vs. Attendance. Figure 1 displays the user-wise dis-
(2) coordinates (geographical location of an activity), (3) service                        tribution of the number of activities a user: (1) was interested in
time (duration of an activity), (4) categories (associated categories),                    (ratinд ≥ 4 or ratinд = 3 if the highest rating given by the user to
(5) users historical data. Though we indicate only 5 elements as                           any activity is equal to 3) and joined (Interested & Going), (2) was
essentials, all the others listed in Tab. 1 are also important as they                     interested in but did not join (Interested & Not Going), (3) was not
may enhance the recommendation. As it can be seen, none of the                             interested in but joined (Not Interested & Going), and (4) was not
existing datasets contains all the essential characteristics. Thus, we                     interested in and did not join (Not Interested & Not Going)5 . The
have made an attempt to create an integral dataset that contains all                       chart shows evidence that individuals miss many activities that
the required features and provides an insight into users’ behaviour.                       represent interest to them. Thus, the number of Interested & Not
                                                                                           Going activities is almost twice higher (1.7621) than Interested &
3.2     Data Collection                                                                    Going. It is also surprising that Not Interested & Going activities
In order to collect required data, we have performed a user study                          constitute about 43% of all joined activities.
via online survey. Participants were recruited via a link to the on-
                                                                                              C-3: List vs. Itinerary. Let us consider the following settings. We
line questionnaire sent by email to several research and university
                                                                                           compare several top-n item recommendation algorithms against
mailing lists. The claimed aim of the study was to create a dataset
                                                                                           itinerary recommendation from the literature. As history data we
that simulates cruise attendance and could be used in order to
                                                                                           consider a binary attendance matrix.
make personalised recommendations of itineraries. The list of ac-
                                                                                              - Category-based: This algorithm ranks the candidate activities
tivities used in the survey was taken from the personal navigators
                                                                                           based on their weighted frequency of corresponding categories.
of Disney’s Fantasy 7-nights Eastern Caribbean cruise. Activities
                                                                                              - Content-based: The candidate activities are ranked in descen-
dedicated exclusively for kids have been excluded from the current
                                                                                           dant order of their textual similarity with the user’s past activities.
list of activities. The original personal navigators can be found
                                                                                           An activity is represented as a TF-IDF vector. The user’s profile is
online3 . The deck plan of the ship can be found on the web4 . The
                                                                                           built over TF-IDF vectors of activities joined by the user in the past.
1 Yelp challenge dataset, http://www.yelp.com/dataset_challenge                               - Logistic Regression: We fed a vector of aforementioned scores
2 https://github.com/jalbertbowden/foursquare-user-dataset                                 into a logistic regression model.
3      http://disneycruiselineblog.com/2015/07/personal-navigators-7-night-eastern-
caribbean-cruise-on-disney-fantasy-itinerary-a-june-20-2015/                               5 Ratings are used only for this part of the study. We do not consider them in estimation
4 http://disneycruiselineblog.com/ships/deck-plans-disney-dream-disney-fantasy/
                                                                                           of user’s interest in activities, as we assume there exist only binary attendance matrix.


RecTour 2017, August 27th, 2017, Como, Italy.                                         32                                                                Copyright held by the author(s).
                                                Table 1: Comparison of the available datasets.

                                                                                                              Single Item                                                                                    Schedule                                          Sequence


                                                                                                                                                                                                                                         Other OP-based [12]
                                                                                                                                                                                                      MCTOPMTW [10]
                                                                                                                                                                                                                      Other OP-TW [12]
                                                                                                                        Foursquare_1 [14]
                                                           TREC CS’13 [3]
                                                                            TREC CS’14 [4]
                                                                                             TREC CS’15 [2]


                                                                                                                                            Foursquare_2 2


                                                                                                                                                                                                                                                               TripBuilder [1]
                                                                                                                                                                                                                                                                                 GeoLife [15]
                                                                                                                                                                                         Meetup [7]
                                                                                                                                                                           Twitter [5]
                                                                                                                                                             Flickr [11]
                                                                                                              Yelp 1
                Entity      Characteristic
                            Time windows                                                                      ✓                                                                                       ✓               ✓
                            Coordinates                    ✓                ✓                ✓                ✓         ✓                   ✓                ✓             ✓             ✓            ✓               ✓                  ✓                     ✓
                            Service Time                                                                                                                                                              ✓               ✓                  ✓
                Item        Categories                                                                                                                                                   ✓            ✓                                                        ✓
                            Price                                                                                       ✓                                                                             ✓                                                        ✓
                            Item Additional Attributes                                                        ✓                                              ✓                           ✓
                            Description                                                                                                                      ✓             ✓
                            Time budget                                                                                                                                                               ✓               ✓                  ✓
                Sequence    Starting/Ending Point                                                                                                                                                     ✓               ✓                  ✓                                       ✓
                            Tour Additional Attributes                                       ✓                                                                                                                                                                                   ✓
                User        User’s personal data                                             ✓                                                                                           ✓
                User-Item   Historical Data                ✓                ✓                ✓                          ✓                   ✓                ✓             ✓             ✓                                                                     ✓                 ✓
                            Score                          ✓                ✓                ✓                          ✓                   ✓                                                         ✓               ✓                  ✓
                User-User   Social links                                                                      ✓         ✓                   ✓                                            ✓

                 Table 2: Description of the parts of the survey. Qnt denotes the number of questions in a section.

 Section           Qnt Description                                          Question Examples
 User Profile      10 Questions on basic user’s features and their          Your gender: 2Female 2  Male
                       cruising experience                                  Have you already experienced DCL (Disney Cruise Line)?
                                                                            Are you aiming to attend the maximum amount of activities
                                                                            mentioned in your Personal Navigator or just a few must-see?
 Users             311 User’s evaluation of a list of proposed activities Sailing Away. Don’t Miss Event.
 Preferences           by selecting one of the grades for the listed ac- Description: It’s time to go Sailing Away! Join Mickey and Minnie
                       tivities: 1 - Never (not interested at all and won’t along with Tinker Bell and the rest of the gang as they welcome
                       recommend to anyone to attend it); 2 - Not inter- you abroad the Disney Fantasy.
                       ested; 3 - Neutral; 4 - Interested; 5 - Won’t miss   Available: Day 1, 16:30-17:15, Location: Deck Stage
                                                                            Never #### Won’t miss
 Itinerary         593 Organisation of the activities into a day-wise
                                                                             Event                                      Going Not going
 Planner               itinerary. Given an ordered list of activities with
                       their availability hours, the respondents were        11:30 - 15:00. Character Meet & Greet       2
                                                                                                                                    2
                       asked to indicate their intention to join the activ- Ticket Distribution. Category: Charac-
                       ity or not by clicking on "Going" or "Not going".     ters. Location: Port Adventures Desk.
                                                                             Don’t Miss Event

 Afterwards        5     Conclusion questions                                                                          When you were having a choice among different activities of
                                                                                                                       your interest, did you consider the distance to the venue while
                                                                                                                       making your choice?
                                                                                                                       How do you usually manage the list of activities to perform
                                                                                                                       during your vacations?


   - ILS+Scores: We also tested a state-of-the-art itinerary construc-                                           (content-based, category-based and time-based) and transition prob-
tion algorithm [9] that is based on the Iterated Local Search (ILS)                                              abilities between activities.
algorithm [13] with activities scores calculated using hybrid scores


RecTour 2017, August 27th, 2017, Como, Italy.                                                            33                                                                                                                                    Copyright held by the author(s).
                     Table 3: Participants Statistics                            3.1 will serve as the basis for the new dataset. Another direction
                                                                                 of future work consists in proposing more accurate solution for
    Statistic                                            Value                   the itinerary recommendation that would embrace all the sides and
    # Female users                                          7                    address all the challenges of the itinerary recommendation.
    # Users already experienced DCL                         1
    # Users already experienced any cruise                  4                    REFERENCES
    # Users considering the distance between venues         8                     [1] Igo Brilhante, Jose Antonio Macedo, Franco Maria Nardini, Raffaele Perego, and
                                                                                      Chiara Renso. 2013. Where Shall We Go Today?: Planning Touristic Tours with
    Managing Activities. Not-to-miss List : Daily plan- 14 : 4 : 5                    Tripbuilder. In Proc. of the 22nd ACM International Conference on Information &
    ning : No planning                                                                Knowledge Management (CIKM ’13). 757–762.
    Age group: 21-30 : > 30                              16 : 7                   [2] Adriel Dean-Hall, Charles L. A. Clarke, Jaap Kamps, Julia Kiseleva, and Ellen
                                                                                      Voorhees. 2015. Overview of the TREC 2015 Contextual Suggestion Track. In
                                                                                      Proc. of the 24th Text REtrieval Conference (TREC 2015).
                        Table 4: Dataset Statistics                               [3] Adriel Dean-Hall, Charles L. A. Clarke, Jaap Kamps, Paul Thomas, Nicole Simone,
                                                                                      and Ellen Voorhees. Overview of the TREC 2013 Contextual Suggestion Track.
                                                                                      In NIST Special Publication 500-302: The Twenty-Second Text REtrieval Conference
    # Activities     # Days       # Users      # Locations   # Categories             Proceedings (TREC 2013) (2013), Ellen M. Voorhees (Ed.).
                                                                                  [4] Adriel Dean-Hall, Charles L. A. Clarke, Jaap Kamps, Paul Thomas, and Ellen
       593              7           23             47             52                  Voorhees. Overview of the TREC 2014 Contextual Suggestion Track. In NIST Spe-
                                                                                      cial Publication 500-308: The Twenty-Third Text REtrieval Conference Proceedings
                                                                                      (TREC 2014) (2014), Ellen M. Voorhees and Angela Ellis (Eds.).
                                                                                  [5] Jacob Eisenstein, Brendan O’Connor, Noah A Smith, and Eric P Xing. 2010.
                                                                                      A latent variable model for geographic lexical variation. In Proc. of the 2010
                                                                                      Conference on Empirical Methods in Natural Language Processing. Association for
                                                                                      Computational Linguistics, 1277–1287.
                                                                                  [6] Florida-Caribbean Cruise Association (FCCA). 2017. Cruise Industry Overview.
                                                                                      11200 Pines Blvd., Suite 201 - Pembroke Pines, Florida 33026. http://www.f-cca.
                                                                                      com/downloads/2017-Cruise-Industry-Overview-Cruise-Line-Statistics.pdf
                                                                                  [7] Xingjie Liu, Qi He, Yuanyuan Tian, Wang-Chien Lee, John McPherson, and Jiawei
                                                                                      Han. 2012. Event-based Social Networks: Linking the Online and Offline Social
                                                                                      Worlds. In Proc. of the 18th ACM SIGKDD conference on Knowledge Discovery and
                                                                                      Data Mining (2012) (KDD’12).
                                                                                  [8] Augusto Q. Macedo, Leandro B. Marinho, and Rodrygo L.T. Santos. 2015. Context-
                                                                                      Aware Event Recommendation in Event-based Social Networks. In Proc. of the
                                                                                      9th ACM Conference on Recommender Systems (RecSys ’15). 123–130.
                                                                                  [9] Diana Nurbakova, Léa Laporte, Sylvie Calabretto, and Jérôme Gensel. 2017. Rec-
     Figure 2: Precision w.r.t. the number of history days.                           ommendation of Short-Term Activity Sequences During Distributed Events. Pro-
                                                                                      cedia Computer Science 108 (2017), 2069 – 2078. International Conference on
                                                                                      Computational Science, {ICCS} 2017, 12-14 June 2017, Zurich, Switzerland.
                                                                                 [10] Wouter Souffriau, Pieter Vansteenwegen, Greet Vanden Berghe, and Dirk
   The algorithms were evaluated in terms of their precision. We                      Van Oudheusden. 2013. The Multiconstraint Team Orienteering Problem with
returned top-20 activities for each day6 using top-n recommenda-                      Multiple Time Windows. Transportation Science 47, 1 (Feb. 2013), 53–63.
tion algorithms. Figure 2 displays the recommendation power of                   [11] Bart Thomee, David A. Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni,
                                                                                      Douglas Poland, Damian Borth, and Li-Jia Li. 2016. YFCC100M: The New Data
each algorithm with varying number of history days (from 1 to                         in Multimedia Research. Commun. ACM 59, 2 (2016), 64–73.
6). Itinerary recommendation algorithm shows higher precision,                   [12] Pieter Vansteenwegen, Wouter Souffriau, and Dirk Van Oudheusden. 2011. The
proving that an itinerary satisfies better the user’s needs.                          orienteering problem: A survey. Eur J Oper Res 209, 1 (2011), 1 – 10.
                                                                                 [13] Pieter Vansteenwegen, Wouter Souffriau, Greet Vanden Berghe, and Dirk
                                                                                      Van Oudheusden. 2009. Iterated Local Search for the Team Orienteering Problem
4     DISCUSSION AND CONCLUSION                                                       with Time Windows. Comput. Oper. Res. 36, 12 (Dec. 2009), 3281–3290.
                                                                                 [14] Dingqi Yang, Daqing Zhang, and Bingqing Qu. 2016. Participatory Cultural
In this paper we have considered the problem of personalised                          Mapping Based on Collective Behavior Data in Location-Based Social Networks.
itinerary recommendation with special interest for cruises. We                        ACM Transactions on Intelligent Systems and Technology (TIST) 7, 3 (2016), 30.
                                                                                 [15] Yu Zheng, Lizhu Zhang, Xing Xie, and Wei-Ying Ma. 2009. Mining Interest-
have distinguished the characteristics of data used in itinerary rec-                 ing Locations and Travel Sequences from GPS Trajectories. In Proc. of the 18th
ommendation and have presented an overview of available datasets.                     International Conference on World Wide Web (WWW ’09). 791–800.
To the best of our knowledge, this is the first attempt to classify and
summarise the existing datasets, and describe them with respect to
the aforementioned characteristics. Moreover, we have undertaken
a user study in order to build a preliminary dataset that satisfies
all the characteristics and that helps to understand individuals’ be-
haviour in activity selection process. Though the discussed dataset
is not large-scale, the undertaken user study reveals general trends
of users’ behaviour while on board of a cruise or while attending a
distributed event. Moreover, we have discussed the challenges faced
by the problem of itinerary recommendation and have illustrated
them with the performed data analysis.
    As future work, we plan to create a dataset via crowdsourcing
using CrowdFlower platform. The characteristics presented in Sec.
6 The average number of joined activities per day is 18.


RecTour 2017, August 27th, 2017, Como, Italy.                               34                                                   Copyright held by the author(s).