I Know What You Did Next Summer:
  Challenges in Travel Destination Recommendation

  DMITRI GOLDENBERG, SARAI MIZRACHI, and ADAM HOROWITZ, Booking.com, Tel Aviv, Israel
  IOANNIS KANGAS, OR LEVKOVICH, MAUD SCHWOERER, ALESSANDRO MOZZATO, MICHELE
  FERRETTI, PANAGIOTIS KORVESIS, and LUCAS BERNARDI, Booking.com, Amsterdam, Netherlands


                             Fig. 1. An example of destination recommendations on Booking.com website

  Picking a destination is the first step of any trip planning. While some travelers have an exact destination in mind, others have
  some degree of open-mindedness and can thus benefit from receiving destination recommendations. Online travel platforms often
  actively recommend a wide variety of travel destinations to their customers; however, customers at different stages of trip planning
  may expect different recommendations to serve different needs such as good alternative destinations or destinations for extending
  a trip. This makes recommending a destination challenging, requiring a complex synthesis of available contextual data along with
  consideration of a customer’s needs and the holistic user experience of receiving recommendations. At a technical level, destinations
  also represent complex entities with ambiguous geographical properties and naming conventions. In this paper, we discuss the
  technical and customer-centric challenges of building real-life destination recommendation systems. We supplement this by providing
  applied solution examples from Booking.com, one of the world’s leading online travel platforms.

  CCS Concepts: • Information systems → Personalization; Recommender systems.

  Additional Key Words and Phrases: Personalization, Travel, Recommender Systems


  1   INTRODUCTION
  Online travel platforms offer a vast variety of products that serve different needs of travelers. Such platforms often
  rely on recommender systems [5] to assist various customer decisions, seeking to help narrow down diverse offerings


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            by suggesting well-fit options in a personalized manner [24]. Picking a destination is a high-impact decision in trip
            planning, affecting subsequent decisions about transportation, accommodations or things to do while travelling.
                Choosing a travel destination is not a trivial task. Availability, information gaps, budget, timing, preferences and
            even weather constraints introduce non-negligible complexity [19]. While some travelers have an exact destination in
            mind, others have some degree of open-mindedness. Destination recommendations can help travelers navigate a vast
            landscape of available options. However, customers are likely to benefit from destination recommendations differently
            in different contexts. For example, the way in which destination recommendations can help someone who has not begun
            travel planning ("I’m not even thinking about traveling yet") is not the same as someone who has a specific trip already
            in mind ("I want to go to Barcelona"), and completely different for someone who is somewhere in-between ( "I just need
            a break on the beach"). Because customers aren’t used to expressing their travel intent explicitly when interacting with
            online travel platforms, destination recommendation systems need to be able to interpret a full spectrum using implicit
            cues as inputs, including cues set by user context [22].
                From a customer point of view, the choice of a travel destination can be viewed as a decision making problem,
            where a single or multiple travel destinations are chosen out of a large but finite set of available alternatives [19, 20].
            Destination recommendation systems may help addressing such decision-making problems by highlighting relevant
            travel alternatives.
                Recommending a destination thus requires a complex consideration of available context data [33], customer needs,
            recommendation purpose, and holistic user experience [17]. Further, destination items themselves represent complex
            entities with ambiguous geographical properties and naming conventions [15].
                Recommending travel destinations therefore presents unique challenges including, but not limited to, continuous
            cold-start [4], seasonality, and long time windows between an initial interaction and trip completion [39]. Moreover,
            recommending destinations often requires a system to capture unique information about travel trends, trip preferences,
            and timing [27], generally with limited context data to rely on [37]. These challenges necessitate employing sophisticated
            modeling approaches, such as sequential learning [25, 26], combining different feature types within model [40, 45], and
            introducing online adaptive algorithms [3]. Together with domain knowledge on travel behavior and destination geo-
            properties, these methods can offer unique, tailored-made solutions to the challenge of destination recommendations.
                This work provides an overview of the domain-specific challenges in building real-life destination recommendation
            systems. We discuss the unique characteristics of travel recommendations and, in particular, destination selection. We
            also explain the special context of customer interaction with such recommender systems and survey the complexity of
            problems related to destination items and to mining travel patterns. We demonstrate modeling approaches alongside
            engineering and user experience solutions to overcome these challenges, including by discussing a set of real, applied
            recommender system production examples from Booking.com.

            2     CHALLENGES
            In this section, we focus on specific challenges that we have encountered while building a destination recommendation
            systems. We break them down into three categories, according to their relationship to user travel patterns and interactions
            with the recommender system (or lack thereof), destination characteristics, and customer behavior and needs.

            2.1    Travel Patterns
            Mining and understanding travel behavior patterns of users can surface critical challenges that exist in destination
            recommendation systems.


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
  2.1.1   Continuous Cold Start.
  One of the most discussed problems in industrial recommendation systems is the so-called ’Cold Start’ problem [38, 44].
  During a cold start, a recommender system has little or no information at all about users or items, posing a high
  difficulty in making accurate predictions for these users and items. Destination recommendation systems tend to suffer
  from a Continuous Cold Start (CoCoS) problem [4], wherein sufficient information may be lacking even for known
  users and/or items. Specifically, since travel tends to be infrequent (with most travellers booking trips once or twice
  per year), even for already-existing customers, what is contextually relevant at present may be highly distinctive from
  previous interactions. For customers who do travel more often, different trips still often have different purposes and
  configurations. For instance, what is contextually relevant will differ for a family beach holiday versus a solo business
  trip.

  2.1.2 Trip Structure. Trips are often conceptualized in terms of travel to a single destination; travelers often go to
  multiple destinations within a single trip, however. Unlike classical e-commerce platforms, a sequence of reservations
  on an online travel platform often reflects a sequence of physical movement by a traveler. Take, for example, a multi-
  destination vacation in Italy, as in Figure 2. The trip starts from Milan, then goes to Venice, Florence, Rome, Naples,
  Reggio Calabria, and finishes in Palermo. The sequential order of destinations follows the geographical location of each
  of the destination, and this sequential order is therefore more likely than alternative reshuffled sequences of the same
  cities. This sequence will not necessarily be the proper fit for all travelers, however. The specific destinations selected
  in building an itinerary can have varying dependencies, such as when a person is traveling, what sites or landmarks
  they are aware of, and other personal travel preferences. A recommender system can help a customer extend their trip
  by adding new destinations to their itinerary, but must do so in a way that accounts for these variances.


                                               Milan
                                                            Venice


                                                         Florence


                                                                     Rome


                                                                               Naples


                                                                     Palermo            Reggio
                                                                                        Calabria


                                         Fig. 2. Example Multi-Destination-Trip in Italy


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            2.1.3 Unique needs and preferences. A successful recommendation system needs to take into consideration the special
            needs and preferences of its users. In the context of destination recommendation systems, there are several travel
            requirements that have been traditionally neglected, such as accessibility [9], and that have been recently gaining
            more importance, such as sustainability [8]. In some instances, these overlooked needs and preferences can actually be
            strong enough as to act as a filtering mechanism for a customer. Hence, accounting for typically overlooked needs and
            preferences can increase the quality of a recommendation system’s output [23].


            2.2     Destinations
            Destination recommender systems are further challenged by the complexity of the item set.

            2.2.1    Geospatial Characteristics.
            Not all destinations have the same set of characteristics, offering different things to see, do, and experience. Geospatial
            differences are highly consequential for travel: beach destinations and ski destinations will gain or lose relevance during
            different times of year, for example. This makes it essential for destination recommendations to be aware of geospatial
            distinctions. For instance, if seeking to recommend an alternative to a beach destination, those recommendations should
            be similarly beach-based.
               To efficiently arrange geospatial information, traditional approaches rely on gazetteers, dictionaries of named locations
            that are used to link unstructured data to geographic space. [1, 28]. Gazetteers pinpoint geo-identifiers, such as locally
            relevant points-of-interest. While gazetteers provide a straightforward way of organising spatial data, they present
            unclear boundaries and have no spatial scales. This can cause overlapping spatial units to confound recommendations,
            with relevant items missing from the appropriate classes while inappropriately surfacing non-relevant ones. "New
            York City" as a classification is an example of this; this classifier can be defined to include either Manhattan only or to
            include all five of the city’s boroughs.


            Fig. 3. Property locations in Tuscany (Italy), color-coded according to the gazetteer nomenclature. Beyond such artificial separations,
            subjective factors influence a user’s perception of and preference for a given destination e.g. where Florence actually is?


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
     Further, a mismatch between destination and real location introduces noise in feature values which are used to classify
  destination attributes (such as distance to certain points-of-interest, nearby destinations, etc). Figure 3 demonstrates
  the challenge of recommending popular travel destinations in Tuscany, Italy. Given the absence of boundaries, property
  assignment within the recommended destination is no longer just a data-driven decision. Rather, multiple subjective
  factors such as the partner preference for a certain market, as well as the user’s perception of a destination, factor into
  shaping the perception of where an item actually is.
     Naming conventions further aggravate the issue. While properties’ locations may surface technical details about
  locality names, customers often are interested in specific regions due to unique travel intentions. As an additional
  example, while destination names like Magny-le-Hongre or Bussy-Saint-Georges, may not be familiar for an international
  traveler, recommending them as the Disneyland area in France will set the right context to the customers.

  2.2.2   Availability.
  An interesting challenge arises from the fact that destinations are not the final product being sold from an online
  travel platform. For example, at Booking.com, a user chooses the destination in order to book an accommodation.
  Therefore, a recommendation engine is required to ensure that there is sufficient inventory in the proposed destinations.
  This becomes crucial in a scenario of large-scale batch recommendation such as marketing. For instance, emails with
  recommendations sent to millions of subscribers through batch jobs can trigger millions of searches to destinations
  with limited room capacity or poor inventory quality. Consequently, if availability is not taken into account, many
  subscribers will end up landing on the website on destinations with few properties, or, even worse, with no available
  accommodations for their dates of interest.
     Furthermore, it is even more challenging to recommend destinations which have supply that is actually relevant for
  the customer. To provide the best service for our customers and avoid bad user experience, it is not enough to have
  sufficient availability of properties in a destination. Instead, there should be availability of "high quality" items, as
  perceived by a particular guest’s preferences; namely, do the recommended destinations include sufficient number of
  relevant properties, which users would be more interested in booking? The disruption to the travel industry caused by
  COVID-19 further exacerbated a shift from quantity to quality, as a decrease in supply caused a focus on on quality
  items [35].

  2.2.3   Dynamic Environment and Constraints.
  Creators of a destination recommender system also need to take into consideration the constantly changing nature of


  Fig. 4. Dynamicity and constrains due to seasonal and special events: i) Left panel shows shift in demand due to COVID-19 in a small
  village in Greece. ii) right panel shows demand in a European city hosting a major sporting event (May 2018), alternated to random
  pattern trends after COVID-19 hit. The data are collected from Booking.com.


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            the item set, influenced by two main categories of factors: dynamic factors, such as seasonality or special events, and
            constraints, such as legal restrictions to travelling (e.g., due to COVID-19). An ideal system should be able to factor both
            in accordingly, increasing recommendations of in-season items, while filtering out destinations for which travelling
            is restricted. Among the first set of factors, seasonality plays a major role, leading to extreme high and low peaks
            of demand [2]. The added complexity comes in understanding and incorporating the causal factors of these demand
            fluctuations, such as weather conditions and public holidays. An example of a seasonal destination is shown in the left
            panel of Figure 4, with the typical seasonal booking pattern of a top summer destination in a small Greek village. In
            addition to seasonality, other special events might periodically or temporarily increase the demand of some destinations,
            thereby throwing off recommendations. For example, in the right panel of Figure 4 the peak of May 2018 caused by the
            football’s Champions league final in a major European city is plotted.
               Moreover, dynamic constraints also play a critical role in destination recommendation. This is highlighted during
            the pandemic of COVID-19, where traveler choices are restricted and constantly revised. The effect of COVID-19 is
            shown in Figure 4 where the summer destination had a shift of almost two months for the in-season reservations in
            2020 compared to the previous two years and the Champions league final hosting city had an arbitrary demand pattern
            in 2020 compared to the previous two years.


            2.3     Customer-centricity
            To build an effective recommender system, it’s essential to look how users interact with an online travel platform and
            what they expect from a destination recommendation - not only in terms of which recommendations are offered, but
            also in how they are presented to them.

            2.3.1    Matching Goal and Purpose.
            Customers arrive at online travel platforms in different stages of their travel planning and booking process. In each
            of these, customers have different most-relevant goals. The choice of a travel destination can be viewed as a decision
            making processes in which the customer makes a choice of one or multiple destinations per trip, out of a finite set of
            alternative destinations [20]. The customers’ travel destination decisions can be influenced by various factors such
            as their own characteristics and travel planning demands or needs, the characteristics of the travel destinations, the
            available set of destinations and the information the customer has available per destination [19]. The purpose of a
            destination recommendation - or the meaning it will bring to a customer - will consequently differ per travel planning
            stage.
               We might, for instance, consider the following: A phase of discovery, in which customers are still collecting options
            that might suit their travel. In this phase, a customer may have expressed only minimal intent to travel, for instance by
            making an initial visiting the online travel platform. Here, the purpose of a recommendation would be to reinforce
            travel intent and reduce friction to choose a destination.


                                            Fig. 5. Multi-Destinations Trip extension recommendation bar


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
      Alternatives, is a phase of considerations in which a customer may be amenable to considering alternative destina-
  tions that are potentially better-matched to them, perhaps because their first choice of destination was not available or
  did not fit their budget. Here, the purpose of a recommendation is to surface alternative destinations that the customer
  is likely to view as acceptable replacements for an original choice.
      Another phase of considerations exists when a customer may have completed one booking in one destination but has
  not yet completed the itinerary planning in full. Here, the purpose of a recommendation can be to help the customer
  select the next, complementary destination to visit before or after the stay that was purchased (such as in Figure 5)
      Correspondingly, an important challenge for the author of a destination recommender system becomes navigating
  between a wide set of different system types, each of which has a purpose tailored to a specific travel planning goal.

  2.3.2   The HOW is as Important as the WHAT.
  How recommendations are presented on a page is often more important than the way they are generated. Copywriting,
  design, and UI can be adapted to and optimized for different user needs. For instance, the fit of a recommendation
  may only be understandable by the relevant information about it that is displayed (such as availability in Figure 8 or
  thematic context in Figure 9). Distance from a searched location, ease of public transport connections, the number of
  available accommodations, the average price of accommodation, or potential discounts may be relevant, but may also
  provide information overload if all presented. Decisions about what information is relevant and in what context must
  accordingly be considered.

  3     SOLUTION APPROACHES
  Following the above-mentioned challenges, this section demonstrates uniquely tailored approaches to serve destination
  recommendation. We present applied examples of utilizing domain-specific knowledge, designing user experience and
  specific modeling approaches for destination recommendation challenges.

  3.1     Domain Knowledge
  3.1.1   Cross Learning.
  Given a wide catalogue of travel products (accommodations, flights, cars, taxis, attractions) we face a unique opportunity
  to provide the users with the most relevant destination recommendations across these various products. Cross-learning
  between travel product can be applied to mitigate the bias impact introduced by missing data on the user journey as well
  as the Continuous Cold Start problem (subsubsection 2.1.1). We can utilize data from user interactions and bookings
  of multiple products to infer user travel patterns on other, non-observed, behaviours and bookings and improve our
  cross product recommendations. For example, by linking data from car rentals pick up airport agencies locations with
  hotel bookings, we are able to learn about the potential airport catchment areas and apply these areas as basis for a
  flight-destination recommendation (see more details in subsubsection 3.1.3).

  3.1.2   Dealing with Dynamic and Supply Constraints.


      An example of dealing with supply constraints in destination recommendation, given dynamic properties of the
  destinations set itself (as discussed in subsubsection 2.2.3) can be seen in our featured "domestic trending destinations"
  recommender. This system is able to provide tailored recommendations by leveraging items from a near-past window
  to leverage recent and seasonal information. Moreover, this solution is agnostic to a destination type; customers choose
  from a pool of recommendations ranging from entire regions to small islands, which partially solves supply constraints.


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            Finally, to adapt with disrupted travels due to COVID-19 restrictions we retrained our model on destinations within a
            fixed maximum distance and introduced hard constraints online to retrieve only destinations within the same country.
               Another aspect of supply-constraint challenges, related to hotel properties availability (see subsubsection 2.2.2), can
            be addressed by factoring in such constraints directly in the modelling process. This can be done by first identifying
            and measuring the accommodation supply "quality" of destinations. A proposed potential metric for supply quality
            considers the proportion of availability of top properties, out of the total availability of properties within an area. This
            allows measuring the optimal level and mix of the supply, and its relevance to the user, instead of just the quantity. 1
            Integrating this metric as part of the recommender model can better reflect supply-limited nature of travel destinations,
            and set a hard constraint for avoiding recommending destinations that have poor quality availability or are sold out.

            3.1.3    Exploiting Geographical Knowledge.
            To address geo-data related challenges we experimented with several solutions. As a general solution we leveraged
            industry-adopted standards such as spatial indexes. These are a class of space-partioning algorithms that allow us to
            efficiently query items at a global scale, but also serve as basis (e.g. aggregational units) for analytical tasks [7]. On top
            of this framework we built custom solutions. A prime example of it is our destination recommendation in flights, which
            can be divided roughly into three main cases - recommendation for booking flight destinations, recommendation for
            flights relevant origin airport given users’ geo-location,2 and a recommendation for travel destinations for users who
            have already booked a flight. In all cases, a main challenge is recommending a travel airport and destinations in the
            area served by the airport, which requires understanding the geographical extent of the recommended class.


                      Fig. 6. Delineation of main city-level IATA code catchment areas for destination recommendation classes in Italy


            1 Calculating and evaluating this metric could consider additional dimensions such as the supply quality within specific travel dates, and over stated user
            preferences such as hotel or room attributes filtering.
            2 Recommending origin airports is important for air travellers. The importance of recommending the relevant origin airport is crucial given seasonal
            activity of airports and carrier-routes, and particularly given fast changes in travel restrictions (which create a barrier for crossing borders to get to an
            airport, which is common in many parts of Europe).


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
     Flights users choose airports as destinations, but airports themselves are only a stop in the traveller’s journey
  before reaching their destinations. For this reason, it’s important to understand how travel destinations are linked with
  airport by delineating catchment areas.3 To classify destinations to service airports area we followed several modeling
  approaches. The obtained catchment areas (see an example of main Italy airports in Figure 6) are then used to group
  destinations into classes for recommending destinations for flights users:
  Flights travel patterns data and Cross-learning between travel products. Airport catchment areas can be iden-
  tified by linking flight user-searches data to identified geo-location. Additionally, service areas can also be delineated by
  applying cross-learning from other travel products. One example for this is by making use of data on users’ rental cars
  airport pick-up location, and link it to data on their chosen travel destination (hotel bookings).
  Gravity model approach. Given low data availability, a simple non-constrained gravity model approach for modeling
  spatial interaction [21, 29, 32] can link travel destinations to the most relevant service airport. The links’ "strength" is
  positively related to airport traffic, and negatively related to travel costs to destinations (proxied by travel distance),
  national border crossing and geo-barriers.

  3.2     Design and User Experience
  UX and UI choices are key for the success of recommender systems. Rapid and iterative experimentation for both ML
  and UX are crucial for reaching impactful solutions.

  3.2.1    Adapting UI according to Customer Needs.
  When a customer first comes to a travel platform, there is little context or user history to use for modeling. The
  subsequently generic recommendations can be made more tailored as user interactions progress. At Booking.com,
  for instance, a searched or clicked recommendation on the main landing page becomes a seed to adapt the next
  recommendations shown, even within the same session, as illustrated on Figure 7.


          Fig. 7. An example flow of recommendations refinement after a single customer search interaction with the website


     UI optimizations, moreover, can more seamlessly integrate recommendations with other useful tools to assist with
  where the customer is in their planning process. As an example, Booking.com ties destination recommendations to
  recent searches, allowing users to easily either revisit a previously search or explore another recommended destination.
     Booking.com also modifies the displayed destination recommendation in search results, depending on the level of a
  search and the number of results available for the search. For customers predicted to still be making destination-level
  decisions, as indicated by searching at the region level, the layout of the page gives higher priority to the destination
  3 Airport coverage areas can often overlap (a single travel destination can be served by multiple "competing" airports). This means that in this context a
  travel destination can be included in more than one recommendation classes.


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                     Fig. 8. Additional destination information within a recommendation element


            recommendations by putting them on top. On the other hand, customers looking for a defined city will see destination
            recommendations at the bottom of the page.
               As mentioned in subsubsection 2.3.2, what complementary information displayed with the destination recommen-
            dation is important, especially for customers looking for suitable alternatives or for next destination to extend their
            trip. Figure 8 shows an enhanced version of destination recommendation. Adding relevant information about the
            recommended destination (such as the main reasons to visit or number of accommodations available) and providing
            a direct link to view accommodations in a recommended destination brought significant improvement on business
            metrics. Destination recommendations also change over time in their relevance, and so recommender systems also
            need to consider how and when to renew and to discard recommendations for instance after the customer searched
            or booked the recommended destination. Here, there is also rationale for providing users a direct interaction with
            recommendations, such as by allowing them to discard inadequate recommendations.

            3.2.2   Leveraging Themes to Recommend Destinations.
            In some instances, an initial user input can help to provide a better destination recommendation. At Booking.com,
            we’ve found this to be particularly useful for customers who know a region they’d like to visit, but not specific towns
            or cities within that region. By utilizing reviews and other user generated data, together with accommodations and
            room amenities to score and match destinations and accommodations, we enabled customers to narrow their search


                                           Fig. 9. Example of thematic trip recommendations for Tuscany.


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
  according to themes, such as beach, nature, spa or relaxation, as displayed in Figure 9. The applied themes were selected
  by relying on user research and data analysis to understand growing trends. Available themes can grow and/or adapt as
  customers behavior and preferences continue to be analytically observed - as was relevant during the COVID-19 period.
  Personalizing the themes and/or the destinations within, generates a unique recommendation per each customer. This
  can also consider each traveller’s unique needs, as discussed in subsubsection 2.1.3.

  3.3    Modeling Approaches
  While many of the challenges in destination recommendation are addressed by designing the right user experience and
  better understanding of the recommended items, some of the patterns in the travel data can be addressed with unique
  modeling techniques. Here we present how adaptive online models and sequential learning models help to provide
  more relevant recommendations.

  3.3.1 Multi-Armed Bandits Modeling. Recommendations can be naturally formulated as a multi-armed bandits prob-
  lem [10, 46]. These algorithms balance between exploration of items with low confidence estimates and exploitation of
  items with high confidence, with user feedback constantly fed into the model and updating the model parameters. In
  destination recommendation, we select an arm (destination) and observe its specified reward. A main advantage of this
  approach over models built on historical data, is that the latter tend to recommend very popular destinations and thus
  suffer from limited diversity. Moreover, these models are trained on historical reservation data which are not directly
  caused by our interventions and may introduce bias. Setting up the framework for a successful bandit is not trivial
  and at Booking.com we have deployed bandits is several parts of our business [3]. Here we list key challenges that we
  identified in the context of destination recommendation in email marketing campaigns:
        • Reward definition. Our goal is to help subscribers complete their booking and therefore, the obvious reward
          for this bandit is the booking itself. However, attributing which bookings are a result of our intervention is not
          trivial. The reason for this is that the customer journey is not always linear, i.e. [Search → Open email → Click on
          recommendation → Book recommendation]. In fact, our subscribers might open the email, see the recommendations
          and complete their booking without landing on the website by clicking in any of them. Furthermore, they might
          click on a recommendation that they like, start browsing and then book another destination.
        • Delayed feedback. In many cases the feedback/reward is not (or cannot be) reported immediately [39]. The
          subscribers don’t always interact with our emails immediately after receiving them.
        • Off-policy updates. As a result of the delayed feedback, online learning is infeasible and thus, off-policy learning
          techniques have to be employed.
        • Multiple recommendations - Slates. Recommendations usually come as slates [18] instead of single items.
          We recommend hundreds of thousands destinations and consequently, the set of possible slate is enormous.
        • Productionization. Sending millions of emails means that we have to get millions of predictions from our
          models, which makes efficiency a critical factor. In stochastic policies, we have to add another layer of complexity
          which comes from sampling (without replacement) from the output of our models.
  Our email recommendations are built using an internal feedback-loop bandits platform [3], supporting large scale,
  distributed online learning. The architecture of our recommendation platform is shown in Figure 10. This platform
  (predictions and feedback tracking box in Figure 10) keeps track of all user interaction data, such as the destinations in
  the emails (recommendation model’s predictions), the features used to get the predictions, and the interactions with
  each recommended item/destination. Our model is based on the top-k off-policy approach presented in [11], tailored to


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                                   Fig. 10. Recommendation platform architecture


            our business needs. We selected clicks to be the reward which, although a proxy to our original goal, is more direct and
            less sparse than bookings. Clicks are collected over a period of three days after sending each email.
               Given a set of features 𝑥 ∈ 𝑅𝑑 for a subscriber, the model and policy produce a probability distribution over the
            𝑁 possible destinations (arms) 𝑎𝑖 , 𝑝 (𝑎 1, 𝑎 2, . . . , 𝑎 𝑁 |𝑥), which in our case is DNN with a softmax output. Similar to
            [11], we used Boltzmann exploration in order to maintain good training data for our policy. In order to generate
            𝑘 recommendation, one has to sample 𝐾 >> 𝑘 items from a categorical distribution parametrized by the model’s
            output 𝑝 (·|𝑥), and then filter out duplicate destinations. In order to avoid sampling with replacement, we employed the
            Topk-Gumbel-Trick [34]. Using this trick we add Gumbel noise to the log probabilities and select the top-k outputs,
            i.e. 𝑎𝑟𝑔𝑡𝑜𝑝𝑘 {𝑙𝑜𝑔(𝑝 (·|𝑥)) + 𝑔 ∼ G(0, 1)}. This modeling led us to improve the efficiency of our recommendation engine.
            Our model is periodically updated using the reward received over the last few days. As mentioned before, the feedback
            is delayed, which means that the policy which is optimised in every iteration might deviate from the logging policy,
            which produced the feedback used for training in the iteration in hand. Similar to the original paper [11], we use inverse
            propensity scoring, where the propensities are stored along with the feedback by the system.

            3.3.2   Sequence Modeling.
            Many types of recommendations problems can be naturally viewed as sequence extension problems. A classic example
            is a language model offering real-time next word recommendations while typing. Similarly, when recommending a
            travel destination, it is reasonable to expect that not only the set of previously booked destinations, but also their
            exact order would be a useful input for making the next destination prediction (see subsubsection 2.1.2). One of the
            recommendation problems that we try to solve, called the multi-destination trips problem, is when we have information
            about the beginning of the trip and we want to suggest a trip extension (Figure 5). Recurrent neural networks (RNNs)
            including their gated variants (e.g. LSTM, GRU) are commonly used for such sequence completion tasks. In general,
            sequential models have rapidly adopted in the field of recommendation systems [12, 36, 42, 47].
               The problem becomes more complex if we want to personalize our recommendations. For example, if we want to
            recommend where to go next after visiting Amsterdam and London, the recommendation for a user from Japan might
            differ from a user from Germany. Recent studies show how to use contextual data in a Recurrent Neural Network (RNN)
            based recommender system [6, 40] and combine the recurrent part within a complex architecture. Going back to our


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                            Fig. 11. Concatenating features embedding to the main token embedding


  multi-destination problem, in order to get a better recommendation, in addition to past bookings, we can use features
  such as user country, lengths of stay, season or accommodation type. A popular and straightforward approach for
  merging item and features embeddings is simply by concatenating them (as shown in Figure 11).
     In a general, "where to go next" recommendation, which is not necessarily within a trip, we may use any user’s
  historical interactions, including searches and his bookings. To distinguish between searches and bookings, we concate-
  nate an identifier feature to each destination embedding before passing it to the network. Another important feature
  we add, similar to Latent Cross [6], is the days difference between each of the items that a user has searched or booked.
  We observed that, a recommendation for a user that lastly booked only a few months ago would be more generic and
  "popular", while a user with an up-to-date booking would be recommended a destination nearby the last booking.
     Another useful approach is the Wide and Deep framework [13]. It allows a joint training of wide linear models and
  deep neural networks. The framework combines the benefits of memorisation and generalisation for recommender


                                             Fig. 12. Wide and Deep Architecture


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            systems. Our architecture (Figure 12) includes three main components: wide, deep and sequential (contextual RNN).
            However, modeling the problem with a generic, deep-learning solution, bring additional challenges:

                 • High cardinality. In Booking.com we recommend hundreds of thousands of destinations. To deal with the
                   high cardinality space of the destinations, we map the long tail low-frequency destinations to an unknown token
                   during the training. Then during prediction, the unknown destinations are replaced in an offline manner, based
                   on geographical proximity.
                 • Different space of recommendations. Some situations require predictions in an output space which is dif-
                   ferent from the input space, we may want to recommend a country to visit next based on the user’s past
                   history of accommodation bookings. We may instead want to recommend other products of connected trip (i.e
                   attractions, flights, cars rental and taxis) based on past accommodation bookings. To address this, we set an
                   RNN Encoder→Decoder architecture, where the decoder can be set as simple as softmax regression in case of a
                   categorical prediction [6, 40].
                 • Not chronological sequences. Some customers perform reservations in an order that differs from the chrono-
                   logical order of the stays, which opens up a new challenge in the form of "gaps-filling". The missing part of the
                   trip is surrounded by two existing bookings, such as in Figure 5. We use RNN Encoder→Decoder, where the
                   sequence encoded with a mask on the gap, and the decoder filling-in the missing destination.

                A recent Booking.com contest [25] conducted on the dataset of the multi-destination trip problem [26] provided
            benchmark results for the recommendation problem and opened up an opportunity to explore various state-of-the-art
            Machine Learning techniques. The best-performing team [45] achieved a 0.5939 score, providing a 30% improvement
            compared to the median result. The solution put to use a blend of three different neural network architectures, using
            XLNet Transformers [48], Gated Recurrent Units [14], and feed-forward networks with a shared Session-based Matrix
            Factorization head (SMF). The authors discuss the superior performance of deep-learning based algorithms in the
            challenge, compared to classical recommender approaches relying on their past experience [31]. The runner-up solution
            is based on an Efficient Manifold Density Estimator (EMDE) for the recommendation task [16]. Other leading solutions
            used various deep-learning and recommender techniques, including Transformers, LSTM networks [30, 49], LambdaRank
            [41, 43], and further state-of-the-art recommender methods.


            4   CONCLUSION
            In this work, we presented a set of unique challenges in destination recommender systems, displaying the underlying
            need for domain-specific approaches to resolve these challenges. These challenges are in the areas of both customer
            context and unique data, modeling, and measurement properties. While the destination recommender systems can
            benefit from the state-of-the-art recommender methods, it further requires unique and creative approaches for tailor-
            made solutions. Resolving these challenges therefore calls for adaptation of classical recommender methods. The lessons
            and solutions described here provide a first layer of applied insights for implementing destination recommendations in
            real-life systems and lay a groundwork for further research in the field.


            ACKNOWLEDGMENTS
            We thank Thomas Bosman, Klajdi Qirko and Jake Mooney for their help in editing and reviewing this paper; and to the
            UX Research, Product, Development and Design teams at Booking.com who contributed to bringing those ideas to life.


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
  REFERENCES
   [1] Elise Acheson, Michele Volpi, and Ross S. Purves. 2020. Machine learning for cross-gazetteer matching of natural features. International Journal of
       Geographical Information Science 34, 4 (2020), 708–734. https://doi.org/10.1080/13658816.2019.1599123
   [2] Ahmad Alshuqaiqi and Shida Irwana Omar. 2019. Causes and Implication of Seasonality in Tourism. Journal of Advanced Research in Dynamical and
       Control Systems 11 (03 2019), 1480–1486.
   [3] Lucas Bernardi, Pablo Estevez, Matias Eidis, and Eqbal Osama. 2020. Recommending Accommodation Filters with Online Learning. In Proceedings of
       the Workshop on Online Recommender Systems and User Modeling (ORSUM @ RecSys 2021).
   [4] Lucas Bernardi, Jaap Kamps, Julia Kiseleva, and Melanie JI Müller. 2015. The Continuous Cold Start Problem in e-Commerce Recommender Systems.
   [5] Lucas Bernardi, Themistoklis Mavridis, and Pablo Estevez. 2019. 150 Successful Machine Learning Models: 6 Lessons Learned at Booking.com. In
       Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1743–1751.
   [6] Alex Beutel, Paul Covington, Sagar Jain, Can Xu, Jia Li, Vince Gatto, and Ed H Chi. 2018. Latent cross: Making use of context in recurrent
       recommender systems. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 46–54.
   [7] B Bondaruk, SA Roberts, and C Robertson. 2019. Discrete Global Grid Systems: Operational Capability of the Current State of the Art. In Proceedings
       of the 7th Conference on Spatial Knowledge and Information Canada (SKI2019), Vol. 2323.
   [8] Booking.com. 2021. New research reveals an increased desire to travel more sustainably. https://partner.booking.com/en-us/click-magazine/new-
       research-reveals-increased-desire-travel-more-sustainably Accessed: 2021-06-30.
   [9] Luchiana C. Brodeala. 2020. Online Recommender System for Accessible Tourism Destinations. In Fourteenth ACM Conference on Recommender Systems
       (Virtual Event, Brazil) (RecSys ’20). Association for Computing Machinery, New York, NY, USA, 787–791. https://doi.org/10.1145/3383313.3411450
  [10] Olivier Chapelle and Lihong Li. 2011. An Empirical Evaluation of Thompson Sampling. In Proceedings of the 24th International Conference on Neural
       Information Processing Systems (Granada, Spain) (NIPS’11). Curran Associates Inc., USA, 2249–2257. http://dl.acm.org/citation.cfm?id=2986459.
       2986710
  [11] Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H. Chi. 2019. Top-K Off-Policy Correction for a REINFORCE
       Recommender System. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (Melbourne VIC, Australia)
       (WSDM ’19). Association for Computing Machinery, New York, NY, USA, 456–464. https://doi.org/10.1145/3289600.3290999
  [12] Qiwei Chen, Huan Zhao, Wei Li, Pipei Huang, and Wenwu Ou. 2019. Behavior sequence transformer for e-commerce recommendation in alibaba. In
       Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data. 1–4.
  [13] Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa
       Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10.
  [14] Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence
       modeling. arXiv preprint arXiv:1412.3555 (2014).
  [15] Amine Dadoun, Raphaël Troncy, Olivier Ratier, and Riccardo Petitti. 2019. Location Embeddings for Next Trip Recommendation. In Companion
       Proceedings of The 2019 World Wide Web Conference (San Francisco, USA) (WWW ’19). Association for Computing Machinery, New York, NY, USA,
       896–903. https://doi.org/10.1145/3308560.3316535
  [16] Michał Daniluk, Barbara Rychalska, Konrad Gołuchowski, and Jacek Dąbrowski. 2021. Modeling Multi-Destination Trips with Sketch-Based Model.
       In Proceedings of the ACM WSDM Workshop on Web Tourism (WSDM Webtour’21).
  [17] Manoj Reddy Dareddy. 2016. Challenges in Recommender Systems for Tourism. In Proceedings of ACM RecSys Workshop on Recommenders in Tourism
       (RecTour 2016) (Boston, MA, USA).
  [18] Maria Dimakopoulou, Nikos Vlassis, and Tony Jebara. 2019. Marginal Posterior Sampling for Slate Bandits.. In IJCAI. 2223–2229.
  [19] Thi Hai Ninh Do and Wurong Shih. 2016. Destination Decision-Making Process Based on a Hybrid MCDM Model Combining DEMATEL and ANP:
       The Case of Vietnam as a Destination. Modern Economy 7 (2016), 966–983.
  [20] Ercan Ezin, Iván Palomares, and James Neve. 2019. Group Decision Making with Collaborative-Filtering ‘in the loop’: interaction-based preference
       and trust elicitation. (2019), 4044–4049. https://doi.org/10.1109/SMC.2019.8914224
  [21] A Stewart Fotheringham and Morton E O’Kelly. 1989. Spatial interaction models: formulations and applications. Vol. 1. Kluwer Academic Publishers
       Dordrecht.
  [22] Dmitri Goldenberg. 2021. Putting the Role of Personalization into Context. In Proceedings of the 44th International ACM SIGIR Conference on Research
       and Development in Information Retrieval (SIGIR ’21). https://doi.org/10.1145/3404835.3464929
  [23] Dmitri Goldenberg. 2021. Putting the Role of Personalization into Context. In Proceedings of the 44th International ACM SIGIR Conference on Research
       and Development in Information Retrieval. 2631–2632.
  [24] Dmitri Goldenberg, Kostia Kofman, Javier Albert, Sarai Mizrachi, Adam Horowitz, and Irene Teinemaa. 2021. Personalization in Practice: Methods
       and Applications. In Proceedings of the 14th International Conference on Web Search and Data Mining.
  [25] Dmitri Goldenberg, Kostia Kofman, Pavel Levin, Sarai Mizrachi, Maayan Kafry, and Guy Nadav. 2021. Booking.com WSDM WebTour 2021 Challenge.
       http://www.bookingchallenge.com. In Proceedings of the ACM WSDM Workshop on Web Tourism (WSDM Webtour ’21).
  [26] Dmitri Goldenberg and Pavel Levin. 2021. Booking.com Multi-Destination Trips Dataset. In Proceedings of the 44th International ACM SIGIR
       Conference on Research and Development in Information Retrieval (SIGIR ’21). https://doi.org/10.1145/3404835.3463240


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            [27] Dmitri Goldenberg, Guy Tsype, Igor Spivak, Javier Albert, and Amir Tzur. 2021. Learning to Persist: Exploring the Tradeoff Between Model
                 Optimization and Experience Consistency. In Companion Proceedings of the Web Conference 2021. Association for Computing Machinery, New York,
                 NY, USA, 527–529. https://doi.org/10.1145/3442442.3452051
            [28] M. F. Goodchild and L. L. Hill. 2008. Introduction to digital gazetteer research. International Journal of Geographical Information Science 22, 10 (2008),
                 1039–1044. https://doi.org/10.1080/13658810701850497
            [29] Kingsley E Haynes and A Stewart Fotheringham. 1985. Gravity and spatial interaction models. Regional Research Institute, West Virginia University.
                 Reprint. Edited by Grant Ian Thrall. WVU Research Repository, 2020.
            [30] Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
            [31] Dietmar Jannach, Gabriel de Souza P. Moreira, and Even Oldridge. 2020. Why Are Deep Learning Models Not Consistently Winning Recommender
                 Systems Competitions Yet? A Position Paper. In Proceedings of the Recommender Systems Challenge 2020. 44–49.
            [32] Jameel Khadaroo and Boopen Seetanah. 2008. The role of transport infrastructure in international tourism development: A gravity model approach.
                 Tourism management 29, 5 (2008), 831–840.
            [33] Julia Kiseleva, Melanie JI Mueller, Lucas Bernardi, Chad Davis, Ivan Kovacek, Mats Stafseng Einarsen, Jaap Kamps, Alexander Tuzhilin, and Djoerd
                 Hiemstra. 2015. Where to go on your next trip? Optimizing travel destinations based on user preferences. In Proceedings of the 38th International
                 ACM SIGIR Conference on Research and Development in Information Retrieval. 1097–1100.
            [34] Wouter Kool, Herke van Hoof, and Max Welling. 2020. Estimating Gradients for Discrete Random Variables by Sampling without Replacement. In
                 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.
                 net/forum?id=rklEj2EFvB
            [35] Dewi Ayu Kusumaningrum and Suci Sandi Wachyuni. 2020. The shifting trends in travelling after the COVID 19 pandemic. International Journal of
                 Tourism & Hospitality Review (2020).
            [36] Tobias Lang and Matthias Rettenmeier. 2017. Understanding consumer behavior with recurrent neural networks. In Workshop on Machine Learning
                 Methods for Recommender Systems.
            [37] Neal Lathia. 2017. Bootstrapping a Destination Recommender System. In Proceedings of the Eleventh ACM Conference on Recommender Systems
                 (Como, Italy) (RecSys ’17). Association for Computing Machinery, New York, NY, USA, 341. https://doi.org/10.1145/3109859.3109924
            [38] Blerina Lika, Kostas Kolomvatsos, and Stathes Hadjiefthymiades. 2014. Facing the cold start problem in recommender systems. Expert Systems with
                 Applications 41, 4, Part 2 (2014), 2065–2073. https://doi.org/10.1016/j.eswa.2013.09.005
            [39] Themis Mavridis, Soraya Hausl, Andrew Mende, and Roberto Pagano. 2020. Beyond algorithms: Ranking at scale at Booking. com. In Proceedings of
                 the Fourth Workshop on Recommendation in Complex Scenarios. CEUR-WS (Virtual Event, Brazil).
            [40] Sarai Mizrachi and Pavel Levin. 2019. Combining Context Features in Sequence-Aware Recommender Systems.. In Late-Breaking Results of the 13th
                 ACM Conference on Recommender Systems (Copenhagen, Denmark) (RecSys ’19).
            [41] Aleksandr Petrov and Yuriy Makarov. 2021. Attention-based neural re-ranking approach for next city in trip recommendations. In Proceedings of the
                 ACM WSDM Workshop on Web Tourism (WSDM Webtour ’21).
            [42] Massimo Quadrana, Paolo Cremonesi, and Dietmar Jannach. 2018. Sequence-aware recommender systems. ACM Computing Surveys (CSUR) 51, 4
                 (2018), 1–36.
            [43] C Quoc and Viet Le. 2007. Learning to rank with nonsmooth cost functions. Proceedings of the Advances in Neural Information Processing Systems 19
                 (2007), 193–200.
            [44] Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, and David M. Pennock. 2002. Methods and Metrics for Cold-Start Recommendations. In
                 Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Tampere, Finland) (SIGIR
                 ’02). Association for Computing Machinery, New York, NY, USA, 253–260. https://doi.org/10.1145/564376.564421
            [45] Benedikt Schifferer, Chris Deotte, Jean-Francois Puget, Gabriel de Souza Pereira Moreira, Gilberto Titericz, Jiwei Liu, and Ronay Ak. 2021. Using
                 Deep Learning to Win the Booking.com WSDM WebTour21 Challenge on Sequential Recommendations. In Proceedings of the ACM WSDM Workshop
                 on Web Tourism (WSDM Webtour ’21).
            [46] Aleksandrs Slivkins. 2019. Introduction to Multi-Armed Bandits. CoRR abs/1904.07272 (2019). arXiv:1904.07272 http://arxiv.org/abs/1904.07272
            [47] Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. 2019. BERT4Rec: Sequential recommendation with bidirectional
                 encoder representations from transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management.
                 1441–1450.
            [48] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is
                 All You Need. https://arxiv.org/pdf/1706.03762.pdf
            [49] Yuanzhe Zhou, Shikang Wu, and Chenyang Zheng. 2021. Explore next destination prediction. In Proceedings of the ACM WSDM Workshop on Web
                 Tourism (WSDM Webtour ’21).


Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).