A Travel Recommender System for Combining Multiple
               Travel Regions to a Composite Trip

                                  Daniel Herzog                                                     Wolfgang Wörndl
                                    TU München                                                          TU München
                                   Boltzmannstr. 3                                                     Boltzmannstr. 3
                                   85748 Garching                                                      85748 Garching
                                      Germany                                                             Germany
                             herzogd@in.tum.de                                                    woerndl@in.tum.de

ABSTRACT                                                                              Keywords
Internet-based services are available to recommend destina-                           case-based recommender system, tourist trip design prob-
tions and activities for organized trips. Only few systems                            lem, travel recommendation, knapsack problem, similarity
support travelers when creating composite trips consisting
of multiple destinations or activities. The idea in this work                         1. INTRODUCTION
is to select travel regions that maximize the value of the
                                                                                         Recommender systems are an established technology in
composite trip for the user while still respecting her limita-
                                                                                      online shops to provide users with a list of recommended
tions in time and money. The value of a travel region can be
                                                                                      items during their shopping experience. In comparison to
determined by the similarity between a specified user query
                                                                                      recommending single items, travel recommendation turns
and the cases in a travel region database. The recommen-
                                                                                      out to be a more complicated issue since tourists have to
dation algorithm needs to find a decent routing between the
                                                                                      deal with both limited budget and time frame. If users call
regions while still satisfying diversity of the whole trip. We
                                                                                      for an individual trip composed of multiple regions, the task
developed an algorithm based on an approximation for the
                                                                                      becomes even more complex. An exemplary situation il-
knapsack problem and extended it to recognize dependencies
                                                                                      lustrates the difficulty: A person is looking for a trip for
between the regions while calculating best combinations. It
                                                                                      four weeks in September with a budget of 2000 Euros; she
is able to determine the optimal duration of stay per region
                                                                                      likes nature and hiking with some cultural highlights as a
and its performance improves when benefiting from the hi-
                                                                                      plus. A high number of possible routes could be packaged
erarchical structure of our travel database. In an expert
                                                                                      to a composite trip and the constraints in time and money
study, we verified the results of our approach. The study
                                                                                      have to be taken into account. This is a special case of the
proves that our algorithm for composite trips delivers good
                                                                                      so called tourist trip design problem (TTDP) [14]. TTDP
recommendations which satisfied an expert user more than
                                                                                      is an extension of the orienteering problem (OP): A score
baseline algorithms. Regions in the composite trip fit to-
                                                                                      which determines the value for the user is assigned to every
gether better and a decent routing between the regions can
                                                                                      location in a sequence. The goal is to maximize the sum of
be ensured. Nevertheless, the algorithm leaves room for im-
                                                                                      the scores of the selected locations while still meeting the
provement by combining less similar regions in a composite
                                                                                      given limitations [16].
trip, thus leading to a higher diversity of the recommenda-
                                                                                         Recommender systems use di↵erent algorithms to find a
tion.
                                                                                      suitable list of recommendations in a feasible time. A sub-
                                                                                      area of content-based recommender systems is case-based
                                                                                      recommender systems. Case-based recommenders rely on
Categories and Subject Descriptors                                                    items structured as cases using a well defined set of features
I.2.8 [Problem Solving, Control Methods, and Search]:                                 and feature values. Those cases are compared to a user
Dynamic programming                                                                   query or her profile and the most similar cases are delivered
                                                                                      as a recommendation [13].
                                                                                         These recommenders are already applied for travel recom-
                                                                                      mendation. In order to recommend a longer trip, the cases
General Terms                                                                         have to be combined to maximize the value while still re-
Algorithms                                                                            specting the users’ limitations. The simplest approach to
                                                                                      determine the value of a composite recommendation is to
                                                                                      add the values of all single parts of the proposed trip. This
                                                                                      approach does not consider dependencies between the parts,
Permission to make digital or hard copies of all or part of this work for             for instance travel activities which cannot be executed at the
personal or classroom use is granted without fee provided that copies are             same time as other activities. Furthermore, determining the
not made or distributed for profit or commercial advantage and that copies            maximum score of all possible combinations is not compu-
bear this notice and the full citation on the first page. To copy otherwise, to       tationally feasible. This is the reason why these algorithms
Copyright 2014 for the individual papers by the paper’s authors.
republish, to post on servers or to redistribute to lists, requires prior specific    need to work with heuristics.
Copying permitted
permission             for private and academic purposes. This volume is
             and/or a fee.
published and
CBRecSys        copyrighted
            2014, October 6, by   its editors.
                               2014,  Silicon Valley, CA, USA.                           This paper aims to investigate possible solutions for com-
CBRecSys 2014
Copyright   2014,byOctober   6, 2014, Silicon Valley, CA, USA.
                     the author(s).                                                   bining multiple travel regions to a composite trip for in-


                                                                                 42
dependent travelers. The more specific research problem           thermore, it calculates the optimal duration of stay per re-
is whether an algorithm which recognizes dependencies be-         gion in the composite trip and benefits from the hierarchical
tween items and is based on a hierarchically structured data      structure of the underlying data model.
model can lead to better recommendations of a sequence of
items. The rest of this paper is organized as follows. Section    3.    UNDERLYING DATA MODEL
2 looks at related work in the field of travel recommenda-
tion. In Section 3 we present the underlying data model we           In this paper, we aim to recommend trips composed of
developed for testing our algorithm. Section 4 explains the       multiple regions for individual traveling. We have developed
main idea, the single steps and the implementation of our         a travel database with realistic data in order to test the
algorithm. This algorithm is evaluated in Section 5. Section      proposed algorithm. The data model is composed of several
6 concludes our work and presents future work.                    layers ordered in a hierarchical manner as explained in this
                                                                  section.
                                                                     In our model, a region is always a subregion of another
2. RELATED WORK                                                   region while the world is the parent region of every region.
   Research already focuses on travel recommendation and          Regions contain the necessary information for recommenda-
bundling k items to a recommendation package. Related             tion:
work is [7] who developed the TAST model which represents
the interests of tourists as well as extracted features of re-         • How good a region matches traveler types such as Free
gions like the location or travel seasons. Travel packages are           spirits or Cultural explorer, on a 5-point Likert scale
recommended to the user based on this model combined with
additional information like prices. [17] developed a compos-           • Minimum and a maximum recommended duration of
ite recommender system with access to one or more under-                 stay
lying recommender systems and therefore devises two algo-              • Minimum and optimal budget a traveler has to spend
rithms for generating top-k packages as recommendations,                 per pay
both with high quality and one being much faster than the
other. Furthermore, [1] explains how to bundle recommen-               • Recommended periods of traveling in the region as 5-
dation packages by regarding relationships and associations              point Likert scale per month
between the entities. Hence, the best recommendations for
a given set of keywords can be determined. Early pruning               • Security for travelers (crime level), on a 5-point Likert
and terminations strategies are used to develop an efficient             scale
algorithm.
   [14] introduces the TTDP, shows the connection to the             If a region lacks specified attributes, it inherits the values
OP and presents a fast algorithm that produces solutions of       from the parent region. We also model the connections be-
better quality by using a guided local search meta-heuristic.     tween regions by specifying the necessary e↵ort (time and
[5] tries to solve the TTDP by presenting a cost-aware travel     cost) in one number to travel from one region to another.
tour recommender, while Trip-Mine is looking for optimal          The connection cost between neighboring regions is 0. Re-
trips by satisfying the user’s travel time constraints [8].       gions are part of a country or can be spread over several
   [9] shows that case-based recommenders can be used to          countries. For example, larger countries such as the USA
bundle travel items. The developed recommendation method-         are divided in several travel regions, while smaller countries
ology Trip@advice stores recommendation sessions as cases.        such as the Netherlands, Belgium and Luxembourg are com-
Furthermore, the user can give direct feedback to invoke sug-     bined into one region. Furthermore, regions contain one or
gested query changes during the recommendation process.           more routes, i.e. concrete itineraries to visit a region. At
[12] confirms that case-based reasoning and making associ-        this time, our systems recommends regions only, but in fu-
ation rules are solutions for recommending tourism travel         ture work, routes can be considered for recommendation as
packages. [3] devises a hybrid algorithm that additionally        well. The model can also be extended with additional at-
includes collaborative filtering in the recommendation pro-       tributes such as spoken languages in regions. Regions can
cess. The bundling of trips to packages in [15] is based on       be assigned to the countries they belong to and store addi-
an object orientation solution which faces the high variation     tional information like visa requirements, but we do not use
in travel requirements.                                           countries for recommendation at this time.
   [10] developed DieToRecs, a travel recommender system             Figure 1 illustrates the data model by showing an example
which o↵ers personalized interaction during the recommen-         with several layers of regions. USA Southwest, USA Pacific
dation process to create individual bundles of trips. [11]        Northwest and Western Canada are subregions which can
shows the application of automatically collected constraints      be recommended by our algorithm. When developing and
and preferences which can be added to user queries in order       testing our algorithm, our database was composed of a total
to improve the results of complex trip recommender systems.       of 152 regions with 124 leaves in the region tree which can
Further approaches make use of geo-location and pictures of       be recommended in a trip. The data was complied based on
the travel entities [4].                                          various Internet sources.
   Our developed algorithm combines multiple travel regions          A user who asks for a recommendation chooses one trav-
to a composite trip. It is based on an approximation for          eler type which expresses her expectations towards trips.
the knapsack problem and extends the existing approaches          The o↵ered traveler types such as Free spirits or Cultural
by taking dependencies between single regions into account.       explorer are inspired by a market segmentation tool of the
The algorithm considers the fact that the value of each re-       Canadian Tourism Commission1 . Based on the described
gion in a composite trip is dependent on the presence or          1
                                                                    http://en-corporate.canada.travel/resources-
absence of other regions in the same recommendation. Fur-         industry/explorer-quotient


                                                             42
                                                                  the user’s requirements and preferences for her trip. Hence,
                                                                  a travel recommendation algorithm needs to collect infor-
                                                                  mation like the user’s traveling type or preferred activities,
                                                                  her intended period of traveling and her monetary and time
                                                                  limitations. The algorithm needs a predefined way to calcu-
                                                                  late the value by using this information. To decide whether
                                                                  a region will be considered for a trip, we calculate a rating
                                                                  (see below).
                                                                     In our case, the problem gets more complicated than the
                                                                  standard knapsack problem because the value of a region
                                                                  is not only determined by the user query. Rather, it de-
                                                                  pends on the presence or absence of other regions in the
                                                                  recommended composite trip. This extension of the knap-
        Figure 1: Data example from region tree                   sack problem is called the Oregon Trail Knapsack Problem
                                                                  [2]. Imagine a travel sequence that recommends visiting Ger-
                                                                  many, Austria and Switzerland. If the user could spend more
preferences and suggested activities for each traveler type,      time and money, the system can add further regions to this
we rated each region in the database to express how it matches    trip. Depending on the user’s preferences, additional re-
the traveler types on a 5-point Likert scale.                     gions in or close to Central Europe should be recommended.
                                                                  Only exceptional circumstances legitimate an additional re-
4. ALGORITHM FOR THE RECOMMEN-                                    gion far from Central Europe because the e↵ort of visiting
                                                                  this region during this trip is disproportional. The value of
   DATION OF COMPOSITE TRIPS                                      the composite trip itself is lower than the sum of the region
   In this section, we present our algorithm to recommend         values because the distance between regions has a negative
trips composed of multiple travel regions. This algorithm         impact on the composite trip.
can serve as a basis for a fully functional travel recommen-         In addition, a region’s value in a composite trip influences
dation application.                                               the diversity of the sequence. For instance, if the recom-
                                                                  mender accepts di↵erent activities as user input, the algo-
4.1 Basic idea and goals                                          rithm should focus on a decent coverage of all demanded
  Composite trip recommendation can be seen as a special          activities. Downgrading the values of similar regions, when
case of the knapsack problem to find best combinations of         combining them in a recommendation, allows avoiding situ-
multiple travel items [17]. The underlying idea is to combine     ations where only a few activities are served. If a user wants
as many single travel items like regions, routes or activities    to go skiing and swimming, a recommender should not only
as necessary to maximize the benefit for the user while still     recommend regions for one of these activities, but both.
respecting existing limitations like time and money. The             In this paper, regions instead of routes are recommended.
problem can be described formally as:                             Regions in our data model are not limited to specific activ-
  maximize                                                        ities, diversity is mainly expressed by the the duration of
                        X n                                       stay in each recommended region. Every additional week
                            fi (xi , Xi )                         a user stays in a region leads to a lower value for the user
                        i=1                                       because the probability that she explored this region enough
  subject to                                                      is increasing. Hence, another goal of the algorithm is deter-
                                                                  mining the optimal duration of stay per region in the travel
                       n
                       X                                          sequence.
                             di · xi  D
                                                                     To conclude, a region reaches its maximum value for a
                       i=1
                                                                  given query when regarding it exclusively, and not within
  and                                                             a composite trip. Other regions in the composite trip can
                       n
                       X                                          decrease this value because of increasing distances and lack
                              bi · xi  B                         of diversity. This negative impact can be calculated by a
                       i=1                                        penalty function. Hence, the value function for the compos-
  with                                                            ite trip recommendation problem extends the value func-
                                                                  tions of the Oregon Trail Knapsack Problem by a penalty
                         xi 2 {0, 1}                              function [2]. The resulting value function can be described
                                                                  formally as:
   where fi (xi , Xi ) is the value function for item i and Xi
is the set of items upon which the value of item i depends.
                                                                                                 X
xi is either 0 or 1 which means that each travel item can              fi (xi , Xi ) = xi · vi          (t(i, e) · xi · vi · [xe > 0])
be added to the travel sequence only once (or not). di is                                        e2Xi
item i’s recommended duration of stay and bi is item i’s
recommended daily budget. D is the maximum possible                  where vi is the value of region i for the specified user
duration of stay and B is the maximum budget the user can         query and t(i, e) is the penalty function for two regions i
spend on her trip. In this paper, travel items are represented    and e. [xe > 0] represents a Boolean value whether item
by regions from all over the world.                               i is in the knapsack. The algorithm needs to provide an
   The first challenge that arises is to determine the value a    implementation of t(i, e).
single region o↵ers to the user. This value is dependent on          After determining the best regions and the durations of


                                                             43
stay for the final composite trip, an optimal routing should
be ensured. The problem of combining regions to a com-
posite trip is part of the orienteering problem. Basically,
the orienteering problem extends the knapsack problem by
charging time and money for the routing between the travel
items [14]. Hence, finding the shortest and cheapest rout-
ing between regions reduces loss of time and money when
executing the algorithm.
   We have developed a travel sequence recommender based
on the explained data model (cf. Section 3). The algo-
rithm exploits the hierarchical structure. For instance, if the
user wants to travel through Europe but already explored
Scandinavia, there is no need to execute any calculations for
other continents or Scandinavian regions. As a consequence
                                                                     Figure 2: Asymmetric similarity metric (adapted
strictly respecting the hierarchical structure promises im-
                                                                     from [13])
provements in the algorithm’s runtime.
   The basic idea is implemented in an algorithm that aims
to achieve these targets. It is composed of three phases
which are gradually executed on the available database:                                           P
                                                                                                      i=1...n wi · simi (qi , ci )
  1. Reduce number of regions                                                Similarity(q, c) =            P
                                                                                                              i=1...n wi

  2. Rate regions                                                       [13]. In our case, the traveling type and the traveling
                                                                     period are weighted higher than other features like crime
  3. Calculate the best combination of regions
                                                                     level because the assumption is that the first two features
   First, the approach reduces the number of considered re-          are more important for the decision.
gions for the recommendation process. Users can explicitly              [13] presents two examples of similarity metrics which con-
exclude regions in their query in our approach. If one or            sider deviation of a case feature from the user query. A
more regions are excluded, all subregions are removed from           symmetric similarity metric reduces the similarity by the
consideration as well, utilizing the hierarchical structure of       same value if the query feature is lower or higher than the
our travel database. Reducing the number of regions means            case feature. An asymmetric metric prefers either higher or
fewer items for the next two steps and therefore, the algo-          lower values. For example, the price of an item usually cor-
rithm’s runtime is improved.                                         responds to an asymmetric metric. If a user is prepared to
   In the following, we describe the other two steps in more         pay 1000 euros for an item, a price of 800 euros satisfies her
detail.                                                              requirement better than a price of 1200 euros.
                                                                        For our recommender, we use the following asymmetric
4.2 Rate regions                                                     similarity metric. All features of the regions are rated in
   The remaining travel items of the pruned region tree are          a range between 0 and 1. For example, the perfect month
rated in the following step. In our case, only the leaves            to visit a region is rated with 1, while the worst possible
are considered for recommendation, reducing the number of            rating is 0 (the particular month is not recommended at all).
items in the rating phase. This behavior can be adapted              The deviation of the query from a case reduces the overall
according to the recommender’s use case.                             similarity if the case feature is rated lower than the expected
   Travel regions in our scenario can be rated in several ways.      value which usually is 1, the best value. If a user expects a
For case-based recommender systems, the similarity between           region only to fit somewhat with regard to a certain feature,
the user query and the case - the region - can be calculated         the query feature will be rated with a value lower than 1. A
in order to determine the value of a region for a specified          higher value of the case feature always leads to a similarity of
user query. [13] explains that the assessment of similarity          1 since no user would complain about a feature being better
in case-based recommender systems involves combining the             served by a region than expected. Figure 2 illustrates this
individual feature level similarities for relevant features. Rel-    metric. The similarity reaches the maximum when the case
evant features in our travel recommender are the traveling           feature is rated greater or equal 1.
type served by a region, recommended traveling periods etc.             At the end of step 2, regions with a low rating are removed
(cf. Section 3). Budget and duration of stay are not taken           from consideration in order to reduce the number of items
into account at this step because they are used as the con-          for step 3. In our implementation, we exclude regions with a
straints for the knapsack algorithm (cf. Section 4.3). The           rating lower than 0.7 on the scale from 0 to 1. This approach
similarity between single features of the query and the case         also prevents composite trips including regions which do not
can be calculated with the following formula:                        satisfy the user’s demands but could be cheap enough to
                                                                     be considered in a recommendation only for the use of free
                                          |fq fc |                   capacities.
            simf eature (fq , fc ) = 1
                                         max(fq , fc )               4.3    Calculate the best combination of regions
  [13], where fq is the feature expectation defined in the user        In step 3, the remaining regions in the recommendation
query and fc is the actual feature value in the case. The            process are combined in a way that maximizes the value for
feature level similarities determine the similarity between          the user while still respecting her limitations in time and
the query and the case by weighting all features:                    money.


                                                                44
   The algorithm extends the classic knapsack problem in
order to achieve two additional objectives. First, the value       Table 1: Recommended composite trip for an exam-
of a composite trip should be decreased according to the           ple query
distances of regions. After testing several variants, we im-         North Argentina and Paraguay 2 weeks EUR 560
plemented a penalty function t(i, e) which decreases the to-         Bolivia                      3 weeks EUR 525
tal value of the composite trip by a percentage depending            Peru                         3 weeks EUR 630
on the connection costs between the regions i and e. This            Sum                          8 weeks EUR 1715
implementation is evaluated in the expert study in Section
5.
   The second objective is the determination of the opti-          specifying individual queries for composite trip recommen-
mal duration of the stay per region while calculating the          dation. The user can enter one predefined traveling type, her
best combination of regions. We assume that travelers stay         preferred month of traveling, her budget and average spend-
longer in regions with a significant higher value than in other    ing as well as the maximum possible duration of traveling.
regions of the same composite trip. On the other hand, sim-        Furthermore, regions can be excluded from the recommen-
ply recommending the maximum possible duration of stay             dation process.
in the best-rated region of phase two is not the best solution        Table 1 illustrates the outcome of an example query2 .
either because this could decrease the diversity of the rec-       Imagine, a user sees herself as cultural explorer who wants
ommended trip when visiting only a low number of regions.          to travel in August. She has a budget of 2000 euro and
   Therefore, we suggest regarding every week of every region      usually spends a low amount of money when traveling. Her
separately. The value of staying in a region for an additional     maximum time of traveling is eight weeks. As she already
week is lower than in the week before. We decided to de-           knows Europe and Asia very well, she excludes these regions
crease the value after every week by a constant between 5          in her query. Every query is composed of this information
and 10 percent. The lower the maximum duration of travel-          and it is automatically extended by an expectation of the
ing, the higher the deduction. This ensures diversity even for     lowest possible crime rate. In this example, the best recom-
short trips. Splitting regions in one week blocks means that       mendation is composed of three di↵erent regions with total
the algorithm is executed on an increased number of items          costs of 1715 euro. The costs already include local transport
which extends the runtime. Nevertheless, this approach al-         within the region and to the borders. The selected regions
lows determining the optimal durations of stay. If diversity       can be visited overland by passing mutual borders. This is
is only determined by the duration of stay in each region,         why there are no connection costs in this case. Additional
there is no need to extend the penalty function t(i, e) be-        costs would occur if the recommendation demands traveling
cause deductions are already stored in the one week blocks.        to a non-neighboring country or region. For future work,
If, for instance, routes characterized by activities are recom-    we suggest extending the model by specific costs for other
mended, the penalty function can be extended by a formula          means of transportation like buses or trains in order to cal-
which calculates the diversity of routes.                          culate the overall costs more accurately. The long-distance
   The knapsack problem itself can be solved by di↵erent           travel from the starting point of the trip to the first region
approximation. We decided to implement a dynamic pro-              is never included.
gramming solution which promises good results by splitting
the problem into smaller subproblems [6]. This approach it-        5.    EXPERT STUDY
erates over the number of available regions n as well over all        The research question of this paper is whether an algo-
limits, in our case the budget B and maximum duration of           rithm which recognizes dependencies between items and is
stay D. The runtime of this algorithm is O(n · B · D). When        based on a hierarchically structured data model can improve
executing, it creates a three dimensional matrix with n·B ·D       travel recommendations. We developed an algorithm that is
subproblems. The maximum possible value for a composite            able to consider distances between single regions and ensure
trip can then be derived from this matrix. We store the            diversity by calculating optimal durations of stay. We con-
selected regions in every entry of the matrix to access the        ducted an expert study to evaluate its performance. The
regions of our final recommendation after the algorithm has        expert had to be familiar with the available traveler types
terminated.                                                        and has knowledge in the travel domain.
   Shortest path algorithms can be applied on the final rec-
ommendation in order to optimize the routing. For our sce-         5.1   Procedure of the study
nario, we implemented a simple approach for finding the best          Our user study is composed of 56 sample queries. Ba-
sequence of regions in the trip. For each added region in a        sically, the queries are selected randomly but we tried to
knapsack with at least one region, the algorithm calculates        define 7 queries per traveler type. Furthermore, each query
at which position the new region includes the lowest addi-         in all of the 8 traveling type groups changes only one or two
tional costs. This region is then inserted at the identified       features compared to the precedent query like an increased
position.                                                          budget or less time to travel. This allows to understand how
                                                                   a recommendation changes if you modify certain features.
4.4 Implementation                                                    The developed Java application executes each query and
  We developed a Java 1.7 application in order to imple-           presents the best-rated recommendation, based on the ex-
ment the algorithm and to test it in a real scenario. The          plained procedure. In addition, two additional (baseline) al-
travel data of our data model is stored in a NoSQL database.       gorithms deliver two further recommendations. All of these
The application accesses regions, connections and the corre-       2
                                                                    The stated cost covers the minimum a traveler would need
sponding data by parsing an online provided JSON file.             to spend in these regions, the cost is (much) higher when
  The application o↵ers a simple user interface which allows       the user specifies average or high amount of spending


                                                              45
three recommendations are presented in a random order in
order to prevent the expert from relating the recommenda-
tions to the algorithms. The recommendations are presented
to the expert like in table 1, with the duration of stay per
region and the total costs of the trip.
   The expert rated all recommendations in these four cat-
egories on a 5-point Likert scale: 1. General satisfaction
with the recommendation, 2. the diversity of the recom-
mendation, 3. how well the single recommendation items fit
together and 4. if the routing between the singles items is
reasonable.

5.2 Comparative algorithms                                        Figure 3: Evaluated travel recommendation algo-
   Two further algorithms deliver recommendations based           rithms
on two di↵erent approaches: A baseline algorithm that is
a standard implementation to solve the knapsack problem,
and a top-k algorithm that selects regions sequentially from      for more than 6 weeks. For higher durations of stay, the dif-
an ordered list of rated regions. Both algorithms are used        ference in the overall satisfaction with the recommendations
for comparison in the expert study.                               is comparable, but our travel recommender algorithm rates
   Related works already implement variations of the clas-        better in how regions fit together (?: 4.41, : 0.84) and in
sical knapsack problem [17] or refer travel package recom-        terms of the routing (?: 4.26, : 1.06). On the other hand,
mendation to the orienteering problem [14]. The baseline          queries with a higher maximum duration of stay are rated
algorithm works like our presented travel recommendation          lower in terms of diversity (?: 3.00, : 0.92).
algorithm but does not include our extensions. Thus, the             25 queries asked for recommendations with a maximum
values of items in a composite trip are not dependent on the      budget per week of lower or equal than 500 euro. This does
presence or absence of other items and no weekly calculated       not have a significant impact on the quality of the recom-
penalties are considered. This algorithm allows evaluating        mendation except in terms of routing which is a bit better
the influence our extensions have on the quality of composite     for lower budget inputs (?: 4.28, : 0.94) than for higher
trip recommendations.                                             inputs (?: 4.06, : 0.96).
   The top-k algorithm ranks all available regions by their          Our travel recommendation algorithm o↵ers the possibil-
value in a list. It is able to implement weekly calculated        ity to limit the recommendation process to preselected re-
penalties and to decrease the total value by the distances        gions (step 1 in the process). In the expert study, 16 of the 56
to be covered, but it tries to find the best combination of       queries had some constraints on the regions. These queries
regions in a simpler approach. The algorithm inserts ev-          deliver recommendations that satisfied the expert less than
ery region from top to bottom in the ordered list into the        average (?: 3.75, : 1.00). Furthermore, single regions fit
recommendation sequence. If a region cannot be inserted           together less (?: 3.81, : 0.83) and the routing gets worse
because of the constraints with regard to money or time,          (?: 3.75, : 1.06) in these cases.
the next region in the list is checked. Hence, the algorithm
goes through the list of regions exactly once. The top-k          6.   CONCLUSION
algorithm allows determining if a simpler approach of the            Recommending composite trips can be understood as a
knapsack problem can lead to similar results as our more          knapsack problem with extensions. Single recommendation
complex travel recommendation algorithm.                          items like regions or routes o↵er value to the user, depend-
                                                                  ing on the similarity between the items and the user query.
5.3 Results                                                       Multiple regions can be combined to a composite trip which
   Figure 3 illustrates the performance of the three algo-        respect the user’s limitations in time and money. The total
rithms in each of the four categories. Our travel recom-          value for the user can not be determined by summing up
mendation algorithm resulted in the highest overall satis-        the single values of all travel stages. Regions in a compos-
faction to the expert (?: 4.02, : 0.82). Furthermore, it          ite trip are dependent on the presence or absence of other
is best-rated in the categories regions fit together (?: 4.29,    regions in the same recommendation. The distance between
  : 0.76) and routing (?: 4.16, : 0.95). In terms of di-          regions decreases the possible value for a specified user query
versity, our travel recommendation algorithm (?: 3.11, :          because of the costs of the connection.
0.91) is rated somewhat lower than the two comparative al-           In this paper, we present a travel recommendation algo-
gorithms. In this category, the baseline algorithm is ranked      rithm for composite trips that addresses the Oregon Trail
first place (?: 3.57, : 0.84). The top-k algorithm is ranked      Knapsack Problem and applies the problem to the travel
second in every category. The results show that in 3 out          domain. The recommendations generated by our algorithm
of 4 categories, our extensions lead to significant improve-      satisfies an expert user more than comparative algorithms.
ments. Some improvements can also be observed when ap-            It is able to combine regions and can determine the opti-
plying the top-k algorithm over the baseline, but the more        mal duration of stay per region. We showed that the rec-
complex knapsack based travel recommendation algorithm            ommended regions fit together and with the availability of
performs better than both.                                        connection costs between regions, a decent routing can be
   The expert study delivers further insights into our travel     ensured. A high maximum duration of stay allows choos-
recommendation algorithm. 29 queries asked for trips with a       ing among a higher number of regions and this improves the
maximum duration of less or equal than six weeks, 27 queries      selection of regions and allows better routing. A low max-


                                                             46
imum budget avoids high connection costs and thus also              [8] E. H.-C. Lu, C.-Y. Lin, and V. S. Tseng. Trip-Mine:
promises better routing. In terms of diversity, our algo-               An Efficient Trip Planning Approach with Travel
rithm performs below average. We integrated mechanisms                  Time Constraints. In 2011 IEEE 12th International
which penalize long stays in the same region. Nevertheless,             Conference on Mobile Data Management, pages
the recommended trips o↵er less diversity than the baseline             152–161. Ieee, June 2011.
algorithms. Further e↵ort is necessary in order to under-           [9] F. Ricci, D. Cavada, N. Mirzadeh, and A. Venturini.
stand the preferences of potential users and to find out how            Case-based travel recommendations. In Destination
a better diversity can be ensured without recommending re-              recommendation systems: behavioural foundations and
gions which fit less together. One possibility is extending             applications., pages 67–93. CABI, 2006.
the penalty function in our algorithm.                             [10] F. Ricci, D. R. Fesenmeier, N. Mirzadeh,
   Restrictions are useful for users when they want to ignore           H. Rumetshofer, E. Schaumlechner, A. Venturini,
specific regions in the recommendation process. Reducing                K. W. Wöber, and A. H. Zins. DieToRecs: a
the number of regions for the recommendation process im-                Case-based Travel Advisory System. In Destination
proves the algorithm’s runtime but also reduces the qual-               recommendation systems: behavioural foundations and
ity of recommendations. Regional constraints made it more               applications., pages 227–239. CABI, 2006.
difficult to find regions which match the user’s preferences.      [11] A. Savir, R. Brafman, and G. Shani. Recommending
Moreover, a limited choice of regions makes it more difficult           improved configurations for complex objects with an
to find regions which fit together and allow a decent routing.          application in travel planning. In Proceedings of the
   Future work is to extend the system by recommending                  7th ACM conference on Recommender systems -
routes or concrete itineraries in addition to travel regions.           RecSys ’13, pages 391–394, New York, New York,
In addition, an improved version of the algorithm could also            USA, 2013. ACM Press.
take possible individual activities of travelers such as hiking    [12] M. Schumacher and J.-P. Rey. Recommender systems
or shopping into account. In this case, di↵erent routes with            for dynamic packaging of tourism services. In R. Law,
similar activities could result in additional deduction of the          M. Fuchs, and F. Ricci, editors, ENTER, pages 13–23.
rating and thus reducing the value of this trip. Furthermore,           Springer Vienna, 2011.
we plan to conduct a more extensive user study with poten-
                                                                   [13] B. Smyth. Case-based recommendation. The adaptive
tial users in order to test the recommendations of the travel
                                                                        web, 4321:342–376, 2007.
recommendation algorithm for composite trips.
                                                                   [14] W. Sou↵riau, P. Vansteenwegen, J. Vertommen, G. V.
                                                                        Berghe, and D. V. Oudheusden. a Personalized
7. REFERENCES                                                           Tourist Trip Design Algorithm for Mobile Tourist
 [1] A. Angel, S. Chaudhuri, G. Das, and N. Koudas.                     Guides. Applied Artificial Intelligence, 22(10):964–985,
     Ranking Objects Based on Relationships and Fixed                   Oct. 2008.
     Associations. In EDBT ’09 Proceedings of the 12th             [15] C. Tan, Q. Liu, E. Chen, H. Xiong, and X. Wu.
     International Conference on Extending Database                     Object-oriented Travel Package Recommendation.
     Technology: Advances in Database Technology, pages                 ACM Transactions on Intelligent Systems and
     910–921, 2009.                                                     Technology, 5(3):26, 2014.
 [2] J. J. Burg, J. Ainsworth, B. Casto, and S.-D. Lang.           [16] P. Vansteenwegen and D. V. Oudheusden. The mobile
     Experiments with the ”Oregon Trail Knapsack                        tourist guide: an OR opportunity. OR Insights,
     Problem”. Electronic Notes in Discrete Mathematics,                20(3):21–27, 2007.
     1:26–35, 1999.                                                [17] M. Xie, L. V. Lakshmanan, and P. T. Wood. Breaking
 [3] J.-H. Chen, K.-M. Chao, and N. Shah. Hybrid                        out of the box of recommendations: from items to
     Recommendation System for Tourism. In 2013 IEEE                    packages. In Proceedings of the fourth ACM conference
     10th International Conference on e-Business                        on Recommender systems - RecSys ’10, pages 151–158.
     Engineering (ICEBE), pages 156–161. Ieee, Sept. 2013.              ACM Press, 2010.
 [4] M. D. Choudhury, M. Feldman, S. Amer-Yahia,
     N. Golbandi, R. Lempel, and C. Yu. Automatic
     construction of travel itineraries using social
     breadcrumbs. In HT ’10 Proceedings of the 21st ACM
     conference on Hypertext and hypermedia, pages 35–44,
     2010.
 [5] Y. Ge, Q. Liu, H. Xiong, A. Tuzhilin, and J. Chen.
     Cost-aware travel tour recommendation. In
     Proceedings of the 17th ACM SIGKDD international
     conference on Knowledge discovery and data mining -
     KDD ’11, pages 983–991, New York, New York, USA,
     2011. ACM Press.
 [6] H. Kellerer, U. Pferschy, and D. Pisinger. Knapsack
     Problems. Springer, 2004.
 [7] Q. Liu, Y. Ge, Z. Li, E. Chen, and H. Xiong.
     Personalized Travel Package Recommendation. In
     2011 IEEE 11th International Conference on Data
     Mining, pages 407–416. Ieee, Dec. 2011.


                                                              47