A Travel Recommender System for Combining Multiple Travel Regions to a Composite Trip Daniel Herzog Wolfgang Wörndl TU München TU München Boltzmannstr. 3 Boltzmannstr. 3 85748 Garching 85748 Garching Germany Germany herzogd@in.tum.de woerndl@in.tum.de ABSTRACT Keywords Internet-based services are available to recommend destina- case-based recommender system, tourist trip design prob- tions and activities for organized trips. Only few systems lem, travel recommendation, knapsack problem, similarity support travelers when creating composite trips consisting of multiple destinations or activities. The idea in this work 1. INTRODUCTION is to select travel regions that maximize the value of the Recommender systems are an established technology in composite trip for the user while still respecting her limita- online shops to provide users with a list of recommended tions in time and money. The value of a travel region can be items during their shopping experience. In comparison to determined by the similarity between a specified user query recommending single items, travel recommendation turns and the cases in a travel region database. The recommen- out to be a more complicated issue since tourists have to dation algorithm needs to find a decent routing between the deal with both limited budget and time frame. If users call regions while still satisfying diversity of the whole trip. We for an individual trip composed of multiple regions, the task developed an algorithm based on an approximation for the becomes even more complex. An exemplary situation il- knapsack problem and extended it to recognize dependencies lustrates the difficulty: A person is looking for a trip for between the regions while calculating best combinations. It four weeks in September with a budget of 2000 Euros; she is able to determine the optimal duration of stay per region likes nature and hiking with some cultural highlights as a and its performance improves when benefiting from the hi- plus. A high number of possible routes could be packaged erarchical structure of our travel database. In an expert to a composite trip and the constraints in time and money study, we verified the results of our approach. The study have to be taken into account. This is a special case of the proves that our algorithm for composite trips delivers good so called tourist trip design problem (TTDP) [14]. TTDP recommendations which satisfied an expert user more than is an extension of the orienteering problem (OP): A score baseline algorithms. Regions in the composite trip fit to- which determines the value for the user is assigned to every gether better and a decent routing between the regions can location in a sequence. The goal is to maximize the sum of be ensured. Nevertheless, the algorithm leaves room for im- the scores of the selected locations while still meeting the provement by combining less similar regions in a composite given limitations [16]. trip, thus leading to a higher diversity of the recommenda- Recommender systems use di↵erent algorithms to find a tion. suitable list of recommendations in a feasible time. A sub- area of content-based recommender systems is case-based recommender systems. Case-based recommenders rely on Categories and Subject Descriptors items structured as cases using a well defined set of features I.2.8 [Problem Solving, Control Methods, and Search]: and feature values. Those cases are compared to a user Dynamic programming query or her profile and the most similar cases are delivered as a recommendation [13]. These recommenders are already applied for travel recom- mendation. In order to recommend a longer trip, the cases General Terms have to be combined to maximize the value while still re- Algorithms specting the users’ limitations. The simplest approach to determine the value of a composite recommendation is to add the values of all single parts of the proposed trip. This approach does not consider dependencies between the parts, Permission to make digital or hard copies of all or part of this work for for instance travel activities which cannot be executed at the personal or classroom use is granted without fee provided that copies are same time as other activities. Furthermore, determining the not made or distributed for profit or commercial advantage and that copies maximum score of all possible combinations is not compu- bear this notice and the full citation on the first page. To copy otherwise, to tationally feasible. This is the reason why these algorithms Copyright 2014 for the individual papers by the paper’s authors. republish, to post on servers or to redistribute to lists, requires prior specific need to work with heuristics. Copying permitted permission for private and academic purposes. This volume is and/or a fee. published and CBRecSys copyrighted 2014, October 6, by its editors. 2014, Silicon Valley, CA, USA. This paper aims to investigate possible solutions for com- CBRecSys 2014 Copyright 2014,byOctober 6, 2014, Silicon Valley, CA, USA. the author(s). bining multiple travel regions to a composite trip for in- 42 dependent travelers. The more specific research problem thermore, it calculates the optimal duration of stay per re- is whether an algorithm which recognizes dependencies be- gion in the composite trip and benefits from the hierarchical tween items and is based on a hierarchically structured data structure of the underlying data model. model can lead to better recommendations of a sequence of items. The rest of this paper is organized as follows. Section 3. UNDERLYING DATA MODEL 2 looks at related work in the field of travel recommenda- tion. In Section 3 we present the underlying data model we In this paper, we aim to recommend trips composed of developed for testing our algorithm. Section 4 explains the multiple regions for individual traveling. We have developed main idea, the single steps and the implementation of our a travel database with realistic data in order to test the algorithm. This algorithm is evaluated in Section 5. Section proposed algorithm. The data model is composed of several 6 concludes our work and presents future work. layers ordered in a hierarchical manner as explained in this section. In our model, a region is always a subregion of another 2. RELATED WORK region while the world is the parent region of every region. Research already focuses on travel recommendation and Regions contain the necessary information for recommenda- bundling k items to a recommendation package. Related tion: work is [7] who developed the TAST model which represents the interests of tourists as well as extracted features of re- • How good a region matches traveler types such as Free gions like the location or travel seasons. Travel packages are spirits or Cultural explorer, on a 5-point Likert scale recommended to the user based on this model combined with additional information like prices. [17] developed a compos- • Minimum and a maximum recommended duration of ite recommender system with access to one or more under- stay lying recommender systems and therefore devises two algo- • Minimum and optimal budget a traveler has to spend rithms for generating top-k packages as recommendations, per pay both with high quality and one being much faster than the other. Furthermore, [1] explains how to bundle recommen- • Recommended periods of traveling in the region as 5- dation packages by regarding relationships and associations point Likert scale per month between the entities. Hence, the best recommendations for a given set of keywords can be determined. Early pruning • Security for travelers (crime level), on a 5-point Likert and terminations strategies are used to develop an efficient scale algorithm. [14] introduces the TTDP, shows the connection to the If a region lacks specified attributes, it inherits the values OP and presents a fast algorithm that produces solutions of from the parent region. We also model the connections be- better quality by using a guided local search meta-heuristic. tween regions by specifying the necessary e↵ort (time and [5] tries to solve the TTDP by presenting a cost-aware travel cost) in one number to travel from one region to another. tour recommender, while Trip-Mine is looking for optimal The connection cost between neighboring regions is 0. Re- trips by satisfying the user’s travel time constraints [8]. gions are part of a country or can be spread over several [9] shows that case-based recommenders can be used to countries. For example, larger countries such as the USA bundle travel items. The developed recommendation method- are divided in several travel regions, while smaller countries ology Trip@advice stores recommendation sessions as cases. such as the Netherlands, Belgium and Luxembourg are com- Furthermore, the user can give direct feedback to invoke sug- bined into one region. Furthermore, regions contain one or gested query changes during the recommendation process. more routes, i.e. concrete itineraries to visit a region. At [12] confirms that case-based reasoning and making associ- this time, our systems recommends regions only, but in fu- ation rules are solutions for recommending tourism travel ture work, routes can be considered for recommendation as packages. [3] devises a hybrid algorithm that additionally well. The model can also be extended with additional at- includes collaborative filtering in the recommendation pro- tributes such as spoken languages in regions. Regions can cess. The bundling of trips to packages in [15] is based on be assigned to the countries they belong to and store addi- an object orientation solution which faces the high variation tional information like visa requirements, but we do not use in travel requirements. countries for recommendation at this time. [10] developed DieToRecs, a travel recommender system Figure 1 illustrates the data model by showing an example which o↵ers personalized interaction during the recommen- with several layers of regions. USA Southwest, USA Pacific dation process to create individual bundles of trips. [11] Northwest and Western Canada are subregions which can shows the application of automatically collected constraints be recommended by our algorithm. When developing and and preferences which can be added to user queries in order testing our algorithm, our database was composed of a total to improve the results of complex trip recommender systems. of 152 regions with 124 leaves in the region tree which can Further approaches make use of geo-location and pictures of be recommended in a trip. The data was complied based on the travel entities [4]. various Internet sources. Our developed algorithm combines multiple travel regions A user who asks for a recommendation chooses one trav- to a composite trip. It is based on an approximation for eler type which expresses her expectations towards trips. the knapsack problem and extends the existing approaches The o↵ered traveler types such as Free spirits or Cultural by taking dependencies between single regions into account. explorer are inspired by a market segmentation tool of the The algorithm considers the fact that the value of each re- Canadian Tourism Commission1 . Based on the described gion in a composite trip is dependent on the presence or 1 http://en-corporate.canada.travel/resources- absence of other regions in the same recommendation. Fur- industry/explorer-quotient 42 the user’s requirements and preferences for her trip. Hence, a travel recommendation algorithm needs to collect infor- mation like the user’s traveling type or preferred activities, her intended period of traveling and her monetary and time limitations. The algorithm needs a predefined way to calcu- late the value by using this information. To decide whether a region will be considered for a trip, we calculate a rating (see below). In our case, the problem gets more complicated than the standard knapsack problem because the value of a region is not only determined by the user query. Rather, it de- pends on the presence or absence of other regions in the recommended composite trip. This extension of the knap- Figure 1: Data example from region tree sack problem is called the Oregon Trail Knapsack Problem [2]. Imagine a travel sequence that recommends visiting Ger- many, Austria and Switzerland. If the user could spend more preferences and suggested activities for each traveler type, time and money, the system can add further regions to this we rated each region in the database to express how it matches trip. Depending on the user’s preferences, additional re- the traveler types on a 5-point Likert scale. gions in or close to Central Europe should be recommended. Only exceptional circumstances legitimate an additional re- 4. ALGORITHM FOR THE RECOMMEN- gion far from Central Europe because the e↵ort of visiting this region during this trip is disproportional. The value of DATION OF COMPOSITE TRIPS the composite trip itself is lower than the sum of the region In this section, we present our algorithm to recommend values because the distance between regions has a negative trips composed of multiple travel regions. This algorithm impact on the composite trip. can serve as a basis for a fully functional travel recommen- In addition, a region’s value in a composite trip influences dation application. the diversity of the sequence. For instance, if the recom- mender accepts di↵erent activities as user input, the algo- 4.1 Basic idea and goals rithm should focus on a decent coverage of all demanded Composite trip recommendation can be seen as a special activities. Downgrading the values of similar regions, when case of the knapsack problem to find best combinations of combining them in a recommendation, allows avoiding situ- multiple travel items [17]. The underlying idea is to combine ations where only a few activities are served. If a user wants as many single travel items like regions, routes or activities to go skiing and swimming, a recommender should not only as necessary to maximize the benefit for the user while still recommend regions for one of these activities, but both. respecting existing limitations like time and money. The In this paper, regions instead of routes are recommended. problem can be described formally as: Regions in our data model are not limited to specific activ- maximize ities, diversity is mainly expressed by the the duration of X n stay in each recommended region. Every additional week fi (xi , Xi ) a user stays in a region leads to a lower value for the user i=1 because the probability that she explored this region enough subject to is increasing. Hence, another goal of the algorithm is deter- mining the optimal duration of stay per region in the travel n X sequence. di · xi  D To conclude, a region reaches its maximum value for a i=1 given query when regarding it exclusively, and not within and a composite trip. Other regions in the composite trip can n X decrease this value because of increasing distances and lack bi · xi  B of diversity. This negative impact can be calculated by a i=1 penalty function. Hence, the value function for the compos- with ite trip recommendation problem extends the value func- tions of the Oregon Trail Knapsack Problem by a penalty xi 2 {0, 1} function [2]. The resulting value function can be described formally as: where fi (xi , Xi ) is the value function for item i and Xi is the set of items upon which the value of item i depends. X xi is either 0 or 1 which means that each travel item can fi (xi , Xi ) = xi · vi (t(i, e) · xi · vi · [xe > 0]) be added to the travel sequence only once (or not). di is e2Xi item i’s recommended duration of stay and bi is item i’s recommended daily budget. D is the maximum possible where vi is the value of region i for the specified user duration of stay and B is the maximum budget the user can query and t(i, e) is the penalty function for two regions i spend on her trip. In this paper, travel items are represented and e. [xe > 0] represents a Boolean value whether item by regions from all over the world. i is in the knapsack. The algorithm needs to provide an The first challenge that arises is to determine the value a implementation of t(i, e). single region o↵ers to the user. This value is dependent on After determining the best regions and the durations of 43 stay for the final composite trip, an optimal routing should be ensured. The problem of combining regions to a com- posite trip is part of the orienteering problem. Basically, the orienteering problem extends the knapsack problem by charging time and money for the routing between the travel items [14]. Hence, finding the shortest and cheapest rout- ing between regions reduces loss of time and money when executing the algorithm. We have developed a travel sequence recommender based on the explained data model (cf. Section 3). The algo- rithm exploits the hierarchical structure. For instance, if the user wants to travel through Europe but already explored Scandinavia, there is no need to execute any calculations for other continents or Scandinavian regions. As a consequence Figure 2: Asymmetric similarity metric (adapted strictly respecting the hierarchical structure promises im- from [13]) provements in the algorithm’s runtime. The basic idea is implemented in an algorithm that aims to achieve these targets. It is composed of three phases which are gradually executed on the available database: P i=1...n wi · simi (qi , ci ) 1. Reduce number of regions Similarity(q, c) = P i=1...n wi 2. Rate regions [13]. In our case, the traveling type and the traveling period are weighted higher than other features like crime 3. Calculate the best combination of regions level because the assumption is that the first two features First, the approach reduces the number of considered re- are more important for the decision. gions for the recommendation process. Users can explicitly [13] presents two examples of similarity metrics which con- exclude regions in their query in our approach. If one or sider deviation of a case feature from the user query. A more regions are excluded, all subregions are removed from symmetric similarity metric reduces the similarity by the consideration as well, utilizing the hierarchical structure of same value if the query feature is lower or higher than the our travel database. Reducing the number of regions means case feature. An asymmetric metric prefers either higher or fewer items for the next two steps and therefore, the algo- lower values. For example, the price of an item usually cor- rithm’s runtime is improved. responds to an asymmetric metric. If a user is prepared to In the following, we describe the other two steps in more pay 1000 euros for an item, a price of 800 euros satisfies her detail. requirement better than a price of 1200 euros. For our recommender, we use the following asymmetric 4.2 Rate regions similarity metric. All features of the regions are rated in The remaining travel items of the pruned region tree are a range between 0 and 1. For example, the perfect month rated in the following step. In our case, only the leaves to visit a region is rated with 1, while the worst possible are considered for recommendation, reducing the number of rating is 0 (the particular month is not recommended at all). items in the rating phase. This behavior can be adapted The deviation of the query from a case reduces the overall according to the recommender’s use case. similarity if the case feature is rated lower than the expected Travel regions in our scenario can be rated in several ways. value which usually is 1, the best value. If a user expects a For case-based recommender systems, the similarity between region only to fit somewhat with regard to a certain feature, the user query and the case - the region - can be calculated the query feature will be rated with a value lower than 1. A in order to determine the value of a region for a specified higher value of the case feature always leads to a similarity of user query. [13] explains that the assessment of similarity 1 since no user would complain about a feature being better in case-based recommender systems involves combining the served by a region than expected. Figure 2 illustrates this individual feature level similarities for relevant features. Rel- metric. The similarity reaches the maximum when the case evant features in our travel recommender are the traveling feature is rated greater or equal 1. type served by a region, recommended traveling periods etc. At the end of step 2, regions with a low rating are removed (cf. Section 3). Budget and duration of stay are not taken from consideration in order to reduce the number of items into account at this step because they are used as the con- for step 3. In our implementation, we exclude regions with a straints for the knapsack algorithm (cf. Section 4.3). The rating lower than 0.7 on the scale from 0 to 1. This approach similarity between single features of the query and the case also prevents composite trips including regions which do not can be calculated with the following formula: satisfy the user’s demands but could be cheap enough to be considered in a recommendation only for the use of free |fq fc | capacities. simf eature (fq , fc ) = 1 max(fq , fc ) 4.3 Calculate the best combination of regions [13], where fq is the feature expectation defined in the user In step 3, the remaining regions in the recommendation query and fc is the actual feature value in the case. The process are combined in a way that maximizes the value for feature level similarities determine the similarity between the user while still respecting her limitations in time and the query and the case by weighting all features: money. 44 The algorithm extends the classic knapsack problem in order to achieve two additional objectives. First, the value Table 1: Recommended composite trip for an exam- of a composite trip should be decreased according to the ple query distances of regions. After testing several variants, we im- North Argentina and Paraguay 2 weeks EUR 560 plemented a penalty function t(i, e) which decreases the to- Bolivia 3 weeks EUR 525 tal value of the composite trip by a percentage depending Peru 3 weeks EUR 630 on the connection costs between the regions i and e. This Sum 8 weeks EUR 1715 implementation is evaluated in the expert study in Section 5. The second objective is the determination of the opti- specifying individual queries for composite trip recommen- mal duration of the stay per region while calculating the dation. The user can enter one predefined traveling type, her best combination of regions. We assume that travelers stay preferred month of traveling, her budget and average spend- longer in regions with a significant higher value than in other ing as well as the maximum possible duration of traveling. regions of the same composite trip. On the other hand, sim- Furthermore, regions can be excluded from the recommen- ply recommending the maximum possible duration of stay dation process. in the best-rated region of phase two is not the best solution Table 1 illustrates the outcome of an example query2 . either because this could decrease the diversity of the rec- Imagine, a user sees herself as cultural explorer who wants ommended trip when visiting only a low number of regions. to travel in August. She has a budget of 2000 euro and Therefore, we suggest regarding every week of every region usually spends a low amount of money when traveling. Her separately. The value of staying in a region for an additional maximum time of traveling is eight weeks. As she already week is lower than in the week before. We decided to de- knows Europe and Asia very well, she excludes these regions crease the value after every week by a constant between 5 in her query. Every query is composed of this information and 10 percent. The lower the maximum duration of travel- and it is automatically extended by an expectation of the ing, the higher the deduction. This ensures diversity even for lowest possible crime rate. In this example, the best recom- short trips. Splitting regions in one week blocks means that mendation is composed of three di↵erent regions with total the algorithm is executed on an increased number of items costs of 1715 euro. The costs already include local transport which extends the runtime. Nevertheless, this approach al- within the region and to the borders. The selected regions lows determining the optimal durations of stay. If diversity can be visited overland by passing mutual borders. This is is only determined by the duration of stay in each region, why there are no connection costs in this case. Additional there is no need to extend the penalty function t(i, e) be- costs would occur if the recommendation demands traveling cause deductions are already stored in the one week blocks. to a non-neighboring country or region. For future work, If, for instance, routes characterized by activities are recom- we suggest extending the model by specific costs for other mended, the penalty function can be extended by a formula means of transportation like buses or trains in order to cal- which calculates the diversity of routes. culate the overall costs more accurately. The long-distance The knapsack problem itself can be solved by di↵erent travel from the starting point of the trip to the first region approximation. We decided to implement a dynamic pro- is never included. gramming solution which promises good results by splitting the problem into smaller subproblems [6]. This approach it- 5. EXPERT STUDY erates over the number of available regions n as well over all The research question of this paper is whether an algo- limits, in our case the budget B and maximum duration of rithm which recognizes dependencies between items and is stay D. The runtime of this algorithm is O(n · B · D). When based on a hierarchically structured data model can improve executing, it creates a three dimensional matrix with n·B ·D travel recommendations. We developed an algorithm that is subproblems. The maximum possible value for a composite able to consider distances between single regions and ensure trip can then be derived from this matrix. We store the diversity by calculating optimal durations of stay. We con- selected regions in every entry of the matrix to access the ducted an expert study to evaluate its performance. The regions of our final recommendation after the algorithm has expert had to be familiar with the available traveler types terminated. and has knowledge in the travel domain. Shortest path algorithms can be applied on the final rec- ommendation in order to optimize the routing. For our sce- 5.1 Procedure of the study nario, we implemented a simple approach for finding the best Our user study is composed of 56 sample queries. Ba- sequence of regions in the trip. For each added region in a sically, the queries are selected randomly but we tried to knapsack with at least one region, the algorithm calculates define 7 queries per traveler type. Furthermore, each query at which position the new region includes the lowest addi- in all of the 8 traveling type groups changes only one or two tional costs. This region is then inserted at the identified features compared to the precedent query like an increased position. budget or less time to travel. This allows to understand how a recommendation changes if you modify certain features. 4.4 Implementation The developed Java application executes each query and We developed a Java 1.7 application in order to imple- presents the best-rated recommendation, based on the ex- ment the algorithm and to test it in a real scenario. The plained procedure. In addition, two additional (baseline) al- travel data of our data model is stored in a NoSQL database. gorithms deliver two further recommendations. All of these The application accesses regions, connections and the corre- 2 The stated cost covers the minimum a traveler would need sponding data by parsing an online provided JSON file. to spend in these regions, the cost is (much) higher when The application o↵ers a simple user interface which allows the user specifies average or high amount of spending 45 three recommendations are presented in a random order in order to prevent the expert from relating the recommenda- tions to the algorithms. The recommendations are presented to the expert like in table 1, with the duration of stay per region and the total costs of the trip. The expert rated all recommendations in these four cat- egories on a 5-point Likert scale: 1. General satisfaction with the recommendation, 2. the diversity of the recom- mendation, 3. how well the single recommendation items fit together and 4. if the routing between the singles items is reasonable. 5.2 Comparative algorithms Figure 3: Evaluated travel recommendation algo- Two further algorithms deliver recommendations based rithms on two di↵erent approaches: A baseline algorithm that is a standard implementation to solve the knapsack problem, and a top-k algorithm that selects regions sequentially from for more than 6 weeks. For higher durations of stay, the dif- an ordered list of rated regions. Both algorithms are used ference in the overall satisfaction with the recommendations for comparison in the expert study. is comparable, but our travel recommender algorithm rates Related works already implement variations of the clas- better in how regions fit together (?: 4.41, : 0.84) and in sical knapsack problem [17] or refer travel package recom- terms of the routing (?: 4.26, : 1.06). On the other hand, mendation to the orienteering problem [14]. The baseline queries with a higher maximum duration of stay are rated algorithm works like our presented travel recommendation lower in terms of diversity (?: 3.00, : 0.92). algorithm but does not include our extensions. Thus, the 25 queries asked for recommendations with a maximum values of items in a composite trip are not dependent on the budget per week of lower or equal than 500 euro. This does presence or absence of other items and no weekly calculated not have a significant impact on the quality of the recom- penalties are considered. This algorithm allows evaluating mendation except in terms of routing which is a bit better the influence our extensions have on the quality of composite for lower budget inputs (?: 4.28, : 0.94) than for higher trip recommendations. inputs (?: 4.06, : 0.96). The top-k algorithm ranks all available regions by their Our travel recommendation algorithm o↵ers the possibil- value in a list. It is able to implement weekly calculated ity to limit the recommendation process to preselected re- penalties and to decrease the total value by the distances gions (step 1 in the process). In the expert study, 16 of the 56 to be covered, but it tries to find the best combination of queries had some constraints on the regions. These queries regions in a simpler approach. The algorithm inserts ev- deliver recommendations that satisfied the expert less than ery region from top to bottom in the ordered list into the average (?: 3.75, : 1.00). Furthermore, single regions fit recommendation sequence. If a region cannot be inserted together less (?: 3.81, : 0.83) and the routing gets worse because of the constraints with regard to money or time, (?: 3.75, : 1.06) in these cases. the next region in the list is checked. Hence, the algorithm goes through the list of regions exactly once. The top-k 6. CONCLUSION algorithm allows determining if a simpler approach of the Recommending composite trips can be understood as a knapsack problem can lead to similar results as our more knapsack problem with extensions. Single recommendation complex travel recommendation algorithm. items like regions or routes o↵er value to the user, depend- ing on the similarity between the items and the user query. 5.3 Results Multiple regions can be combined to a composite trip which Figure 3 illustrates the performance of the three algo- respect the user’s limitations in time and money. The total rithms in each of the four categories. Our travel recom- value for the user can not be determined by summing up mendation algorithm resulted in the highest overall satis- the single values of all travel stages. Regions in a compos- faction to the expert (?: 4.02, : 0.82). Furthermore, it ite trip are dependent on the presence or absence of other is best-rated in the categories regions fit together (?: 4.29, regions in the same recommendation. The distance between : 0.76) and routing (?: 4.16, : 0.95). In terms of di- regions decreases the possible value for a specified user query versity, our travel recommendation algorithm (?: 3.11, : because of the costs of the connection. 0.91) is rated somewhat lower than the two comparative al- In this paper, we present a travel recommendation algo- gorithms. In this category, the baseline algorithm is ranked rithm for composite trips that addresses the Oregon Trail first place (?: 3.57, : 0.84). The top-k algorithm is ranked Knapsack Problem and applies the problem to the travel second in every category. The results show that in 3 out domain. The recommendations generated by our algorithm of 4 categories, our extensions lead to significant improve- satisfies an expert user more than comparative algorithms. ments. Some improvements can also be observed when ap- It is able to combine regions and can determine the opti- plying the top-k algorithm over the baseline, but the more mal duration of stay per region. We showed that the rec- complex knapsack based travel recommendation algorithm ommended regions fit together and with the availability of performs better than both. connection costs between regions, a decent routing can be The expert study delivers further insights into our travel ensured. A high maximum duration of stay allows choos- recommendation algorithm. 29 queries asked for trips with a ing among a higher number of regions and this improves the maximum duration of less or equal than six weeks, 27 queries selection of regions and allows better routing. A low max- 46 imum budget avoids high connection costs and thus also [8] E. H.-C. Lu, C.-Y. Lin, and V. S. Tseng. Trip-Mine: promises better routing. In terms of diversity, our algo- An Efficient Trip Planning Approach with Travel rithm performs below average. We integrated mechanisms Time Constraints. In 2011 IEEE 12th International which penalize long stays in the same region. Nevertheless, Conference on Mobile Data Management, pages the recommended trips o↵er less diversity than the baseline 152–161. Ieee, June 2011. algorithms. Further e↵ort is necessary in order to under- [9] F. Ricci, D. Cavada, N. Mirzadeh, and A. Venturini. stand the preferences of potential users and to find out how Case-based travel recommendations. In Destination a better diversity can be ensured without recommending re- recommendation systems: behavioural foundations and gions which fit less together. One possibility is extending applications., pages 67–93. CABI, 2006. the penalty function in our algorithm. [10] F. Ricci, D. R. Fesenmeier, N. Mirzadeh, Restrictions are useful for users when they want to ignore H. Rumetshofer, E. Schaumlechner, A. Venturini, specific regions in the recommendation process. Reducing K. W. Wöber, and A. H. Zins. DieToRecs: a the number of regions for the recommendation process im- Case-based Travel Advisory System. In Destination proves the algorithm’s runtime but also reduces the qual- recommendation systems: behavioural foundations and ity of recommendations. Regional constraints made it more applications., pages 227–239. CABI, 2006. difficult to find regions which match the user’s preferences. [11] A. Savir, R. Brafman, and G. Shani. Recommending Moreover, a limited choice of regions makes it more difficult improved configurations for complex objects with an to find regions which fit together and allow a decent routing. application in travel planning. In Proceedings of the Future work is to extend the system by recommending 7th ACM conference on Recommender systems - routes or concrete itineraries in addition to travel regions. RecSys ’13, pages 391–394, New York, New York, In addition, an improved version of the algorithm could also USA, 2013. ACM Press. take possible individual activities of travelers such as hiking [12] M. Schumacher and J.-P. Rey. Recommender systems or shopping into account. In this case, di↵erent routes with for dynamic packaging of tourism services. In R. Law, similar activities could result in additional deduction of the M. Fuchs, and F. Ricci, editors, ENTER, pages 13–23. rating and thus reducing the value of this trip. Furthermore, Springer Vienna, 2011. we plan to conduct a more extensive user study with poten- [13] B. Smyth. Case-based recommendation. The adaptive tial users in order to test the recommendations of the travel web, 4321:342–376, 2007. recommendation algorithm for composite trips. [14] W. Sou↵riau, P. Vansteenwegen, J. Vertommen, G. V. Berghe, and D. V. Oudheusden. a Personalized 7. REFERENCES Tourist Trip Design Algorithm for Mobile Tourist [1] A. Angel, S. Chaudhuri, G. Das, and N. Koudas. Guides. Applied Artificial Intelligence, 22(10):964–985, Ranking Objects Based on Relationships and Fixed Oct. 2008. Associations. In EDBT ’09 Proceedings of the 12th [15] C. Tan, Q. Liu, E. Chen, H. Xiong, and X. Wu. International Conference on Extending Database Object-oriented Travel Package Recommendation. Technology: Advances in Database Technology, pages ACM Transactions on Intelligent Systems and 910–921, 2009. Technology, 5(3):26, 2014. [2] J. J. Burg, J. Ainsworth, B. Casto, and S.-D. Lang. [16] P. Vansteenwegen and D. V. Oudheusden. The mobile Experiments with the ”Oregon Trail Knapsack tourist guide: an OR opportunity. OR Insights, Problem”. Electronic Notes in Discrete Mathematics, 20(3):21–27, 2007. 1:26–35, 1999. [17] M. Xie, L. V. Lakshmanan, and P. T. Wood. Breaking [3] J.-H. Chen, K.-M. Chao, and N. Shah. Hybrid out of the box of recommendations: from items to Recommendation System for Tourism. In 2013 IEEE packages. In Proceedings of the fourth ACM conference 10th International Conference on e-Business on Recommender systems - RecSys ’10, pages 151–158. Engineering (ICEBE), pages 156–161. Ieee, Sept. 2013. ACM Press, 2010. [4] M. D. Choudhury, M. Feldman, S. Amer-Yahia, N. Golbandi, R. Lempel, and C. Yu. Automatic construction of travel itineraries using social breadcrumbs. In HT ’10 Proceedings of the 21st ACM conference on Hypertext and hypermedia, pages 35–44, 2010. [5] Y. Ge, Q. Liu, H. Xiong, A. Tuzhilin, and J. Chen. Cost-aware travel tour recommendation. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’11, pages 983–991, New York, New York, USA, 2011. ACM Press. [6] H. Kellerer, U. Pferschy, and D. Pisinger. Knapsack Problems. Springer, 2004. [7] Q. Liu, Y. Ge, Z. Li, E. Chen, and H. Xiong. Personalized Travel Package Recommendation. In 2011 IEEE 11th International Conference on Data Mining, pages 407–416. Ieee, Dec. 2011. 47