=Paper= {{Paper |id=Vol-1441/recsys2015_poster19 |storemode=property |title=Recommender Systems for Product Bundling |pdfUrl=https://ceur-ws.org/Vol-1441/recsys2015_poster19.pdf |volume=Vol-1441 |dblpUrl=https://dblp.org/rec/conf/recsys/BeladevSR15 }} ==Recommender Systems for Product Bundling== https://ceur-ws.org/Vol-1441/recsys2015_poster19.pdf

Recommender Systems for Product Bundling
Moran Beladev Bracha Shapira Lior Rokach
Ben-Gurion University of the Negev Ben-Gurion University of the Negev Ben-Gurion University of the Negev
belachde@bgu.ac.il bshapira@bgu.ac.il liorrk@bgu.ac.il

ABSTRACT bundle recommendation problem, in which its solution is a set of
Recommender systems (RSs) enhance e-commerce sales by items that maximizes some total expected reward. However, the
recommending relevant products to their customers. RSs aim at price aspect was not considered in the model. Our paper maximizes
implementing the firm's web-based marketing strategy to increase the expected revenue by considering the item-to-item cross
revenues. Generating bundles is an example of a marketing strategy dependencies, user-item collaborative filtering techniques and the
that aims to satisfy consumer needs and preferences, and at the demand-price function—resulting in recommendation of the best
same time, to increase customers' buying scope and the firm's bundle and price proposal to the user.
income. Thus, finding and recommending an optimal and personal
bundle becomes very important. In this paper we introduce a novel
2. BUNDLE RECOMMENDATION MODEL
model of bundle recommendations that integrates collaborative We maximize the following retailer expected revenue function:
filtering (CF) techniques, personalized demand functions, and price (1) 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑𝑅𝑒𝑣𝑒𝑛𝑢𝑒 = 𝑃𝑖 (𝐴, 𝐵, 𝑇) ∙ (𝑇 − 𝑐𝑜𝑠𝑡𝐴 − 𝑐𝑜𝑠𝑡𝐵 )
modeling. This model provides a recommendation list by finding where 𝑃𝑖 (𝐴, 𝐵, 𝑇) is the probability that user i will purchase the
pairs of products that maximizes both, the probability of their bundle, which is composed of products A and 𝐵, at price T. The
purchase by the user and the revenue received by selling this 𝑐𝑜𝑠𝑡A is the retailer’s cost for product A and 𝑐𝑜𝑠𝑡𝐵 is the retailer’s
bundles. cost for product B. The proposed bundle and the price T for user i
Categories and Subject Descriptors is set to maximize the expected revenue:
H.3.3 [Information Storage and Retrieval]: Information Search and (2) (𝐴, 𝐵, 𝑇) = 𝑎𝑟𝑔𝑚𝑎𝑥∀𝐴,𝐵,𝑇 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑𝑅𝑒𝑣𝑒𝑛𝑢𝑒(𝐴, 𝐵, 𝑇)
Retrieval—Information Filtering. In order to find 𝑃𝑖 (𝐴, 𝐵, 𝑇) we find the corresponding prices 𝐶𝐴 of
Keywords product 𝐴 and 𝐶𝐵 of product 𝐵 aggregated to the bundle price 𝑇:
Bundle Recommendation, Recommender Systems, E-Commerce, (3) 𝑃𝑖 (𝐴, 𝐵, 𝑇) = 𝑚𝑎𝑥∀𝑐𝐴,𝑐𝐵 |𝑐𝐴+𝑐𝐵 =𝑇 𝑃𝑖 (𝐴 ∩ 𝐵 ∩ 𝐶𝐴 ∩ 𝐶𝐵 )
Collaborative Filtering, SVD. Thus, we have to find the prices 𝐶𝐴 and 𝐶𝐵 that maximize the
probability of the user i to buy products 𝐴 and 𝐵 while paying those
1. INTRODUCTION prices. According to Bayes' law:
Bundling refers to the practice of selling two or more items together
(4) 𝑃𝑖 (𝑙𝑖𝑘𝑒 𝐴 ∩ 𝑤𝑖𝑙𝑙𝑖𝑛𝑔 𝑡𝑜 𝑝𝑎𝑦 𝐶𝐴 𝑓𝑜𝑟 𝐴) = 𝑃𝑖 (𝑙𝑖𝑘𝑒 𝐴) ∙
as a package at a price that is below the sum of the independent
prices. Optimal bundling would combine items into bundles that 𝑃𝑖 (𝑤𝑖𝑙𝑙𝑖𝑛𝑔 𝑡𝑜 𝑝𝑎𝑦 𝐶𝐴 𝑓𝑜𝑟 𝐴|𝑙𝑖𝑘𝑒 𝐴)
best fit the retailer’s needs and the user's preferences, and maximize According to the Jaccard measure: (5) 𝐽𝑎𝑐𝑐𝑎𝑟𝑑 = 𝐽𝐴,𝐵 =
𝑃(𝐴∩𝐵)
product compliance within the bundle. Thus, a single price 𝑃(𝐴∪𝐵)

𝐏𝐀+𝐁 <𝐏𝐀 + 𝐏𝐁 is set for the two products (A, B) if purchased Using combinatorial mathematics, the inclusion–exclusion
jointly. One challenge is to suggest a price for a bundle that fits both principle: (6) 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∪ 𝐵)
the customer reservation price i.e., the maximal price buyers are 𝑃(𝐴)+𝑃(𝐵)
accepted to pay, and the retailer’s revenue [1]. Very few studies Using equations (5) + (6): (7) 𝑃(𝐴 ∩ 𝐵) = 1
1+
have combined bundling strategy with recommender systems
𝐽𝐴,𝐵

(RSs). The field of frequent item set mining and association rules Using Bayes’ law and equation (7):
deals with finding a basket of items that are frequently bought 𝑃𝑖 (𝐴) ∙ 𝑃𝑖 (𝐶𝐴 |𝐴) + 𝑃𝑖 (𝐵) ∙ 𝑃𝑖 (𝐶𝐵 |𝐵)
together [2]. However, these techniques are not personalized, thus (8) 𝑃𝑖 ((𝐴 ∩ 𝐶𝐴 ) ∩ (𝐵 ∩ 𝐶𝐵 )) =
1
not applicable for RSs. The recommendation of bundles were 1+
𝐽𝐴,𝐵
presented as a tailored solution for the tourism domain using case-
We assume that the Jaccard measure, 𝐽𝐴,𝐵 , which denotes the
based reasoning where case models representing the travel plan
products' compatibility, is not affected by the price. The 𝑃𝑖 (𝐴),
bundle were matched against the user profile and preferences [3].
The authors of [4] presented a bundle optimization using a genetic 𝑃𝑖 (𝐵) probabilities are found using the CF technique;
algorithm to maximize the compatibility of the products within a 𝑃𝑖 (𝐶𝐴 |𝐴), 𝑃𝑖 (𝐶𝐵 |𝐵) is found by the upcoming personal demand.
bundle. However, these studies did not measure the 2.1 Personalized Demand Graph
recommendation aspect, i.e., if it is at all feasible and beneficial to We would like to assume that each customer has its own demand
predict bundle purchasing. The study presented in [5] introduces a graph for each product based on his/her preferences. Thus, we
developed heuristics for estimating the “personalized” demand
graph for user i and item j using very sparse data. Figure 1
demonstrates the demand of a generic customer versus an
enthusiastic one (i.e., one that would pay high prices) as well as
an indifferent one. We assume that the difference between the
demand graphs can be reflected by the following:
Copyright is held by the author(s). RecSys 2015 Poster (9) 𝑃𝑖 (𝐶𝐴 |𝐴) = 𝑚𝑖𝑛(𝑃∙,𝐴 (𝐶𝐴 ) × 𝛼𝑖,𝐴 , 100%)
Proceedings, September 16-20, 2015, Austria, Vienna.
where 𝑃∙,𝐴 (𝐶A ) is the generic demand graph for item 𝐴 given that dataset 2, for the personal demand graph, we received an RMSE
the price 𝐶A and 𝛼𝑖,𝐴 is the personalized bias factor for user i and error of alpha of 1.067 and an RMSE of 0.34, compared to the
item 𝐴. In order to find the personal bias factor, 𝛼𝑖,𝐴 , we scan each median. The results for the product evaluation are presented in table
customer's previous purchases or his/her highest bid on an item. We 1 and 3 and the results for the price evaluation are presented in table
compare his/her price to the median of the generic graph. For 2 and 4. Bundle (1) and Bundle (2) represent the two strategies of
example if customer i purchased item 𝐴 for price 𝐶A * then his/her maximizing probability and the expected revenue, respectively.
0.5
bias factor is estimated as: (10) 𝛼𝑖,𝐴 =
(𝐶 ∗ )
. Table 1. Product bundling results for dataset 1
𝑃∙,𝐴 A
For example (Figure 1), assume that a customer purchased the item Precision Recall Q Price
for 𝐶A *=1300; according to the generic graph, this price would be CF 0.027 0.012 0.133 12.133
considered only by 35% of the interested population. Thus the SVD 0.013 0.033 0.067 70.533
0.5
personalized bias for this user is calculated as: 𝛼𝑖,𝐴 = = 1.42. Bundle (1) 0.088 0.09 0.8 469.133
0.35
Bundle (2) 0.071 0.08 0.6 457.467
Table 2. Price bundling recommendation results for dataset 1
Recommended price Mean price
Bundle(1) 0.043 788.89
Bundle(2) 29.989 788.89
Table 3. Product bundling results for dataset 2
Precision Recall Q Price
CF 0.052 0.003 0.26 1.852
SVD 0.58 0.024 2.9 29.981
Bundle (1) 0.728 0.018 4.44 38.069
Figure 1. Personalized demand for various customer types Bundle (2) 0.2 0.004 1.02 12.882
We can create a bias matrix for all purchases of items by users. The Table 4. Price bundling recommendation results for dataset 2
bias factor of products that have not been purchased by the Recommended price Mean price
customer can be predicted using the SVD method. Given the
Bundle(1) 170.8 87.1
complete matrix, we can infer the personalized demand graph of
each user i and item 𝐴 from the generic demand graph calculated Bundle(2) 117.44 104.057
for item 𝐴 and multiply it by the predicted alpha.
4. CONCLUSIONS AND FUTURE WORK
3. EVALUATION Our results demonstrate that bundles are predictable and may
Our model was evaluated based on two datasets. (Dataset 1) increase users' purchase scope. The first dataset is more difficult to
consists of transactions from a shopping website that sells predict, but the bundle model is at least comparable to state of the
electronics and furniture. (Dataset 2) is a supermarket dataset from art algorithms and is even superior in some cases. The personal
Kaggle (https://www.kaggle.com/c/acquire-valued-shoppers- demand graph tends to be very accurate as was observed by the
challenge). We used offline evaluation and compared our model to price recommendation accuracy. The second dataset contains
SVD and CF as baseline models. We evaluated: (i) The personal commodities data, thus a personal demand graph is more difficult
demand function by using a validation set in order to test the to predict. The recommended price was not as accurate as in the
predicted alphas compared to the actual alphas, using the RMSE first dataset. Moreover, for dataset 2 the products are more
measure, and comparing the personal demand graph probability to predictable and the first bundle strategy yields the best results. For
0.5 (median probability) of all purchased products in the test set- both datasets maximizing the probability of the user's purchase is
using the RMSE measure too; (ii) The product bundling more effective than maximizing the expected revenue. Future work
recommendation by comparing the top 5 bundles to the top 5 items will aim at improving the personal demand graph of dataset 2,
recommended by CF and SVD algorithms. For this we used examining more datasets and providing live user experiments.
precision, recall, the average quantity that was recommended and
purchased, and the average price paid for the recommended and 5. REFERENCES
purchased products; (iii) The price bundling recommendation by [1] Guiltinan, J. 1987. The price bundling of services: a normative
comparing the recommended price to the actual price the user paid framework. The Journal of Marketing, 74-85.
in the test set, measuring the sum of the absolute difference. The
recommended price was compared to the mean price of the product. [2] Agrawal, R., Imielinski, T., and Swami, A.N. 1993. Mining
We also compared two strategies: (1) maximizing the bundle association rules between sets of items in large databases.
buying probability of the user, and (2) maximizing the expected ACM SIGMOD, volume 22,2 of SIGMOD Record, 207–216.
revenue. For both datasets we evaluated our model on the top 1,000 [3] Ricci, F. 2002. ITR: a case-based travel advisory system.
customers and top 300 products. The first dataset resulted in 3,425 Advances in Case-Based Reasoning. Springer Berlin
transactions and the second in 836,846 transactions. A Heidelberg, 613-627.
recommended bundle is considered a hit in the test set if the two
[4] Birtolo, C. 2013. Searching optimal product bundles by means
products have been purchased by the user within a week. In dataset
of GA-based Engine and Market Basket Analysis. IFSA
1 for the personal demand graph we received an RMSE of alpha of
0.072 and an RMSE error compared to the median of 0.261. Thus, [5] Zhu, T. 2014. Bundle recommendation in ecommerce. SIGIR
the personal graphs are compatible to the users’ preferences. In 2014, 657-666.