Next-Basket Recommendation Constrained by Total Cost Brainstorming session report Oleg Lashinin1 , Marina Ananyeva1,2 1 Tinkoff, Moscow, Russia 2 National Research University Higher School of Economics, Moscow, Russia Abstract The next-basket recommendations are a well-studied problem in the academic field. As a rule, we want to predict top-K goods that might be added to the last known basket of the user based on the previous purchase history. However, the existing studies do not take into account that each user can be limited by the amount of money. Therefore, clients of online websites might be interested in the opportunity to receive personal recommendations with a given fixed cost. For example, we can take the average spending over the history of customer’s purchases as an upper bound. In this article, we consider addressing the problem of making next-basket recommendations with set total cost restrictions. Keywords next-basket recommendation, constrained recommendations 1. Motivation by using filters or other UI tools and obtaining refreshed recommendations with a total sum up to the given con- The majority of online platforms and e-commerce stores straint. Otherwise, the need to use constrained recom- feature recommender systems that consider users’ previ- mendations is evident for most online platforms even ous purchases and try to predict their future purchases without updating the user interfaces. For example, the by solving the next-basket recommendation task. e-commerce site might recalculate the lists of recommen- The evaluation process of the next-basket recommen- dations for different online shopping cart states in real dations implies that the considered method generates time. Once a client adds or deletes any good in a basket, recommendations of some predefined length 𝐾. Lists of we can update the block with recommendations under recommendations are compared to the last basket in a the basket taking into account the updated total sum. test sample by computing the ranking metrics, e.g. such Depending on the business demands, the total cost can as π‘…π‘’π‘π‘Žπ‘™π‘™@𝐾 and 𝑁 𝐷𝐢𝐺@𝐾. These lists have a fixed be formalized in a task as the maximum sum of the user’s length as far as e-commerce marketplaces offer millions purchase or as the total cost of recommendations only. of items, which makes it almost impossible to view all of In this extended abstract, we propose to use a new ap- them. Moreover, most user interfaces, especially mobile proach that helps users to view affordable recommended applications, have very limited space to display item lists. goods and e-commerce platforms to use the available Thus, it is crucial to maximize both the high precision slots for recommendations wisely, maximizing the prob- and ranking of recommendations with a fixed 𝐾 length. ability of purchases. Meanwhile, we could take into consideration another restriction for the next-basket formalization task. Each user usually has a certain amount of money that he or 2. Proposed approach she can afford to spend at the moment of purchase. The following might be one of the possible user inter- We formulate the problem of constrained next-basket faces for personal recommendations with total cost con- recommendations as follows. Let there be π‘ˆ - a set of straints. If the user specifically sets this limit, we provide users and 𝐼 - a set of items. Each user 𝑒 has a list of a list with recommended items of some arbitrary length, predicted relevance scores for items π‘Ÿπ‘’,𝑖 = {π‘Ÿπ‘’,1 , … , π‘Ÿπ‘’,|𝐼 | }. not exceeding the indicated total cost. Furthermore, the It is implied, that any recommender model can be used client may specify such constraints in a real-time session to obtain these scores, which is beyond our scope. Also, we obtain a list of items prices 𝑝𝑖 = {𝑝1 , … , 𝑝|𝐼 | }. For each user, we have a true basket with some purchased items ORSUM@ACM RecSys 2022: 5th Workshop on Online Recommender Systems and User Modeling, jointly with the 16th ACM Conference on in the last transaction 𝑏𝑒,𝑗 = {𝑏𝑒,1 , … , 𝑏𝑒,|𝐽 | } where |𝐡| is a Recommender Systems, September 23rd, 2022, Seattle, WA, USA number of goods in the last purchased basket. The aim is Envelope-Open o.a.lashinin@tinkoff.ru (O. Lashinin); m.ananyeva@tinkoff.ai to predict items in 𝑏𝑒,𝑗 with a 𝐾-length recommendations (M. Ananyeva) π‘Ÿπ‘’,π‘˜ Μ‚ = {π‘Ÿπ‘’,1 Μ‚ , … , π‘Ÿπ‘’,𝐾 Μ‚ }. 𝑇 𝐢𝑒 is a total cost for a user 𝑒. We Orcid 1234-5678-9012 (M. Ananyeva) Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License generate the list based on the restrictions for each user: Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) Figure 1: Results of experiments on the Ta-Feng dataset. From left to right, the first histogram demonstrates the average rate of the used total cost in recommended baskets. The second histogram presents the distribution of the number of recommended items. The line graph presents Recall@k w.r.t. different k values both for the classical NBR problem and the proposed next-basket with constraints task. 𝐾 3. Experiments βˆ‘ π‘Ÿπ‘’,𝑖 β†’ π‘šπ‘Žπ‘₯ (1) Dataset. We conducte preliminary experiments on the 𝑖=1 [1, 2] Ta-Feng dataset1 , which is extensively used in rec- 𝐾 ommender systems. βˆ‘ 𝑝𝑒,𝑖 < 𝑇 𝐢𝑒 (2) It includes purchase history of goods with given pric- 𝑖=1 ing. Users in the Ta-Feng dataset can purchase different However, a few remarks are required here. Usually, we quantities of the same goods. Nonetheless, we will con- have no information on how many items |𝐡| a customer sider proposing a product based on the dataset’s average will add to the next cart. To overcome this problem, we price. Furthermore, we assume that the recommenda- can generate the lists π‘Ÿπ‘’,π‘˜ Μ‚ for different 𝐾 values and select tions cannot contain duplicates, which means that each those that maximize the equation (1). In our implemen- item can only be suggested once and in a single occur- tation, we generate 𝐾-length vectors with recommenda- rence. We employ the same data preparation strategy as tions for each user and drop extra items Alternatively, in TIFU-KNN [1] to evaluate our approach, and we use a we can try to predict the number of items |𝐡| and make leave-one-basket [3] scenario. the lists based on the predicted values. Metrics. We compute the ranking metrics (Recall@k, Also, the 𝑇 𝐢𝑒 parameter can be varied by the user in NDCG@k) that are commonly used in the next-basket real-time. The user can change the thresholds of the total recommendation evaluation [1, 4, 2]. cost and select the best option with personal recommen- Experiment Settings. dations. In offline evaluation settings, the 𝑇 𝐢𝑒 value can be We use a genetic algorithm in order to find the optimal chosen by the researchers. For instance, we could predict π‘Ÿπ‘’,π‘˜ Μ‚ . The generated populations of multi-hot vectors with the total cost of the last known basket for each user or 𝑁 size from the most relevant items π‘Ÿπ‘’,𝑖 = {π‘Ÿπ‘’,1 , … , π‘Ÿπ‘’,𝑁 } vary the values of 𝑇 𝐢𝑒 across some range on a dataset. for each user. The 𝑖-th component of a generated vec- In particular, it is possible to calculate the average basket tor (gene of a chromosome) is supposed to be equal to cost from the training set for each user and take this 1 and the corresponding 𝑖-th element of the recommen- aggregated statistic as a constraint. In our case, 𝑇 𝐢𝑒 dations list will be included to the final predictions. If equals the true cost of the last basket βˆ‘ 𝑝𝑔𝑒,𝑗 of each the condition (2) is not met, the fitness function is equal customer. It approximates the case when the customer to negative infinity or βˆ‘ π‘Ÿπ‘’,π‘˜ otherwise. Therefore, the knows exactly how much he is can afford or is willing to genetic algorithm will maximize (1) with the respect to spend. the condition (2). We leave other possible approaches to We adopt the genetic algorithm, using the implemen- finding the optimal solution for future work. tation from the open-source framework geneticalgo- rithm22 . For solving this task on the Ta-Feng dataset, we use only the 20 most relevant items and population 1 https://www.kaggle.com/datasets/chiranjivdas09/ta-feng-grocery- dataset 2 https://github.com/PasaOpasen/geneticalgorithm2 size of 40 vectors. PersonTopFreq approach is applied reward, and the conditions for buying the entire bundle as a recommender model that returns the most popular or only certain items from it are open. In addition, the items from the baskets of all users, demonstrating the different cases which allow a user to add or delete an high performance in TIFU-KNN [1]. Also, we use nega- item allow the platforms to make a user-friendly inter- tive ranks of items as π‘Ÿπ‘’,𝑖 because negativity is needed for face with updates in recommendations in real time. In maximizing the 𝑇 𝐢𝑒 of items included in the final predic- particular, after each customer’s action with the basket, tions. Application of other recommendation approaches the recommended bundle of items can be recalculated for the next basket task is left for future work. entirely or iteratively, by replacing each added or deleted Performance Comparison. Figure 1 (a) shows the item with a new item of the same price in order not average rate of the used total cost, which is quite close to exceed the total price. Finally, we could use bundle to the cost of the basket from the testing sample. In this recommendations in the offline experiments as baseline graph, all predicted baskets with a higher total cost are solutions or adapt them to this formalized task. neglected. The costs of most collected baskets are close Approaches. Instead of genetic algorithms, we could to 𝑇 𝐢𝑒 and satisfy the user’s needs. also try to adopt a graph-based A* algorithm, integer pro- Figure 1 (b) demonstrates the number of goods that gramming, or knapsack problem with dynamic program- were included in the baskets, which is close to the Gaus- ming techniques to address the problem of generating sian distribution. next-basket recommendations with a total cost constraint. Figure 1 (c) compares the values of Recall@k w.r.t. the Still, some open issues should be discussed. For instance, different π‘˜ values for two cases. The results when we should the sum of relevance scores be equal to the total recommend only top-k relevant items are in green. The relevance score and be constrained by the lower bound Recall@k values with top-k items with the total cost re- of relevance or not? What are the reasonable restrictions strictions from the 20 most relevant items are in blue. for relevance scores considering their different ranges There is no extreme difference between the two meth- of values, negative and positive scores from the recom- ods observed, but our recommendations fit additional mender algorithms? These are only a few examples of restrictions in the TCANBR scenario. open questions. Total Cost values. In this paper, we suggested a way to set a total cost value. However, we could think of 4. Discussion predicting the total cost for each user, for example, using linear regression or other machine learning models. In During the discussion panel of this brainstorming ides, user cases, when the customer updates the maximum that took place at the 5th Workshop on Online Recom- affordable cost, we can take this value in real-time and use mender Systems and User Modeling (Seattle, WA, USA, it on the inference step. Otherwise, we could also think of 2022)3 , we can summarize the open issues and several using the segments of customers instead of personalized future directions for future work. values of the total cost. For example, the customers with Additional constraints. Taking into account the for- low, medium, and high income and three different values malization problem mentioned above, we could think of for this constraint correspondingly. additional restrictions that can be applied to bring the so- Item quantity. One of the problems, which we ne- lution closer to real conditions. Firstly, we could extend glect for the simplicity, is the challenge to take into ac- to the lower and upper bounds of the price for recom- count the different quantities for each item in the lists of mended items. It can either be set as hyperparameters recommendations. In case we want to add this additional or as extra constraints in our task. It would help to solve restriction to the task, we should think carefully about a problem of making the list of recommendations only how to include this constraint and optimize the function. with cheap or expensive items. Secondly, we could extend We also should think about whether we allow duplicated the task to introducing the profit margins, which both items. For example, the milk, but of different brands and customers and enterprises pursue considering business prices. In addition, some items could be bought always metrics. in single pieces, but some others are taken in multiple Bundle recommendations. Considering the fact that pieces or are the goods sold by weight. In our case, we we want to generate a list of recommended items, we implied no duplicates and only 1 allowed piece of each might think about it as bundle recommendations. In par- item for simplicity. ticular, it could enhance the diversity or complementarity The list of problems mentioned above is not exhaustive of predicted items. Thus, the approaches for bundle rec- and can be continued and developed into comprehensive ommendations could also bee used to solve the task of versions of formalization task and be addressed by differ- constrained recommendations with total cost. However, ent approaches. the questions on how to evaluate the bundle, attribute the 3 https://orsum.inesctec.pt/orsum2022/ 5. Conclusion recommendation, in: Proceedings of the 43rd In- ternational ACM SIGIR Conference on Research In this article, we briefly introduced a formalization prob- and Development in Information Retrieval, 2020, pp. lem of the next-basket constrained recommendations 1071–1080. problem. The tested approach simulates a case when a [2] G. Chen, Z. Li, A new method combining pattern customer of an online marketplace can spend a particu- prediction and preference prediction for next basket lar upper bound of the amount of money. Alternatively, recommendation, Entropy 23 (2021) 1430. some online services can fit the recommendations into [3] Z. Meng, R. McCreadie, C. Macdonald, I. Ounis, Ex- a suggested budget for each user. This problem task ploring data splitting strategies for the evaluation of suggests generating item recommendations that do not recommendation models, in: Fourteenth ACM con- exceed the specified maximum total cost. ference on recommender systems, 2020, pp. 681–686. [4] M. Ariannezhad, S. Jullien, M. Li, M. Fang, S. Schelter, M. de Rijke, Recanet: A repeat consumption-aware References neural network for next basket recommendation in [1] H. Hu, X. He, J. Gao, Z.-L. Zhang, Modeling person- grocery shopping (2022). alized item frequency information for next-basket