Next-Basket Recommendation Constrained by Total Cost
Brainstorming session report

Oleg Lashinin1 , Marina Ananyeva1,2
1
    Tinkoff, Moscow, Russia
2
    National Research University Higher School of Economics, Moscow, Russia


                                             Abstract
                                             The next-basket recommendations are a well-studied problem in the academic field. As a rule, we want to predict top-K
                                             goods that might be added to the last known basket of the user based on the previous purchase history. However, the existing
                                             studies do not take into account that each user can be limited by the amount of money. Therefore, clients of online websites
                                             might be interested in the opportunity to receive personal recommendations with a given fixed cost. For example, we can
                                             take the average spending over the history of customer’s purchases as an upper bound. In this article, we consider addressing
                                             the problem of making next-basket recommendations with set total cost restrictions.

                                             Keywords
                                             next-basket recommendation, constrained recommendations


1. Motivation                                                                                                     by using filters or other UI tools and obtaining refreshed
                                                                                                                  recommendations with a total sum up to the given con-
The majority of online platforms and e-commerce stores straint. Otherwise, the need to use constrained recom-
feature recommender systems that consider users’ previ- mendations is evident for most online platforms even
ous purchases and try to predict their future purchases without updating the user interfaces. For example, the
by solving the next-basket recommendation task.                                                                   e-commerce site might recalculate the lists of recommen-
              The evaluation process of the next-basket recommen- dations for different online shopping cart states in real
dations implies that the considered method generates time. Once a client adds or deletes any good in a basket,
recommendations of some predefined length 𝐾. Lists of we can update the block with recommendations under
recommendations are compared to the last basket in a the basket taking into account the updated total sum.
test sample by computing the ranking metrics, e.g. such Depending on the business demands, the total cost can
as 𝑅𝑒𝑐𝑎𝑙𝑙@𝐾 and 𝑁 𝐷𝐶𝐺@𝐾. These lists have a fixed be formalized in a task as the maximum sum of the user’s
length as far as e-commerce marketplaces offer millions purchase or as the total cost of recommendations only.
of items, which makes it almost impossible to view all of                                                            In this extended abstract, we propose to use a new ap-
them. Moreover, most user interfaces, especially mobile proach that helps users to view affordable recommended
applications, have very limited space to display item lists. goods and e-commerce platforms to use the available
Thus, it is crucial to maximize both the high precision slots for recommendations wisely, maximizing the prob-
and ranking of recommendations with a fixed 𝐾 length. ability of purchases.
              Meanwhile, we could take into consideration another
restriction for the next-basket formalization task. Each
user usually has a certain amount of money that he or 2. Proposed approach
she can afford to spend at the moment of purchase.
              The following might be one of the possible user inter- We formulate the problem of constrained next-basket
faces for personal recommendations with total cost con- recommendations as follows. Let there be 𝑈 - a set of
straints. If the user specifically sets this limit, we provide users and 𝐼 - a set of items. Each user 𝑢 has a list of
a list with recommended items of some arbitrary length, predicted relevance scores for items 𝑟𝑢,𝑖 = {𝑟𝑢,1 , … , 𝑟𝑢,|𝐼 | }.
not exceeding the indicated total cost. Furthermore, the It is implied, that any recommender model can be used
client may specify such constraints in a real-time session to obtain these scores, which is beyond our scope. Also,
                                                                                                                  we obtain a list of items prices 𝑝𝑖 = {𝑝1 , … , 𝑝|𝐼 | }. For each
                                                                                                                  user, we have a true basket with some purchased items
ORSUM@ACM RecSys 2022: 5th Workshop on Online Recommender
Systems and User Modeling, jointly with the 16th ACM Conference on in the last transaction 𝑏𝑢,𝑗 = {𝑏𝑢,1 , … , 𝑏𝑢,|𝐽 | } where |𝐵| is a
Recommender Systems, September 23rd, 2022, Seattle, WA, USA                                                       number of goods in the last purchased basket. The aim is
Envelope-Open o.a.lashinin@tinkoff.ru (O. Lashinin); m.ananyeva@tinkoff.ai                                        to predict items in 𝑏𝑢,𝑗 with a 𝐾-length recommendations
(M. Ananyeva)                                                                                                     𝑟𝑢,𝑘
                                                                                                                   ̂ = {𝑟𝑢,1
                                                                                                                           ̂ , … , 𝑟𝑢,𝐾
                                                                                                                                    ̂ }. 𝑇 𝐶𝑢 is a total cost for a user 𝑢. We
Orcid 1234-5678-9012 (M. Ananyeva)
                     © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License generate the list based on the restrictions for each user:
                                       Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
Figure 1: Results of experiments on the Ta-Feng dataset. From left to right, the first histogram demonstrates the average rate
of the used total cost in recommended baskets. The second histogram presents the distribution of the number of recommended
items. The line graph presents Recall@k w.r.t. different k values both for the classical NBR problem and the proposed
next-basket with constraints task.


                        𝐾
                                                                3. Experiments
                       ∑ 𝑟𝑢,𝑖 → 𝑚𝑎𝑥                       (1)   Dataset. We conducte preliminary experiments on the
                       𝑖=1
                                                                [1, 2] Ta-Feng dataset1 , which is extensively used in rec-
                        𝐾
                                                                ommender systems.
                       ∑ 𝑝𝑢,𝑖 < 𝑇 𝐶𝑢                      (2)      It includes purchase history of goods with given pric-
                       𝑖=1
                                                                ing. Users in the Ta-Feng dataset can purchase different
   However, a few remarks are required here. Usually, we        quantities of the same goods. Nonetheless, we will con-
have no information on how many items |𝐵| a customer            sider proposing a product based on the dataset’s average
will add to the next cart. To overcome this problem, we         price. Furthermore, we assume that the recommenda-
can generate the lists 𝑟𝑢,𝑘
                        ̂ for different 𝐾 values and select     tions cannot contain duplicates, which means that each
those that maximize the equation (1). In our implemen-          item can only be suggested once and in a single occur-
tation, we generate 𝐾-length vectors with recommenda-           rence. We employ the same data preparation strategy as
tions for each user and drop extra items Alternatively,         in TIFU-KNN [1] to evaluate our approach, and we use a
we can try to predict the number of items |𝐵| and make          leave-one-basket [3] scenario.
the lists based on the predicted values.                           Metrics. We compute the ranking metrics (Recall@k,
   Also, the 𝑇 𝐶𝑢 parameter can be varied by the user in        NDCG@k) that are commonly used in the next-basket
real-time. The user can change the thresholds of the total      recommendation evaluation [1, 4, 2].
cost and select the best option with personal recommen-            Experiment Settings.
dations.                                                           In offline evaluation settings, the 𝑇 𝐶𝑢 value can be
   We use a genetic algorithm in order to find the optimal      chosen by the researchers. For instance, we could predict
𝑟𝑢,𝑘
 ̂ . The generated populations of multi-hot vectors with        the total cost of the last known basket for each user or
𝑁 size from the most relevant items 𝑟𝑢,𝑖 = {𝑟𝑢,1 , … , 𝑟𝑢,𝑁 }   vary the values of 𝑇 𝐶𝑢 across some range on a dataset.
for each user. The 𝑖-th component of a generated vec-           In particular, it is possible to calculate the average basket
tor (gene of a chromosome) is supposed to be equal to           cost from the training set for each user and take this
1 and the corresponding 𝑖-th element of the recommen-           aggregated statistic as a constraint. In our case, 𝑇 𝐶𝑢
dations list will be included to the final predictions. If      equals the true cost of the last basket ∑ 𝑝𝑔𝑢,𝑗 of each
the condition (2) is not met, the fitness function is equal     customer. It approximates the case when the customer
to negative infinity or ∑ 𝑟𝑢,𝑘 otherwise. Therefore, the        knows exactly how much he is can afford or is willing to
genetic algorithm will maximize (1) with the respect to         spend.
the condition (2). We leave other possible approaches to           We adopt the genetic algorithm, using the implemen-
finding the optimal solution for future work.                   tation from the open-source framework geneticalgo-
                                                                rithm22 . For solving this task on the Ta-Feng dataset,
                                                                we use only the 20 most relevant items and population
                                                                1
                                                                    https://www.kaggle.com/datasets/chiranjivdas09/ta-feng-grocery-
                                                                    dataset
                                                                2
                                                                    https://github.com/PasaOpasen/geneticalgorithm2
size of 40 vectors. PersonTopFreq approach is applied          reward, and the conditions for buying the entire bundle
as a recommender model that returns the most popular           or only certain items from it are open. In addition, the
items from the baskets of all users, demonstrating the         different cases which allow a user to add or delete an
high performance in TIFU-KNN [1]. Also, we use nega-           item allow the platforms to make a user-friendly inter-
tive ranks of items as 𝑟𝑢,𝑖 because negativity is needed for   face with updates in recommendations in real time. In
maximizing the 𝑇 𝐶𝑢 of items included in the final predic-     particular, after each customer’s action with the basket,
tions. Application of other recommendation approaches          the recommended bundle of items can be recalculated
for the next basket task is left for future work.              entirely or iteratively, by replacing each added or deleted
   Performance Comparison. Figure 1 (a) shows the              item with a new item of the same price in order not
average rate of the used total cost, which is quite close      to exceed the total price. Finally, we could use bundle
to the cost of the basket from the testing sample. In this     recommendations in the offline experiments as baseline
graph, all predicted baskets with a higher total cost are      solutions or adapt them to this formalized task.
neglected. The costs of most collected baskets are close          Approaches. Instead of genetic algorithms, we could
to 𝑇 𝐶𝑢 and satisfy the user’s needs.                          also try to adopt a graph-based A* algorithm, integer pro-
   Figure 1 (b) demonstrates the number of goods that          gramming, or knapsack problem with dynamic program-
were included in the baskets, which is close to the Gaus-      ming techniques to address the problem of generating
sian distribution.                                             next-basket recommendations with a total cost constraint.
   Figure 1 (c) compares the values of Recall@k w.r.t. the     Still, some open issues should be discussed. For instance,
different 𝑘 values for two cases. The results when we          should the sum of relevance scores be equal to the total
recommend only top-k relevant items are in green. The          relevance score and be constrained by the lower bound
Recall@k values with top-k items with the total cost re-       of relevance or not? What are the reasonable restrictions
strictions from the 20 most relevant items are in blue.        for relevance scores considering their different ranges
There is no extreme difference between the two meth-           of values, negative and positive scores from the recom-
ods observed, but our recommendations fit additional           mender algorithms? These are only a few examples of
restrictions in the TCANBR scenario.                           open questions.
                                                                  Total Cost values. In this paper, we suggested a way
                                                               to set a total cost value. However, we could think of
4. Discussion                                                  predicting the total cost for each user, for example, using
                                                               linear regression or other machine learning models. In
During the discussion panel of this brainstorming ides,
                                                               user cases, when the customer updates the maximum
that took place at the 5th Workshop on Online Recom-
                                                               affordable cost, we can take this value in real-time and use
mender Systems and User Modeling (Seattle, WA, USA,
                                                               it on the inference step. Otherwise, we could also think of
2022)3 , we can summarize the open issues and several
                                                               using the segments of customers instead of personalized
future directions for future work.
                                                               values of the total cost. For example, the customers with
   Additional constraints. Taking into account the for-
                                                               low, medium, and high income and three different values
malization problem mentioned above, we could think of
                                                               for this constraint correspondingly.
additional restrictions that can be applied to bring the so-
                                                                  Item quantity. One of the problems, which we ne-
lution closer to real conditions. Firstly, we could extend
                                                               glect for the simplicity, is the challenge to take into ac-
to the lower and upper bounds of the price for recom-
                                                               count the different quantities for each item in the lists of
mended items. It can either be set as hyperparameters
                                                               recommendations. In case we want to add this additional
or as extra constraints in our task. It would help to solve
                                                               restriction to the task, we should think carefully about
a problem of making the list of recommendations only
                                                               how to include this constraint and optimize the function.
with cheap or expensive items. Secondly, we could extend
                                                               We also should think about whether we allow duplicated
the task to introducing the profit margins, which both
                                                               items. For example, the milk, but of different brands and
customers and enterprises pursue considering business
                                                               prices. In addition, some items could be bought always
metrics.
                                                               in single pieces, but some others are taken in multiple
   Bundle recommendations. Considering the fact that
                                                               pieces or are the goods sold by weight. In our case, we
we want to generate a list of recommended items, we
                                                               implied no duplicates and only 1 allowed piece of each
might think about it as bundle recommendations. In par-
                                                               item for simplicity.
ticular, it could enhance the diversity or complementarity
                                                                  The list of problems mentioned above is not exhaustive
of predicted items. Thus, the approaches for bundle rec-
                                                               and can be continued and developed into comprehensive
ommendations could also bee used to solve the task of
                                                               versions of formalization task and be addressed by differ-
constrained recommendations with total cost. However,
                                                               ent approaches.
the questions on how to evaluate the bundle, attribute the
3
    https://orsum.inesctec.pt/orsum2022/
5. Conclusion                                                    recommendation, in: Proceedings of the 43rd In-
                                                                 ternational ACM SIGIR Conference on Research
In this article, we briefly introduced a formalization prob-     and Development in Information Retrieval, 2020, pp.
lem of the next-basket constrained recommendations               1071–1080.
problem. The tested approach simulates a case when a [2] G. Chen, Z. Li, A new method combining pattern
customer of an online marketplace can spend a particu-           prediction and preference prediction for next basket
lar upper bound of the amount of money. Alternatively,           recommendation, Entropy 23 (2021) 1430.
some online services can fit the recommendations into [3] Z. Meng, R. McCreadie, C. Macdonald, I. Ounis, Ex-
a suggested budget for each user. This problem task              ploring data splitting strategies for the evaluation of
suggests generating item recommendations that do not             recommendation models, in: Fourteenth ACM con-
exceed the specified maximum total cost.                         ference on recommender systems, 2020, pp. 681–686.
                                                             [4] M. Ariannezhad, S. Jullien, M. Li, M. Fang, S. Schelter,
                                                                 M. de Rijke, Recanet: A repeat consumption-aware
References                                                       neural network for next basket recommendation in
[1] H. Hu, X. He, J. Gao, Z.-L. Zhang, Modeling person-          grocery shopping (2022).
     alized item frequency information for next-basket