=Paper= {{Paper |id=Vol-2440/paper5 |storemode=property |title=Simple Objectives Work Better |pdfUrl=https://ceur-ws.org/Vol-2440/paper5.pdf |volume=Vol-2440 |authors=Joaquin Delgado,Samuel Lind,Carl Radecke,Satish Konijeti |dblpUrl=https://dblp.org/rec/conf/recsys/DelgadoLRK19 }} ==Simple Objectives Work Better== https://ceur-ws.org/Vol-2440/paper5.pdf
                                      Simple Objectives Work Better*
                                                         Joaquin Delgado1, Samuel Lind,
                                                          Carl Radecke, Satish Konijeti
                                                                   Groupon, Inc.
                                                          2445 Augustine Dr, Santa Clara, CA
                                                                       95054
                                                         joaquin.delgado@gmail.com
                                                              {slind, cradecke,
                                                           bkonijeti}@groupon.com

ABSTRACT                                                                     1.        Introduction
                                                                             Groupon is a large global e-commerce company, operating via the
Groupon is a dynamic two-sided marketplace where millions of
deals organized in three different lines of businesses or verticals:         web and the popular Groupon Mobile App. Currently serving 15
Local, Goods and Getaways, using various taxonomies, are                     countries and more than 100 million monthly active users
matched with customers’ demand across 15 countries around the                worldwide, Groupon is the place you start when you want to buy
world. Customers discover deals by directly entering the search              just about anything, anytime, anywhere. Groupon offers physical
query or browsing on the mobile or desktop devices. Relevance is             merchandise through their Goods business, travel deals through its
Groupon’s homegrown search and recommendation engine,                        Getaways business, and is the market leader in Local e-commerce.
tasked to find the best deals for its users while ensuring the               Groupon is trying to develop a robust marketplace, and as such,
business objectives are also met at the same time. Hence the                 needs to understand at an individual level the supply and service
objective function is designed to calibrate the score to meet the            needed to develop a daily habit for the company’s customers.
needs of multiple stakeholders. Currently, the function is                   How does featuring the local burger place down the block
comprised of multiple weighted factors that are combined to                  compare to featuring a big chain when it comes to increasing a
                                                                             user’s future spending? Given the number of local choices, a
satisfy the needs of the respective stakeholders in the
multi-objective scorer, a key component of Groupon’s ranking                 customer has, how many Groupon options are provided to
pipeline.                                                                    promote a daily habit? When is it appropriate to recommend a
                                                                             product over a trip? In essence, what are the underlying objectives
The purpose of this paper is to describe various techniques                  and forces that power Relevance, the company’s search and
explored by Groupon’s Relevance team to improve various parts                recommendation ranking engine?
of Search and Ranking algorithms specifically related to the
multi-objective scorer. It is for research only, and it does not             An objective function is a mathematical expression which
reflect the views, plans, policy or practices of Groupon.                    implicitly reflects certain tradeoffs for outcomes. The design of an
                                                                             objective function must take into consideration three important
The main contributions of this paper are in the areas of                     points. The first is that, as a mathematical object, the outcomes
factorization of the different abstract objectives and the                   that one includes must be capable of being quantified. The second
simplification of the objective function to capture the essence of           observation is that these outcomes, in addition to being
short, mid and long term benefits while preserving fairness and              quantifiable, must also be observable and in certain cases
moving users forward in the customer lifecycle.                              predictable. The third is that, insofar as an objective function
                                                                             determines decision making, care must be taken as to which
                                                                             outcomes are included in light of Goodhart’s Law [1], which is
CCS CONCEPTS                                                                 the idea that “when a measure becomes a target, it ceases to be a
                                                                             good measure” (as phrased by Marilyn Strathern).
•Information systems → Recommender systems; Retrieval
effectiveness; Computing methodologies; Applied computing →                  These considerations lead naturally to constraints on the types of
Electronic commerce                                                          factors that can and ought to be included in an objective function
                                                                             and bear on all approaches to designing and iterating on objective
KEYWORDS                                                                     functions in concrete ways.
Multi-stakeholder Recommendations, Recommender Systems,
Algorithmic Fairness, Marketplace, Ranking, E-commerce1
                                                                             2.        Groupon’s Situation
                                                                             So far all of this is abstract and unlikely to be new to anyone
1
    This work was done while the author was at Groupon                       reading this paper, but it is important to get the trivial things out
                                                                             of the way.
* © Copyright 2019 for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).          Now we consider how these abstractions impact the actual
Presented at the RMSE workshop held in conjunction with the 13th ACM
                                                                             situation faced by Groupon. As a two-sided marketplace, the
Conference on Recommender Systems (RecSys), 2019, in Copenhagen,
Denmark.
terms that might naturally exist in any overarching objective           While traditional recommender systems generally aim at solving
function are not hard to conceptualize at a high level: Groupon         the low-intent “surprise me” recommendation use case, we see the
must please its users, please its merchants, and make a profit.         ranking problem as something to solve in multiple places
                                                                        throughout the purchasing funnel continuum. To capture the
Following the abstractions described in the previous section, such
an objective function must take into consideration how these            different aspects of ranking in a multi-stakeholder environment
objectives can be quantified, the level of accuracy at which they       we have modeled the ranking problem as a multi-stage pipeline
                                                                        that combines machine learning (learning to rank or LTR [2])
can be quantified both retrospectively and in prediction, and what
distortions these quantifications may introduce to the market’s         based predictions with the objective function.
behavior.
The objective function’s rubber meets the road when it comes to         2.2           The Ranking Pipeline
deciding how to allocate limited resources to meet those                Groupon has a sophisticated real-time ranking pipeline that
objectives. In the case of ranking deals, the limited resources are
                                                                        includes query understanding for search and both response
chiefly impressions: we want to allocate these in the most efficient
                                                                        prediction and optimization phases for generating a per-item
way possible, where the meaning of “efficiency” is more or less         score, as shown in Figure 2 below, to form a ranked list of deals
defined by optimizing an objective function.                            presented to the user.
Furthermore, determining relevant deals for a given user at a
given time introduces novel constraints on an objective function.
In particular, computing such an objective function must be
efficient and fast when applied to all eligible deals per user with
thousands of requests occurring every second, and furthermore,
there must be some mechanism for predicting some terms of an
objective function before being able to measure such terms.
For instance, we naturally want to weigh the financial benefit of a
deal being purchased into a deal’s score. Financial benefit can be
easily quantified after the fact. However, predicting a deal’s
financial benefit, even assuming it is purchased, can be tricky -
there are often multiple prices for a given deal, depending on
quantity sold, the day of the week you wish to reserve a hotel,
different options etc.
                                                                                      Figure 2: Illustrative Ranking Pipeline2
So an objective function for ranking deals ought to only include
quantities that we can (i) quantify in a clearly defined way and (ii)   For a particular set of deals (i.e. the candidate set), a customer and
predict in a clearly defined and accurate way.                          a given context, the output of the response prediction phase is a
                                                                        list of per-deal likelihoods that the customer will view or purchase
                                                                        (i.e. respond to or take action on) the deal that is offered, under
2.1       Recommending Deals                                            that specific context. This likelihood or probability is then used as
                                                                        an input to the optimization stage, which computes a final score
The art & science of recommending deals that delight customers
                                                                        that considers multiple stakeholders’ goals in the Multi-Objective
is one exercised throughout different touch points on Gorupon’s
                                                                        Scorer followed by Diversity Management that ensures diversity
web and mobile apps. As shown in Figure 1, there are multiple
                                                                        and fairness.
use cases. Whether it's personalized recommendations in the home
feed, keyword search, browse or upsell/cross-sell opportunities,
ranking deals and other items (e.g query autocomplete) is at the
front and center of the user experience and is what Relevance           3.            Response Prediction
does.                                                                   User response prediction is a central problem in the computational
                                                                        advertising and e-commerce domains. Quantifying user intent
                                                                        allows advertisers and merchants to target offers towards the right
                                                                        users. This leads to a judicious use of marketing dollars and also
                                                                        renders a pleasant user experience.
                                                                        We believe it is important to highlight how computational
                                                                        advertising, and in particular, response prediction relates to the
                                                                        evolution of recommender systems.
                                                                        Despite recent advances in context-aware recommender systems
                                                                        [3], traditional item-based and user-based collaborative filtering
                                                                        approaches to recommender systems fail to factor in context, such
                                                                        as time-of-day, geo-location or session-based information to
                                                                        generate more accurate recommendations. Moreover, they also
 Figure 1: Ranking Throughout the Purchasing Funnel                     fail to recognize that recommendations don't happen in a vacuum

                                                                        2
                                                                            Illustrative only; Groupon may consider different factors.
and as such may require the evaluation of business constraints and     An alternative/normalized Form of Objective Function:
objectives. With the advent of learning to rank (LTR) and the
application of other shallow and deep machine learning
                                                                        score = eCV R       * ( a + price
                                                                                                        price_exponent
                                                                                                                         * ( b + c * margin%))
techniques to recommender systems, the world of recommender            Here are a few key points to highlight about the various factors:
systems, advertising and e-commerce has finally converged [4][5].           ●    These values of these components are context specific
In order to produce meaningful features used as input to an online               to provide flexibility to match specific goals for each
response prediction model, we developed and deployed ML                          context.
models used to generate offline deal features, such as Deal Quality         ●    Price used in the calculation above is adjusted with an
Score (DQS), a prior computed for each new deal, distance and                    price_exponent to reduce its overpowering effect for
customer-gender triple, as well as Customer-Deal Interaction
                                                                                 high priced deals.
models that use more traditional Collaborative Filtering (Matrix
Factorization [6]) techniques to establish deal-category propensity         ●    The price and margin for the deals are calculated based
used as customer features. More recently, we have been                           on the nuances within each channel or vertical.
experimenting with deep learning and the implementation of an               ●    The constants used as weights (a,b and c in the equation
embedding framework to generate item (deal, user, context and                    above)    are    normalized     and    represent the
combined) embeddings similar to those developed at Pinterest [7]                 post-normalized relative importance given by the
and Twitter [8].                                                                 business to orders/purchase velocity (conversion),
As shown in Figure 2, the final response prediction scores are                   revenue for the merchant (bookings) and revenue for
computed using a shallow, low-latency oriented Gradient                          company (margins%). In this paper, we do not use any
Boosting Machine (GBM) [9] that takes in a few raw and some                      other business metrics and/or constraints used to
engineered Context, Deal and Customer features and produces an                   optimally compute these values.
online score per each qualified deal in a LTR plugin we developed           ●    For new and anonymous visitors, the emphasis is
and use in Groupon’s ElasticSearch deal catalog cluster.                         entirely on conversion in order to drive activations.
                                                                       While this approach provides the necessary levers to adjust the
4.         The Multi-Objective Scorer                                  scores for different use-cases and scenarios, it is complex, requires
                                                                       interpretation of the input price and margin data, it lacks the
Simply put, the multi-objective scorer is implemented as a
                                                                       mathematical rigor that clearly states the measurable trade-offs
weighted average of all the different factors signifying the needs
                                                                       and allows for optimizing the objectives of multiple stakeholders.
of each of the stakeholders. The factors considered in the
objective function are:
     1.    eCVR (estimated Conversion Rate): This score is             5.          A Simplified Formulation
           Groupon’s prediction for the likelihood of a transaction    A more simple and principled formulation of Groupon’s objective
           of this deal by this user. The score is the output of all   function, used in computational advertising, is to produce a bid or
           relevance machine learned models that includes              score that represents the expected gain (in $ amount) for each
           multiple features.                                          deal-impression based on goals/actions and the probability of
     2.    Estimated Bookings: The estimated booking is factored       achieving the goals:
           in to solve for the business objective of optimizing
           bookings in addition to conversion. This factor is
           calculated using the price of the deal and the estimated
           conversion to evaluate the likely amount of booking $.
     3.    Estimated Value: Similar to estimated booking,
           estimated value is also a business objective that aims to
           incorporate net value into the mix. This factor is
           calculated using a predicted $ operational value (OV)       b = bid value/expected gain ,
           for each deal adjusted by the estimated conversion to
           evaluate which deals have the highest potential to          g = g oal/action ,
           contribute to company goals. It is important to note        λg = probability of achieving goal/action happening ,
           that the scope of the scorer is to determine which          v g = v alue/gain f rom achieving goal/action happening (in $ amount)
           deals are more likely to contribute to company goals
           relative to other deals, and not as a tool to forecast      Examples of such goals include, but are not limited to:
           actual impact to those goals.                                    ●    Activation: The meaning of activation varies according
The function as implemented is defined below                                     to user segments. It can be defined as a sign-up action

                    *                *
          score = a eCV R + b eBooking + c eV alue*                              for anonymous users, first purchase for new users who
                                                                                 have already signed up and first purchase after 365 days
where                                                                            of inactivity for reactivatable users. We definitely want
                                                                                 Groupon users to perform the activation action
     ●                           *
           eBooking = eCV R priceprice_exponent                                  associated with their respective segments.
     ●                       *           *
           eV alue = eCV R margin% priceprice_exponent
       ●   Conversion: We want to show deals that users are more                7.        Predicted OV
           likely to purchase.                                                  OV can be easily calculated in hindsight. However, during the
       ●   Value: This represents short term revenue gain from the              scoring time, not all data is statically available. The predictive OV
           sale of a deal. We prefer to show deals that have the                model predicts tomorrow’s OV per unit for each active deal option
           potential to make more money.                                        factoring known business changes (e.g. discount campaigns) and
                                                                                uploads the data for relevance to use in tomorrow’s live ranking of
       ●   Engagement: The more engaged users are with
                                                                                deals.
           Groupon’s platform, greater is the likelihood that they
           keep making purchases which in turn would generate                   This data aims to replace both financial components of the
           more revenue.                                                        objective function (margin and sell price) as Predicted OV better
                                                                                approximates a deal’s potential value to Groupon.
Considering that these are goals that we will consider for our v0
version, we need to define λg and v g for each goal g .                         In the overall OV calculation, the predictive components are only
                                                                                OD amount and CD amount. Our target variables for the ML
                                                                                model are OD orders percent and CD percent.
6.         Operational Value                                                    The model calculates as many values as possible by inputting data
Given the simplified formulation, the challenge can be divided                  points specific to each deal from standard data sources and only
into two: a) build a model to estimate the probability of the                   predicts values when no standard data sources are available (e.g.
action/goal occurring and b) build a separate model to estimate the             Open Discounts).
actual value of the action/goal, should it occur. Going back to the
                                                                                A primary factor that impacts a deal’s OV from one day to the
original multi-objective formulation, price and margin are used for
                                                                                next is discounting.
margin value estimation, whereas a machine learning model
trained on impression and purchase data is used to predict the                  The ML model used for predicting the percentage of orders that
likelihood of a customer buying a deal. However, there are many                 will use an OD code and the average CD percent is also a GBM.
factors, other than price and margin, that may affect the true value            Important features are found to include, among others, the
of the transaction. For example, there are additional                           following:
processing/booking fees, marketing costs (i.e. discounts) and
variable considerations that can affect the value of a transaction                   ●    Lags (past behavior)
and are vertical dependent.                                                          ●    Vertical
                                                                                     ●    Vertical sub-category
To deal with value estimation, we utilize the concept of                             ●    OD day or not
Operational Value or OV. The table below contains the main                           ●    Day of the week
assumptions and components of OV:                                                    ●    Week number


    Operational Gross     Unit Selling Price * Quantity + Fees                  8.        Experiments
    Revenue

    Operational Net       Operational Gross Revenue - OD - CD -
    Revenue               Shipping Costs

    Operational           Operational Net Revenue -
    Value(OV)             Transactional Costs
                  Table 2: OV and its Components3

                                                                                                 Table 3: Predicted Variables
OD stands for Open Discount, which is available on Groupon.com
via promo code for all the users on a given day, and CD stands for              For a given deal on a given day, we want to predict:
Closed Discount, which is available through marketing/targeting                      1.   Percent of orders that will use an open discount (when
the customers based on marketing strategies, and it is available                          available)
only for certain set of customers not everybody.                                               ○ OD orders pct = orders with open discount/
While most of the variables to OV are direct inputs calculated per                                   total orders
their definition on aggregated and historical data, OD and CD                                  ○ OD per unit = min(cost_to_user * OD %, OD
need to be predicted as there is no way to know beforehand                                           $ cap) * OD orders pct
whether a customer will use a promo code or will be targeted for                     2.   Closed discount as a percent of the Sell Price
additional marketing discounts.                                                           (applicable for all days)
                                                                                               ○ CD pct = closed discount amount/ total
                                                                                                     amount
3
  Operational Gross Revenue, Operational Net Revenue, and Operational                          ○ CD per unit = cost_to_user * CD pct
Value are not financial measures under GAAP and are not intended as a
substitute for revenue or other financial metrics reported in accordance with   For this, the data is aggregated at deal level and day level for OD
GAAP.                                                                           and CD separately. We then constructed this problem as a
time-series regression problem with historical information as                  For Predict CD percent (actual mean of the entire test set = 1.69%
independent variables. As data we considered the sample of 1.2M                per deal)
data points out of around 20M data points. The population dataset                    ●   R2 = 8.5%
is for 1 year of data. Split the data into Train (70%), validation                   ●   RMSE = 7.3%
(15%), and test (15%) datasets.                                                      ●   MAE = 2.7%
We used a GBM model to train the data and performed
regularization to generalize the model using a validation set
                                                                               For deals with avg total orders per day >= 30, (actual mean =
Finally, all the metrics shown in the presentation are as per the              1.2%), (around 1.7% of the test data), R2 = 30.1%, RMSE = 2.2%,
performance on hold out (test) dataset                                         MAE = 1.1%.


8.1        Baseline Results4
As a baseline, we used a model that calculates the percentage of
OD orders and CD based on the average of the past behavior.
Overall average OD orders percentage is 29% per deal (average %
of entire data).
      ●    R2 = 2.5%
      ●    RMSE = 39%
      ●    MAE = 28%
                                                                                                Table 4: Results per Vertical
For deals with avg total orders per day >= 5, R2 = 41%, RMSE =                 As seen in Table 4, the ML Model improved the baseline model in
21%, MAE = 14% (around 16% of test data) (actual mean = 24%)                   all the metrics (RMSE, MAE and R2 ), especially for the
                                                                               Getaways vertical where discounts typically have a higher impact
For deals with avg total orders per day >= 15, R2 = 56%, RMSE =                on the bottom line.
14%, MAE = 8% (around 4.5% of test data) (actual mean = 16%)
Overall average CD percentage is 1.7% per deal
      ●    R2 = -16%, Adjusted R2 = -16% (n = 230k, k = 6)
                                                                               8.3       A/B Experiment Results
      ●    RMSE = 8%                                                           We also conducted a full A/B tests at 50/50 split of customer
      ●    MAE = 3%                                                            sessions on web and mobile traffic where we substituted the
                                                                               previous multi-objective scorer with the simplified objective
                                                                               function based only on value maximization for registered users
For deals with avg total orders per day >= 30 (actual mean =                   (existing customers) and conversion/activation maximization for
1.2%), R2 = 4.5%, RMSE = 2.77%, MAE = 1.39%                                    non-registered (new users). This resulted in improvement for all
                                                                               verticals with an overall statistically significant lift of:
While our primary metrics are MAE and RMSE, we are using R2
to track model fit and it’s especially useful for comparing category                 ●   Conversion Lift: 1.56%
level model fit. The R2 values are low (or negative) as the straight                 ●   OV Lift: 1.43%
line average method based on historical data is a very poor fit.
                                                                               We believe that these results stem from improved financial
                                                                               estimates used for this experiment as well as the use of a simpler
8.2        ML Model Results                                                    optimization function that has less moving pieces but is more in
For predicted OD orders percent (actual mean of the entire test                line with clear goals and objectives.
data = 29.6% per deal)
      ●    R2 = 22%
      ●    RMSE = 35%                                                          9.        Future Directions
      ●    MAE = 27%                                                           In this section, we discuss various future directions we will be
                                                                               investigating.
For deals with avg total orders per day >= 5, R2 = 50%, RMSE =
19%, MAE = 13% (around 16% of test data) (actual mean = 24%)                   9.1    Moving Users Through the Customer
For deals with avg total orders per day >= 15, R2 = 65%, RMSE =                Lifecycle
13%, MAE = 7% (around 4.5% of test data) (actual mean = 16%)                   Let’s first identify the stage at which a user currently is, in their
We can observe that, prediction accuracy increases as avg total                customer lifecycle. Then, identify the event (quantifiable) that
orders per day increases.                                                      would push the user to the next stage. Finally, consider this event
                                                                               as an objective and optimize for it. In other words, use a different
4
  To do the evaluation we used standard statistical metrics for regressions,
                                                                               objective for a different cohort of users based on where they are
such as Root Mean Squared Error (RMSE), Mean Averaged Precision (MAE)          currently in their customer lifecycle.
and Coefficient of Determination (R2).
                                                                       job, however, they also need to consider the intent of candidate
                                                                       in their recommendations to make sure the candidates they
                                                                       recommend are going to respond to the job poster. They define a
                                                                       parametric function that combines the semantic match score and
                                                                       intent score which is the objective they want to optimize. Then,
                                                                       they try to find a set of parameters that maximize this objective
                                                                       with a constraint that the distance between ranked list generated
                                                                       by the new multi-objective function and ranked list generated by
                                                                       just the semantic match score is less than some acceptable value.
                                                                       We can incorporate user segmentation by learning different
                                                                       parameters for each segment. We relax the constraint based on
                                                                       what we think is the maximum acceptable violation of the ideal
                                                                       ranking per segment.
                                                                       The form of objective would something similar to the following:
                                                                            max AGk [ f (E[P rof it], E[M argin], ..., α, β, γ, ...) ]
       Figure 3: Purchase Behavior User Segmentation5
                                                                         s.t. N DCG[ f (E[P rof it], E[M argin], ...), f (eCV R)] > Δ
One of the main advantages of this approach is that it eliminates
                                                                                                          |queries|      k
the manual procedure of determining the weights present in our                                   1                    1
                                                                                   AGk (f ) = |queries|      ∑        k ∑ f (q, π i (f , q ))
base approach. Once the objective is clear for each cohort of                                               q=1         i=1
users, we can use the simplified formulation to combine multiple
objectives according to the goals that correspond to the given        where π (f , q ) is ranked list produced for user q by ranking
cohort.                                                               function f .
Amongst the challenges, we need to create cohorts representing        Given this form, we can make the constraint Δ stricter or relaxed
stages of customer lifecycle like that shown in Fig. 2 and we need    for different user segments based on what kind of treatment we
to figure out a quantifiable objective for each cohort.               envision for these segments.
                                                                      We can re-use our offline evaluation framework to measure the
                                                                      distance between two ranked lists (e.g. MAP[11], NDCG[12]).
                                                                      Amongst the challenges we face is the need to create cohorts of
                                                                      users and to figure out what objectives contribute to “long term
                                                                      profitability” and how to combine them. Finally, it is a
                                                                      Constrained Optimization Learning problem that would need to be
                                                                      correctly modeled and implemented.


                                                                      9.3       Other Factors to Consider
                                                                      In addition to estimated CVR (e-CVR) and estimated Value
     Table 5: User Segmentation and Goal Combination                  (eValue) which we have already optimized for, we could also
As seen above in Table 5, multiple different objectives can be        consider the following factors as goals/estimates in Groupon’s
applied to a different cohort of users to move them through the       objective function:
customer lifecycle.                                                         ●   Estimated CTR (e-CTR): An estimate of the click
                                                                                through rate that can be a proxy to measure customer
                                                                                engagement. However, we need to evaluate if it is
9.2           A Hybrid Parametric Function                                      redundant or adds valuable information along with
We can think of objective as some parametric function of multiple               e-CVR.
objectives e.g. Financial Value, Repurchase Tendency, Expected              ●   Affinity to Cause Revisit: A measure of the capability of
Margin, etc. Our task is to find a set of parameters that maximize
                                                                                a deal to create a likeability towards the company which
the value gained from ranking produced by this function subject to
                                                                                causes the user to come back.
a constraint that the distance between the list ranked purely by
e-CVR and the one ranked by the output of this function is less             ●   Price: Absolute Price/Price Range is a measure of
than some acceptable value.                                                     revenue. Moreover, at a user segment level, there could
                                                                                be certain segments whose behavior is highly correlated
This is similar to the approach presented in Multiple Objective
                                                                                to price changes while some segments which are more
Optimization in Recommender Systems [10] which is a paper                       agnostic to price changes. How the learned weight on
from LinkedIn which explains how their system of recommending                   this factor plays out for different user segments could be
candidates to job posters optimizes multiple objectives. Their core
                                                                                insightful.
system outputs a semantic matching between a candidate and a
                                                                            ●   Merchant ROI: In addition to increasing sales and other
                                                                                reasons, merchants sign up with Groupon to a) bring in
5
    Illustrative Only
          more new customers and b) to have customers come
          back again and again...
                                                                         REFERENCES
                                                                         [1]   Goodhart’s Law
      ●   Available Merchant Inventory: Groupon might not want                 [https://towardsdatascience.com/unintended-consequences-and-goodharts-law-
                                                                               68d60a94705c]
          good deals to sell out fast to maintain a rich inventory of
          good deals at all times. Groupon might also want to            [2]   Alexandros Karatzoglou, Linas Baltrunas, and Yue Shi. 2013. Learning to rank
                                                                               for recommender systems. In Proceedings of the 7th ACM conference on
          reserve these good deals to activate/reactivate users by             Recommender systems (RecSys '13). ACM, New York, NY, USA, 493-494.
          limiting their exposure to power users. Some measure
                                                                         [3]   Adomavičius, G., Mobasher, B., Ricci, F., & Tuzhilin, A. (1). Context-Aware
          which represents the selling rate/inventory left.                    Recommender Systems. AI Magazine, 32(3), 67-80.
      ●   Exposure to categories: A combination of a user’s              [4]   Si ying Diana Hu and Joaquin Delgado. 2015. Scalable Recommender
          affinity to explore and exploration level in the deal’s              Systems: Where Machine Learning Meets Search. In Proceedings of the 9th
                                                                               ACM Conference on Recommender Systems (RecSys '15). ACM, New York,
          category. We might want to do more exploration for
                                                                               NY, USA, 365-366
          power users to gain more confidence in a deal’s
                                                                         [5]   Delgado, Joaquin A. Scalable Advertising * Recommender Systems. ACM
          performance but not so much for less active users.
                                                                               Bay Area Profesional Chapter Talk:
                                                                               [https://www.slideshare.net/joaquindelgado1/scalable-advertising-recommende
                                                                               r-systems] [https://www.youtube.com/watch?v=zxYDaI1vu-0]
10.       Conclusion                                                     [6]   Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization
In this paper, we first described considerations we took at                    Techniques for Recommender Systems. Computer 42, 8 (August 2009), 30-37
Groupon when defining an objective function designed to                  [7]   Applying Deep Learning to Related Pins, Pinterest Engineering
calibrate the score to meet the needs of multiple stakeholders in              [https://medium.com/the-graph/applying-deep-learning-to-related-pins-a6fee3c
                                                                               92f5e]
the company’s two-sided deal marketplace. We then described the
logic behind the multi-objective scorer which is part of Groupon’s       [8]   Embeddings@Twitter, Twitter Engineering
                                                                               [https://blog.twitter.com/engineering/en_us/topics/insights/2018/embeddingsatt
current ranking pipeline. Subsequently, we provided a simplified               witter.html]
formulation of the objective function, making more principled and
                                                                         [9]   Jerome H. Friedman. 2002. Stochastic gradient boosting. Comput. Stat. Data
centered around the concept of expected gain. To optimize the                  Anal. 38, 4 (February 2002), 367-378.
outputted ranked list of deals-impressions the function produces a
                                                                         [10] Rodríguez, M., Posse, C., & Zhang, E. (2012). Multiple objective optimization
per-deal bid/score that represents the expected gain (in $ amount)            in recommender systems. RecSys ‘12. In Proceedings of the sixth ACM
for each deal-impression based on given goals/actions and the                 conference on Recommender systems
probability of achieving such goals.                                     [11] Mean Average Precision (MAP)
Focusing first on maximizing conversion and financial value we                [https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval)#
                                                                              Mean_average_precision]
went ahead and defined Operational Value (OV) as a unified
calculation of value per deal to be plugged into the simplified          [12] Normalized Discounted Cumulative Gain (NDCG)
                                                                              [https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG
objective function. We then trained, built and evaluated a separate           ]
machine learned Gradient Boosted Machine (GBM) model to
estimate the percentage of users exposed to open/closed discounts,
a key component in the OV estimation.
Finally, we reported experimental results and discussed future
directions.


DISCLAIMER
This paper has been kept intentionally broad and does not describe
in detail any specific product feature nor does it promise the
delivery of one. It bears no direct influence on the Relevance
development roadmap or any other Groupon products for that
matter. It is a research paper, exploratory in nature, that represents
the discussions and ideas solely attributed to the authors and does
not represent the views, plans, policies or practices of Groupon.
As used herein, “we” and “our” means the authors of this paper
and not Groupon or any of its subsidiaries.