Online Learning of a Ranking Formula for Revenue and
                       Advertiser ROI Optimization
                                                                                   Or Levi
                                                                            ebay/Marktplaats
                                                                             olevi@ebay.com

ABSTRACT                                                                                site employs a pay-per-click advertising model, where advertisers
A standard model for sponsored search comprises of ranking ads                          bid for ads to be displayed and are charged by the bid amount, also
by their expected revenue, that is, the product of their bid price and                  known as cost-per-click (CPC), once a user clicks on an ad, which
estimated click-through rate (CTR). In this work, we introduce two                      in turn generates a revenue for the site. Ads can appear on multiple
complementary use cases for ranking ads at an online classifieds site                   devices, including desktop, mobile applications and tablets, and at
and aim to optimize a ranking formula which extends the traditional                     different placements on the site, such as the top of the search results
one.                                                                                    page, interleaved between organic results, and also on the home
   First, we address the task of ranking ads on the search results                      page feed. After clicking on an ad, users visit the view item page
page for revenue optimization. While most works address this chal-                      (VIP) where they can click-out to the advertiser website (Figure 1).
lenge by improving CTR estimation, we consider the effectiveness                        Advertiser return on investment (ROI) is directly related to the cost
of the CTR estimation as a given and presume that if CTR estima-                        per user click-out, also known as cost-per-action (CPA), calculated
tion is somewhat ineffective, it can be compensated by applying a                       by dividing the total cost by the total number of click-outs. Hence,
larger weight to the bid factor.                                                        optimizing advertiser ROI is equivalent to minimizing the CPA.
   Second, we aim to improve advertiser return on investment (ROI)                      This setting reflects a potential conflict between the interests of the
while keeping a similar level of revenues for ads ranking on the                        site and advertisers, which we aim to balance.
home page feed. To this end, we introduce into the standard ranking                        Similar to the standard model of sponsored search, ads in our
formula - a factor that favors ads with higher click-out rate and                       system are ranked by multiplying their bid and estimated CTR.
serves as an effective tie-breaker in cases of two competing ads                        The same bid price applies for ranking an ad independent of query
with relatively similar revenue expectations.                                           keywords and across all devices and placements. An ad’s CTR
   To optimize the ranking formula, for each case, we propose an                        estimation is calculated using past click-through data independent
online learning procedure in a multi-armed bandit setting. Empiri-                      of query keywords, but separately per each device and placement,
cal evaluation attests to the merits of this approach compared to the                   as those might exhibit inherently different user behavior. While bids
existing ranking in production, which is based on the traditional                       are exact and given by advertisers, CTRs are difficult to estimate
formula, and validates our reasoning, first, regarding the relation-                    because clicks are rare events and new ads frequently enter the
ship between CTR estimation effectiveness and the learned weights,                      system. Specifically, in our setting, click-through data for tablets is
and second, on the contribution of the click-out factor to increase                     relatively sparse, which can result in less effective estimation.
in advertiser ROI.                                                                         We introduce two use cases for optimization of ads ranking. In
                                                                                        the first use case, we aim to optimize revenues for ads ranking
KEYWORDS                                                                                on the search results page. A major line of research in sponsored
                                                                                        search focuses on improving CTR estimation [1, 5, 6], which is also
Online Learning-to-Rank, Sponsored Search
                                                                                        a major focus of our future work. In this paper, however, we take
ACM Reference format:                                                                   a different view. We treat the CTR estimation effectiveness as a
Or Levi. 2017. Online Learning of a Ranking Formula for Revenue and Adver-              given and acknowledge that it varies across the multiple devices
tiser ROI Optimization. In Proceedings of the fisrt International Workshop on           and placements. Consequently, we re-examine the standard ranking
LEARning Next gEneration Rankers, Amsterdam, The Netherlands, October 1,
                                                                                        formula and presume that applying a weighting scheme, where the
2017, (LEARNER '17), 4 pages.
                                                                                        bid factor is more dominant than the CTR estimation, can yield
                                                                                        superior revenues. Moreover, we presume that, the less effective
1    INTRODUCTION                                                                       the CTR estimation, the larger the weight that should be applied to
Sponsored search is a major monetization source for commercial                          the bid, and inversely smaller weight to the CTR.
search engines. The ranking of sponsored ads determines which                              In the second use case, we consider ads ranking on the home
ads will be displayed and in which order, and thus plays a crucial                      page feed. The majority of the feed traffic is on the mobile apps,
part in optimization of revenues, user experience and advertiser                        where click-out rates are significantly lower than desktop. There-
efficiency.                                                                             fore, in this task we aim to improve advertiser ROI while main-
   Our work aims to optimize a formula for ranking ads at Markt-                        taining relatively similar revenues. We propose that in cases where
plaats.nl, one of the largest sites in the ebay classifieds group. The                  two competing ads have relatively similar revenue expectations,
                                                                                        advertiser efficiency will be increased by favoring the ad with a
LEARNER’17, October 1, 2017, Amsterdam, The Netherlands                                 higher historical click-out rate, and introduce this factor into the
Copyright ©2017 for this paper by its authors. Copying permitted for private and        ranking formula.
academic purposes.
ICTIR ’17, , Amsterdam, Netherlands                                                                                                Or Levi


Figure 1: Marktplaats.nl sponsored search. Sponsored results are displayed at the top of the search results page (SRP). Clicking
on a result opens the view item page (VIP), where a user might click-out to an advertiser website.


   These two use cases reflect our aspiration to optimize near term    setting, such that it cannot be compared as a baseline. The main dif-
revenues while improving advertiser ROI to sustain business rela-      ference is that they study keyword auctions, where the bid and CTR
tionship in the longer term.                                           estimation are keyword specific, and so is the fine-tuned exponent.
   Inspired by recent approaches that model online learning-to-        In contrast, our ranking function employs weights for both the CTR
rank as a contextual multi-armed bandit problem [2], we treat the      and the bid factor, which are optimized through online learning, in-
challenge of optimizing weights for the ranking formula as an          dependent of query keywords, and in the context of each placement
online learning process. Under this model, we attempt to learn         on our site. Consequently, we demonstrate through an empirical
the best action, that is, weights for the ranking formula, per each    evaluation, that the optimized weights are not affected by keyword
context, namely device and placement on the site, while observing      specific conditions, but rather by the degree of CTR estimation
revenues resulting from user clicks.                                   effectiveness, which varies across the different placements.
   We show, through an empirical evaluation, that our approach            Advertiser efficiency in sponsored search is generally not consid-
in both use cases outperforms the existing ranking in production,      ered as a separate objective. The common premise is that advertiser
which is based on the traditional ranking formula. The evaluation      revenues are directly related to CTR and thus improving CTR esti-
also validates the underlying premises of our approach. In the         mation also increases efficiency with respect to advertisers. Wang
revenue optimization task, we point out to the correlation between     et al. [7] recognized that the objectives of the site and advertisers
the CTR estimation effectiveness and the weights learned across        are not always consistent and proposed to model ads ranking as a
the different devices and placements. In the task of optimizing        multi-objective optimization problem. However, similar to the com-
advertiser ROI while keeping revenues unchanged, we show the           mon premise, they also used CTR as the objective for optimizing
contribution of the click-out factor to increase in advertiser ROI.    advertiser utility.
                                                                          In our work, we focus on balancing revenues and advertiser ROI,
                                                                       wherein the latter is related more directly to cost-per-action (CPA)
2   RELATED WORK                                                       than CTR. CPA is affected to a large degree by the effectiveness of
Works in sponsored search that address the challenge of revenue op-    the advertiser view item page, but can also be improved directly
timization mostly focus on improving click-through rate estimation     through the ranking formula as we demonstrate in the next sections.
[1, 5, 6]. Predicting CTR for ads is typically based on machine-
learned models trained using past click-through data. Examples         3   METHOD
of such models are logistic regression [1, 5], probit regression [6]
                                                                       We revisit the traditional ranking formula in sponsored search,
and boosted trees [3]. These models employ multiple features that
                                                                       where the ranking score of an ad is equal to its bid multiplied by
might affect the probability of a user clicking on an ad, such as
                                                                       its estimated CTR, and introduce the following weighting scheme:
textual match between the user’s query and ad content, historical
ad performance and personal user preferences. There has also been                               CPC w 1 ∗ CT Rw 2                       (1)
some work on employing learning-to-rank methods for the CTR
estimation task [8], where a statistical model is learned offline.     such that w 1 + w 2 = 1.
   Our work, on the contrary, treats the CTR estimation effective-        For the revenue optimization task, we presume that if CTR esti-
ness as a given, and aims to maximize revenues through online          mation is somewhat ineffective, it can be compensated by applying
learning of a ranking formula, that applies a larger weight to the     a larger weight to the bid factor. Accordingly, we study sets of
bid factor to compensate for possibly ineffective CTR estimation.      weights where w 1 > w 2 . It can be expected that reducing the CTR
   The most relevant work to ours is that of Lahaie and Pennock        weight in the ranking would result in a lower click-through rate.
[4]. It showed that applying an exponent, substantially smaller than   However, if the CTR estimation is indeed ineffective, this drop
one, to the CTR estimation, can yield superior revenue in equilib-     should be relatively mild and should be more than compensated by
rium under certain conditions. There are several differences to our    an increase in average CPC, yielding higher revenues.
Online Learning of a Ranking Formula for Revenue and Advertiser ROI Optimization                          ICTIR ’17, , Amsterdam, Netherlands

                                                                            Table 1: Performance on the revenue optimization task.
                                                                            Change in click-through rate (CTR), average cost-per-click
                                                                            (Avg CPC), revenue per mille impressions (RPM) and click-
                                                                            out rate between ranking using the learned weights and a
                                                                            baseline of the traditional ranking formula, for five differ-
                                                                            ent devices and two placements on the SRP, at the top and in-
                                                                            terleaved with organic results. Overall performance changes
                                                                            that are statistically significant are marked with ’*’.

Figure 2: Example of two ads with relatively similar revenue
                                                                             Device     Placement      CTR       Avg CPC      RPM      Click-Out
expectation but large difference in click-out rate. In these
cases, using the click-out factor we can display more ads                   Desktop         Top        -1.10%      3.43%      2.29%      -1.05%
with higher click-out rate to improve advertiser ROI.                                      Inter.      -2.87%      6.89%      3.82%       1.06%
                                                                            iPhone-         Top        -1.92%      4.40%      2.40%       0.24%
                                                                              App          Inter.      -8.41%     10.15%      0.89%      1.59%
   For the task of balancing revenues and advertiser ROI, we em-            Android-        Top        -4.05%      8.93%      4.52%       3.86%
ploy the traditional product of CPC and CTR, which captures the               App          Inter.      -7.43%     11.99%      3.67%      3.87%
expected revenue from an ad, and introduce a third factor into the            iOS-          Top        -4.72%     10.17%      4.97%      2.07%
formula to capture an ad’s click-out rate:                                   Tablet        Inter.      -2.09%      4.30%      2.13%      4.33%
                                                                            Android-        Top        -4.39%     10.59%      5.73%       2.05%
              CPC ∗ CT R ∗ w 1 + CLICK − OUT ∗ w 2                   (2)     Tablet        Inter.      -3.39%      6.19%      2.59%      4.71%
such that w 1 + w 2 = 1 . Our premise is that the click-out factor           Overall                  -3.18% *    6.43% *    3.05% *     1.66%
can serve as a tie-breaker in cases where two competing ads have
relatively similar revenue expectations, as illustrated in Figure 2.
An effective tie-breaker will allow us to keep revenues relatively
                                                                               Finally, we also consider more fine-grained weights in adjacency
stable while significantly increasing advertiser efficiency.
                                                                            to the best performing solution. If, for example, a bid factor weight
    For each of these two ranking formulas, our aim is to find an
                                                                            of 0.6 has given the best performance, we consider weights of
optimal set of weights per each context, namely a certain device
                                                                            0.55 and 0.65, and run another iteration with the best performing
and placement on our site. We model this challenge as a contextual
                                                                            solution and the fine-grained weights.
multi-armed bandit problem, where each slot represents a set of
weights, and our objective is to play the best slot for each context.
The best slot would be the one that maximizes the expected reward,          4   EVALUATION
that is, revenues in the case of Formula 1, or ratio of advertiser          We evaluate our methods using the following online experiments
efficiency to revenues in the case of Formula 2. We also consider           on the classifieds site Marktplaats.nl. Each of the two use cases we
the effect on the relevancy of the results to users and impose a            introduced In Section 1 is evaluated separately.
bound on a click-through rate drop that would be tolerated.                    First we optimize the weights for each device and placement
    To optimize the weights, we devise a split test setting, with           following the procedure described in Section 3. Learning phase
equally sized buckets, representing different sets of weights, and          takes two weeks, one week for initial weights and one week for
propose the following online learning procedure with a epsilon-first        fine-grained weights. Training data overall includes more than 50
strategy of pure exploration followed by pure exploitation.                 million impressions and more than 1 million unique ads. Subsequent
    First, we set weights as per the above mentioned premises. Specif-      to the learning phase, we run an online A/B test to evaluate the
ically, for the revenue optimization task, the weight of the bid factor     ranking formula with the learned weights against a baseline of
is set to values in {0.6,0.7,0.8,0.9}, based on our presumption, that the   the traditional sponsored search ranking formula, which is the
bid factor should have a larger weight than that of the CTR estima-         existing method in production. Each alternative is assigned with
tion factor. For the task of balancing revenues and advertiser ROI,         an equal size of the traffic divided randomly by user ID. Lastly,
we set the weight of the click-out rate to values in {0.025,0.05,0.075}.    we collect data for the evaluation over a one-month period and
These relatively minor weights reflect our premise that this factor         report the following measures: revenues per mille impressions, click-
should serve as a marginal tie-breaker.                                     through rate, average cost-per-click and click-out rate. Statistical
    Next, we observe each slot’s performance over a one-week pe-            significance of performance differences is determined using a two
riod, to account for seasonal factors, and select the weighting             tailed paired t-test with p = 0.05.
scheme that achieves the maximal improvement with statistically                Table 1 presents the performance on the revenue optimization
significant difference to the baseline, and such that click-through         task. As expected following the overweighting of the bid factor,
rate does not drop by more than 10%. Our main performance ob-               click-through rate drops for all the devices and placements, but this
jective is revenue per mille impressions (RPM) in the revenue op-           is more than compensated by an increase in average CPC, such that
timization task, and the ratio of CPA to RPM for advertiser ROI             overall revenues increase. We see an increase in RPM across all the
optimization. If no set of weights meets this criteria, we keep the         devices and placements, contributing to a statistically significant
baseline, which is the traditional sponsored search ranking formula.        increase of 3% in overall RPM.
ICTIR ’17, , Amsterdam, Netherlands                                                                                                                 Or Levi

Table 2: CTR estimation effectiveness, measured by AUC,
and the best performing weight of the bid factor per each
device and placement.

      Device       Placement     AUC      Best Peforming Bid Weight
     Desktop          Top        0.723                0.55
                  Interleaved    0.698                 0.6
    iPhone App        Top        0.715                0.55
                  Interleaved    0.678                0.65
 Android App          Top        0.704                 0.6
                  Interleaved    0.634                 0.7
    iOS Tablet        Top        0.541                 0.8
                  Interleaved    0.552                 0.8
Android Tablet        Top        0.571                 0.8
                                                                         Figure 3: Performance on the task of balancing revenues and
                  Interleaved    0.538                 0.8
                                                                         advertiser ROI. Change in overall revenue per mille impres-
                                                                         sions (RPM) and cost-per-action (CPA) for three weights of
                                                                         the click-out factor in Formula 2, compared to a baseline
   Next, we study the correlation between the effectiveness of CTR       with no click-out factor. Note that CPA drop is equivalent to
estimation per each device and placement, and the best performing        improvement in advertiser ROI. Performance changes that
weight of the bid factor. To evaluate the CTR estimation effective-      are statistically significant are marked with ’*’.
ness we use the AUC measure calculated on past click data. Table
2 presents the AUC and best performing weight per each device
and placement. We see a clear correlation between the two (-0.985        demonstrated that the underlying premises of our approach are
Pearson correlation); Specifically, the less effective the CTR esti-     evident in the results. First, the less effective the CTR estimation
mation (lower AUC), the larger the weight of the bid factor. As          for a specific device or placement, the larger the learned weight
expected, CTR estimation for tablets is substantially less effective     of the bid factor. Second, setting minor weights for the click-out
than for desktop and mobile apps, due to click sparsity. Accordingly,    factor can serve to increase advertiser ROI directly through the
they are assigned with the largest bid factor weights. We also see       ranking formula, and the larger the weight, the bigger the increase
that the AUC for the interleaved results is generally slightly lower     to advertiser ROI.
than that of the top results, and the learned bid factor for them is         As avenues for future work, we plan to extend the learning
generally larger.                                                        method to online Bayesian bandits. In addition, we plan to test more
   For the task of balancing revenues and advertiser ROI on the          advanced CTR estimation methods and subsequently revisit this
home page feed using ranking Formula 2, we present the overall           analysis. It can be expected that once more effective CTR estimation
RPM and CPA performance for three weights of the click-out factor        is reached, it would be less beneficial to outweigh the bid factor, if
(Figure 3). As expected, RPM drops in all the three alternatives,        at all.
but the drop in CPA is much more substantial, which presents an
attractive trade-off for improving advertiser ROI, especially in the     REFERENCES
low-weight alternative where revenues are essentially unchanged.         [1] H. Cheng and E. Cantu-Paz. 2010. Personalized click prediction in sponsored
                                                                             search. In Proceedings of the third ACM international conference on Web search
Moreover, we can see that the larger the weight of the click-out fac-        and data mining, WSDM.
tor, the larger the drop in CPA, or equivalently, the more advertiser    [2] A. Grotov and M. de Rijke. 2016. Online Learning to Rank for Information Re-
ROI is improved.                                                             trieval. In Proceedings of the 39th International ACM SIGIR Conference on Research
                                                                             and Development in Information Retrieval. 1215–1218.
                                                                         [3] A. Kornetova I. Trofimov and V. Topinskiy. 2012. Using boosted trees for click-
5     CONCLUSIONS                                                            through rate prediction for sponsored search. In Proceedings of the Sixth Inter-
                                                                             national Workshop on Data Mining for Online Advertising and Internet Economy,
We introduced two use cases for ads ranking at a classifieds website         ADKDD.
that complement each other as part of our aspiration to optimize         [4] S. Lahaie and D. M. Pennock. 2007. Revenue analysis of a family of ranking rules
                                                                             for keyword auctions. In Proceedings of the 8th ACM Conference on Electronic
revenues in the near term while improving advertiser ROI to sustain          Commerce. 50–56.
long term business. To address these challenges, we re-examine the       [5] E. Dominowska M. Richardson and R. Ragno. 2007. Predicting clicks: estimat-
traditional sponsored search ranking formula, introducing a weight-          ing the click-through rate for new ads. In Proceedings of the 16th international
                                                                             conference on World Wide Web (WWW-07). 521–530.
ing scheme where the bid factor is more dominant to compensate           [6] T. Borchert T. Graepel, J. Q. Candela and R. Herbrich. 2010. Web-scale bayesian
for CTR estimation ineffectiveness, in the first case, and introducing       click-through rate prediction for sponsored search advertising in microsoft’s
                                                                             bing search engine. In Proceedings of the 27th International Conference on Machine
a click-out factor as a tie-breaker, in the second one. Subsequently,        Learning.
we proposed an online learning procedure in a multi-armed ban-           [7] J. Yan Y. Wang, B. Wei, Z. Chen, and Q. Du. 2012. Multi-objective optimization
dit setting to optimize the ranking formula for each case. Online            for sponsored search. In Proceedings of the Sixth International Workshop on Data
                                                                             Mining for Online Advertising and Internet Economy.
experiments showed that our ranking formula with the learned             [8] J. Yang D. Wang J. Yan J. Hu Y. Zhu, G. Wang and Z. Chen. 2009. Optimizing search
weights outperforms a baseline of the traditional ranking formula,           engine revenue in sponsored search. In Proceedings of the 32nd international ACM
which is currently implemented in production. Furthermore, we                SIGIR conference on Research and development in information retrieval. 588–595.