Online Learning of a Ranking Formula for Revenue and Advertiser ROI Optimization Or Levi ebay/Marktplaats olevi@ebay.com ABSTRACT site employs a pay-per-click advertising model, where advertisers A standard model for sponsored search comprises of ranking ads bid for ads to be displayed and are charged by the bid amount, also by their expected revenue, that is, the product of their bid price and known as cost-per-click (CPC), once a user clicks on an ad, which estimated click-through rate (CTR). In this work, we introduce two in turn generates a revenue for the site. Ads can appear on multiple complementary use cases for ranking ads at an online classifieds site devices, including desktop, mobile applications and tablets, and at and aim to optimize a ranking formula which extends the traditional different placements on the site, such as the top of the search results one. page, interleaved between organic results, and also on the home First, we address the task of ranking ads on the search results page feed. After clicking on an ad, users visit the view item page page for revenue optimization. While most works address this chal- (VIP) where they can click-out to the advertiser website (Figure 1). lenge by improving CTR estimation, we consider the effectiveness Advertiser return on investment (ROI) is directly related to the cost of the CTR estimation as a given and presume that if CTR estima- per user click-out, also known as cost-per-action (CPA), calculated tion is somewhat ineffective, it can be compensated by applying a by dividing the total cost by the total number of click-outs. Hence, larger weight to the bid factor. optimizing advertiser ROI is equivalent to minimizing the CPA. Second, we aim to improve advertiser return on investment (ROI) This setting reflects a potential conflict between the interests of the while keeping a similar level of revenues for ads ranking on the site and advertisers, which we aim to balance. home page feed. To this end, we introduce into the standard ranking Similar to the standard model of sponsored search, ads in our formula - a factor that favors ads with higher click-out rate and system are ranked by multiplying their bid and estimated CTR. serves as an effective tie-breaker in cases of two competing ads The same bid price applies for ranking an ad independent of query with relatively similar revenue expectations. keywords and across all devices and placements. An ad’s CTR To optimize the ranking formula, for each case, we propose an estimation is calculated using past click-through data independent online learning procedure in a multi-armed bandit setting. Empiri- of query keywords, but separately per each device and placement, cal evaluation attests to the merits of this approach compared to the as those might exhibit inherently different user behavior. While bids existing ranking in production, which is based on the traditional are exact and given by advertisers, CTRs are difficult to estimate formula, and validates our reasoning, first, regarding the relation- because clicks are rare events and new ads frequently enter the ship between CTR estimation effectiveness and the learned weights, system. Specifically, in our setting, click-through data for tablets is and second, on the contribution of the click-out factor to increase relatively sparse, which can result in less effective estimation. in advertiser ROI. We introduce two use cases for optimization of ads ranking. In the first use case, we aim to optimize revenues for ads ranking KEYWORDS on the search results page. A major line of research in sponsored search focuses on improving CTR estimation [1, 5, 6], which is also Online Learning-to-Rank, Sponsored Search a major focus of our future work. In this paper, however, we take ACM Reference format: a different view. We treat the CTR estimation effectiveness as a Or Levi. 2017. Online Learning of a Ranking Formula for Revenue and Adver- given and acknowledge that it varies across the multiple devices tiser ROI Optimization. In Proceedings of the fisrt International Workshop on and placements. Consequently, we re-examine the standard ranking LEARning Next gEneration Rankers, Amsterdam, The Netherlands, October 1, formula and presume that applying a weighting scheme, where the 2017, (LEARNER '17), 4 pages. bid factor is more dominant than the CTR estimation, can yield superior revenues. Moreover, we presume that, the less effective 1 INTRODUCTION the CTR estimation, the larger the weight that should be applied to Sponsored search is a major monetization source for commercial the bid, and inversely smaller weight to the CTR. search engines. The ranking of sponsored ads determines which In the second use case, we consider ads ranking on the home ads will be displayed and in which order, and thus plays a crucial page feed. The majority of the feed traffic is on the mobile apps, part in optimization of revenues, user experience and advertiser where click-out rates are significantly lower than desktop. There- efficiency. fore, in this task we aim to improve advertiser ROI while main- Our work aims to optimize a formula for ranking ads at Markt- taining relatively similar revenues. We propose that in cases where plaats.nl, one of the largest sites in the ebay classifieds group. The two competing ads have relatively similar revenue expectations, advertiser efficiency will be increased by favoring the ad with a LEARNER’17, October 1, 2017, Amsterdam, The Netherlands higher historical click-out rate, and introduce this factor into the Copyright ©2017 for this paper by its authors. Copying permitted for private and ranking formula. academic purposes. ICTIR ’17, , Amsterdam, Netherlands Or Levi Figure 1: Marktplaats.nl sponsored search. Sponsored results are displayed at the top of the search results page (SRP). Clicking on a result opens the view item page (VIP), where a user might click-out to an advertiser website. These two use cases reflect our aspiration to optimize near term setting, such that it cannot be compared as a baseline. The main dif- revenues while improving advertiser ROI to sustain business rela- ference is that they study keyword auctions, where the bid and CTR tionship in the longer term. estimation are keyword specific, and so is the fine-tuned exponent. Inspired by recent approaches that model online learning-to- In contrast, our ranking function employs weights for both the CTR rank as a contextual multi-armed bandit problem [2], we treat the and the bid factor, which are optimized through online learning, in- challenge of optimizing weights for the ranking formula as an dependent of query keywords, and in the context of each placement online learning process. Under this model, we attempt to learn on our site. Consequently, we demonstrate through an empirical the best action, that is, weights for the ranking formula, per each evaluation, that the optimized weights are not affected by keyword context, namely device and placement on the site, while observing specific conditions, but rather by the degree of CTR estimation revenues resulting from user clicks. effectiveness, which varies across the different placements. We show, through an empirical evaluation, that our approach Advertiser efficiency in sponsored search is generally not consid- in both use cases outperforms the existing ranking in production, ered as a separate objective. The common premise is that advertiser which is based on the traditional ranking formula. The evaluation revenues are directly related to CTR and thus improving CTR esti- also validates the underlying premises of our approach. In the mation also increases efficiency with respect to advertisers. Wang revenue optimization task, we point out to the correlation between et al. [7] recognized that the objectives of the site and advertisers the CTR estimation effectiveness and the weights learned across are not always consistent and proposed to model ads ranking as a the different devices and placements. In the task of optimizing multi-objective optimization problem. However, similar to the com- advertiser ROI while keeping revenues unchanged, we show the mon premise, they also used CTR as the objective for optimizing contribution of the click-out factor to increase in advertiser ROI. advertiser utility. In our work, we focus on balancing revenues and advertiser ROI, wherein the latter is related more directly to cost-per-action (CPA) 2 RELATED WORK than CTR. CPA is affected to a large degree by the effectiveness of Works in sponsored search that address the challenge of revenue op- the advertiser view item page, but can also be improved directly timization mostly focus on improving click-through rate estimation through the ranking formula as we demonstrate in the next sections. [1, 5, 6]. Predicting CTR for ads is typically based on machine- learned models trained using past click-through data. Examples 3 METHOD of such models are logistic regression [1, 5], probit regression [6] We revisit the traditional ranking formula in sponsored search, and boosted trees [3]. These models employ multiple features that where the ranking score of an ad is equal to its bid multiplied by might affect the probability of a user clicking on an ad, such as its estimated CTR, and introduce the following weighting scheme: textual match between the user’s query and ad content, historical ad performance and personal user preferences. There has also been CPC w 1 ∗ CT Rw 2 (1) some work on employing learning-to-rank methods for the CTR estimation task [8], where a statistical model is learned offline. such that w 1 + w 2 = 1. Our work, on the contrary, treats the CTR estimation effective- For the revenue optimization task, we presume that if CTR esti- ness as a given, and aims to maximize revenues through online mation is somewhat ineffective, it can be compensated by applying learning of a ranking formula, that applies a larger weight to the a larger weight to the bid factor. Accordingly, we study sets of bid factor to compensate for possibly ineffective CTR estimation. weights where w 1 > w 2 . It can be expected that reducing the CTR The most relevant work to ours is that of Lahaie and Pennock weight in the ranking would result in a lower click-through rate. [4]. It showed that applying an exponent, substantially smaller than However, if the CTR estimation is indeed ineffective, this drop one, to the CTR estimation, can yield superior revenue in equilib- should be relatively mild and should be more than compensated by rium under certain conditions. There are several differences to our an increase in average CPC, yielding higher revenues. Online Learning of a Ranking Formula for Revenue and Advertiser ROI Optimization ICTIR ’17, , Amsterdam, Netherlands Table 1: Performance on the revenue optimization task. Change in click-through rate (CTR), average cost-per-click (Avg CPC), revenue per mille impressions (RPM) and click- out rate between ranking using the learned weights and a baseline of the traditional ranking formula, for five differ- ent devices and two placements on the SRP, at the top and in- terleaved with organic results. Overall performance changes that are statistically significant are marked with ’*’. Figure 2: Example of two ads with relatively similar revenue Device Placement CTR Avg CPC RPM Click-Out expectation but large difference in click-out rate. In these cases, using the click-out factor we can display more ads Desktop Top -1.10% 3.43% 2.29% -1.05% with higher click-out rate to improve advertiser ROI. Inter. -2.87% 6.89% 3.82% 1.06% iPhone- Top -1.92% 4.40% 2.40% 0.24% App Inter. -8.41% 10.15% 0.89% 1.59% For the task of balancing revenues and advertiser ROI, we em- Android- Top -4.05% 8.93% 4.52% 3.86% ploy the traditional product of CPC and CTR, which captures the App Inter. -7.43% 11.99% 3.67% 3.87% expected revenue from an ad, and introduce a third factor into the iOS- Top -4.72% 10.17% 4.97% 2.07% formula to capture an ad’s click-out rate: Tablet Inter. -2.09% 4.30% 2.13% 4.33% Android- Top -4.39% 10.59% 5.73% 2.05% CPC ∗ CT R ∗ w 1 + CLICK − OUT ∗ w 2 (2) Tablet Inter. -3.39% 6.19% 2.59% 4.71% such that w 1 + w 2 = 1 . Our premise is that the click-out factor Overall -3.18% * 6.43% * 3.05% * 1.66% can serve as a tie-breaker in cases where two competing ads have relatively similar revenue expectations, as illustrated in Figure 2. An effective tie-breaker will allow us to keep revenues relatively Finally, we also consider more fine-grained weights in adjacency stable while significantly increasing advertiser efficiency. to the best performing solution. If, for example, a bid factor weight For each of these two ranking formulas, our aim is to find an of 0.6 has given the best performance, we consider weights of optimal set of weights per each context, namely a certain device 0.55 and 0.65, and run another iteration with the best performing and placement on our site. We model this challenge as a contextual solution and the fine-grained weights. multi-armed bandit problem, where each slot represents a set of weights, and our objective is to play the best slot for each context. The best slot would be the one that maximizes the expected reward, 4 EVALUATION that is, revenues in the case of Formula 1, or ratio of advertiser We evaluate our methods using the following online experiments efficiency to revenues in the case of Formula 2. We also consider on the classifieds site Marktplaats.nl. Each of the two use cases we the effect on the relevancy of the results to users and impose a introduced In Section 1 is evaluated separately. bound on a click-through rate drop that would be tolerated. First we optimize the weights for each device and placement To optimize the weights, we devise a split test setting, with following the procedure described in Section 3. Learning phase equally sized buckets, representing different sets of weights, and takes two weeks, one week for initial weights and one week for propose the following online learning procedure with a epsilon-first fine-grained weights. Training data overall includes more than 50 strategy of pure exploration followed by pure exploitation. million impressions and more than 1 million unique ads. Subsequent First, we set weights as per the above mentioned premises. Specif- to the learning phase, we run an online A/B test to evaluate the ically, for the revenue optimization task, the weight of the bid factor ranking formula with the learned weights against a baseline of is set to values in {0.6,0.7,0.8,0.9}, based on our presumption, that the the traditional sponsored search ranking formula, which is the bid factor should have a larger weight than that of the CTR estima- existing method in production. Each alternative is assigned with tion factor. For the task of balancing revenues and advertiser ROI, an equal size of the traffic divided randomly by user ID. Lastly, we set the weight of the click-out rate to values in {0.025,0.05,0.075}. we collect data for the evaluation over a one-month period and These relatively minor weights reflect our premise that this factor report the following measures: revenues per mille impressions, click- should serve as a marginal tie-breaker. through rate, average cost-per-click and click-out rate. Statistical Next, we observe each slot’s performance over a one-week pe- significance of performance differences is determined using a two riod, to account for seasonal factors, and select the weighting tailed paired t-test with p = 0.05. scheme that achieves the maximal improvement with statistically Table 1 presents the performance on the revenue optimization significant difference to the baseline, and such that click-through task. As expected following the overweighting of the bid factor, rate does not drop by more than 10%. Our main performance ob- click-through rate drops for all the devices and placements, but this jective is revenue per mille impressions (RPM) in the revenue op- is more than compensated by an increase in average CPC, such that timization task, and the ratio of CPA to RPM for advertiser ROI overall revenues increase. We see an increase in RPM across all the optimization. If no set of weights meets this criteria, we keep the devices and placements, contributing to a statistically significant baseline, which is the traditional sponsored search ranking formula. increase of 3% in overall RPM. ICTIR ’17, , Amsterdam, Netherlands Or Levi Table 2: CTR estimation effectiveness, measured by AUC, and the best performing weight of the bid factor per each device and placement. Device Placement AUC Best Peforming Bid Weight Desktop Top 0.723 0.55 Interleaved 0.698 0.6 iPhone App Top 0.715 0.55 Interleaved 0.678 0.65 Android App Top 0.704 0.6 Interleaved 0.634 0.7 iOS Tablet Top 0.541 0.8 Interleaved 0.552 0.8 Android Tablet Top 0.571 0.8 Figure 3: Performance on the task of balancing revenues and Interleaved 0.538 0.8 advertiser ROI. Change in overall revenue per mille impres- sions (RPM) and cost-per-action (CPA) for three weights of the click-out factor in Formula 2, compared to a baseline Next, we study the correlation between the effectiveness of CTR with no click-out factor. Note that CPA drop is equivalent to estimation per each device and placement, and the best performing improvement in advertiser ROI. Performance changes that weight of the bid factor. To evaluate the CTR estimation effective- are statistically significant are marked with ’*’. ness we use the AUC measure calculated on past click data. Table 2 presents the AUC and best performing weight per each device and placement. We see a clear correlation between the two (-0.985 demonstrated that the underlying premises of our approach are Pearson correlation); Specifically, the less effective the CTR esti- evident in the results. First, the less effective the CTR estimation mation (lower AUC), the larger the weight of the bid factor. As for a specific device or placement, the larger the learned weight expected, CTR estimation for tablets is substantially less effective of the bid factor. Second, setting minor weights for the click-out than for desktop and mobile apps, due to click sparsity. Accordingly, factor can serve to increase advertiser ROI directly through the they are assigned with the largest bid factor weights. We also see ranking formula, and the larger the weight, the bigger the increase that the AUC for the interleaved results is generally slightly lower to advertiser ROI. than that of the top results, and the learned bid factor for them is As avenues for future work, we plan to extend the learning generally larger. method to online Bayesian bandits. In addition, we plan to test more For the task of balancing revenues and advertiser ROI on the advanced CTR estimation methods and subsequently revisit this home page feed using ranking Formula 2, we present the overall analysis. It can be expected that once more effective CTR estimation RPM and CPA performance for three weights of the click-out factor is reached, it would be less beneficial to outweigh the bid factor, if (Figure 3). As expected, RPM drops in all the three alternatives, at all. but the drop in CPA is much more substantial, which presents an attractive trade-off for improving advertiser ROI, especially in the REFERENCES low-weight alternative where revenues are essentially unchanged. [1] H. Cheng and E. Cantu-Paz. 2010. Personalized click prediction in sponsored search. In Proceedings of the third ACM international conference on Web search Moreover, we can see that the larger the weight of the click-out fac- and data mining, WSDM. tor, the larger the drop in CPA, or equivalently, the more advertiser [2] A. Grotov and M. de Rijke. 2016. Online Learning to Rank for Information Re- ROI is improved. trieval. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1215–1218. [3] A. Kornetova I. Trofimov and V. Topinskiy. 2012. Using boosted trees for click- 5 CONCLUSIONS through rate prediction for sponsored search. In Proceedings of the Sixth Inter- national Workshop on Data Mining for Online Advertising and Internet Economy, We introduced two use cases for ads ranking at a classifieds website ADKDD. that complement each other as part of our aspiration to optimize [4] S. Lahaie and D. M. Pennock. 2007. Revenue analysis of a family of ranking rules for keyword auctions. In Proceedings of the 8th ACM Conference on Electronic revenues in the near term while improving advertiser ROI to sustain Commerce. 50–56. long term business. To address these challenges, we re-examine the [5] E. Dominowska M. Richardson and R. Ragno. 2007. Predicting clicks: estimat- traditional sponsored search ranking formula, introducing a weight- ing the click-through rate for new ads. In Proceedings of the 16th international conference on World Wide Web (WWW-07). 521–530. ing scheme where the bid factor is more dominant to compensate [6] T. Borchert T. Graepel, J. Q. Candela and R. Herbrich. 2010. Web-scale bayesian for CTR estimation ineffectiveness, in the first case, and introducing click-through rate prediction for sponsored search advertising in microsoft’s bing search engine. In Proceedings of the 27th International Conference on Machine a click-out factor as a tie-breaker, in the second one. Subsequently, Learning. we proposed an online learning procedure in a multi-armed ban- [7] J. Yan Y. Wang, B. Wei, Z. Chen, and Q. Du. 2012. Multi-objective optimization dit setting to optimize the ranking formula for each case. Online for sponsored search. In Proceedings of the Sixth International Workshop on Data Mining for Online Advertising and Internet Economy. experiments showed that our ranking formula with the learned [8] J. Yang D. Wang J. Yan J. Hu Y. Zhu, G. Wang and Z. Chen. 2009. Optimizing search weights outperforms a baseline of the traditional ranking formula, engine revenue in sponsored search. In Proceedings of the 32nd international ACM which is currently implemented in production. Furthermore, we SIGIR conference on Research and development in information retrieval. 588–595.