=Paper= {{Paper |id=Vol-2410/paper8.pdf |storemode=property |title=Organic Ponies and Sponsored Batteries: A Category-Based CTR Optimization Model |pdfUrl=https://ceur-ws.org/Vol-2410/paper8.pdf |volume=Vol-2410 |authors=Or Levi |dblpUrl=https://dblp.org/rec/conf/sigir/Levi19 }} ==Organic Ponies and Sponsored Batteries: A Category-Based CTR Optimization Model== https://ceur-ws.org/Vol-2410/paper8.pdf
                 Organic Ponies and Sponsored Batteries:
                A Category-Based CTR Optimization Model
                                                                                 Or Levi
                                                                           ebay/Marktplaats
                                                                            olevi@ebay.com
ABSTRACT                                                                              1 INTRODUCTION
A common challenge for E-commerce sites is the allocation                             E-commerce sites often present users with two types of re-
of available digital real estate between organic and sponsored                        sults: organic and sponsored. Both serve important business
results. While methods for optimizing each type of results in                         functions; Organic results represent consumer-to-consumer
isolation have been extensively studied, selective presenta-                          listings that help to maintain an active user base, whereas
tion of these two types to optimize overall performance has                           sponsored results represent business-to-consumer ads which
been largely unexplored.                                                              allow for monetization. This gives rise to the challenge of al-
   Our work aims to address this allocation challenge at Mark-                        locating available digital real estate between these two types
tplaats.nl, one of the largest sites in the ebay classifieds group.                   of results. Previous works [2, 5, 6, 8] have addressed the
To this end, we explore the interplay between organic and                             challenge of optimizing organic and sponsored results in iso-
sponsored results across a variety of item categories while re-                       lation. However, selective presentation of these two types to
flecting on findings by previous works. We hypothesize that                           optimize overall performance has been largely unexplored.
in categories of niche items, such as Ponies, organic results                            Our work aims to address this allocation challenge at Mark-
perform better than sponsored results, while in categories of                         tplaats.nl, one of the largest sites in the ebay classifieds group.
commoditized items, such as Batteries, the opposite is true.                          The Marktplaats homepage feed, presented in figure 1, the
   Based on our findings, we propose a simple and adaptive                            largest placement on the site in terms of traffic and revenues,
allocation model to improve the overall CTR performance.                              employs a paradigm that allocates equal amounts of organic
Empirical evaluation attests to the merits of our model, com-                         and sponsored impressions on a per category basis. The
pared to the existing method in production, with a signifi-                           homepage feed holds two desirable traits for our study on
cantly higher click-through rate for both organic and spon-                           the relevancy of results. First, unlike the search result pages,
sored results.                                                                        where the presentation order of organic and sponsored re-
   For future work, we consider the challenges of optimizing                          sults can affect the performance, all results in each page of
the allocation for profitability, rather than clicks, and taking                      the feed are shuffled together, producing a random order and
into account additional factors beyond category, such as                              eliminating position bias towards one type of results. Sec-
personal user preferences.                                                            ond, while the sponsored results on search result pages are
                                                                                      marked with a badge, there is no similar mark for sponsored
KEYWORDS                                                                              results on the feed, removing the disclosure effect on user
E-commerce, Sponsored Advertising, Click Prediction                                   behavior.
                                                                                         To address the allocation challenge, we study the relation-
ACM Reference format:                                                                 ship between the item category and the relative performance
Or Levi. 2019. Organic Ponies and Sponsored Batteries: A
                                                                                      of the two types of results. Our hypothesis is that in some cat-
Category-Based CTR Optimization Model. In Proceedings of the
                                                                                      egories organic results perform better than sponsored results
SIGIR 2019 Workshop on eCommerce (SIGIR 2019 eCom), 5
                                                                                      while in others the opposite is true, due to the different na-
pages.
                                                                                      ture of these two types. Organic results usually reflect more
Copyright © 2019 by the paper’s authors. Copying permitted for private and academic   second hand stuff or niche items, while sponsored results
purposes.                                                                             are geared more towards new products and commoditized
In: J. Degenhardt, S. Kallumadi, U. Porwal, A. Trotman (eds.):
Proceedings of the SIGIR 2019 eCom workshop, July 2019, Paris, France, published at
                                                                                      items. For example, users looking for Ponies are more likely
http://ceur-ws.org                                                                    to be interested in the organic results, while users looking
                                                                                      for Batteries are likely to find the sponsored results more
                                                                                      relevant.
SIGIR 2019 eCom, July 2019, Paris, France                                                                              Or Levi




                                                                  Figure 2: Ratio of organic CTR to sponsored CTR per
                                                                  category, sorted in descending order. For instance, or-
                                                                  ganic results perform better for ’Ponies’, but spon-
                                                                  sored results are relatively better for ’Batteries’. Since
      Figure 1: The Markplaats Homepage Feed
                                                                  organic results outperform across the majority of cat-
                                                                  egories, our method employs a normalization by the
   Our main contribution is a framework for selective pre-        median CTR ratio, such that in half of the categories
sentation of organic and sponsored results to optimize the        we show more organic and in the other half more
overall performance of an E-commerce site operator. We            sponsored.
show through empirical evaluation that our method outper-
forms the existing method in production, with a significantly          Rank                  Category
higher click-through rate for both organic and sponsored                 1        Animals and Accessories | Ponies
results.                                                                 2         Animals and accessories | Dogs
                                                                         3                 Mopeds | Honda
2   RELATED WORK                                                         4         Animals and Accessories | Cats
Previous works [1, 4, 7] have studied the interplay between              5     Computer Games | Nintendo Game Boy
organic and sponsored results on the search results page.         Table 1: The top 5 item categories with highest organic
Yang et al. [7] studied whether the presence of organic list-     CTR compared to the sponsored CTR
ings on a search engine is associated with a positive, a neg-
ative, or no effect on the click-through rates of paid search
advertisements. Their findings suggest that clicks on organic          Rank                  Category
listings have a positive interdependence with clicks on paid             1    Cell Phones | Chargers and car chargers
listings, and vice versa, and that this positive interdepen-             2     Audio, TV and Photography | Batteries
dence is asymmetric such that the impact of organic clicks
                                                                         3              Holiday homes | Italy
on increases in utility from paid clicks is much stronger.
                                                                         4          Car miscellaneous | Stickers
   Danescu et al. [4] investigated the perceived relative use-
                                                                         5       Services and Professionals | Movers
fulness of the results with respect to the nature of the query.
They found that when both sources focus on the same intent,       Table 2: The top 5 item categories with highest spon-
for navigational queries there is a clear competition between     sored CTR compared to the organic CTR
ads and organic results, while for non-navigational queries
this competition turns into synergy. Similarly, Agarwal et al.
[1] found that an increase in organic competition leads to        explore the relative performance of organic and sponsored
a decrease in the click performance of sponsored advertise-       results across the different categories while also reflecting
ments. However, organic competition helps the conversion          on findings by previous works.
performance of sponsored ads and leads to higher revenue.            Our study reveals, in accordance with prior findings [3],
                                                                  that organic results generally attain higher click-through
3   DATA EXPLORATION                                              rate than sponsored results, as shown in figure 2. However,
To test our hypothesis, we collect a dataset based on the         the ratio of organic CTR to sponsored CTR varies to a large
responses of users to several hundred millions of impressions,    degree across the different categories.
across more than one thousand categories, over a one-month           Tables 1 and 2 show the top 5 item categories (translated
period, from the logs at Marktplats.nl. We use the data to        from Dutch) with the highest organic CTR compared to the
Organic Ponies and Sponsored Batteries:
A Category-Based CTR Optimization Model                                              SIGIR 2019 eCom, July 2019, Paris, France
sponsored CTR, and vice versa. Among the categories with
the highest relative organic CTR, ’Animals and Accessories’
is very dominant. This might be a result of users generally
looking for animals while the sponsored ads are selling ac-
cessories, such as dog harnesses and horse food. The other
categories with relatively high organic CTR are ’Mopeds |
Honda’ and ’Computer Games | Nintendo Game Boy’. A quick
examination of the inventory of ads in these categories re-
veals that there is no business-to-consumer seller of ’Honda
mopeds’ or ’Nintendo Game Boy’, but only ads for moped
parts and console games, respectively.
   Among the categories with the highest relative sponsored
CTR, it is not surprising to see ’Batteries’, ’Phone Chargers’
and ’Car Stickers’, given that users normally do not buy these     Figure 3: The proposed allocation per category based
items second hand. Further examination of more categories          on our method. Contrary to a naive equal amount al-
with relatively high sponsored CTR reveals multiple exam-          location, the new allocation is highly consistent with
ples in ’Holiday homes’ and ’Services and Professionals’. It       the actual CTR ratio (shown overlaid)
could be that users value the reputation and expertise of a
business-to-consumer seller in these categories in particu-
lar. Overall, this confirms our hypothesis with regard to the
                                                                      Our model, on the contrary, uses historical CTR perfor-
different nature of organic and sponsored results, and the
                                                                   mance to allocate the impressions between the two types
potential to adjust the allocation on a per category basis.
                                                                   of results, proportionally to their expected relative perfor-
                                                                   mance, while normalizing with the expected relative per-
4   METHOD                                                         formance of the median category. This helps to account for
Our work aims to allocate impressions between organic and          the a priory preference of users towards organic results and
sponsored results to improve the overall performance. While        maintain the preexisting overall balance of organic and spon-
the profitability of clicks on sponsored results is straightfor-   sored impressions, given that the correlation between the
ward to measure, it is much more difficult to evaluate how         relative performance and the category size is not significant
much clicks on organic results are worth, given that organic       (0.03 Pearson correlation). We also apply a multiplier of 0.5
results do not generate revenue directly, but help to maintain     such that the impressions of the median category are divided
an active user base.                                               equally.
   Consequently, we make two simplifying assumptions. First,
we focus on optimizing for clicks, rather than profitability,                                          CT R Ratio(x)
                                                                             Allocation(x) = 0.5 ·                             (1)
as a common denominator for organic and sponsored re-                                               Median CT R Ratio
sults. Our assumption is that more clicks would translate          where CT R Ratio(x) = Orдanic CT R(x)/Sponsored CT R(x)
to more leads with organic results, and more revenues with         for a category x and such that 80% ≥ Allocation(x) ≥ 20%.
sponsored results. Second, we bypass the question of how              We limit the proposed allocation, such that we never show
much an organic click is worth compared to a sponsored             less than 20% of the impressions from one type. This guar-
click, by keeping the preexisting overall balance of organic       antees that we will have sufficient data regarding the per-
and sponsored impressions while showing, on a per category         formance of both the organic and sponsored results in each
basis, more of the type that is expected to perform better.        category, to continue updating the model. Contrary to the
In other words, we maintain the same total numbers of or-          naive equal amount method, the proposed allocation per
ganic and sponsored impressions, but only allocate them in         category is highly consistent with the actual CTR ratio, as
a smarter way between the categories, such that the click-         presented in figure 3.
through rate, and the clicks, for both organic and sponsored          To employ the proposed allocation, we produce a table
results increase.                                                  with the calculated ratio of organic to sponsored results per
   Given that organic results generally attain higher click-       each category. This table is then loaded into an ElasticSearch
through rate than sponsored results, as discussed in section       index in production. In query time, we look-up the ratio
3, a straightforward allocation model based on historical          per the relevant category and the impressions are allocated
CTR performance is likely to impair the overall balance of         between organic and sponsored results accordingly. We use
impressions, resulting in significantly more organic results.      Apache Spark to build a pipeline for collecting the data and
SIGIR 2019 eCom, July 2019, Paris, France                                                                                 Or Levi

calculating the optimal allocation. This process runs end-to-                      Precision Recall F1
end offline, which allows for a simple and scalable solution.                          0.82     0.81     0.81
The Spark job runs weekly to support dynamic allocation            Table 3: The allocation challenge as a classification
that adapts based on changes in performance.                       task. While a naive equal amount allocation has no
                                                                   predictive ability, our model is able to predict between
                                                                   sponsored and organic results with f1-score of 0.81
5   EVALUATION
The allocation challenge can be seen as classification task
of predicting on a per category basis, whether sponsored
results will perform better or worse than organic results. We          Organic Results Sponsored Results Overall
use the data collected in section 3 to evaluate the predictions             5.98%*               8.31%*          7.10%*
of our model in an offline setting. We split the data by weeks     Table 4: Main Results. Increase in click-through rate
and use each consecutive pair of weeks as the train and test       based on our method compared to the existing method
sets, predicting based on the historical CTR of the prior week     in production. Statistically significant differences are
and evaluating using the next one. For this classification task,   marked with ’*’
the baseline with an equal amount of organic and sponsored
impressions has no predictive ability, meaning that it does
not provide any insight regarding the relative performance of
organic and sponsored results per category. On the contrary,       in clicks reflects that the results are generally more relevant
our model is able to predict between sponsored and organic         to the users and is translated, as assumed, to an increase
results with precision of 0.82 and recall of 0.81, as shown in     in leads of 0.9% with the organic results and an increase in
table 3.                                                           revenues of 1.1% with the sponsored results.
   We further evaluate our model through an online A/B
test over a two-week period. Each group is assigned with           6   CONCLUSION AND FUTURE WORK
an equal size of the traffic divided randomly by user ID. The      Our work addressed the challenge of allocating digital real
evaluation demonstrates the superiority of our model, com-         estate between organic and sponsored results. We studied
pared to the existing method in production of equal amount         the interplay between these two types of results across dif-
allocation, with a significantly higher click-through rate for     ferent categories, and found that organic results generally
both organic and sponsored results, as shown in table 4. The       attain higher CTR, in accordance with prior findings, but this
two-tailed paired t-test with a 0.05 significance level was        varies to a large degree across the different categories, con-
used for testing statistical significance of performance differ-   firming our hypothesis with regard to the different nature of
ences. Further examination confirms that the overall balance       organic and sponsored results. Based on these findings, we
of organic and sponsored impressions remains unchanged             proposed a simple and adaptive impression allocation model
as planned.                                                        that accounts for the a-priory preference of users towards
   To illustrate why the CTR increases for both organic and        organic results and is highly consistent with the actual CTR
sponsored results, consider the following ’toy’ example. As-       ratio per category. Empirical evaluation demonstrated the
suming we have two categories: A and B, and in each we             superiority of our model, compared to the existing method
show 100 impressions, of which 50 organic and 50 sponsored.        in production, with a significant increase in click-through
Moreover, if we assume that in the initial state, users clicked    rate for both organic and sponsored results, that has made a
on all the sponsored results in category A, and only those,        great impact on the relevancy of the results and revenues at
and vice versa with the organic results in category B, then        Markrplaats.nl.
the initial CTRs for both organic and sponsored, across both          As avenues for future work, we plan to extend this work
categories, are 50%. With our method, we allocate 80% of           to further placements on the site. Specifically, this work has
the 100 impressions in category A to sponsored results and         focused on the homepage feed. Next, we plan to experiment
80% of the 100 impressions in category B to organic results        with the impression allocation method on the search result
(respecting the 20% lower bound). If the user behavior would       pages.
remain 100% consistent, we would expect the CTR for both              Furthermore, in this work we have made a couple of sim-
organic and sponsored results to increase to 80%. In prac-         plifying assumptions due to the difficulty of estimating the
tice, the behavior is not fully consistent due to temporal         worth of clicks on organic results. Consequently, we em-
changes in the ad inventory and user preferences, however          ployed a constrain to keep the overall balance of organic and
this approximation allows to shift the allocation in a desir-      sponsored impressions. This leaves room for future work to
able direction, as demonstrated in our results. The increase       propose models for estimating the monetary value of organic
Organic Ponies and Sponsored Batteries:
A Category-Based CTR Optimization Model                                        SIGIR 2019 eCom, July 2019, Paris, France

clicks, and remove this constrain, to optimize for overall prof-
itability directly.
   Lastly, a generalization of our approach could employ a
confidence-based classifier to predict how good are the or-
ganic or sponsored results in a category. Note that this would
still require a normalization scheme, perhaps using the a pri-
ory class probabilities. The features for this method can be
based on historical performance as in our work. We also plan
to study the effect of factors, such as user preferences with
regard to price, buying new versus second hand, and more,
on the interplay between organic and sponsored results. We
envision that these features could be utilized in a contextual
bandit setting to learn a personalized optimal allocation, per
user and category.

7    ACKNOWLEDGMENT
We thank our colleagues at Marktplaats.nl and especially the
Finding team for their support in implementation and set up
of the experiment.

REFERENCES
 [1] K. Hosanagar A. Agarwal and M. Smith. 2015. Do Organic Results
     Help or Hurt Sponsored Search Performance?. In Information Systems
     Research. 291–300.
 [2] I. Markov L. Stout F. Xumara A. Grotov, A. Chuklin and M. de Rijke.
     2015. A Comparative Study of Click Models for Web Search. In CLEF.
     78–90.
 [3] M. Resnick B. Jansen. 2006. An examination of searcher’s percep-
     tions of nonsponsored and sponsored links during ecommerce Web
     searching. In J. Assoc. Inf. Sci. Technol.
 [4] E. Gabrilovich V. Josifovski C. Danescu-Niculescu-Mizil, A.Z. Broder
     and B. Pang. 2010. Competing for users’ attention: on the interplay
     between organic and sponsored search results. In WWW. 291–300.
 [5] H. Cheng and E. Cantu-Paz. 2010. Personalized click prediction in spon-
     sored search. In Proceedings of the third ACM international conference
     on Web search and data mining, WSDM.
 [6] T. Joachims. 2002. Optimizing Search Engines using Clickthrough
     Data. In KDD. 133–142.
 [7] A. Ghose S. Yang. 2010. Analyzing the Relationship Between Organic
     and Sponsored Search Advertising: Positive, Negative, or Zero Inter-
     dependence?. In Journal Marketing Science.
 [8] T. Borchert T. Graepel, J. Q. Candela and R. Herbrich. 2010. Web-scale
     bayesian click-through rate prediction for sponsored search adver-
     tising in microsoft’s bing search engine. In Proceedings of the 27th
     International Conference on Machine Learning.