<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Quality Ad Selection: A Model-based Approach to Performance Filtering</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mandar S. Chaudhary</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Javad Nejati</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mahmuda Rahman</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gajanan Adalinge</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abraham Bagherjeiran</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>eBay Inc.</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Major e-commerce platforms display advertisements (ads) on search results page through a three-phase approach: retrieval, selection, and ranking. The efectiveness of ad selection algorithms directly impacts the quality of candidates available for ranking on the search results page, thereby heavily influencing buyer engagement and advertiser performance. Ad selection algorithms need to eficiently prune the set of retrieved ads due to limited latency and resource capacity for the ranking stage. In order to pass better quality ads to the ranking stage, we propose a two-stage ad selection algorithm which filters ads that degrade buyer experience on search engine results page (SERP). Our algorithm is based on an ad performance filter which presents a novel approach for identifying and filtering low-performing ads. First, we formulate eligibility criteria to select ads with suficient exposure on SERPs, and second, we leverage this criteria to identify ads with low buyer engagement. We demonstrate the eficacy of our approach by conducting online experiments using the A/B test framework of a major e-commerce platform. Results show that the proposed two-stage ad performance filter significantly improves Click-Through-Rate, and it highlights the impact of developing a well designed ad selection filter to enhance buyer and advertiser experiences.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Sponsored products quality</kwd>
        <kwd>Low-performing ads filter</kwd>
        <kwd>label generation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In the rapidly evolving landscape of e-commerce, the efectiveness of sponsored search results is
paramount to both advertisers and platform providers. Sponsored ads serve as a crucial revenue stream
for e-commerce companies, while simultaneously influencing consumer purchasing decisions. These
results, often displayed prominently on search engine results page (SERP), are a significant revenue
driver for e-commerce platforms and also a critical tool for advertisers to reach potential customers. The
quality of these sponsored listings impacts buyer satisfaction and engagement, making it imperative for
platforms to ensure that the ads presented are relevant and engaging to the user.</p>
      <p>
        Major e-commerce platforms follow a well-known multi-stage approach shown in Figure 1 to display
ads on SERP [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1, 2, 3</xref>
        ]. This approach primarily consists of the ad retrieval, selection, and ranking stages
that aim to maximize buyer engagement as well as the total value of ads displayed on the SERP. The
retrieval phase matches the buyer query with the sponsored ads keywords specified by the advertiser
and usually fetches millions of ad listings. The ad selection algorithm should eficiently trim this ad
space and maintain quality listings for the final ranking stage. However, achieving this balance poses
a complex challenge. On the one hand, e-commerce platforms must prioritize buyer experience by
showcasing high-quality ads while on the other hand advertisers are keen on maximizing the exposure
of their listings to enhance brand visibility and drive sales. This duality necessitates a sophisticated
approach to ad selection, where the interests of both users and advertisers are harmonized. Additionally,
this approach must also address the cold-start problem since advertisers daily onboard thousands of
new ad listings in the sponsored ads program.
      </p>
      <p>The goal of ad selection algorithms is to provide suficient opportunities for each ad listing to
surface on SERP, and once they have accumulated suficient exposure, eficiently filter the ads that are
unlikely to receive buyer engagement. It is very dificult to quantify these suficient opportunities for
each ad listing because the true performance of an ad is not known until it has surfaced on SERP to
accumulate a few hundred or even thousands of impressions. Furthermore, allowing low-quality ads on
premium placements on SERP can dilute the overall quality of search results, leading to suboptimal user
experiences and reduced revenue potential. Low-quality ads, characterized by poor engagement metrics
such as low click-through rates (CTR), can undermine the efectiveness of sponsored search results.</p>
      <p>In this work, we introduce an innovative ad performance filter designed to filter low-performing ads
that negatively impact buyer experience. The contributions of our proposed work can be summarized
as follows,
• Construct an ad cohort to group similar ads, enabling comparative analysis of ad performance
within each cohort based on buyer engagement metrics.
• We formulate a novel approach to quantify the suficient opportunities provided to each ad in
a cohort, and identify eligible ads with measurable performance. We build on the definition of
eligible ads to propose our novel approach for identifying and filtering low-performing ads in
each cohort.
• We demonstrate the eficacy of our approach by performing online experiments using A/B test
framework of a major e-commerce platform, and results indicate a significant improvement in
CTR by filtering ad impressions that do not receive clicks.</p>
      <p>The rest of the paper is organized as follows. We present a survey of state-of-art priors works in
Section 2 followed by details of the proposed method in Section 3, and experimental results in Section 4.
In the end, we present discussions and future work in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Improving ad selection with ad quality filters is not a well-studied problem, since most of the prior work
is primarily based on three diferent approaches: optimizing query-keyword matching [
        <xref ref-type="bibr" rid="ref4">4, 5, 6, 7</xref>
        ], early
stage ranking with multi-task framework [8, 9, 10], and reinforcement learning to improve ad selection
policy [
        <xref ref-type="bibr" rid="ref1">1, 11, 12</xref>
        ]. Although our proposed work does not directly fit into these approaches, it has some
overlap with multi-task frameworks that learn ad quality signals to improve buyer engagement.
      </p>
      <p>
        Prior works based on multi-task learning frameworks have incorporated diferent kinds of ad quality
signals to select the best set of ads for the final ranking stage [
        <xref ref-type="bibr" rid="ref3">3, 13, 14</xref>
        ]. These signals are based on
explicit buyer feedback or derived from buyer feedback using feature engineering to measure ad quality.
For instance in [14] a multi-task learning framework is developed to learn two diferent quality signals
that measure the ad dismiss rate and post-ad-click experience respectively. These signals contribute
towards the improvement of the final CTR prediction task. Another prior work in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] learned two ad
quality events namely cross-out rate, which measures the number of times a user explicitly does not
want to see an ad, and survey assessment, which records user ads rating with higher rating indicating
      </p>
      <p>Low-performance Ad</p>
      <p>Filter</p>
      <p>Pass</p>
      <p>Pass
Beta Binomial
Bayesian Filter</p>
      <p>Ad Eligible Filter</p>
      <p>Non-eligible Ads</p>
      <p>Allow the ad
serves as a guardrail to filter out low-quality ads across all cohorts. Ads that pass this initial filter proceed to the
second stage, where they are evaluated for eligibility using the ad eligibility filter. Eligible ads are then further
assessed using the low-performing ad filter. Ads deemed ineligible in the second stage may still be allowed to
appear on the search results page (SERP), provided they have satisfied the criteria of the first-stage filter.
better ad quality. Both the prior works used explicit buyer feedback to create the ad quality labels. On
the other hand, the work in [13] focused on predicting the time interval for an ad to be discontinued for
user exposure where the labels for time interval were generated heuristically from the user feedback
data.</p>
      <p>The aforementioned prior works deliver promising results with the state-of-art deep learning based
multi-task methods. While efective, these methods are dificult to scale in ad selection systems with
limited latency and resource model capacity of the final ranking stage. We posit that similar to these
works, the valuable buyer engagement feedback from the impressed ads can be harnessed to develop
meaningful ad quality signals for improving the selection process with simple and eficient approaches.
To the best of our knowledge, this is the first work to propose a novel light-weight two-stage filter with
an intuitive definition to quantify ads performance from low buyer engagement.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Method</title>
      <p>In this section we describe in detail our two-stage ad performance filter with the Beta Binomial Bayesian
model as the first stage and the low-performance ad model as the second stage. Figure 2 shows the
high-level approach of the two-stage filter pipeline. The first stage filter guardrails buyer experience
from ads which do not meet the minimum quality criteria while the second stage filters identify ads
with measurable performance on SERP and filters poor quality ads. As part of the second stage filter
we describe our approach for defining an ad cohort and two label generation methods. The first label
generation method develops an ad eligibility criteria to identify ads with suficient impressions on SERP.
These ads are considered to have received suficient exposure and buyer interaction to reliably estimate
their performance. The second label generation approach determines whether an eligible listing is
low-performing based on its performance in their respective cohort.</p>
      <sec id="sec-3-1">
        <title>3.1. Leveraging Beta-binomial Bayesian Model</title>
        <p>We utilize the well-known Beta-Binomial Bayesian model to estimate posterior mean scores of
ClickThrough-Rate and Purchase-Through-Rate respectively for each ad listing [15]. These models have
been widely used for measuring item’s performance by smoothing their quality scores with priors to
address the cold-start problem [16, 17, 18]. The posterior mean for computing Click-Through-Rate of an
ad listing  from exposure of impressions and clicks on country  can be estimated from the following
beta distribution,</p>
        <p>Beta
︂(
 −   +</p>
        <p>The beta binomial priors are estimated for each country using the well-known Method-of-Moments
(MoM) and Maximum Likelihood Estimation (MLE) approaches. The priors are initialized using the
estimates produced by MoM and refined iteratively using MLE until the diference in prior values
between successive iterations is less than 1e-3. We apply a threshold each on the smoothed CTR and
PTR scores to filter listings that do not meet the minimum quality bar for CTR and PTR. All listings
which pass the first-round are passed onto the next steps.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Identifying Ad Cohort</title>
        <p>Our approach groups similar ad listings into cohorts to accurately measure ad performance. This is
important because ads in diferent cohorts can display vastly diferent performance metrics. For example,
an ad for search query iphone case might have a higher CTR than one for sectional couch, it could still
be considered low-performing when compared to other ads within its own category. Therefore, we
evaluate ad eligibility and click-through-rate performance within the context of their specific cohorts,
allowing for more meaningful comparisons.</p>
        <p>Major e-commerce platforms provide an option for sellers to list an item under a suitable business
vertical and category based on the item’s functionality. It creates a natural grouping of similar items and
provides a clear organization of the inventory. Informally, an ad cohort is defined as a group of ads with
suficient listing count that exhibit a certain level of similarity in terms of semantics and functionality.</p>
        <p>We consider each combination of country and listing category as a cohort to group similar ads. This
combination can be extended to include diferent levels of granularity such as price, item aspects and
query. However, higher granularity increases the data sparsity, which results in fewer ads in each
cohort. In this work, we define an ad cohort by the (, ) combination as it provides a
reasonable trade-of between data sparsity and ad group similarity. Next, we measure the
click-throughrate performance of each ad cohort to establish the ad eligibility criteria. Specifically, we calculate the
impressions and clicks for each cohort using a rank-discounted exponential decayed function.</p>
        <p>Formally, we denote  = { |  = 1, . . . , } as the set of  distinct cohorts, and the CTR score for
each cohort is stored in  = { |  = 1, . . . , }.</p>
        <p>where    and    are the priors of the beta distribution for country ,   is the total click
count and  is the total impression count for ad listing . Similarly, we also compute the
PurchaseThrough-Rate (PTR) score of  by estimating beta binomial priors using the total sale count,  , and
total impression count, . The priors are useful for providing smoothed scores for new ad listings that
do not have any exposure on the SERP thereby addressing the cold-start problem. Finally, the smoothed
CTR and PTR scores of an ad listing are estimated as the mean of their corresponding beta distributions,</p>
        <p>=</p>
        <p>=
∑︁ ( *   , +  )</p>
        <p>,∈
,∈
 =
∑︁ ( *   , + )
(2)
(3)
(4)
(5)
(6)
•  and  refer to the decayed click count and decayed impression count with rank discount
at current timestamp  observed across  and  series of timestamps respectively, and  is the
decay factor. For simplicity, we will use the notation  and  for the decayed click count
and impression count respectively.
• The timestamps ,  ∈  represent the consecutive timestamps when the listing received
impressions and  , represents the time elapsed between  and  timestamps. Similarly, ,  ∈ 
where  &lt;  represents the timestamps with consecutive clicks.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Defining Ad Eligibility</title>
        <p>Once the ad cohort is established, we assess an ad’s eligibility to be categorized as low performing
by evaluating its visibility on search engine results pages (SERPs) which is defined in terms of its
impression count. The impression of each ad shown on SERP belongs to one of the cohorts  ∈ , and
we determine an ad is eligible if it has accumulated suficient impression count under the given cohort .
Let A = { |  = 1, . . . , ;  = 1, . . . , } denote the set of ads such that ℎ ad listing,  , has received
at least one impression under cohort  . Correspondingly, I = { |  = 1, . . . , ;  = 1, . . . , }
contains the rank-discounted exponential decayed impression count of the ad listings in their cohorts.
Finally, the criteria to determine an ad is eligible under the given cohort is as follows,
 &gt;</p>
        <p>1
  =

(7)
(8)
where   is the inverse of  and measures the average number of impressions per click for cohort  .
The cohort’s click-through-rate  score is used to estimate a threshold for the number of impressions
each ad should be provided before their performance can be reliably judged. Ads with fewer than
  impressions are not considered eligible as they have not received suficient exposure to the buyer
on SERP. Such ads also referred to as non-eligible ads are allowed to pass through the second-stage
iflter and have the opportunity to surface as impression as long as they pass the first-round filter. The
non-eligible group of ads comprises newly listed ads with no historical data, as well as existing listings
that have not received any recent exposure on the SERP within a defined time window. As a result,
our label generation process inherently handles the cold-start problem by treating new ad listings as
not eligible for filtering by our two-stage filter. We define the time window and additional experiment
details in Section 4.1.</p>
        <p>Intuitively, the ad eligibility criterion determines   opportunities in the form of impressions for each
ad in a given cohort before their click-through-rate performance can be considered. Note that the same
item can belong to more than one cohort and it can be labeled as eligible and low-performing in one
cohort but not in another or it can be eligible and low-performing in all cohorts.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Detecting Low-performing Ads</title>
        <p>In this step, we consider all eligible ads and determine if they are low-performing based on their quality
scores. Consider the set of ads, Ã︀ ⊆ A, which have received suficient impression count with respect to
their cohort and the set P̃︀ ⊆ P denote their quality scores. An ad is considered low-performing in a
cohort if its quality score is lower than a threshold that is computed from the quality score distribution
of the ads in the cohort. For a given cohort  , we calculate the lower ℎ percentile of the quality score
distribution from all eligible ads and set it as the threshold for identifying low-performing ads. An ad
with quality score lower than this threshold is labeled as low-performing ad and the rest are labeled as
not-low-performing. Below we present the equation for calculating this threshold ˜ from the quality
scores of all eligible items, ˜ , in cohort  .</p>
        <p>˜ = { |  ∈ P̃︀ ⇐⇒  &gt;   }
(9)
˜ = Percentile(˜ , )
(10)
Finally, we formulate the conditions for identifying an ad  as a low-performing ad as follows,
( &gt;   ) and ( &lt; ˜ )</p>
        <p>An ad in the  cohort with at least   impression count and a quality score lower than ˜ will be
labeled as an eligible and low-performing ad, since it has received suficient exposure to the buyer on
SERPs, and with buyer feedback incorporated into its quality score it has been observed to be among
the worst performing listings in its cohort.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Model Training</title>
        <p>
          We train two classification models to predict ad eligibility and low-performing ad respectively. The
response variable in the ad eligibility prediction model is binary valued where a value 1 indicates
the ad is eligible and 0 indicates an ineligible ad. Similarly, for low-performing ad model, the target
variable is also binary valued with a value of 1 indicating a low-performing ad and 0 indicating the ad
is not-low-performing. The predictor set for both models included a combination of content-based and
historical features. We trained both classification models using the XGBoost algorithm [ 19] and logistic
loss function by varying the number of trees in the model in the range of [
          <xref ref-type="bibr" rid="ref1">1, 50</xref>
          ].
        </p>
        <p>The eligible ad model was trained by adding sample weights to the loss function. The sample weights
were set to the rank-discounted decayed impression count of each ad listing thereby penalizing the
model if it incorrectly predicts ad listings with high impression count. No such sample weights were
applied for training the low-performance ad model.</p>
        <p>1 ∑︁  [ log(ˆ) + (1 − ) log(1 − ˆ)]
Loss(, ˆ) = −</p>
        <p>=1
where  = ∑︀  is the rank weighted decayed impression count of the ℎ ad listing.</p>
        <p>The output of the ad eligibility prediction model is used to determine whether an ad should be further
examined for low performance or whether it should be allowed more opportunities. As shown in Figure
2 if an ad has a low probability of being eligible it will have an opportunity to appear as impression on
SERP whereas an ad with high probability of being eligible will receive another prediction score from
the low-performing ad model, and the ad will be filtered if its probability score of being low-performing
is higher than a threshold.
(11)
(12)</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>In this section, we evaluate our proposed two-stage filter by describing the ofline experiment setup
and demonstrate the efectiveness with experiments on real-world trafic using A/B test framework.</p>
      <sec id="sec-4-1">
        <title>4.1. Ofline experiments</title>
        <p>We sampled logs of sponsored ad listings on SERP of a major e-commerce platform over a period of
three months. The dataset comprised of 3.5 billion impressions and each ad listing was labeled as
eligible and low-performing based on the look-back period of one month. The training and validation
datasets were generated by splitting the data by time where all ad listings with impressions before
timestamp  were included in the training set and those after timestamp  were used for validation set.</p>
        <p>We evaluate the efectiveness of the eligible ad model in distinguishing between eligible and
noneligible ads. The eligible category encompasses both low-performing and not-low-performing ads. Our
analysis focuses on the change in item filter rate for low-performing ads, not-low-performing ads, and
non-eligible ads as we vary the thresholds used by the eligible ad filter and the low-performance ad
iflter.
0.0
)80
%
(
te60
a
R
r
e
lit40
F
m
e
It20
0
Low-performing ads
Not Low-performing ads
Non-eligible ads</p>
        <p>Low-performing ads
Not Low-performing ads
Non-eligible ads
0.2 0.4 0.6 0.8</p>
        <p>Low-performance Ad Probability
(a) Eligible ad threshold=0
0.2 0.4 0.6 0.8</p>
        <p>Low-performance Ad Probability
(b) Eligible ad threshold=0.3
0.2 0.4 0.6 0.8</p>
        <p>Low-performance Ad Probability
(c) Eligible ad threshold=0.6</p>
        <p>In Figure 3a, we calculate item filter rates for the three types of ads subject to a threshold of zero
to pass the eligible ad filter and apply only the low-performance ad filter. Simulations with varying
thresholds of low-performance prediction score show that a significant portion of non-eligible ads
get filtered alongside low-performing ads. For instance, with a threshold of 0.6, around 40% of all
non-eligible ads get filtered which would lead to great dissatisfaction among advertisers. The eligible
ad model prevents this by passing only eligible ads to the next stage. This is evident from Figures 3b-3c
where ads with a eligible score greater than 0.3 and 0.6 respectively are applied the low-performance
iflter. However, this comes at the cost of also reducing the total fraction of low-performing ads that can
be filtered from 100% to less than 50%. Therefore, carefully tuning threshold for eligible ad model is a
trade-of between precision and recall as higher threshold presents more precise results but lowers the
fraction of low-performing ads that can be filtered.</p>
        <p>As briefly discussed in Section 3.3, our ad eligibility label generation mechanism mitigates the
coldstart problem. We found that the ad eligibility model only predicted an ad as eligible for receiving a
low-performing ad prediction score once it had accumulated, on average, at least 25 impressions. This
behavior underscores the model’s efectiveness in handling the cold-start challenge.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Online A/B Test Results</title>
        <p>We perform an online experiment for two weeks using the A/B test framework to evaluate the
efectiveness of the two-stage low-performance ad filter across four diferent channels including, desktop,
mobile web, iOS, and Android. The A/B test trafic was equally distributed between the control and
treatment groups with users randomly assigned to each group. The results of the A/B test experiment
are presented in Table 1.</p>
        <p>The experimental results demonstrate the efectiveness of the two-stage filter in reducing the number
of ad impressions which do not receive a click on SERP. Particularly, we observe a statistically significant
reduction of -0.55% in the total ads impression count without afecting the total click count while slightly
positive trending total sale count of +0.47%. As a result there is a significant increase of +0.51% in
Click-Through-Rate and an increase of +1.03% in Sale-Through-Rate with confidence interval of [-0.1%,
+2.16%]. The statistical significance was measured by a two-sided t-test with a p-value of 0.05. We
also observed the fraction of impressions from low-performing ads dropped by -19.6% compared to the
Control group. By reducing the number of impressions without losing clicks or sales, the two-stage ad
performance filter was able to improve the buyer experience.</p>
        <p>The promising results support the design of developing an eficient two-stage filter that does not
require substantial infrastructure investment as deep learning methods. These findings validate the
approach of creating an innovative ad selection filter that emphasizes creating meaningful ad quality
signals to improve sponsored search buyer experience.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussions and Future Work</title>
      <p>In this work, we developed a simple and intuitive novel approach for identifying ads with poor
clickthrough-rate on SERP. The proposed two-stage filter quantifies the opportunities for each ad and presents
two label generation strategies for classifiers to learn the patterns of eligible and low-performing ads.
The approach is evaluated on real-world trafic with an A/B test to illustrate the eficacy of the two-stage
iflter by filtering impressions that do not lead to a click.</p>
      <p>As part of our future work, we plan to improve this approach in a few diferent ways. The proposed
label generation strategies do not take advantage of the entire inventory as the ad quality signals are
measured only for the impressed ad listings. To address this drawback of selection bias, we plan to
improve the approach by generating pseudo-labels for non-impressed ads so they can be included in
generating ad quality signals as well as model training. For instance, pseudo-labels for non-impressed
inventory can be obtained from the final CTR ranker. We also plan to refine the approach for grouping
ads based on their ad cohort by including additional information such as embedding similarity scores
of ads. The embeddings can be generated by including several additional signals such as seller id, price,
image and aspects. Lastly, we plan to develop a similar model for ads with low conversion rates to
further improve buyer and advertiser experience.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the author(s) used ChatGPT for grammar and spelling check,
paraphrase and reword. After using this tool/service, the author(s) reviewed and edited the content as
needed and take(s) full responsibility for the publication’s content.
[5] H. Wang, Y. Liang, L. Fu, G.-R. Xue, Y. Yu, Eficient query expansion for advertisement search, in:
Proceedings of the 32nd international ACM SIGIR conference on Research and development in
information retrieval, 2009, pp. 51–58.
[6] A. Broder, P. Ciccolo, E. Gabrilovich, V. Josifovski, D. Metzler, L. Riedel, J. Yuan, Online expansion
of rare queries for sponsored search, in: Proceedings of the 18th international conference on
World wide web, 2009, pp. 511–520.
[7] Y. Choi, M. Fontoura, E. Gabrilovich, V. Josifovski, M. Mediano, B. Pang, Using landing pages for
sponsored search ad selection, in: Proceedings of the 19th international conference on World wide
web, 2010, pp. 251–260.
[8] J. Ma, Z. Zhao, X. Yi, J. Chen, L. Hong, E. H. Chi, Modeling task relationships in multi-task learning
with multi-gate mixture-of-experts, in: Proceedings of the 24th ACM SIGKDD international
conference on knowledge discovery &amp; data mining, 2018, pp. 1930–1939.
[9] H. Tang, J. Liu, M. Zhao, X. Gong, Progressive layered extraction (ple): A novel multi-task learning
(mtl) model for personalized recommendations, in: Proceedings of the 14th ACM conference on
recommender systems, 2020, pp. 269–278.
[10] Z. Zhao, L. Hong, L. Wei, J. Chen, A. Nath, S. Andrews, A. Kumthekar, M. Sathiamoorthy, X. Yi,
E. Chi, Recommending what video to watch next: a multitask ranking system, in: Proceedings of
the 13th ACM conference on recommender systems, 2019, pp. 43–51.
[11] S. Han, R. Lakritz, H. Wu, Augmented two-stage bandit framework: Practical approaches for
improved online ad selection (2024).
[12] Q. Shi, F. Xiao, D. Pickard, I. Chen, L. Chen, Deep neural network with linucb: A contextual
bandit approach for personalized recommendation, in: Companion Proceedings of the ACM Web
Conference 2023, 2023, pp. 778–782.
[13] S. Kitada, H. Iyatomi, Y. Seki, Ad creative discontinuation prediction with multi-modal multi-task
neural survival networks, Applied Sciences 12 (2022). URL: https://www.mdpi.com/2076-3417/12/
7/3594. doi:10.3390/app12073594.
[14] N. Ma, M. Ispir, Y. Li, Y. Yang, Z. Chen, D. Z. Cheng, L. Nie, K. Barman, An online multi-task
learning framework for google feed ads auction models, in: Proceedings of the 28th ACM SIGKDD
Conference on Knowledge Discovery and Data Mining, 2022, pp. 3477–3485.
[15] G. Casella, An introduction to empirical bayes data analysis, The American Statistician 39 (1985)
83–87.
[16] C. Han, P. Castells, P. Gupta, X. Xu, V. Salaka, Addressing cold start in product search via empirical
bayes, in: Proceedings of the 31st ACM International Conference on Information &amp; Knowledge
Management, 2022, pp. 3141–3151.
[17] Q. Liu, A. Singh, J. Liu, C. Mu, Z. Yan, J. Pedersen, Long or short or both? an exploration on
lookback time windows of behavioral features in product search ranking, in: Proceedings of ACM
SIGIR Workshop on eCommerce (SIGIR eCom’24), 2024.
[18] P. Gupta, T. Dreossi, J. Bakus, Y.-H. Lin, V. Salaka, Treating cold start in product search by priors,
in: Companion Proceedings of the Web Conference 2020, 2020, pp. 77–78.
[19] C. T. G. C. XGBoost, et al., A scalable tree boosting system, in: Proc 22nd ACM SIGKDD Int Conf
Knowl Discov Data Min, 2016, pp. 785–794.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Pei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Qiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Tao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Guan</surname>
          </string-name>
          , et al.,
          <article-title>Optimizing ad pruning of sponsored search with reinforcement learning</article-title>
          ,
          <source>in: Companion Proceedings of the Web Conference</source>
          <year>2021</year>
          ,
          <year>2021</year>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>127</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-C.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <article-title>Towards a better tradeof between efectiveness and eficiency in pre-ranking: A learnable feature selection based approach</article-title>
          ,
          <source>in: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>2036</fpage>
          -
          <lpage>2040</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Wen</surname>
          </string-name>
          , et al.,
          <article-title>Towards the better ranking consistency: A multi-task learning framework for early stage ads ranking</article-title>
          ,
          <source>arXiv preprint arXiv:2307.11096</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.-S.</given-names>
            <surname>Bai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gao</surname>
          </string-name>
          , T.-Y. Liu,
          <article-title>Global optimization for advertisement selection in sponsored search</article-title>
          ,
          <source>Journal of Computer Science and Technology</source>
          <volume>30</volume>
          (
          <year>2015</year>
          )
          <fpage>295</fpage>
          -
          <lpage>310</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>