<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Online Advertising Auctions : Robust Click-Through-Rate Prediction</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ryohei Emori</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shinya Suzumura</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nobuyuki Shimizu</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Takahiro Hoshino</string-name>
          <email>hoshino@econ.keio.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Instrumental Variables, Omitted Variable Bias, Robustness, Cold-start Problem, Click-Through-Rate, Online Advertising Auction</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Keio University</institution>
          ,
          <addr-line>2-15-45, Mita, Minato-ku, Tokyo</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LY Corporation</institution>
          ,
          <addr-line>Kioi Tower 1-3 Kioicho, Chiyoda-ku, Tokyo</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Riken AIP center</institution>
          ,
          <addr-line>1-4-1 Nihonbashi, Chuo-ku, Tokyo</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <fpage>25</fpage>
      <lpage>29</lpage>
      <abstract>
        <p>Predicting the click-through rate (CTR) in online ad auctions is essential for calculating bid amounts and forming rankings. However, predicting CTR from historical data faces some dificulties, one of which is the cold-start problem. Our research uses the instrumental variables (IVs) framework to address the cold-start problem and selection bias, validating robust CTR prediction in online advertising auctions. Although generally identifying IVs in wide applications is notably challenging, their potential use is not limited to CTR prediction; they can potentially be used to address practical issues and research questions in advertising auctions in general. We put forth bid amounts as IVs, discussing their validity as IVs and testing the robustness of predictions using IVs in both simulations and real data scenarios. Moreover, we enhanced our methodology by integrating explicit interactions between bid amounts and other features, demonstrating that accounting for heterogeneity in IVs significantly improves prediction accuracy in actual data. Our proposal on IVs and its refined CTR prediction approach enriches the research fields on causal inference robustness and invariant prediction.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Online advertising, an essential backbone of the digital
economy, relies heavily on accurate prediction models to allocate
ads efectively and enhance the user experience. Crucially,
the accuracy of click-through rate (CTR) prediction plays a
pivotal role in determining the success in terms of welfare of
of online advertising auctions, and at the same time, hover
the potential biases that may skew results [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
      </p>
      <p>
        In addition to the problem of bias that lurks in some
online ad auctions and is often the subject of research, the
cold-start problem arises when we must make predictions
for new advertisements or infrequent users, leading to
decreased predictive accuracy. Against the backdrop of
problems arising from those various factors, causal methods of
predicting user behavior that capture invariant user
behavior have risen as a subject of high research interest [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ].
Among them, prior research [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] has highlighted that one
of those causal methods, the instrumental variables (IVs)
method, has the potential to contribute to solving the
coldstart problem. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] provided a methodology for IVs using
neural networks, but specific IVs always need to be
identiifed in a specific research domain. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] uses the user’s search
query as an instrumental variable; their use of IVs is
limited to search advertising and may not satisfy one of the
conditions for IVs, the exclusion restriction.
      </p>
      <p>In this paper, we identify bid amounts as IVs in online ad
auction settings and demonstrate that click prediction using
the IVs method exhibits robust predictions in the overall
prediction and cold start problems.</p>
      <p>Although IVs are generally considered dificult to identify,
they have the potential to: 1) maximize the use of data,
including impressions of ads with low historical win rates; 2)
not require random impressions of ads; 3) avoid assumptions
AdKDD’24 30th ACM SIGKDD Conference on Knowledge Discovery and
∗Corresponding author.</p>
      <p>CEUR</p>
      <p>
        ceur-ws.org
that often lead to erroneous predictions due to the
unrealistic absence of unobserved confounding factors between
treatment and outcome relationships [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]; and 4) potentially
infer the causal efect of impressions on conversion as well
as clicks.
      </p>
      <p>
        Furthermore, we demonstrate that the explicit use of
firststage heterogeneity in the IVs method can be strongly
recommended in online ad auctions [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ]. First-stage
heterogeneity in the IVs method has been relatively overlooked
compared to heterogeneity in the second stage, namely, user
response. However, we find that increasing the association
between IVs and impression probability shows robust
predictions for the overall prediction and the cold-start problem.
      </p>
      <p>The contributions of the paper have three main points:
1. We identify and propose valid IVs tailored to online
advertising auctions. The IVs suit broad advertising
auction contexts, including display and search
advertising. Furthermore, the IVs method is expected
to have further applications such as causal inference
of medium- and long-term efects of ad impressions
on conversions, etc., not limited to causal efects on
user click behavior in online ad auctions.
2. There have been few empirical examples the IVs
method has been demonstrated to be capable of
making invariant behavioral predictions. We identify
valid IVs for further application in the setting of
online ad auctions, a setting in which the research
ifeld has been broaden, and demonstrated the
robustness of the IVs method’s prediction accuracy for the
overall forecast and the cold-start scenario in our
experiments.
3. Notably, our research advances the concept of
utilizing the first stage heterogeneity in the IVs method
in the context of prediction. By considering
heterogeneity in the strength of IVs concerning impression
probability, our method shows more significantly
robust prediction performance in whole prediction
and the cold-start scenario.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Identification of Instrumental</title>
      <p>Variables in Ad Auctions
2.1. Ad Auctions and Biases
 - ./0.1,!!, !!,!!  .345678034,!!, !!,!!
!.</p>
      <p>Before we explain that the bid amounts is IVs, we describe
the setting in ad auctions. This is because it is essential to
examine the actual flow of data generation to ascertain the
IVs.</p>
      <p>The notations used to describe the auction mechanism
are as follows: the total number of auctions is N, the number
of auctioneers participating in auction  ∈ {1, ⋯ ,  }
and the auctioneer’s advertisement is   ∈ {1, ⋯ ,   }. Let
ad   ,  
  be the bid amount that the auctioneer spends on the

 be the predictive click-through-rate, and  ∗ be

the ad that wins an impression to the user in the auction
not,</p>
      <p>. Also,   is the outcome that is 1 if ad   is clicked and 0 if

is a variables vector used to target ads and users in
ad   . To simplify complex efects such as position bias, we
assume a setting where there is only one ad that wins an
that is 1 if the ad  ∗ is clicked and 0 otherwise.
impression. Therefore, let   be a binary dummy that is 1

when   =  ∗ and 0 otherwise. Also, let   be the outcome

is   ,
Here,  
 is as followed:

  = (   = 1|   = 1,   ),

where  
ables.
clicked given winning impression, target and other
vari  is the probability of whether ad   will be</p>
      <p>
        In ad auctions, there can be various methods for
determining auction scores. Here, for instance, the auction score
is calculated as follows:
  =    ×  
 ,

This determination scheme, which takes into account bid
amount and predictive CTR in the auction score, has been
studied under the name ”weighted GSP” [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ]. When the
bid amount is a manual bid by the auctioneer, it is generated
from the distribution of bid amounts conditional on the
target variable of the ad set by the auctioneer. Alternatively,
when the bid amount is an automated bid by the platform,
the bid amount is generated by, for example, predictive
conversion rate (pCVR) and target CPA. In this case,  
is a function of
      </p>
      <p>. That is, bid amounts is generated from
some distribution conditioned on the target variables of
the ad set by the auctioneer or other variables used by the


platform. Thus,
   ∼  (
 ),

!!
!!
!!
where  (⋅) is the generated distribution of bid amounts.</p>
      <p>
        As summarized by [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], bias in the recommendation
system is a looping process. Figure 1 depicts the looping of
several biases, focused in ad auctions setting, which are
interdependent. In particular, the auction score will be biased
if the platform’s prediction of the pCTR is a biased estimator.
The same is true for pCVR and adjust term. The assignment
of impressions by the auction score with bias is as follows:
 ∗ = arg max
      </p>
      <p>∈{1,⋯,  }

biased.</p>
      <sec id="sec-2-1">
        <title>2.2. Causal View of Online Ad Auctions</title>
        <p>!!
sponse, and   is unobserved heterogeneity of click behavior

that correlates with some or all of   consisting of user and

ad features but cannot be observed, known as the omitted
variable.  ∗(⋅) is a function returns a predictive probability
when    = 1.</p>
        <p>Treatments are determined in the auction system together
with predicted values such as pCTR and pCVR, which are
conditioned on the user and ad features involved in ad
auctions, and the advertiser’s bid amount. At this point,
pCTR and pCVR are not conditioned on omitted variables
 

, which generates a bias in the estimates of predictive
outcome. Since the bid amount is determined from the
predictions with this bias and an auction is formed, there is a
strong suspicion that the impressions   are endogenous

variables, which are variables correlated with the error term
amplified through the auction with the omitted variable
bias. We consider the assumption that no omitted variables
exist as a type of inductive bias, a convenient assumption
for pCTR model.</p>
        <p>Unconfoundedness, i.e., a situation where no omitted
variables exist, is a somewhat severe assumption for
realworld data. Therefore, IVs methods that do not require the
assumption of unconfoundedness can be compelling and
valuable.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.3. Validating Bid Amounts as IVs</title>
        <p>There are three conditions that valid IVs satisfy. The first is
the relevance of the IVs to a treatment variable. The second
can write them as follows:
is an exclusion restriction, where the IVs does not directly
afect the outcome but rather afects the outcome through
the treatment variable. The third is the independence of the
IVs with respect to the treatment and the outcome. Notating
IVs vector in ad   as   and combining these conditions, we

  ∶


  ∶
  ∶
IVs

{

  ⟂̸  ,</p>
        <p>,    } ⟂   ,</p>
        <p>|    ⟂   ,


We argue that bid amounts is valid as IVs in ad auctions.
The reason bid amounts function as IVs is summarized in
impressions, the relevance is explicitly acknowledged by the
fact that the main item in the auction score is the bid amount.
Concerning the exclusion restriction, the bid amount only
influences impressions through the auction score.
Therefore, the bid amounts does not influence the user’s click
behavior. Conditional on the variables used by advertisers
and platforms to set bid amounts, bid amounts are valid
instruments.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.4. Reasons Other Variables are Not Valid</title>
        <p>Here, we introduce why other variables, such as bid times
used for targeting, do not meet the conditions of an
instrumental variable in ad auctions.</p>
        <p>Relevance : Take targeting variables as an example.
From the perspective of relevance, advertisers determine
bid amounts based on targeting users, which should relate
to the probability of assignment. Bid amounts influence
the auction score directly, ensuring more vital relevance
than targeting variables, while targeting variables have an
”indirect” relevance to the auction score.</p>
        <p>Conditional Independence : The more crucial
condition, however, is that targeting variables do not satisfy the
independence from the unobserved factors afecting the
user’s probability of clicking. For instance, consider bid
times as one of the targeting variables. The time when a
user requests an advertisement, that is, the user’s visitation
process, and the probability of clicking the ad can be
related. Users visiting at 10 AM may have a higher or lower
probability of clicking an ad, and even if conditioned on
other targeting variables, the presence of unobserved
factors makes it impossible to guarantee the independence of
bid times from the click probability. On the other hand, the
probability that a user will click is considered independent
of the bid amount, conditioned on the targeting variables,
since the user cannot know how much was paid for the
specific advertising at the time of the click.</p>
        <p>Exclusion Restriction : From the perspective of the
exclusion restriction, targeting variables afect the probability
of a user’s click, and do not ensure that their influence on the
click probability is exerted solely through the assignment
of impressions.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Click Prediction with First-stage</title>
    </sec>
    <sec id="sec-4">
      <title>IVs Heterogeneity</title>
      <p>In the methodology section, we propose several variants of
the IVs method to examine the following questions:
• Q.1 Do prediction methods using simple neural
networks with IVs perform in the online ad auction
setting? and
• Q.2 Is IVs heterogeneity strongly present in online
ad auction settings and is explicitly addressing it
efective in prediction?,
• Q.3 Heterogeneity in treatment efects is widely
known, but by how much improvement relative to
accounting for heterogeneity in IVs?
To introduce models that respond to those questions, the
methodology section is organized as follows. For Q.1, We
ifrst introduce the basic structure of the nonparametric IVs
method and highlight its heterogeneous relevance to the
probability of winning impressions in ad auctions. Next,
Q.2, we present a method based on an attention network
that explicitly considers interactions between IVs and their
other features. Finally, Q.3, we explicitly incorporate
heterogeneity in click probabilities by employing an interaction
structure similar to the heterogeneity of instrumental
variables. Figure 3 summarizes our proposed final IVs method.</p>
      <p>For simplicity in subscripting the training data, 
corresponds to the record number in this section.</p>
      <p>y!"#!$

&amp;,'()*
','()*
&amp;,$%
!,$%
…</p>
      <p>…
…
…

&amp;,'()*
','()*
&amp;,$%
!,$%
tion of multiple IVs, and we assume that   depends only on
  through (</p>
      <p>,   ) and call it first stage.  ∗ is a function
which is called second stage. In the ad auctions, (
that returns a predictive probability of the event   = 1,
is the predicted impression probability, henceforth   
which is a multi-task learning frame and can be trained
 ,   )
in one step together with  
. Using neural networks,
a layer structure can be used that follows the simplified
manner of IVs, which we henceforth refer to as the IV-BS
approach.</p>
      <p>Although there can be several approaches incorporating
interactions between features and IVs, we use an attention
network. This is because it is suitable merely for validating
the idea of bid amount heterogeneity.</p>
      <sec id="sec-4-1">
        <title>3.2. Leveraging First-Stage IVs by</title>
      </sec>
      <sec id="sec-4-2">
        <title>Interactions</title>
        <p>Given a dataset, let the input feature matrix be represented
as  after passing through an input layer where all units are
fully connected, including units from   
and features.</p>
        <p>Let  denote the batch size and  represent the number of
units in the input layer, leading to 
having dimensions
of  ×  . The instrumental variable, represented as matrix
 , has dimensions  × 1 . To align with the shape of  ,
matrix  iv is formed by performing a tiling operation on  .
Specifically, each row of  is replicated on the basis of the
number of columns in  . Furthermore, the weight matrix for
IVs interaction is denoted as  iv and has dimensions  ×  .
Using these matrices, the attention score  iv is calculated
as:
 iv =  (</p>
        <p>iv( iv ⊙  ) +  iv).</p>
        <p>Here, we use the swish function as an activation function in
the weight matrix  iv so as to represent the non-linear
strength in the heterogeneity of bid amounts.
We feed
element-wise products as interactions into the fully
connected layer with the softmax function as the activation
function to generate the attention score   . Then, we
obtain the representation g by the element-wise product of
the input layer  and the generated attention scores  iv.</p>
        <p>iv =  iv ⊙ 
We combine the representation g obtained by the attention
layer and the features input in a fully connected neural
network to form the hidden layer.</p>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Second-stage Heterogeneity</title>
        <p>In the second stage, namely in  
side, it is evident that
heterogeneity exists when conditioning on user and
advertisement features regarding the efect of impressions.
Similarly to how we took the dot product of bid amounts
and feature units in the input layer in the first stage, we
symmetrically use the same in the second stage. The input
layer consists of fully connected units from   
tures. The structure of the entire network including   
and
fea</p>
      </sec>
      <sec id="sec-4-4">
        <title>3.4. Loss Function for Multi-task Learning</title>
        <p>In the multi-task learning framework for pIMP and pCTR,
we adjust the loss function for pCTR by applying sample
weights through an indicator function, 1{  =1}:

 
= 
 
This function ensures that the 
is only computed
for data points with impressions, when   = 1, filtering out
instances without impressions from afecting the pCTR loss
calculation. This approach allows us to concentrate on the
performance of the model to predict CTR.
× 1{  =1}
 
5:
6:
7:
8:
9:
10:
15:
16:
17:
18:
19:
20:
21:</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Experiments</title>
      <p>The experimental section is divided into two parts:
simulation and evaluation in scenarios approximating the
coldstart problem with real data sets. The code for replication
is available at the following link:
https://github.com/ryoheiemori/NPIV-pCTR. Please note that the repository excludes
sections related to private data.</p>
      <p>The notation is consistent with that used in Section 3.</p>
      <sec id="sec-5-1">
        <title>4.1. Simulated Datasets</title>
        <p>Algorithm 1 Simulating auction data and validating
baselines
1: 1. Initializing paramaters:
2: Set parameters ( , ,  )
3:  ∶= 0
4: while  &lt; 5, 000 do</p>
        <p>
          Generate   and  
  ∼ Bernoulli(   ), where    = Logistic( 
′ +   )
if   = 1 then
  )
 ∶=  + 1
end if
  ∼ Bernoulli(  ), where    = Logistic(  ′ +

11: end while
12: Train pCTR: (  = 1|  = 1) ∶= (  )
13: 2. Generating historical auction data:
14: for each auction  in 5, 000 do
   ∼ Beta(, 2) by [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], where  ∶= Logistic( 
′

 )
  = 20
Generate   and
        </p>
        <p>
          a specific distribution: Uniform [
          <xref ref-type="bibr" rid="ref5">−5, 5</xref>
          ] for  ∈ {1, ⋯ , 10} ,
from a normal distribution with a mean of 0.1 and variance
of conditional independence between the treatment    and
4.2.2. Test data
In the test data, the prediction baselines using the day after
the 7 days of training data is evaluated. The test dataset
consists of all independently displayed records conditional
on ads’ targeting variables.
        </p>
        <p>To evaluate the model’s performance in cold-start
scenarios, the test data was divided based on previous ad
impressions. Specifically, the data was split into 20 subsets at every
5% quantile, with each subset containing data points below
the respective quantile. To ensure suficient sample size,
the test data included 2,000,000 records. Predicting clicks
with more past impressions is generally easier, even with a
simple baseline.</p>
      </sec>
      <sec id="sec-5-2">
        <title>4.3. Evaluation Score</title>
        <p>We used log loss, known as a standard evaluation metric for
pCTR, and the area under the curve (AUC) scores. AUC is a
proper metric for evaluating rankings in assessing the ability
to predict the correct position in auction rankings. For the
simulation data, we employes the actual scores and relative
scores to compare improvements. For our real dataset, we
present relative evaluation scores due to confidentiality. The
relative scores are defined as follows:
Relative LogLoss =</p>
        <p>Relative AUC = (</p>
        <p>Naive LogLoss − Compared LogLoss</p>
        <p>Naive LogLoss
Compared AUC − 0.5</p>
        <p>Naive AUC − 0.5
− 1) × 100.
To evaluate our proposed methods with instrumental
variables, we took a naive benchmark and comparative
baselines.</p>
        <p>tion.
1. Naive: The Naive has three hidden layers between
the input layer of features and their passage to the
sigmoid function, building a pCTR model. Each of
these hidden layers consists of 256 units. The first
layer uses the swish activation function, while the
second and third layers use the ReLU activation
func2. IV-BS: The baseline is described in section 3.1. Its
pCTR model has the same network structure as
Naive, including</p>
        <p>in the input layer.
3. IV-FS: The baseline is described in section 3.2. In
side, it has the same network structure as
IV4. IV-SSFS: The baseline in  
side is described in
section 3.3, while its network has the same structure
as IV-FS in</p>
        <p>
          side.
5. UBIPS : It consists of   
times  
for
unbiased inverse propensity weighting estimator [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
Its network structure is consistent with IV-BS for
        </p>
      </sec>
      <sec id="sec-5-3">
        <title>4.5. Comparing Each Baselines</title>
        <p>7
6
5
4
sLog
Lo
3
2
1</p>
        <p>Naive
IV-BS
UBIPS
100
0
× 100, performance even with omitted variables. IV-BS remains
stable and robust, especially on the left side where the test
data’s   value is high. Notably, omitted variable bias cannot
be ignored even in the Weighted GSP impression assignment
algorithm, and in this regard, IV-BS demonstrates superior
performance.
An evaluation of our proposed methods on the real dataset
is shown in Figure 5. It is expected that Naive performs
relatively well since the training data includes many ads with
numerous impressions. However, our proposed methods,
IV-BS, IV-FS, and IV-SSFS, show significant improvement
in relative AUC, particularly for ads with few previous
impressions. The improvement of UBIPS over Naive, unlike
in the simulation experiment, is likely attributable to the
confounder being associated with the variable observed in
the actual data.</p>
        <p>Improvement for ads with few impressions matches that
for ads with many, likely due to the infrequent inclusion
of rare ads in training data, causing popularity bias.
Notably, the increasing improvement of IVs methods for the
0 − 20 quantile of previous impressions demonstrates their
robustness in predicting rare ads.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion</title>
      <p>This paper argues that bid amount is a valid
instrumental variable under the assumption of conditional
independence, and tested its validity by applying it to predictive
CTR. Our experiment on a real dataset showed that explicitly
accounting for heterogeneity in the strength of IVs allows
for eficient and robust predictions. For greater
extensibility, incorporating complex interactions between IVs and
other features with more developed approachs such asgraph
neural networks is recommended. Additionally, addressing
other looping bias and validating prediction methods in
repeated auctions would be valuable.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.</given-names>
            <surname>Marotta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , A. Acquisti,
          <article-title>The welfare impact of targeted advertising technologies</article-title>
          ,
          <source>Information Systems Research</source>
          <volume>33</volume>
          (
          <year>2022</year>
          )
          <fpage>131</fpage>
          -
          <lpage>151</lpage>
          . doi:
          <volume>10</volume>
          . 1287/isre.
          <year>2021</year>
          .
          <volume>1024</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <article-title>Bias and debias in recommender system: A survey and future directions</article-title>
          ,
          <source>ACM Transactions on Information Systems</source>
          <volume>41</volume>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bühlmann</surname>
          </string-name>
          , Invariance, causality and robustness,
          <source>Statistical science 35</source>
          (
          <year>2020</year>
          )
          <fpage>404</fpage>
          -
          <lpage>426</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jiang</surname>
          </string-name>
          , Causpref:
          <article-title>Causal preference learning for out-of-distribution recommendation</article-title>
          ,
          <source>in: Proceedings of the ACM Web Conference</source>
          <year>2022</year>
          ,
          <year>2022</year>
          , pp.
          <fpage>410</fpage>
          -
          <lpage>421</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Feder</surname>
          </string-name>
          , G. Horowitz,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Reichart</surname>
          </string-name>
          ,
          <string-name>
            <surname>N.</surname>
          </string-name>
          <article-title>Rosenfeld, In the eye of the beholder: Robust prediction with causal user modeling</article-title>
          ,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>35</volume>
          (
          <year>2022</year>
          )
          <fpage>14419</fpage>
          -
          <lpage>14433</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Hartford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Leyton-Brown</surname>
          </string-name>
          , M. Taddy,
          <article-title>Deep iv: A flexible approach for counterfactual prediction</article-title>
          ,
          <source>in: International Conference on Machine Learning, PMLR</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>1414</fpage>
          -
          <lpage>1423</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Si</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , J. Xu,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-R.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <article-title>A model-agnostic causal learning framework for recommendation using search data</article-title>
          ,
          <source>in: Proceedings of the ACM Web Conference</source>
          <year>2022</year>
          , WWW '22,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2022</year>
          , p.
          <fpage>224</fpage>
          -
          <lpage>233</lpage>
          . URL: https://doi.org/10.1145/ 3485447.3511951. doi:
          <volume>10</volume>
          .1145/3485447.3511951.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G. W.</given-names>
            <surname>Imbens</surname>
          </string-name>
          ,
          <article-title>Instrumental variables: An econometrician's perspective</article-title>
          ,
          <source>Statistical Science</source>
          <volume>29</volume>
          (
          <year>2014</year>
          )
          <fpage>323</fpage>
          -
          <lpage>358</lpage>
          . URL: http://www.jstor.org/stable/43288511.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Belloni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Chernozhukov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <article-title>Sparse models and methods for optimal instruments with an application to eminent domain</article-title>
          ,
          <source>Econometrica</source>
          <volume>80</volume>
          (
          <year>2012</year>
          )
          <fpage>2369</fpage>
          -
          <lpage>2429</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Abadie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <article-title>Instrumental variable estimation with first-stage heterogeneity</article-title>
          ,
          <source>Journal of econometrics</source>
          (
          <year>2023</year>
          )
          <fpage>105425</fpage>
          -.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Leyton-Brown</surname>
          </string-name>
          ,
          <article-title>Revenue optimization in the generalized second-price auction</article-title>
          ,
          <source>in: Proceedings of the fourteenth ACM conference on Electronic commerce</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>837</fpage>
          -
          <lpage>852</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <article-title>Optimal reserve prices in weighted gsp auctions</article-title>
          ,
          <source>Electronic Commerce Research and Applications</source>
          <volume>13</volume>
          (
          <year>2014</year>
          )
          <fpage>178</fpage>
          -
          <lpage>187</lpage>
          . URL: https://www.sciencedirect.com/ science/article/pii/S1567422314000106. doi:https: //doi.org/10.1016/j.elerap.
          <year>2014</year>
          .
          <volume>02</volume>
          .003.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Frolich</surname>
          </string-name>
          ,
          <article-title>Nonparametric iv estimation of local average treatment efects with covariates</article-title>
          ,
          <source>Journal of econometrics 139</source>
          (
          <year>2007</year>
          )
          <fpage>35</fpage>
          -
          <lpage>75</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ferrari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Cribari-Neto</surname>
          </string-name>
          ,
          <article-title>Beta regression for modelling rates and proportions</article-title>
          ,
          <source>Journal of applied statistics 31</source>
          (
          <year>2004</year>
          )
          <fpage>799</fpage>
          -
          <lpage>815</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Saito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yaginuma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nishino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Sakata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Nakata</surname>
          </string-name>
          ,
          <article-title>Unbiased recommender learning from missing-notat-random implicit feedback</article-title>
          ,
          <source>in: Proceedings of the 13th International Conference on Web Search and Data Mining</source>
          , WSDM '20,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2020</year>
          , p.
          <fpage>501</fpage>
          -
          <lpage>509</lpage>
          . URL: https://doi.org/10.1145/3336191.3371783. doi:
          <volume>10</volume>
          . 1145/3336191.3371783.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>