<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>ACM SIGIR Workshop on eCommerce, July</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Long or Short or Both? An Exploration on Lookback Time Windows of Behavioral Features in Product Search Ranking</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Qi Liu</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Atul Singh</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jingbo Liu</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cun Mu</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zheng Yan</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Pedersen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Walmart Global Tech</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hoboken</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>U.S.A.</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Walmart.com.</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>18</volume>
      <issue>2024</issue>
      <abstract>
        <p>Customer shopping behavioral features are core to product search ranking models in eCommerce. In this paper, we investigate the efect of lookback time windows when aggregating these features at the (query, product) level over history. By studying the pros and cons of using long and short time windows, we propose a novel approach to integrating these historical behavioral features of diferent time windows. In particular, we address the criticality of using query-level vertical signals in ranking models to efectively aggregate all information from diferent behavioral features. Anecdotal evidence for the proposed approach is also provided using live product search trafic on</p>
      </abstract>
      <kwd-group>
        <kwd>Online shopping</kwd>
        <kwd>product search ranking</kwd>
        <kwd>learning to rank</kwd>
        <kwd>feature engineering</kwd>
        <kwd>behavioral features</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Online shopping has become an indispensable part of people’s daily lives due to its convenience,
wide selection, cost-efectiveness, and mobile accessibility. With an ever increasing catalog
size, product search ranking system [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4 ref5 ref6">1, 2, 3, 4, 5, 6</xref>
        ] has been playing a pivotal role in serving
customers by ranking relevant products at the top of their search results.
      </p>
      <p>
        At the heart of every modern eCommerce product search ranking system lies a
machinelearned ranking model. For example, LambdaRank/MART [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ] leverages gradient boosting
machines [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], and neural ranker [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] employs deep learning techniques. These models evaluate
and assign scores to each product based on a wide range of input signals derived from diverse
sources, including user behaviors, query intents, product attributes, seller reputations, and
sophisticated interactions among them.
      </p>
      <p>
        Out of these many hundreds or even thousands of signals, behavioral features hold significant
importance as they are generated through direct interactions between customers and products,
encompassing actions like impressions, clicks, add-to-carts (ATCs), purchases, and others.
Several studies [
        <xref ref-type="bibr" rid="ref1 ref11 ref12 ref13 ref14">1, 11, 12, 13, 14</xref>
        ] have emphasized the pivotal role of such implicit relevance
feedback [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] in product ranking. In the eCommerce context, customers are the ultimate
authorities in determining the relevance of products for a given query, particularly when their
judgment is backed by their purchasing decisions. Moreover, such logged customer feedback is
abundant and cheap to obtain in nowadays operational systems. Hence, it is very natural that
(query, product)-level behavioral features are the ones that ranking models rely on the most
when ranking products.
      </p>
      <p>In spite of the rich and growing literature on leveraging (query, product)-level behavioral
features and their variants in product search ranking, one much unaddressed problem is what
the lookback time window should be used to aggregate the customer engagement at the (query,
product) level. This is a very critical design question for all practitioners when applying these
essential behavioral features in their ranking systems. In this paper, we will share our empirical
insights from our first-hand industrial experience. In particular, we explore behavioral features
aggregated over diferent lookback time windows and study their respective efects on product
search ranking. Based upon the pros and cons of using long and short time windows, we
propose a principled approach to integrating both sets of behavioral features into the model.
The efectiveness of this hybrid model is justified on real product search trafic at Walmart.com
through online A/B tests.</p>
      <p>The remainder of the paper is organized as follows. Section 2 discusses the pros and cons
of using behavioral features with long and/or short windows. Section 3 details the proposed
enhancement to achieve the best integration of both long and short windows. Section 4 describes
the comprehensive online A/B experiment conducted to evaluate the proposed ranking model.
Finally, Section 5 summarizes our findings and draws conclusions.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Long or Short or Both?</title>
      <p>In this section, we will explore the efect of diferent lookback time window lengths when
leveraging behavioral features in product search ranking models. Three types of (query,
product)level user engagement are considered: click rate, add-to-cart (ATC) rate, and order rate. To
compute these rates, for a given query ( ) - product ( ) pair (,  ), we employ the Beta-Binomial
Bayesian model and derive behavioral feature values  , as the posterior mean of the following
Beta distribution,</p>
      <p>Beta( ∑  ,( ) + , ∑  ,( ) − ∑  ,( ) +  ),
∈ ∈ ∈
() is the raw count of the behavior (clicks,
where  and  specify the prior Beta distribution,  ,
() is the raw count of customer examines for
ATCs, or orders) frequency for (,  ) on day  ,  ,
(,  ) on day  , and  is the collection of lookback dates we use to aggregate the engagement
data. In particular, the following behavioral features are output to our ranking model,
 , =</p>
      <p>
        ∑∈  ,( ) + 
∑∈  ,( ) +  + 
which is quite similar to the behavioral features defined in [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] but smoothed with prior in
order to better address the cold start problem [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>As shown in equation 2, one critical factor influencing the values and interpretation of
behavioral features is the lookback time window length | | used to aggregate engagements.
Utilizing a longer time window captures long-term customer engagement patterns but may
(1)
(2)</p>
      <sec id="sec-3-1">
        <title>Pros</title>
      </sec>
      <sec id="sec-3-2">
        <title>Cons</title>
      </sec>
      <sec id="sec-3-3">
        <title>Long</title>
        <p>• rich in historical engagement data
• robust to noise
• insensitive to recent behavioral
changes from customers
• frictional to new products</p>
      </sec>
      <sec id="sec-3-4">
        <title>Short</title>
        <p>• good at capturing recent behavioral
changes from customers
• friendly to new products
• sparse in historical engagement data
• prone to noise
overlook recent trends. Conversely, a short time window highlights more short-term behaviors
but may not accurately reflect enduring customer interests. Both long and short time windows
for aggregating behavioral features present distinct advantages and disadvantages outlined in
Table 1.</p>
        <p>To investigate the impacts of long- and short-term behavioral features, we define 2 years
(| | = 730 ) as the long lookback time window and 1 month (| | = 30 ) as the short one, and we
specify the ranking model with only 2-year behavioral features as the baseline. Three distinct
ranking models with diferent designs on the window lengths are proposed below.
• Baseline Model: model with only 2-year behavioral features.
• Model A: model with only 1-month behavioral features.</p>
        <p>• Model B: model with both 2-year and 1-month behavioral features.</p>
        <p>
          Our search ranking models are trained using XGBoost [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] with the Learning-to-Rank (LTR)
framework [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] very similar to [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] by utilizing data from a truncated historical period of online
customer search trafic on Walmart.com for model training.
        </p>
        <p>
          To explore the best usage of lookback time windows, we conducted multiple interleaving
tests [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], each comparing one proposed model against the baseline model. These tests were
performed on a substantial volume of online customer trafic to compare their reactions to
diferent ranking models. Specifically, for each test, we compare the percentage of searches that
result in customer engagements between the Control and Variant groups using their respective
ranking models. The results are further segmented by diferent verticals—specific business
niches tailored to particular shopping needs. We currently categorize our search queries into
six verticals: Food, Consumables, Home, Hardlines, Fashion, and ETS (Electronics, Toys, and
Seasonal), with the latter four collectively categorized as General Merchandise (GM).
        </p>
        <sec id="sec-3-4-1">
          <title>2.1. Only Long / Short Window</title>
          <p>The first interleaving test is configured as follows:
• Control: Baseline Model (2-year behavioral features only),
• Variant: Model A (1-month behavioral features only),</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>Change</title>
        <p>−0.63%∗
−0.34%
+3.79%∗
+1.51%
+1.06%
with the purpose to separately examine and compare the individual impact of 2-year and
1-month behavioral features on search ranking models.</p>
        <p>
          The test result is presented in Table 2. Although Model A demonstrates an overall insignificant
decline compared to the baseline, when zooming into each business vertical, we find very
interesting stories. Model A exhibits a significant decline in Food and a trending decline in
Consumables. Conversely, it demonstrates a significant lift in the ETS and, more generally,
positive changes across most General Merchandise verticals. This corroborates that short-term
behavioral features are more informative in an environment that is more dynamic in terms of
both inventory assortment and customers’ shopping behaviors. In contrast, long-term features
are more advantageous for business units Food and Consumables, which typically display more
stable and enduring shopping patterns. Therefore, it is very tempting to employ both types of
features in the ranking model to leverage their combined strengths. Similar ideas of combining
session and historical customer search behaviors per each customer are also investigated in
web search personalization [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], but to the best of knowledge, our work is the first one to
explore combining (query, product)-level historical behavior features over diferent lookback
time windows in product search ranking.
        </p>
        <sec id="sec-3-5-1">
          <title>2.2. Both Long &amp; Short Windows</title>
          <p>With the insight from the previous subsection, we set up the second interleaving test as follows:
• Control: Baseline Model (2-year behavioral features only),
• Variant: Model B (2-year and 1-month behavioral features),
with the purpose to examine the combined impact of using both 2-year and 1-month behavioral
features in ranking.</p>
          <p>The test result is presented in Table 3. To our surprise, Model B performs quite sub-optimally
overall with the degradation in Food, Consumables, and ETS verticals. This suggests that
combining both long- and short-term behavioral features in this vanilla manner not only
fails to provide gains in ranking performance but also leads to further declines. One possible
reason for the negativity is the lack of flexibility in our ranking model to leverage diferent
behavioral features accordingly. For instance, the Food vertical should ideally leverage the
2-year behavioral features as extensively as possible. However, adding 1-month features dilutes
the positive efect of the 2-year features, negatively interfering with the overall contribution of
behavioral features. Conversely, in verticals such as ETS and Hardlines, where 1-month features
are more advantageous, the inclusion of 2-year features can similarly impair performance.</p>
        </sec>
      </sec>
      <sec id="sec-3-6">
        <title>Change</title>
        <p>−0.46%∗
+0.29%
−0.46%
−1.07%
+1.72%</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. How to Integrate Both?</title>
      <p>Diferent verticals exhibit diferent patterns of trending efects in customer behaviors. For
instance, Fashion such as “clothes” may be significantly influenced by recent trends afecting
their popularity and customer interactions. In contrast, Food and Consumables such as “milk”
and “toilet paper” tend to show more stability over time and are predominantly shaped by
long-term engagement patterns.</p>
      <p>Based on observations from tests in Sections 2.1 and 2.2, to improve the model performance
with combined behavioral features of both long and short windows, we consider making our
ranking model more query context-aware by incorporating one-hot encoded query vertical
signals (predicted from the upstream query understanding model) into the model. These
querylevel vertical signals would better guide our ranking model to leverage behavioral features of
diferent time windows according to diferent queries. Thus, we propose the fourth ranking
model below.</p>
      <p>• Model C: model with both 2-year and 1-month features, and query-level vertical features.</p>
      <sec id="sec-4-1">
        <title>3.1. Both Long &amp; Short Windows with Verticals</title>
        <p>The third interleaving test is configured as follows.</p>
        <p>• Control: Baseline Model (2-year behavioral features only),
• Variant: Model C (2-year and 1-month behavioral features with the vertical features),
with the purpose to examine whether adding query-level vertical features helps better integrate
2-year and 1-month behavioral features in ranking.</p>
        <p>The test result, detailed in Table 4, shows that guided by vertical information, behavioral
features are more efectively utilized by the ranking model, leading to significant uplifts across
all General Merchandise verticals while rectifying the previous degradation in the Food and
Consumables. Model C also shows an overall significant increase of 0.22% in customer engagement
and proves to be the best candidate ranking model among all tested. This demonstrates that
incorporating vertical features can indeed enhance the integration of multi-window behavioral
features, allowing each to play to its strengths and mitigate its weaknesses.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Why Does Long &amp; Short &amp; Verticals Work?</title>
        <p>The guiding efect that vertical information has on the ranking model in using diferent
behavioral features can also be validated in the model structure. Our ranking model inherently</p>
        <sec id="sec-4-2-1">
          <title>Change</title>
          <p>−0.01%
+0.40%∗
+0.78%∗
+1.58%∗
+0.73%∗
employs a tree structure, where adjacent tree nodes tend to be functionally related. More
precisely, if the nodes corresponding to one feature frequently precede those of another specific
feature, it suggests that the former, i.e., the upper-level feature, exerts a certain degree of
influence/control over the latter, i.e., the lower-level feature, determining when it will activate
to impact the model’s predictions.</p>
          <p>Across all splitting nodes from the trees in Model C, we summarize the distribution of diferent
behavioral nodes under the vertical nodes in Figure 1, taking Fashion and Consumables as
examples. The results show that 1-month behavioral features are more influential for Fashion
queries since they more prevalently occupy the place of the immediate lower level when the
current vertical node is Fashion, whereas 2-year behavioral features are more prevalent for
Consumables queries. This observation is aligned with our interpretation of the test result
in Section 3.1 given the characteristics of diferent verticals, and it evinces that introducing
query-level vertical signals can help guide our ranking model to better ensemble long- and
short-term behavioral features in the sense that diferent behavioral features can contribute
accordingly with respect to diferent search queries.
+0.12%</p>
          <p>Marketplace</p>
          <p>GMV
+0.64%∗
+0.21%∗</p>
          <p>Sessions
with ATC
+0.22%∗</p>
          <p>Session
Abandonment
−0.16%∗</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. A/B Test</title>
      <p>After the series of interleaving tests in Sections 2 and 3, we decided to move forward to A/B test
with the most promising candidate, Model C, which incorporates both long- and short-term
behavioral features along with the query-level vertical signals. Specifically, we conducted a
comprehensive A/B test on Walmart.com for two weeks to compare Model C against the baseline
production model.</p>
      <p>The result, detailed in Table 5, highlights substantial improvements in key search related
metrics. This A/B test observation confirms our hypothesis that a vertical-aware ranking model
incorporating a hybrid of behavioral features across both long and short time windows can
enhance the customer experience for a diverse range of online shopping needs. In addition, the
positivity in marketplace GMV clearly indicates that we are also able to better address cold-start
problems when introducing short-term behavioral features into the system.</p>
      <p>We also present a qualitative example in Figure 2 illustrating the comparison of Model C versus
the baseline in terms of user experience from the search ranking. It is clearly demonstrated
that utilizing behavioral features from both long and short time windows, along with vertical
information, results in a ranking model that prioritizes products with high recent popularity,
especially in the General Merchandise categories. This approach ensures that customers are
provided with options that are more closely aligned with their current shopping needs.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion</title>
      <p>In this paper, we propose a novel product search ranking model that incorporates a hybrid
of behavioral features over both long and short lookback time windows with vertical-specific
insights. The multi-window design aims to capture customer engagement patterns over varying
durations, and the vertical features are purposed to tailor behavioral features more efectively
to diferent online shopping contexts. This approach allows long-term behavioral features to
reflect enduring patterns, supporting routine customer journeys, while short-term features
capture immediate, trending patterns to enhance discovery customer experiences.</p>
      <p>Through comprehensive online testing, we demonstrate that the proposed model significantly
outperforms the baseline, which solely utilizes singular time-window behavioral features, by
achieving substantial improvements in key evaluation metrics across various verticals, catering
to distinct online shopping needs. As a result, the integration of multi-window behavioral
features and search context awareness adeptly navigates the complex dynamics of diferent
shopping categories, thereby enhancing customer engagement across all verticals. Consequently,
the proposed model not only fulfills the diverse needs of contemporary eCommerce online
shopping but also lays a scalable foundation for future enhancements in search ranking systems.</p>
      <p>For future work, we intend to expand the feature scope of the search ranking model by
incorporating behavioral features from additional time windows, such as 1 week and 1 year.
This extension will enable the model to capture a broader spectrum of trending efects, further
enhancing its predictive accuracy. Additionally, we plan to introduce more granular query-level
signals–e.g., categorical signals, product type signals, etc.–to allow for more nuanced guidance
of behavioral features, improving ranking’s contextualized capability and enriching the online
shopping experience for customers.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Sorokina</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Cantu-Paz</surname>
          </string-name>
          ,
          <article-title>Amazon search: The joy of ranking products</article-title>
          ,
          <source>in: SIGIR</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>459</fpage>
          -
          <lpage>460</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Trotman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Degenhardt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kallumadi</surname>
          </string-name>
          ,
          <article-title>The architecture of ebay search</article-title>
          , in: SIGIR eCom, volume
          <volume>2311</volume>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E. P.</given-names>
            <surname>Brenner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kutiyanawala</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <article-title>End-to-end neural ranking for ecommerce product search</article-title>
          , in: SIGIR eCom,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Tsagkias</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. H.</given-names>
            <surname>King</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kallumadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Murdock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rijke</surname>
          </string-name>
          ,
          <article-title>Challenges and research opportunities in ecommerce search and recommendations</article-title>
          ,
          <source>in: ACM SIGIR Forum</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Eletreby</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          ,
          <article-title>Machine learning based methods and apparatus for automatically generating item rankings</article-title>
          ,
          <year>2022</year>
          . US Patent App.
          <volume>17</volume>
          /246,
          <fpage>179</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Magnani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chaidaroon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Puthenputhussery</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Liao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <article-title>A multi-task learning framework for product ranking with bert</article-title>
          ,
          <source>in: WWW</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>493</fpage>
          -
          <lpage>501</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Burges</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ragno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <article-title>Learning to rank with nonsmooth cost functions</article-title>
          ,
          <source>NeurIPS</source>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Burges</surname>
          </string-name>
          ,
          <article-title>From ranknet to lambdarank to lambdamart: an overview</article-title>
          ,
          <source>Learning</source>
          <volume>11</volume>
          (
          <year>2010</year>
          )
          <fpage>81</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Friedman</surname>
          </string-name>
          ,
          <article-title>Greedy function approximation: a gradient boosting machine</article-title>
          ,
          <source>Annals of statistics</source>
          (
          <year>2001</year>
          )
          <fpage>1189</fpage>
          -
          <lpage>1232</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Ai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zamani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. B.</given-names>
            <surname>Croft</surname>
          </string-name>
          , X. Cheng,
          <article-title>A deep look into neural ranking models for information retrieval</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>57</volume>
          (
          <year>2020</year>
          )
          <fpage>102067</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>O.</given-names>
            <surname>Chapelle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <article-title>Yahoo! learning to rank challenge overview</article-title>
          , in: PMLR,
          <year>2011</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Dreossi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bakus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Salaka</surname>
          </string-name>
          ,
          <article-title>Treating cold start in product search by priors</article-title>
          ,
          <source>in: WWW Companion</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <surname>P</surname>
          </string-name>
          . Castells,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Salaka</surname>
          </string-name>
          ,
          <article-title>Addressing cold start in product search via empirical bayes</article-title>
          ,
          <source>in: CIKM</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hendriksen</surname>
          </string-name>
          , E. Kuiper,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nauts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schelter</surname>
          </string-name>
          , M. de Rijke,
          <article-title>Analyzing and predicting purchase intent in e-commerce: Anonymous vs. identified customers</article-title>
          , in: SIGIR eCom,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Rocchio</surname>
          </string-name>
          ,
          <article-title>Relevance feedback in information retrieval, The SMART retrieval system: experiments in automatic document processing (</article-title>
          <year>1971</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>K.</given-names>
            <surname>Santu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Parikshit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhai</surname>
          </string-name>
          ,
          <article-title>On application of learning to rank for e-commerce search</article-title>
          , in: SIGIR,
          <year>2017</year>
          , pp.
          <fpage>475</fpage>
          -
          <lpage>484</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>Xgboost: A scalable tree boosting system</article-title>
          ,
          <source>in: KDD</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>785</fpage>
          -
          <lpage>794</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>T. Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>Learning to rank for information retrieval</article-title>
          ,
          <source>Foundations and Trends® in Information Retrieval</source>
          <volume>3</volume>
          (
          <year>2009</year>
          )
          <fpage>225</fpage>
          -
          <lpage>331</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>O.</given-names>
            <surname>Chapelle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Joachims</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Radlinski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yue</surname>
          </string-name>
          ,
          <article-title>Large-scale validation and analysis of interleaved search evaluation</article-title>
          ,
          <source>ACM Transactions on Information Systems</source>
          <volume>30</volume>
          (
          <year>2012</year>
          )
          <fpage>1</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>P.</given-names>
            <surname>Bennett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>White</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dumais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bailey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Borisyuk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <article-title>Modeling the impact of short- and long-term behavior on search personalization</article-title>
          , in: SIGIR,
          <year>2012</year>
          , pp.
          <fpage>185</fpage>
          -
          <lpage>194</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>