<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>IIR</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Evaluating Recommendations in a User Interface With Multiple Carousels</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Discussion Paper</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maurizio Ferrari Dacrema</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicolò Felicioni</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paolo Cremonesi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ContentWise</institution>
          ,
          <addr-line>Via Simone Schiafino 11, Milano, 20158, Milano</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Politecnico di Milano</institution>
          ,
          <addr-line>Piazza Leonardo da Vinci 32, 20133 Milano</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>12</volume>
      <fpage>0000</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>Many video-on-demand and music streaming services provide the user with a page consisting of several recommendation lists, i.e., widgets or swipeable carousels, each built with a specific criteria (e.g., most recent, TV series, etc.). Finding eficient strategies to select which carousels to display is an active research topic of great industrial interest. In this setting the overall quality of the recommendations of a new algorithm cannot be assessed by measuring solely its individual recommendation quality. Rather, it should be evaluated in a context where other recommendation lists are already available, to account for how they complement each other. The traditional ofline evaluation protocol however does not take this into account. To address this limitation, we propose an ofline evaluation protocol for a carousel setting in which the recommendation quality of a model is measured by how much it improves upon that of an already available set of carousels. Our results indicate that under a carousel setting the ranking of the algorithms changes sometimes significantly. This work is an extended abstract of [1].</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Recommender Systems</kwd>
        <kwd>User Interface</kwd>
        <kwd>Evaluation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The general goal of a recommendation system is to help the users navigate the large number
of options at their disposal by suggesting a limited number of relevant results. Traditionally,
the focus of newly developed recommendation systems is to generate the best possible ranked
list of results, see [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]. A common assumption in almost all research works is that the
recommendations will be provided to the user as a single list which will be explored following
its order from the first element to the last. However, many industrial applications provide
users with a two-dimensional layout of recommendations. Examples are video on demand (e.g.,
Netflix, Amazon Prime Video) and music streaming services (e.g., Spotify). In these scenarios the
user is provided with an interface composed of multiple rows, each row containing thematically
consistent recommendations, e.g., most recent, most popular, editorially curated, see [
        <xref ref-type="bibr" rid="ref5 ref6 ref7 ref8 ref9">5, 6, 7, 8, 9</xref>
        ].
These rows are referred to as widgets, shelves or as carousels. In a carousel interface scenario the
user satisfaction depends both on the entire set of carousels shown to the user, rather than on
a single list, and on their relative positions. Finding appropriate combinations of algorithms
and ranking them to provide the user with a personalized page is an active research topic of
significant industrial interest [
        <xref ref-type="bibr" rid="ref10 ref5 ref9">9, 10, 5</xref>
        ], but the community lacks a standardized evaluation
procedure to represent this scenario. Frequently, the recommendation quality of the carousel
interface is measured by flattening all recommendation lists into a single one, but this is not a
realistic evaluation process. In paper [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] we propose: (i) a procedure for the ofline evaluation
of recommendations under a carousel layout; (ii) an extension to the NDCG that accounts for
how users navigate a two-dimensional interface and perform actions to reveal hidden parts of
the interface; (iii) two simple strategies to rank carousels in a page. Several recommendation
models are evaluated both independently and as the last carousel in a page containing other
recommendation lists. In this scenario the recommendation quality of a model is based on how
many new correct recommendation it provides compared to those already present in the page.
Results indicate that the two evaluation procedures lead sometimes to very diferent results,
highlighting the importance to take into account how the carousels complement each other.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Characteristics of a Carousel Interface</title>
      <p>The carousel interface layout and the way it is usually generated by video-on-demand and music
streaming platforms has important characteristics that distinguish it from a single-list setup,
see [11]. While a carousel layout may seem similar to a traditional merge-list ensemble, where
multiple recommendation lists are combined into one, this is not the case. In a real scenario
multiple constraints play a role and must be taken into account:
Layout Structure: The two dimensional user interface of almost all devices is organized with
multiple horizontal carousels, where each carousel is generated according to a certain (often
explainable) criteria e.g., most recent, most popular, because you watched, editorially curated.
Recommendation Lists: The lists shown to the users within each carousel are generated
with diferent algorithms or by diferent providers and independently from each other (i.e.,
each algorithm or provider is not aware of the existence of the other lists or of their content).
Consider for example content aggregators, which combine carousels from diferent providers, e.g.,
Sky, Youtube, Netflix, Prime Video, etc. Due to either business constraints or the strict real-time
requirements, no single post-processing step is applied, e.g., to remove items duplicated across
diferent carousels. Due to this, while each individual recommendation list does not contain
duplicates, the same item may be recommended in multiple carousels.</p>
      <p>User Behavior: The users will focus on the top-left triangle of the screen rather than exploring
the carousels sequentially. This is usually called golden triangle. Furthermore, they will explore
the recommendations in diferent ways according to which actions they need to perform in
order to reveal them. Usually users tend to navigate more easily with simple swipes rather than
repeated mouse clicks, hence their behavior, as it is known, will change according to the device
(e.g., personal computer, smartphone, tablet, Smart TV).</p>
    </sec>
    <sec id="sec-3">
      <title>3. Experimental Methodology</title>
      <p>While the traditional evaluation assesses the recommendation quality of a single
recommendation model, in a carousel scenario the goal is to assess the recommendation quality of a
certain layout composed of recommendation lists. Once it is possible to evaluate the overall
recommendation quality of a single layout it is possible to compare diferent layouts in order
to select the best one. For example, one may wish to select the optimal carousel ranking or to
choose which recommendation model should generate a specific carousel.</p>
      <p>Layout generation: The layout will contain a fixed number of carousels, or recommendation
lists, of a given length. If some of the carousels are generated with recommendation models, the
ifrst step is to ensure that all models are adequately optimized. Since the specific layout structure
that will be shown is, in general, not known in advance each recommendation model should
be optimized independently. The recommendations that will be shown to the users are the
sequence of all the recommendation lists in the layout, without any centralized postprocessing.
Evaluation metrics: The recommendations provided to the user will be displayed in a
twodimensional pattern. A frequent simplification is to concatenate all lists in a single one and
remove duplicate recommendations. While this allows to use traditional metrics (e.g., NDCG,
MAP), it makes assumptions that are not consistent with a carousel layout: (i) the user explores
the lists sequentially; (ii) the recommendation lists are centrally collected and postprocessed. In
a real scenario we must ensure that a correct recommendation is only counted once and with the
correct ranking, if it appeared multiple times in diferent positions. The correct recommendation
should be counted where the user would see it first , according to the user’s navigation pattern,
which requires to define a new ranking discount function to compute ranking metrics. 1</p>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Discussion</title>
      <p>
        In a realistic carousel scenario, several recommendation lists (or carousels) are available
generated with diferent algorithms or editorial rules. In order to mimic this setting we include
in the evaluation 16 algorithms that are both simple, well-known and competitive [12]. The
algorithsm are: TopPopular, Global Efects, SLIM ElasticNet and SLIM BPR [ 13], EASE [14],
P3 [15] and RP3 [16], PureSVD [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], FunkSVD [12], Non-negative matrix factorization [17],
MF BPR [18] and IALS [19], ItemKNN content-based and ItemKNN CF-CBF [20]. The evaluation
includes three datasets: MovieLens 20M [21], Netflix Prize [22] and ContentWise Impressions
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The data is split in 80% training, 10% validation and 10% test with a random holdout. The
hyperparameter search is conduced as in [12] with Bayesian Search by exploring 50 cases.
      </p>
      <p>
        Due to space reasons, we will describe in particular one experiment comparing the model
ranking for Movielens20M. The goal is to choose which model to add as the last carousel in an
interface that contains an increasing number of carousels: TopPopular, ItemKNN CF and, for
the Movielens 20M dataset, ItemKNN CBF. The models are first evaluated individually with
the traditional evaluation protocol and then with the proposed carousel evaluation protocol.
1See the full paper for a detailed description on how to compute evaluation metrics [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>All recommendation lists have a length of 10 and are evaluated with NDCG. As a general
trend we can see that the relative efectiveness of the models difer, resulting in changes to
the ranking of the algorithms in the two evaluation modes, see Figure 1. Some models such
as GlobalEfects and PureSVD are always ranked in the same position. Others, in this case
all other matrix factorization algorithms, gain 2 or 3 positions. On the other hand item-based
machine learning models tend to consistently lose some positions, with EASE being the worst
afected and losing 4 positions. As a result, while in the individual evaluation the best algorithms
are SLIM ElasticNet, UserKNN CF, SLIM BPR and EASE, in the carousel evaluation the best
algorithms are UserKNN CF, SLIM ElasticNet, IALS and FunkSVD. Since the recommendation
lists generated by all algorithms are identical for both evaluation procedures, the diference
in the ranking lies in how those recommendations intersect. Algorithms which will tend to
recommend popular items will be penalized in this carousel evaluation because popular items
will already be present in the TopPopular carousel, whereas algorithms providing accurate
recommendation but involving less popular items will be advantaged. These results are similar
for the Netflix Prize and ContentWise Impressions datasets, although the afected models change.
For example, on the Netflix Prize dataset a carousel layout with TopPopular and ItemKNN CF
causes two sharp changes in ranking: EASE falls by 6 positions while FunkSVD gains 7.</p>
      <p>The results indicate that accounting for how multiple recommendation lists complement
each other can produce substantially diferent results compared to evaluating each model
independently and therefore it is an aspect that should be taken into account when developing
recommendation models aimed to domains that use carousel interfaces. The carousel evaluation
protocol also opens new research directions, such as how to combine the strength of various
models to provide the user with ever more accurate and interesting recommendations.
constraints, in: A. Teredesai, V. Kumar, Y. Li, R. Rosales, E. Terzi, G. Karypis (Eds.),
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery
&amp; Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019, ACM, 2019, pp. 3153–
3161. URL: https://doi.org/10.1145/3292500.3330675. doi:10.1145/3292500.3330675.
[11] N. Felicioni, M. Ferrari Dacrema, P. Cremonesi, A methodology for the ofline evaluation
of recommender systems in a user interface with multiple carousels, in: J. Masthof,
E. Herder, N. Tintarev, M. Tkalcic (Eds.), Adjunct Publication of the 29th ACM Conference
on User Modeling, Adaptation and Personalization, UMAP 2021, Utrecht, The Netherlands,
June 21-25, 2021, ACM, 2021, pp. 10–15. URL: https://doi.org/10.1145/3450614.3461680.
doi:10.1145/3450614.3461680.
[12] M. Ferrari Dacrema, S. Boglio, P. Cremonesi, D. Jannach, A troubling analysis of
reproducibility and progress in recommender systems research, ACM Trans. Inf. Syst. 39 (2021).</p>
      <p>URL: https://doi.org/10.1145/3434185. doi:10.1145/3434185.
[13] X. Ning, G. Karypis, SLIM: Sparse linear methods for top-n recommender systems, in:
Proceedings of the 11th IEEE International Conference on Data Mining (ICDM ’11), 2011,
pp. 497–506.
[14] H. Steck, Embarrassingly shallow autoencoders for sparse data, in: L. Liu, R. W. White,
A. Mantrach, F. Silvestri, J. J. McAuley, R. Baeza-Yates, L. Zia (Eds.), The World Wide Web
Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019, ACM, 2019, pp. 3251–
3257. URL: https://doi.org/10.1145/3308558.3313710. doi:10.1145/3308558.3313710.
[15] C. Cooper, S. Lee, T. Radzik, Y. Siantos, Random walks in recommender systems: exact
computation and simulations, in: C. Chung, A. Z. Broder, K. Shim, T. Suel (Eds.), 23rd
International World Wide Web Conference, WWW ’14, Seoul, Republic of Korea, April
7-11, 2014, Companion Volume, ACM, 2014, pp. 811–816. URL: https://doi.org/10.1145/
2567948.2579244. doi:10.1145/2567948.2579244.
[16] B. Paudel, F. Christofel, C. Newell, A. Bernstein, Updatable, accurate, diverse, and scalable
recommendations for interactive applications, ACM Trans. Interact. Intell. Syst. 7 (2017)
1:1–1:34. URL: https://doi.org/10.1145/2955101. doi:10.1145/2955101.
[17] A. Cichocki, A. H. Phan, Fast local algorithms for large scale nonnegative matrix and tensor
factorizations, IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 92-A (2009) 708–721.</p>
      <p>URL: https://doi.org/10.1587/transfun.E92.A.708. doi:10.1587/transfun.E92.A.708.
[18] S. Rendle, C. Freudenthaler, Z. Gantner, L. Schmidt-Thieme, BPR: bayesian personalized
ranking from implicit feedback, in: J. A. Bilmes, A. Y. Ng (Eds.), UAI 2009, Proceedings
of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, Montreal, QC,
Canada, June 18-21, 2009, AUAI Press, 2009, pp. 452–461. URL: https://dslpitt.org/uai/
displayArticleDetails.jsp?mmnu=1&amp;smnu=2&amp;article_id=1630&amp;proceeding_id=25.
[19] Y. Hu, Y. Koren, C. Volinsky, Collaborative filtering for implicit feedback datasets, in:
Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008),
December 15-19, 2008, Pisa, Italy, IEEE Computer Society, 2008, pp. 263–272. URL: https:
//doi.org/10.1109/ICDM.2008.22. doi:10.1109/ICDM.2008.22.
[20] B. Mobasher, X. Jin, Y. Zhou, Semantically enhanced collaborative filtering on the web,
in: B. Berendt, A. Hotho, D. Mladenic, M. van Someren, M. Spiliopoulou, G. Stumme
(Eds.), Web Mining: From Web to Semantic Web, First European Web Mining Forum,
EMWF 2003, Cavtat-Dubrovnik, Croatia, September 22, 2003, Revised Selected and Invited
Papers, volume 3209 of Lecture Notes in Computer Science, Springer, 2003, pp. 57–76. URL:
https://doi.org/10.1007/978-3-540-30123-3_4. doi:10.1007/978-3-540-30123-3\_4.
[21] F. M. Harper, J. A. Konstan, The movielens datasets: History and context, ACM Trans.</p>
      <p>Interact. Intell. Syst. 5 (2016) 19:1–19:19. URL: https://doi.org/10.1145/2827872. doi:10.
1145/2827872.
[22] J. Bennett, S. Lanning, et al., The netflix prize, in: Proceedings of KDD cup and workshop,
volume 2007, New York, 2007, p. 35.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ferrari Dacrema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Felicioni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          ,
          <article-title>Ofline evaluation of recommender systems in a user interface with multiple carousels</article-title>
          ,
          <source>Frontiers Big Data</source>
          <volume>5</volume>
          (
          <year>2022</year>
          )
          <article-title>910030</article-title>
          . URL: https://doi.org/10.3389/fdata.
          <year>2022</year>
          .
          <volume>910030</volume>
          . doi:
          <volume>10</volume>
          .3389/fdata.
          <year>2022</year>
          .
          <volume>910030</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Herlocker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Konstan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. G.</given-names>
            <surname>Terveen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          ,
          <article-title>Evaluating collaborative filtering recommender systems</article-title>
          ,
          <source>ACM Trans. Inf. Syst</source>
          .
          <volume>22</volume>
          (
          <year>2004</year>
          )
          <fpage>5</fpage>
          -
          <lpage>53</lpage>
          . URL: https://doi.org/10.1145/ 963770.963772. doi:
          <volume>10</volume>
          .1145/963770.963772.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Koren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Turrin</surname>
          </string-name>
          ,
          <article-title>Performance of recommender algorithms on top-n recommendation tasks</article-title>
          , in: X.
          <string-name>
            <surname>Amatriain</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Torrens</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Resnick</surname>
          </string-name>
          , M. Zanker (Eds.),
          <source>Proceedings of the 2010 ACM Conference on Recommender Systems, RecSys</source>
          <year>2010</year>
          , Barcelona, Spain,
          <source>September 26-30</source>
          ,
          <year>2010</year>
          , ACM,
          <year>2010</year>
          , pp.
          <fpage>39</fpage>
          -
          <lpage>46</lpage>
          . URL: https://doi.org/10.1145/1864708. 1864721. doi:
          <volume>10</volume>
          .1145/1864708.1864721.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sanderson</surname>
          </string-name>
          , W. B.
          <string-name>
            <surname>Croft</surname>
          </string-name>
          ,
          <article-title>The history of information retrieval research</article-title>
          ,
          <source>Proc. IEEE</source>
          <volume>100</volume>
          (
          <year>2012</year>
          )
          <fpage>1444</fpage>
          -
          <lpage>1451</lpage>
          . URL: https://doi.org/10.1109/JPROC.
          <year>2012</year>
          .
          <volume>2189916</volume>
          . doi:
          <volume>10</volume>
          .1109/JPROC.
          <year>2012</year>
          .
          <volume>2189916</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. V.</given-names>
            <surname>Alvino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Smola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Basilico</surname>
          </string-name>
          ,
          <article-title>Using navigation to improve recommendations in real-time</article-title>
          , in: S. Sen,
          <string-name>
            <given-names>W.</given-names>
            <surname>Geyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Freyne</surname>
          </string-name>
          , P. Castells (Eds.),
          <source>Proceedings of the 10th ACM Conference on Recommender Systems</source>
          , Boston, MA, USA, September
          <volume>15</volume>
          -
          <issue>19</issue>
          ,
          <year>2016</year>
          , ACM,
          <year>2016</year>
          , pp.
          <fpage>341</fpage>
          -
          <lpage>348</lpage>
          . URL: https://doi.org/10.1145/2959100.2959174. doi:
          <volume>10</volume>
          .1145/2959100. 2959174.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E.</given-names>
            <surname>Elahi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chandrashekar</surname>
          </string-name>
          ,
          <article-title>Learning representations of hierarchical slates in collaborative ifltering</article-title>
          , in: R. L.
          <string-name>
            <surname>T. Santos</surname>
            ,
            <given-names>L. B.</given-names>
          </string-name>
          <string-name>
            <surname>Marinho</surname>
            ,
            <given-names>E. M.</given-names>
          </string-name>
          <string-name>
            <surname>Daly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Falk</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Koenigstein</surname>
          </string-name>
          , E. S. de Moura (Eds.),
          <source>RecSys 2020: Fourteenth ACM Conference on Recommender Systems</source>
          , Virtual Event, Brazil,
          <source>September 22-26</source>
          ,
          <year>2020</year>
          , ACM,
          <year>2020</year>
          , pp.
          <fpage>703</fpage>
          -
          <lpage>707</lpage>
          . URL: https://doi. org/10.1145/3383313.3418484. doi:
          <volume>10</volume>
          .1145/3383313.3418484.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F. B.</given-names>
            <surname>Pérez Maurera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Ferrari</given-names>
            <surname>Dacrema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Saule</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Scriminaci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          ,
          <article-title>Contentwise impressions: An industrial dataset with impressions included</article-title>
          , in: M.
          <string-name>
            <surname>d'Aquin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Dietze</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Hauf</surname>
            , E. Curry,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Cudré-Mauroux</surname>
          </string-name>
          (Eds.),
          <source>CIKM '20: The 29th ACM International Conference on Information and Knowledge Management</source>
          , Virtual Event, Ireland,
          <source>October 19-23</source>
          ,
          <year>2020</year>
          , ACM,
          <year>2020</year>
          , pp.
          <fpage>3093</fpage>
          -
          <lpage>3100</lpage>
          . URL: https://doi.org/10.1145/3340531.3412774. doi:
          <volume>10</volume>
          .1145/3340531.3412774.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gruson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Chandar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Charbuillet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>McInerney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tardieu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Carterette</surname>
          </string-name>
          ,
          <article-title>Ofline evaluation to make decisions about playlistrecommendation algorithms</article-title>
          , in: J. S. Culpepper,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mofat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Bennett</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          Lerman (Eds.),
          <source>Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, WSDM</source>
          <year>2019</year>
          ,
          <article-title>Melbourne</article-title>
          ,
          <string-name>
            <surname>VIC</surname>
          </string-name>
          , Australia,
          <source>February 11-15</source>
          ,
          <year>2019</year>
          , ACM,
          <year>2019</year>
          , pp.
          <fpage>420</fpage>
          -
          <lpage>428</lpage>
          . URL: https://doi.org/10.1145/ 3289600.3291027. doi:
          <volume>10</volume>
          .1145/3289600.3291027.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>W.</given-names>
            <surname>Bendada</surname>
          </string-name>
          , G. Salha, T. Bontempelli,
          <article-title>Carousel personalization in music streaming apps with contextual bandits</article-title>
          , in: R. L.
          <string-name>
            <surname>T. Santos</surname>
            ,
            <given-names>L. B.</given-names>
          </string-name>
          <string-name>
            <surname>Marinho</surname>
            ,
            <given-names>E. M.</given-names>
          </string-name>
          <string-name>
            <surname>Daly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Falk</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Koenigstein</surname>
          </string-name>
          , E. S. de Moura (Eds.),
          <source>RecSys 2020: Fourteenth ACM Conference on Recommender Systems</source>
          , Virtual Event, Brazil,
          <source>September 22-26</source>
          ,
          <year>2020</year>
          , ACM,
          <year>2020</year>
          , pp.
          <fpage>420</fpage>
          -
          <lpage>425</lpage>
          . URL: https://doi.org/10.1145/3383313.3412217. doi:
          <volume>10</volume>
          .1145/3383313.3412217.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>W.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Govindaraj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. V. N.</given-names>
            <surname>Vishwanathan</surname>
          </string-name>
          ,
          <article-title>Whole page optimization with global</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>