<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Task in Recommender Systems Research between Traditional and Deep Learning Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Discussion Paper</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vito Walter Anelli</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alejandro Bellogín</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonio Ferrara</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniele Malitesta</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Felice Antonio Merra</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudio Pomo</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Maria Donini</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eugenio Di Sciascio</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tommaso Di Noia</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Amazon Science Berlin</institution>
          ,
          <addr-line>Invalidenstraße 75, 10557 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Politecnico di Bari</institution>
          ,
          <addr-line>via Orabona, 4, 70125 Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Universidad Autónoma de Madrid</institution>
          ,
          <addr-line>Ciudad Universitaria de Cantoblanco, 28049 Madrid</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Università degli Studi della Tuscia</institution>
          ,
          <addr-line>via Santa Maria in Gradi, 4, 01100 Viterbo</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Recommender Systems have shown to be a useful tool for reducing over-choice and providing accurate, personalized suggestions. The large variety of available recommendation algorithms, splitting techniques, assessment protocols, metrics, and tasks, on the other hand, has made thorough experimental evaluation extremely dificult. Elliot is a comprehensive framework for recommendation with the goal of running and reproducing a whole experimental pipeline from a single configuration file. The framework uses a variety of ways to load, filter, and divide data. Elliot optimizes hyper-parameters for a variety of recommendation algorithms, then chooses the best models, compares them to baselines, computes metrics ranging from accuracy to beyond-accuracy, bias, and fairness, and does statistical analysis. The aim is to provide researchers with a tool to ease all the experimental evaluation phases (and make them reproducible), from data reading to results collection. Elliot is freely available on GitHub at https://github.com/sisinflab/elliot.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Recommender Systems</kwd>
        <kwd>Reproducibility</kwd>
        <kwd>Adversarial Learning</kwd>
        <kwd>Visual Recommenders</kwd>
        <kwd>Knowledge Graphs</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In the last decade, Recommendation Systems (RSs) have gained momentum as the pivotal choice
for personalized decision-support systems. Recommendation is essentially a retrieval task where
a catalog of items is ranked in a personalized way and the top-scoring items are presented to the
user [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Once the RSs ability to provide personalized items to clients had been demonstrated,
both academia and industry began to devote attention to them [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This collective efort resulted
in an impressive number of recommendation algorithms, ranging from memory-based [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] to
latent factor-based [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ], as well as deep learning-based methods [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. At the same time, the RS
research community realized that focusing only on the accuracy of results could be detrimental,
and started exploring beyond-accuracy evaluation [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. As accuracy was recognized as insuficient
to guarantee users’ satisfaction [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], novelty and diversity [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ] came into play as new dimensions
to be analyzed when comparing algorithms. However, this was only the first step in the direction
of a more comprehensive evaluation. Indeed, more recently, the presence of biased [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and unfair
recommendations towards user groups and item categories [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] has been widely investigated.
The abundance of possible choices has generated confusion around choosing the correct
baselines, conducting the hyperparameter optimization and the experimental evaluation [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], and
reporting the details of the adopted procedure. Consequently, two major concerns have arisen:
unreproducible evaluation and unfair comparisons [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        The advent of various frameworks over the last decade has improved the research process, and
the RS community has gradually embraced the emergence of recommendation, assessment, and
even hyperparameter tweaking frameworks. Starting from 2011, Mymedialite [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], LensKit [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ],
LightFM [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], RankSys [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], and Surprise [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], have formed the basic software for rapid
prototyping and testing of recommendation models, thanks to an easy-to-use model execution and the
implementation of standard accuracy, and beyond-accuracy, evaluation measures and splitting
techniques. However, the outstanding success and the community interest in Deep Learning
(DL) recommendation models, raised the need for novel instruments. LibRec [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], Spotlight 1,
and OpenRec [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] are the first open-source projects that made DL-based recommenders available
– with less than a dozen of available models but, unfortunately, without filtering, splitting, and
hyper-optimization tuning strategies. An important step towards more exhaustive and up-to-date
set of model implementations have been released with RecQ [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], DeepRec [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], and Cornac [23]
frameworks. However, they do not provide a general tool for extensive experiments on the
pre-elaboration and the evaluation of a dataset. Indeed, after the reproducibility hype [24, 25],
DaisyRec [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and RecBole [26] raised the bar of framework capabilities, making available both
large set of models, data filtering/splitting operations and, above all, hyper-parameter tuning
features. Unfortunately, even though these frameworks are a great help to researchers, facilitating
reproducibility or extending the provided functionality would typically depend on developing
bash scripts or programming on whatever language each framework is written.
      </p>
      <p>This is where Elliot comes to the stage. It is a novel kind of recommendation framework,
aimed to overcome these obstacles by proposing a fully declarative approach (by means of a
configuration file) to the set-up of an experimental setting. It analyzes the recommendation
problem from the researcher’s perspective as it implements the whole experimental pipeline,
from dataset loading to results gathering in a principled way. The main idea behind Elliot is to
keep an entire experiment reproducible and put the user (in our case, a researcher or RS developer)
in control of the framework. To date, according to the recommendation model, Elliot allows for
choosing among 27 similarity metrics, defining of multiple neural architectures, and choosing
51 hyperparameter tuning combined approaches, unleashing the full potential of the HyperOpt
library [27]. To enable evaluation for the diverse tasks and domains, Elliot supplies 36 metrics
(including Accuracy, Error-based, Coverage, Novelty, Diversity, Bias, and Fairness metrics), 13
splitting strategies, and 8 prefiltering policies.</p>
      <p>1https://github.com/maciejkula/spotlight</p>
      <p>P
Filter-by-rating
k-core</p>
      <p>S
Temporal
Random
Fix</p>
      <p>L
Ratings
Side Information</p>
      <p>R
Restore Model
Restore Model
Restore</p>
      <p>External Model</p>
      <p>M
Accuracy
Error
Coverage
Novelty</p>
      <p>Diversity
H FBaiairsness</p>
      <p>O
Performance Tables
Model Weights
Recommendation Lists</p>
      <p>S . T
Paired t-test
Wilcoxon</p>
      <p>Con guration File
Data Modules
Run Module
Evaluation Modules
Output Module
Optional Modules</p>
    </sec>
    <sec id="sec-2">
      <title>2. Framework</title>
      <p>Elliot is an extendable framework made up of eight functional modules, each of which is in charge
of a diferent phase in the experimental suggestion process. The user is only required to submit
human-level experimental flow information via a customisable configuration file, so what happens
beneath the hood (Figure 1) is transparent to them. As a result, Elliot constructs the whole
pipeline. What follows presents each of Elliot’s modules and how to create a configuration file.</p>
      <sec id="sec-2-1">
        <title>2.1. Data Preparation</title>
        <p>The Data modules are in charge of handling and organizing the experiment’s input, as well as
providing a variety of supplementary data, such as item characteristics, visual embeddings, and
pictures. The input data is taken over by the Prefiltering and Splitting modules after being loaded
by the Loading module, whose techniques are described in Sect.2.1.2 and 2.1.3 respectively.
2.1.1. Loading</p>
        <p>Diferent data sources, such as user-item feedback or side information, such as the item visual
aspects, may be required for RSs investigations. Elliot comes with a variety of Loading module
implementations to meet these requirements. Furthermore, the user may create computationally
intensive prefiltering and splitting operations that can be saved and loaded to save time in the future.
Additional data, such as visual characteristics and semantic features generated from knowledge
graphs, can be handled through data-driven extensions. When a side-information-aware Loading
module is selected, it filters out items that lack the needed information to provide a fair comparison.</p>
        <sec id="sec-2-1-1">
          <title>2.1.2. Prefiltering</title>
          <p>Elliot provides data filtering procedures using two diferent techniques after data loading.
Filter-by-rating is the first method implemented in the Prefiltering module, which removes a
useritem interaction if the preference score falls below a certain level. It can be a Numerical value, such
as 3.5, a Distributional information, such as the worldwide rating average value, or a user-based
distributional (User Dist.) value, such as the user’s average rating value. The -core prefiltering
approach eliminates people, objects, or both if there are less than  documented interactions. The
-core technique can be used to both users and things repeatedly (Iterative -core) until the -core
ifltering requirement is fulfilled, i.e., all users and items have at least  recorded interactions. Since
reaching such condition might be intractable, Elliot allows specifying the maximum number of
iterations (Iter--rounds). Finally, the Cold-Users filtering feature allows retaining cold-users only.</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>2.1.3. Splitting</title>
          <p>Elliot implements three splitting strategies: (i) Temporal, (ii) Random, and (iii) Fix. The
Temporal method divides user-item interactions depending on the transaction timestamp, either
by setting the timestamp, selecting the best one [28, 29], or using a hold-out (HO) mechanism.
Hold-out (HO), -repeated hold-out (K-HO), and cross-validation (CV ) are all part of the Random
methods. Finally, the Fix approach leverages an already split dataset.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Recommendation Models</title>
        <p>After data loading and pre-elaborations, the Recommendation module (Figure 1) provides the
functionalities to train (and restore) both Elliot’s state-of-the-art recommendation models and
custom user-implemented models, with the possibility to find the best hyper-parameter setting.</p>
        <sec id="sec-2-2-1">
          <title>2.2.1. Implemented Models</title>
          <p>
            To date, Elliot integrates around 50 recommendation models grouped into two sets: (i) popular
models implemented in at least two of the other reviewed frameworks, and (ii) other
wellknown state-of-the-art recommendation models which are less common in the reviewed
frameworks, such as autoencoder-based, e.g., [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ], graph-based, e.g., [30], visually-aware [31], e.g., [32],
adversarially-robust, e.g., [33], and content-aware, e.g., [34, 35].
          </p>
        </sec>
        <sec id="sec-2-2-2">
          <title>2.2.2. Hyper-parameter Tuning</title>
          <p>According to Rendle et al. [25], Anelli et al. [36], hyper-parameter optimization has a significant
impact on performance. Elliot supplies Grid Search, Simulated Annealing, Bayesian Optimization,
and Random Search, supporting four diferent traversal techniques in the search space. Grid
Search is automatically inferred when the user specifies the available hyper-parameters.</p>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Performance Evaluation</title>
        <p>After the training phase, Elliot continues its operations, evaluating recommendations. Figure 1
indicates this phase with two distinct evaluation modules: Metrics and Statistical Tests.
2.3.1. Metrics
Elliot provides a set of 36 evaluation metrics, partitioned into seven families: Accuracy, Error,
Coverage, Novelty, Diversity, Bias, and Fairness. It is worth mentioning that Elliot is the framework
that exposes both the largest number of metrics and the only one considering bias and fairness
measures. Moreover, the practitioner can choose any metric to drive the model selection and the tuning.</p>
        <sec id="sec-2-3-1">
          <title>2.3.2. Statistical Tests</title>
          <p>All other cited frameworks do not support statistical hypothesis tests, probably due to the need
for computing fine-grained (e.g., per-user or per-partition) results and retaining them for each
recommendation model. Conversely, Elliot helps computing two statistical hypothesis tests,
i.e., Wilcoxon and Paired t-test, with a flag in the configuration file.</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Framework Outcomes</title>
        <p>When the training of recommenders is over, Elliot uses the Output module to gather the results.
Three types of output files can be generated: (i) Performance Tables, (ii) Model Weights, and (iii)
Recommendation Lists. Performance Tables come in the form of spreadsheets, including all the
metric values generated on the test set for each recommendation model given in the configuration
ifle. Cut-of-specific and model-specific tables are included in a final report (i.e., considering each
combination of the explored parameters). Statistical hypothesis tests are also presented in the
tables, as well as a JSON file that summarizes the optimal model parameters. Optionally, Elliot
stores the model weights for the sake of future re-training.</p>
      </sec>
      <sec id="sec-2-5">
        <title>2.5. Preparation of the Experiment</title>
        <p>Elliot is triggered by a single configuration file written in YAML (e.g., refer to the toy
example sample_hello_world.yml). The first section details the data loading, filtering, and
splitting information defined in Section 2.1. The models section represents the recommendation
models’ configuration, e.g., Item- NN. Here, the model-specific hyperparameter optimization
strategies are specified, e.g., the grid-search. The evaluation section details the evaluation
strategy with the desired metrics, e.g., nDCG in the toy example. Finally, save_recs and top_k
keys detail, for example, the Output module abilities described in Section 2.4.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Conclusion and Future Work</title>
      <p>Elliot is a framework that looks at the recommendation process from the eyes of an RS
researcher. To undertake a thorough and repeatable experimental assessment, the user only has
to generate a flexible configuration file. Several loading, prefiltering, splitting, hyperparameter
optimization, recommendation models, and statistical hypothesis testing are included in the
framework. Elliot reports may be evaluated and used directly into research papers. We
evaluated the RS assessment literature, putting Elliot in the context of the other frameworks and
highlighted its benefits and drawbacks. Following that, we looked at the framework’s design
and how to create a functional (and repeatable) experimental benchmark. Elliot is the only
recommendation framework we’re aware of that supports a full multi-recommender
experimental pipeline from a single configuration file. We intend to expand the framework in the near
future to incorporate sequential recommendation scenarios, adversarial attacks, reinforcement
learning-based recommendation systems, diferential privacy facilities, sampling assessment,
and distributed recommendation, among other things.
doi:10.1109/ISCA45697.2020.00084.
[23] A. Salah, Q. Truong, H. W. Lauw, Cornac: A comparative framework for
multimodal recommender systems, J. Mach. Learn. Res. 21 (2020) 95:1–95:5. URL:
http://jmlr.org/papers/v21/19-805.html.
[24] M. F. Dacrema, P. Cremonesi, D. Jannach, Are we really making much progress? A worrying
analysis of recent neural recommendation approaches, in: RecSys, ACM, 2019, pp. 101–109.
[25] S. Rendle, W. Krichene, L. Zhang, J. R. Anderson, Neural collaborative filtering vs. matrix
factorization revisited, in: RecSys, ACM, 2020, pp. 240–248.
[26] W. X. Zhao, S. Mu, Y. Hou, Z. Lin, K. Li, Y. Chen, Y. Lu, H. Wang, C. Tian, X. Pan, Y. Min,
Z. Feng, X. Fan, X. Chen, P. Wang, W. Ji, Y. Li, X. Wang, J. Wen, Recbole: Towards a
unified, comprehensive and eficient framework for recommendation algorithms, CoRR
abs/2011.01731 (2020). URL: https://arxiv.org/abs/2011.01731. arXiv:2011.01731.
[27] J. Bergstra, D. Yamins, D. D. Cox, Making a science of model search: Hyperparameter
optimization in hundreds of dimensions for vision architectures, in: Proceedings of the
30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21
June 2013, volume 28 of JMLR Workshop and Conference Proceedings, JMLR.org, 2013, pp.
115–123. URL: http://proceedings.mlr.press/v28/bergstra13.html.
[28] V. W. Anelli, T. D. Noia, E. D. Sciascio, A. Ragone, J. Trotta, Local popularity and time in
top-n recommendation, in: ECIR (1), volume 11437 of Lecture Notes in Computer Science,
Springer, 2019, pp. 861–868.
[29] A. Bellogín, P. Sánchez, Revisiting neighbourhood-based recommenders for temporal
scenarios, in: RecTemp@RecSys, volume 1922 of CEUR Workshop Proceedings, CEUR-WS.org,
2017, pp. 40–44.
[30] X. Wang, X. He, M. Wang, F. Feng, T. Chua, Neural graph collaborative filtering, in:
B. Piwowarski, M. Chevalier, É. Gaussier, Y. Maarek, J. Nie, F. Scholer (Eds.), SIGIR 2019,
ACM, 2019, pp. 165–174. doi:10.1145/3331184.3331267.
[31] V. W. Anelli, A. Bellogín, A. Ferrara, D. Malitesta, F. A. Merra, C. Pomo, F. M. Donini, T. D.</p>
      <p>Noia, V-elliot: Design, evaluate and tune visual recommender systems, in: RecSys, ACM,
2021, pp. 768–771.
[32] R. He, J. J. McAuley, VBPR: visual bayesian personalized ranking from implicit feedback,
in: D. Schuurmans, M. P. Wellman (Eds.), Proceedings of the Thirtieth AAAI Conference
on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA, AAAI Press, 2016,
pp. 144–150. URL: http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11914.
[33] J. Tang, X. Du, X. He, F. Yuan, Q. Tian, T. Chua, Adversarial training towards robust
multimedia recommender system, IEEE Trans. Knowl. Data Eng. 32 (2020) 855–867.
doi:10.1109/TKDE.2019.2893638.
[34] V. W. Anelli, T. D. Noia, E. D. Sciascio, A. Ragone, J. Trotta, How to make latent factors
interpretable by feeding factorization machines with knowledge graphs, in: ISWC (1),
volume 11778 of Lecture Notes in Computer Science, Springer, 2019, pp. 38–56.
[35] V. W. Anelli, T. D. Noia, E. D. Sciascio, A. Ferrara, A. C. M. Mancino, Sparse feature
factorization for recommender systems with knowledge graphs, in: RecSys, ACM, 2021, pp. 154–165.
[36] V. W. Anelli, T. D. Noia, E. D. Sciascio, C. Pomo, A. Ragone, On the discriminative power
of hyper-parameters in cross-validation and how to choose them, in: RecSys, ACM, 2019,
pp. 447–451.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>W.</given-names>
            <surname>Krichene</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rendle</surname>
          </string-name>
          ,
          <article-title>On sampled metrics for item recommendation</article-title>
          , in: R. Gupta,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Prakash</surname>
          </string-name>
          (Eds.),
          <source>KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining</source>
          , Virtual Event, CA, USA,
          <year>August</year>
          23-
          <issue>27</issue>
          ,
          <year>2020</year>
          , ACM,
          <year>2020</year>
          , pp.
          <fpage>1748</fpage>
          -
          <lpage>1757</lpage>
          . URL: https://dl.acm.org/doi/10.1145/3394486.3403226.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bennett</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Lanning,</surname>
          </string-name>
          <article-title>The netflix prize</article-title>
          ,
          <source>in: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , San Jose, California, USA,
          <year>August</year>
          12-
          <issue>15</issue>
          ,
          <year>2007</year>
          , ACM,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B. M.</given-names>
            <surname>Sarwar</surname>
          </string-name>
          , G. Karypis,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Konstan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          ,
          <article-title>Item-based collaborative filtering recommendation algorithms</article-title>
          , in: V. Y.
          <string-name>
            <surname>Shen</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Saito</surname>
            ,
            <given-names>M. R.</given-names>
          </string-name>
          <string-name>
            <surname>Lyu</surname>
          </string-name>
          , M. E. Zurko (Eds.),
          <source>WWW</source>
          <year>2001</year>
          , ACM,
          <year>2001</year>
          , pp.
          <fpage>285</fpage>
          -
          <lpage>295</lpage>
          . doi:
          <volume>10</volume>
          .1145/371920.372071.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Koren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Bell</surname>
          </string-name>
          ,
          <article-title>Advances in collaborative filtering</article-title>
          , in: F.
          <string-name>
            <surname>Ricci</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Rokach</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          Shapira (Eds.),
          <source>Recommender Systems Handbook</source>
          , Springer,
          <year>2015</year>
          , pp.
          <fpage>77</fpage>
          -
          <lpage>118</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-1-
          <fpage>4899</fpage>
          -7637-6\_3.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Rendle</surname>
          </string-name>
          ,
          <article-title>Factorization machines</article-title>
          , in: G. I.
          <string-name>
            <surname>Webb</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Gunopulos</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          Wu (Eds.),
          <source>ICDM 2010, The 10th IEEE International Conference on Data Mining</source>
          , Sydney, Australia,
          <fpage>14</fpage>
          -17
          <source>December</source>
          <year>2010</year>
          , IEEE Computer Society,
          <year>2010</year>
          , pp.
          <fpage>995</fpage>
          -
          <lpage>1000</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICDM.
          <year>2010</year>
          .
          <volume>127</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. G.</given-names>
            <surname>Krishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Hofman</surname>
          </string-name>
          , T. Jebara,
          <article-title>Variational autoencoders for collaborative filtering</article-title>
          , in: P.
          <string-name>
            <surname>Champin</surname>
            ,
            <given-names>F. L.</given-names>
          </string-name>
          <string-name>
            <surname>Gandon</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lalmas</surname>
          </string-name>
          , P. G. Ipeirotis (Eds.),
          <source>WWW</source>
          <year>2018</year>
          , ACM,
          <year>2018</year>
          , pp.
          <fpage>689</fpage>
          -
          <lpage>698</lpage>
          . doi:
          <volume>10</volume>
          .1145/3178876.3186150.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vargas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Castells</surname>
          </string-name>
          ,
          <article-title>Rank and relevance in novelty and diversity metrics for recommender systems</article-title>
          , in: B.
          <string-name>
            <surname>Mobasher</surname>
            ,
            <given-names>R. D.</given-names>
          </string-name>
          <string-name>
            <surname>Burke</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Jannach</surname>
          </string-name>
          , G. Adomavicius (Eds.),
          <source>RecSys</source>
          <year>2011</year>
          , ACM,
          <year>2011</year>
          , pp.
          <fpage>109</fpage>
          -
          <lpage>116</lpage>
          . URL: https://dl.acm.org/citation.cfm?id=
          <fpage>2043955</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>McNee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Konstan</surname>
          </string-name>
          ,
          <article-title>Being accurate is not enough: how accuracy metrics have hurt recommender systems</article-title>
          , in: G.
          <string-name>
            <surname>M. Olson</surname>
          </string-name>
          , R. Jefries (Eds.),
          <source>Extended Abstracts Proceedings of the 2006 Conference on Human Factors in Computing Systems, CHI</source>
          <year>2006</year>
          , Montréal, Québec, Canada,
          <source>April 22-27</source>
          ,
          <year>2006</year>
          , ACM,
          <year>2006</year>
          , pp.
          <fpage>1097</fpage>
          -
          <lpage>1101</lpage>
          . doi:
          <volume>10</volume>
          .1145/1125451.1125659.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vargas</surname>
          </string-name>
          ,
          <article-title>Novelty and diversity enhancement and evaluation in recommender systems and information retrieval</article-title>
          , in: S. Geva,
          <string-name>
            <given-names>A.</given-names>
            <surname>Trotman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bruza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L. A.</given-names>
            <surname>Clarke</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          Järvelin (Eds.),
          <source>SIGIR</source>
          <year>2014</year>
          , ACM,
          <year>2014</year>
          , p.
          <fpage>1281</fpage>
          . doi:
          <volume>10</volume>
          .1145/2600428.2610382.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Castells</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Hurley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vargas</surname>
          </string-name>
          ,
          <article-title>Novelty and diversity in recommender systems</article-title>
          , in: F.
          <string-name>
            <surname>Ricci</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Rokach</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          Shapira (Eds.),
          <source>Recommender Systems Handbook</source>
          , Springer,
          <year>2015</year>
          , pp.
          <fpage>881</fpage>
          -
          <lpage>918</lpage>
          . URL: https://doi.org/10.1007/978-1-
          <fpage>4899</fpage>
          -7637-6_
          <fpage>26</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-1-
          <fpage>4899</fpage>
          -7637-6\_
          <fpage>26</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Caverlee</surname>
          </string-name>
          ,
          <article-title>Popularity-opportunity bias in collaborative filtering</article-title>
          ,
          <source>in: WSDM</source>
          <year>2021</year>
          , ACM,
          <year>2021</year>
          . doi: https: //doi.org/10.1145/3437963.3441820.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Deldjoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. W.</given-names>
            <surname>Anelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zamani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bellogin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. Di</given-names>
            <surname>Noia</surname>
          </string-name>
          ,
          <article-title>A flexible framework for evaluating user and item fairness in recommender systems, User Modeling</article-title>
          and
          <string-name>
            <surname>User-Adapted Interaction</surname>
          </string-name>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>47</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Said</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bellogín</surname>
          </string-name>
          ,
          <article-title>Comparative recommender system evaluation: benchmarking recommendation frameworks</article-title>
          , in: A.
          <string-name>
            <surname>Kobsa</surname>
            ,
            <given-names>M. X.</given-names>
          </string-name>
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ester</surname>
          </string-name>
          , Y. Koren (Eds.),
          <source>RecSys</source>
          <year>2014</year>
          , ACM,
          <year>2014</year>
          , pp.
          <fpage>129</fpage>
          -
          <lpage>136</lpage>
          . doi:
          <volume>10</volume>
          .1145/2645710.2645746.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Qu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , C. Geng,
          <article-title>Are we evaluating rigorously? benchmarking recommendation for reproducible evaluation and fair comparison</article-title>
          , in: R. L.
          <string-name>
            <surname>T. Santos</surname>
            ,
            <given-names>L. B.</given-names>
          </string-name>
          <string-name>
            <surname>Marinho</surname>
            ,
            <given-names>E. M.</given-names>
          </string-name>
          <string-name>
            <surname>Daly</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Falk</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Koenigstein</surname>
          </string-name>
          , E. S. de Moura (Eds.),
          <source>RecSys</source>
          <year>2020</year>
          , ACM,
          <year>2020</year>
          , pp.
          <fpage>23</fpage>
          -
          <lpage>32</lpage>
          . doi:
          <volume>10</volume>
          .1145/3383313.3412489.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gantner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rendle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Freudenthaler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Schmidt-Thieme</surname>
          </string-name>
          ,
          <article-title>Mymedialite: a free recommender system library</article-title>
          , in: B.
          <string-name>
            <surname>Mobasher</surname>
            ,
            <given-names>R. D.</given-names>
          </string-name>
          <string-name>
            <surname>Burke</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Jannach</surname>
          </string-name>
          , G. Adomavicius (Eds.),
          <source>RecSys</source>
          <year>2011</year>
          , ACM,
          <year>2011</year>
          , pp.
          <fpage>305</fpage>
          -
          <lpage>308</lpage>
          . doi:
          <volume>10</volume>
          .1145/2043932.2043989.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>M. D. Ekstrand</surname>
          </string-name>
          ,
          <article-title>Lenskit for python: Next-generation software for recommender systems experiments</article-title>
          , in: M.
          <string-name>
            <surname>d'Aquin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Dietze</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Hauf</surname>
            , E. Curry,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Cudré-Mauroux</surname>
          </string-name>
          (Eds.),
          <source>CIKM</source>
          <year>2020</year>
          , ACM,
          <year>2020</year>
          , pp.
          <fpage>2999</fpage>
          -
          <lpage>3006</lpage>
          . doi:
          <volume>10</volume>
          .1145/3340531.3412778.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kula</surname>
          </string-name>
          ,
          <article-title>Metadata embeddings for user and item cold-start recommendations</article-title>
          , in: T. Bogers, M. Koolen (Eds.),
          <source>Proceedings of the 2nd Workshop on New Trends on Content-Based Recommender Systems co-located with 9th ACM Conference on Recommender Systems (RecSys</source>
          <year>2015</year>
          ), Vienna, Austria,
          <source>September 16-20</source>
          ,
          <year>2015</year>
          ., volume
          <volume>1448</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>14</fpage>
          -
          <lpage>21</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1448</volume>
          /paper4.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>N.</given-names>
            <surname>Hug</surname>
          </string-name>
          ,
          <article-title>Surprise: A python library for recommender systems</article-title>
          ,
          <source>J. Open Source Softw</source>
          .
          <volume>5</volume>
          (
          <year>2020</year>
          )
          <article-title>2174</article-title>
          . doi:
          <volume>10</volume>
          .21105/joss.02174.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>G.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <surname>N.</surname>
          </string-name>
          <article-title>Yorke-Smith, Librec: A java library for recommender systems</article-title>
          , in: A. I. Cristea,
          <string-name>
            <given-names>J.</given-names>
            <surname>Masthof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Said</surname>
          </string-name>
          , N. Tintarev (Eds.), Posters, Demos, Late-breaking
          <source>Results and Workshop Proceedings of the 23rd Conference on User Modeling</source>
          , Adaptation, and
          <string-name>
            <surname>Personalization</surname>
          </string-name>
          (UMAP
          <year>2015</year>
          ), Dublin, Ireland, June 29 - July 3,
          <year>2015</year>
          , volume
          <volume>1388</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>L.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Bagdasaryan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gruenstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hsieh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Estrin</surname>
          </string-name>
          ,
          <article-title>Openrec: A modular framework for extensible and adaptable recommendation algorithms</article-title>
          , in: Y.
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Zhai</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Liu</surname>
          </string-name>
          , Y. Maarek (Eds.),
          <source>WSDM</source>
          <year>2018</year>
          , ACM,
          <year>2018</year>
          , pp.
          <fpage>664</fpage>
          -
          <lpage>672</lpage>
          . doi:
          <volume>10</volume>
          .1145/3159652.3159681.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>J.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Generating reliable friends via adversarial training to improve social recommendation</article-title>
          , in: J.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Shim</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          Wu (Eds.),
          <source>2019 IEEE International Conference on Data Mining, ICDM</source>
          <year>2019</year>
          , Beijing, China, November 8-
          <issue>11</issue>
          ,
          <year>2019</year>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>768</fpage>
          -
          <lpage>777</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICDM.
          <year>2019</year>
          .
          <volume>00087</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>U.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hsia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Saraph</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Reagen</surname>
          </string-name>
          , G. Wei,
          <string-name>
            <given-names>H. S.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Brooks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>Deeprecsys: A system for optimizing end-to-end at-scale neural recommendation inference</article-title>
          ,
          <source>in: 47th ACM/IEEE Annual International Symposium on Computer Architecture</source>
          ,
          <source>ISCA</source>
          <year>2020</year>
          , Valencia, Spain, May 30 - June 3,
          <year>2020</year>
          , IEEE,
          <year>2020</year>
          , pp.
          <fpage>982</fpage>
          -
          <lpage>995</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>