<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Metrics⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Discussion Paper</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vincenzo Paparella</string-name>
          <email>vincenzo.paparella@poliba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dario Di Palma</string-name>
          <email>d.dipalma2@phd.poliba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vito Walter Anelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro De Bellis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tommaso Di Noia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Recommender System, Multi-Objective Evaluation, Pareto optimality</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Politecnico di Bari</institution>
          ,
          <addr-line>via Orabona, 4, 70125 Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Current recommender systems (RSs) prioritize accuracy, often neglecting aspects like diversity and fairness. This single-metric approach overlooks valuable trade-ofs between diferent qualities. We propose a multi-objective evaluation using Pareto optimality and Quality Indicators (QI) of Pareto frontiers to consider all model configurations simultaneously across multiple perspectives. This approach reveals a more comprehensive picture of RS performance, potentially leading to a reevaluation of existing methods. Code and data are available at https://github.com/sisinflab/RecMOE.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        The success of Recommender Systems (RSs) is often measured by their ability to accurately
predict a user’s preferences and suggest relevant items. However, beyond-accuracy metrics like
diversity [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], novelty [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ], and fairness [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ] have been proposed. While beyond-accuracy
metrics have gained momentum, accuracy is still prioritized [
        <xref ref-type="bibr" rid="ref7 ref8 ref9">7, 8, 9</xref>
        ]. Figure 1 shows the
normalized performance of baselines on the Goodreads dataset, selecting the best hyper-parameters
for each metric. Selecting the best model solely based on accuracy limits consideration of
beyond-accuracy performance. A Pareto-optimal configuration improves at least one objective
without hurting others, forming the Pareto frontier [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ]. We propose introducing Quality
Indicators (QIs) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] to RSs, providing a quantitative evaluation of Pareto frontiers from
different perspectives [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Our contributions are (i) Showing the negative impact of prioritizing
accuracy and motivating multi-objective evaluation; (ii) Computing Pareto frontiers for
hyperparameter settings of models on public datasets in multi-objective scenarios. (iii) Enhancing
multi-objective evaluation by utilizing QIs to comprehensively analyze recommendation models.
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. Quality Indicators</title>
      <p>
        In this Section, we present the Quality Indicators (QIs) to assess the Pareto frontiers
corresponding to an RS model. They can be classified according to the quality they assess.
⋆Extended version [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] published at the 17th ACM Conference on Recommender Systems (RecSys 2023).
Div
      </p>
      <p>Bias Div</p>
      <p>Bias Div</p>
      <p>Bias</p>
      <p>Nov
(a) UserKNN</p>
      <p>Nov
(b) RP3</p>
      <p>Nov
(c) EASE</p>
      <sec id="sec-3-1">
        <title>Models chosen for the best values of</title>
      </sec>
      <sec id="sec-3-2">
        <title>Accuracy/Novelty</title>
      </sec>
      <sec id="sec-3-3">
        <title>Diversity</title>
      </sec>
      <sec id="sec-3-4">
        <title>Bias</title>
        <p>
          frontier. For our study, we use the Maximum Spread (MS) [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. Specifically, this spread indicator
measures the range of a Pareto frontier by considering the maximum extent of each objective.
The higher the value, the better the extensiveness of the curve.
        </p>
        <p>∑∈</p>
        <p>
          ()

Uniformity QI. The uniformity of a Pareto frontier provides information about the distribution
of the solutions. A higher uniformity of the curve denotes that the solutions are less dispersed,
while a low uniformity indicates more diversity within the set. Specifically, we employ the
Spacing metric (SP) [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] that measures the variation in the Manhattan distances between the
Pareto-optimal solutions. The lower the value, the more concentrated the solutions are on the
Pareto frontier. However, an  = 0
        </p>
        <p>indicates that all the solutions could be equidistant.</p>
        <p>
          Cardinality QI. Given  generic solutions belonging to the set  , the QIs for cardinality
determine the proportion of Pareto-optimal solutions in this set. Specifically, the Error Ratio
(ER) [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] is defined as () =
with () = 1
if  is a Pareto-optimal solution, 0
otherwise. A higher ER value indicates greater Pareto-optimal solutions in the set  .
All quality aspects QI. The QIs included in this category provide insights into the spread,
uniformity, and cardinality of the Pareto frontiers simultaneously. Among them, the
Hypervolume (HV) [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] is a volume-based QI that measures the volume of the objective function space
dominated by the Pareto frontier. The larger the hypervolume, the better the solution set is.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Experiments</title>
      <p>We aim to answer two research questions: RQ1: To what extent can the models provide
Paretooptimal configurations? Are these configurations uniformly distributed, or are they dispersed
enhancing diverse solutions to the trade-of?</p>
      <p>
        RQ2: Which model has the Pareto frontier that
simultaneously ofers better solutions on multiple metrics?
Datasets. We select three diferent datasets to cover several domains. Specifically, we use
Amazon Music (music), Goodreads [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] (books), and Movielens1M [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] (movies).
Baselines and Hyper-parameters Settings Exploration. We train five recommendation
algorithms, i.e., EASE [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], MultiVAE [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], LightGCN [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], RP3 [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], and UserKNN [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. We
train 32 hyper-parameter values combinations of each model by using Elliot [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ].
Metrics. We assess the baselines’ performance under several perspectives. We compute nDCG,
Precision, and Recall for the accuracy of recommendations. From the final user point of view,
we evaluate the diversity (with Gini index [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] and Item Coverage) and novelty (with EPC and
EFD [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]). Finally, we measure the popularity bias of the recommendations with APLT [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] –
the greater, the better – and ARP [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] – the less, the better. All these metrics refer to cutof 10.
Multi-Objective Evaluation Methodology. We obtain Pareto frontiers for each recommender
system (RS) baseline using the metrics described in Section 2. Each hyper-parameter setting
represents a solution in the objective space. We identify the Pareto-optimal configurations
for each baseline, forming their respective Pareto frontiers. We evaluate these frontiers using
QIs under two scenarios: 1) user-centered (accuracy, diversity, novelty) and 2) accuracy vs.
algorithmic bias. Figure 2 shows the resulting Pareto frontiers.
3.1. Results and Discussion
While EASE and UserKNN provide the most accurate recommendations, beyond-accuracy
metrics paint a diferent picture. By observing Figure 2, UserKNN exhibits better diversity than
EASE . Finally, RP3 consistently outperforms its competitors in addressing the popularity bias.
We delve into a multi-objective evaluation using QIs on Pareto frontiers. Here, we examine the
distribution of Pareto-optimal configurations and performance on all quality metrics.
Distribution of Pareto-optimal configurations. The Error Ratio (ER), Maximum Spread
(MS), and Spacing metric (SP) values in Table 1 unveil interesting insights into the distribution of
Pareto-optimal configurations for each model. In the nDCG/APLT scenario for the Movielens1M
dataset, for instance: 1) UserKNN exhibits a wide range of solutions with good dispersion across
the Pareto frontier, indicating its ability to ofer various well-balanced trade-ofs between
accuracy and algorithmic bias; 2) EASE , while ofering a high number of solutions on the
frontier, they tend to be concentrated in a limited area, suggesting a lack of diversity in the
achievable trade-ofs; 3) RP3 strikes a good balance between the number of solutions, their
dispersion, and the ability to provide various trade-ofs between accuracy and bias. This is
reflected in its high ER, MS, and SP values. Similar trends are observed for the other datasets
(a) Amazon, nDCG/Gini/EPC
(b) Goodreads, nDCG/Gini/EPC
(c) ML1M, nDCG/Gini/EPC
(d) Amazon, nDCG/APLT
(e) Goodreads, nDCG/APLT
      </p>
      <p>(f) ML1M, nDCG/APLT
RP3</p>
      <sec id="sec-4-1">
        <title>EASE</title>
      </sec>
      <sec id="sec-4-2">
        <title>UserKNN</title>
      </sec>
      <sec id="sec-4-3">
        <title>LightGCN</title>
      </sec>
      <sec id="sec-4-4">
        <title>MultiVAE</title>
        <p>(see Figures 2f - 2e). When examining the user-centric scenario (nDCG/Gini/EPC), UserKNN
again excels, ofering well-diversified solutions across all datasets (see Figures 2a - 2c).
Performance on all quality metrics. In response to RQ2, we can utilize the Hypervolume (HV)
measure. HV evaluates the performance of models from multiple objectives simultaneously, as
shown in Table 1. By considering the cardinality and dispersion of the Pareto-optimal solutions
and the dominance among the Pareto frontiers, HV provides us with valuable insights. The
higher the volume or area under the frontier, the greater the HV. The results show that UserKNN
outperforms the other models by achieving the best or second-best values of HV for all datasets
and scenarios. This result indicates that UserKNN generates an extensive and diversified Pareto
frontier while performing well across all metrics. While EASE has the highest value of HV for
the Amazon Music dataset in the user-centred scenario, it does not dominate or get dominated
in the remaining cases. This result highlights the model’s limited reliance on accounting for
multiple metrics. LightGCN shows no distinctive trends, while MultiVAE’s HV decreases when
dealing with sparser datasets. RP3 confirms its capability in managing the nDCG/APLT
tradeof by achieving the highest values of HV and visual dominance of its Pareto frontiers against
the others in Figures 2d, 2e, and 2f.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusion and Future Work</title>
      <p>Our multi-objective evaluation with Quality Indicators reveals new insights into recommender
systems (RSs). While EASE exhibits high accuracy, UserKNN emerges as a strong contender
ofering diverse solutions across multiple objectives. Additionally, RP3 proved to be highly
efective in the accuracy/algorithmic bias scenario.</p>
      <p>Acknowledgements. The authors acknowledge partial support of the following projects: OVS:
Fashion Retail Reloaded, Lutech Digitale 4.0, Secure Safe Apulia, Patti Territoriali WP1, BIO-D,
and MOST - Centro Nazionale per la Mobilità Sostenibile. We also gratefully acknowledge the
CINECA award under the ISCRA initiative, for the availability of HPC resources and support.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>V.</given-names>
            <surname>Paparella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. Di</given-names>
            <surname>Palma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. W.</given-names>
            <surname>Anelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Noia</surname>
          </string-name>
          ,
          <article-title>Broadening the scope: Evaluating the potential of recommender systems beyond prioritizing accuracy</article-title>
          , in: J.
          <string-name>
            <surname>Zhang</surname>
            , L. Chen,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Berkovsky</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , T. D.
          <string-name>
            <surname>Noia</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Basilico</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Pizzato</surname>
          </string-name>
          , Y. Song (Eds.),
          <source>Proceedings of the 17th ACM Conference on Recommender Systems, RecSys</source>
          <year>2023</year>
          , Singapore, Singapore,
          <source>September 18-22</source>
          ,
          <year>2023</year>
          , ACM,
          <year>2023</year>
          , pp.
          <fpage>1139</fpage>
          -
          <lpage>1145</lpage>
          . URL: https://doi.org/10.1145/3604915. 3610649. doi:
          <volume>10</volume>
          .1145/3604915.3610649.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>V.</given-names>
            <surname>Paparella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. W.</given-names>
            <surname>Anelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Boratto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Noia</surname>
          </string-name>
          ,
          <article-title>Reproducibility of multi-objective reinforcement learning recommendation: Interplay between efectiveness and beyondaccuracy perspectives</article-title>
          , in: J.
          <string-name>
            <surname>Zhang</surname>
            , L. Chen,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Berkovsky</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , T. D.
          <string-name>
            <surname>Noia</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Basilico</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Pizzato</surname>
          </string-name>
          , Y. Song (Eds.),
          <source>Proceedings of the 17th ACM Conference on Recommender Systems, RecSys</source>
          <year>2023</year>
          , Singapore, Singapore,
          <source>September 18-22</source>
          ,
          <year>2023</year>
          , ACM,
          <year>2023</year>
          , pp.
          <fpage>467</fpage>
          -
          <lpage>478</lpage>
          . URL: https://doi.org/10.1145/3604915.3609493. doi:
          <volume>10</volume>
          .1145/3604915.3609493.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vargas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Castells</surname>
          </string-name>
          ,
          <article-title>Rank and relevance in novelty and diversity metrics for recommender systems</article-title>
          , in: B.
          <string-name>
            <surname>Mobasher</surname>
            ,
            <given-names>R. D.</given-names>
          </string-name>
          <string-name>
            <surname>Burke</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Jannach</surname>
          </string-name>
          , G. Adomavicius (Eds.),
          <source>Proceedings of the 2011 ACM Conference on Recommender Systems, RecSys</source>
          <year>2011</year>
          , Chicago, IL, USA, October
          <volume>23</volume>
          -
          <issue>27</issue>
          ,
          <year>2011</year>
          , ACM,
          <year>2011</year>
          , pp.
          <fpage>109</fpage>
          -
          <lpage>116</lpage>
          . URL: https://dl.acm.org/citation.cfm?id=
          <fpage>2043955</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Di</surname>
          </string-name>
          <string-name>
            <surname>Palma</surname>
          </string-name>
          ,
          <article-title>Retrieval-augmented recommender system: Enhancing recommender systems with large language models</article-title>
          , in: RecSys, ACM,
          <year>2023</year>
          , pp.
          <fpage>1369</fpage>
          -
          <lpage>1373</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Boratto</surname>
          </string-name>
          , G. Fenu,
          <string-name>
            <given-names>M.</given-names>
            <surname>Marras</surname>
          </string-name>
          ,
          <article-title>Interplay between upsampling and regularization for provider fairness in recommender systems, User Model</article-title>
          .
          <source>User Adapt. Interact</source>
          .
          <volume>31</volume>
          (
          <year>2021</year>
          )
          <fpage>421</fpage>
          -
          <lpage>455</lpage>
          . URL: https://doi.org/10.1007/s11257-021-09294-8. doi:
          <volume>10</volume>
          .1007/ s11257-021-09294-8.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Di Palma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. W.</given-names>
            <surname>Anelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Malitesta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Paparella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pomo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Deldjoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Noia</surname>
          </string-name>
          ,
          <article-title>Examining fairness in graph-based collaborative filtering: A consumer and producer perspective</article-title>
          ,
          <source>in: IIR</source>
          , volume
          <volume>3448</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>79</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>V. W.</given-names>
            <surname>Anelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Noia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Sciascio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pomo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ragone</surname>
          </string-name>
          ,
          <article-title>On the discriminative power of hyper-parameters in cross-validation and how to choose them</article-title>
          , in: T.
          <string-name>
            <surname>Bogers</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Said</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Brusilovsky</surname>
          </string-name>
          , D. Tikk (Eds.),
          <source>Proceedings of the 13th ACM Conference on Recommender Systems, RecSys</source>
          <year>2019</year>
          , Copenhagen, Denmark,
          <source>September 16-20</source>
          ,
          <year>2019</year>
          , ACM,
          <year>2019</year>
          , pp.
          <fpage>447</fpage>
          -
          <lpage>451</lpage>
          . URL: https://doi.org/10.1145/3298689.3347010. doi:
          <volume>10</volume>
          .1145/3298689.3347010.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>V. W.</given-names>
            <surname>Anelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bellogín</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Noia</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Pomo, Reenvisioning the comparison between neural collaborative filtering and matrix factorization</article-title>
          , in: H.
          <string-name>
            <surname>J. C. Pampín</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          <string-name>
            <surname>Larson</surname>
            ,
            <given-names>M. C.</given-names>
          </string-name>
          <string-name>
            <surname>Willemsen</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          <string-name>
            <surname>Konstan</surname>
            ,
            <given-names>J. J.</given-names>
          </string-name>
          <string-name>
            <surname>McAuley</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Garcia-Gathright</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Huurnink</surname>
          </string-name>
          , E. Oldridge (Eds.),
          <source>RecSys '21: Fifteenth ACM Conference on Recommender Systems, Amsterdam, The Netherlands, 27 September 2021 - 1 October</source>
          <year>2021</year>
          , ACM,
          <year>2021</year>
          , pp.
          <fpage>521</fpage>
          -
          <lpage>529</lpage>
          . URL: https://doi.org/10.1145/3460231.3475944. doi:
          <volume>10</volume>
          .1145/3460231.3475944.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Di Palma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Biancofiore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. W.</given-names>
            <surname>Anelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Narducci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Noia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Sciascio</surname>
          </string-name>
          ,
          <article-title>Evaluating chatgpt as a recommender system: A rigorous approach</article-title>
          ,
          <source>CoRR abs/2309</source>
          .03613 (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Marler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Arora</surname>
          </string-name>
          ,
          <article-title>Survey of multi-objective optimization methods for engineering</article-title>
          ,
          <source>Structural and Multidisciplinary Optimization</source>
          <volume>26</volume>
          (
          <year>2004</year>
          )
          <fpage>369</fpage>
          -
          <lpage>395</lpage>
          . doi:
          <volume>10</volume>
          .1007/ s00158- 003- 0368- 6.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>V.</given-names>
            <surname>Paparella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. W.</given-names>
            <surname>Anelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Nardini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Perego</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Noia</surname>
          </string-name>
          ,
          <article-title>Post-hoc selection of pareto-optimal solutions in search and recommendation</article-title>
          , in: I.
          <string-name>
            <surname>Frommholz</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Hopfgartner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Oakes</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lalmas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , R. L. T. Santos (Eds.),
          <source>Proceedings of the 32nd ACM International Conference on Information and Knowledge Management</source>
          ,
          <string-name>
            <surname>CIKM</surname>
          </string-name>
          <year>2023</year>
          , Birmingham, United Kingdom,
          <source>October 21-25</source>
          ,
          <year>2023</year>
          , ACM,
          <year>2023</year>
          , pp.
          <fpage>2013</fpage>
          -
          <lpage>2023</lpage>
          . URL: https://doi.org/10.1145/3583780.3615010. doi:
          <volume>10</volume>
          .1145/3583780.3615010.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <article-title>Quality evaluation of solution sets in multiobjective optimisation: A survey, ACM Computing Surveys (CSUR) 52 (</article-title>
          <year>2019</year>
          )
          <fpage>1</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>V.</given-names>
            <surname>Paparella</surname>
          </string-name>
          ,
          <article-title>Pursuing optimal trade-of solutions in multi-objective recommender systems</article-title>
          , in: J.
          <string-name>
            <surname>Golbeck</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          <string-name>
            <surname>Harper</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Murdock</surname>
            ,
            <given-names>M. D.</given-names>
          </string-name>
          <string-name>
            <surname>Ekstrand</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Shapira</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Basilico</surname>
            ,
            <given-names>K. T.</given-names>
          </string-name>
          <string-name>
            <surname>Lundgaard</surname>
          </string-name>
          , E. Oldridge (Eds.),
          <source>RecSys '22: Sixteenth ACM Conference on Recommender Systems</source>
          , Seattle, WA, USA, September
          <volume>18</volume>
          -
          <issue>23</issue>
          ,
          <year>2022</year>
          , ACM,
          <year>2022</year>
          , pp.
          <fpage>727</fpage>
          -
          <lpage>729</lpage>
          . URL: https: //doi.org/10.1145/3523227.3547425. doi:
          <volume>10</volume>
          .1145/3523227.3547425.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zitzler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Deb</surname>
          </string-name>
          , L. Thiele,
          <article-title>Comparison of multiobjective evolutionary algorithms: Empirical results</article-title>
          ,
          <source>Evolutionary computation 8</source>
          (
          <year>2000</year>
          )
          <fpage>173</fpage>
          -
          <lpage>195</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Schott</surname>
          </string-name>
          ,
          <article-title>Fault tolerant design using single and multicriteria genetic algorithm optimization</article-title>
          .,
          <source>Technical Report, Air force inst of tech Wright-Patterson afb OH</source>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>D. A. Van Veldhuizen</surname>
          </string-name>
          ,
          <article-title>Multiobjective evolutionary algorithms: classifications, analyses</article-title>
          , and new innovations,
          <source>Air Force Institute of Technology</source>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>E.</given-names>
            <surname>Zitzler</surname>
          </string-name>
          , L. Thiele,
          <article-title>Multiobjective optimization using evolutionary algorithms-a comparative case study, in: Parallel Problem Solving from Nature-PPSN V:</article-title>
          5th International Conference Amsterdam,
          <source>The Netherlands September 27-30</source>
          ,
          <year>1998</year>
          Proceedings 5, Springer,
          <year>1998</year>
          , pp.
          <fpage>292</fpage>
          -
          <lpage>301</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Wan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Misra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nakashole</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. J. McAuley</surname>
          </string-name>
          ,
          <article-title>Fine-grained spoiler detection from largescale review corpora</article-title>
          , in: A.
          <string-name>
            <surname>Korhonen</surname>
            ,
            <given-names>D. R.</given-names>
          </string-name>
          <string-name>
            <surname>Traum</surname>
          </string-name>
          , L. Màrquez (Eds.),
          <source>Proceedings of the 57th Conference of the Association for Computational Linguistics</source>
          ,
          <string-name>
            <surname>ACL</surname>
          </string-name>
          <year>2019</year>
          , Florence, Italy,
          <source>July 28- August 2</source>
          ,
          <year>2019</year>
          , Volume
          <volume>1</volume>
          :
          <string-name>
            <given-names>Long</given-names>
            <surname>Papers</surname>
          </string-name>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>2605</fpage>
          -
          <lpage>2610</lpage>
          . URL: https://doi.org/10.18653/v1/p19-
          <fpage>1248</fpage>
          . doi:
          <volume>10</volume>
          .18653/ v1/p19-
          <fpage>1248</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Harper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Konstan</surname>
          </string-name>
          ,
          <article-title>The movielens datasets: History and context</article-title>
          ,
          <source>ACM Trans. Interact. Intell. Syst</source>
          .
          <volume>5</volume>
          (
          <year>2016</year>
          )
          <volume>19</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>19</lpage>
          :
          <fpage>19</fpage>
          . URL: https://doi.org/10.1145/2827872. doi:
          <volume>10</volume>
          . 1145/2827872.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>H.</given-names>
            <surname>Steck</surname>
          </string-name>
          ,
          <article-title>Embarrassingly shallow autoencoders for sparse data</article-title>
          , in: L. Liu,
          <string-name>
            <given-names>R. W.</given-names>
            <surname>White</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mantrach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Silvestri</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. J. McAuley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Baeza-Yates</surname>
          </string-name>
          , L. Zia (Eds.),
          <source>The World Wide Web Conference, WWW</source>
          <year>2019</year>
          , San Francisco, CA, USA, May
          <volume>13</volume>
          -17,
          <year>2019</year>
          , ACM,
          <year>2019</year>
          , pp.
          <fpage>3251</fpage>
          -
          <lpage>3257</lpage>
          . URL: https://doi.org/10.1145/3308558.3313710. doi:
          <volume>10</volume>
          .1145/3308558. 3313710.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>D.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. G.</given-names>
            <surname>Krishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Hofman</surname>
          </string-name>
          , T. Jebara,
          <article-title>Variational autoencoders for collaborative filtering</article-title>
          , in: P.
          <string-name>
            <surname>Champin</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Gandon</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lalmas</surname>
          </string-name>
          , P. G. Ipeirotis (Eds.),
          <source>Proceedings of the 2018 World Wide Web Conference on World Wide Web, WWW</source>
          <year>2018</year>
          , Lyon, France,
          <source>April 23-27</source>
          ,
          <year>2018</year>
          , ACM,
          <year>2018</year>
          , pp.
          <fpage>689</fpage>
          -
          <lpage>698</lpage>
          . URL: https://doi.org/10.1145/3178876.3186150. doi:
          <volume>10</volume>
          .1145/3178876.3186150.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Lightgcn: Simplifying and powering graph convolution network for recommendation</article-title>
          , in: J. X.
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Chang</surname>
          </string-name>
          , X. Cheng, J.
          <string-name>
            <surname>Kamps</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Murdock</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Wen</surname>
          </string-name>
          , Y. Liu (Eds.),
          <source>Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval</source>
          ,
          <string-name>
            <surname>SIGIR</surname>
          </string-name>
          <year>2020</year>
          ,
          <string-name>
            <given-names>Virtual</given-names>
            <surname>Event</surname>
          </string-name>
          , China,
          <source>July 25-30</source>
          ,
          <year>2020</year>
          , ACM,
          <year>2020</year>
          , pp.
          <fpage>639</fpage>
          -
          <lpage>648</lpage>
          . URL: https://doi.org/10. 1145/3397271.3401063. doi:
          <volume>10</volume>
          .1145/3397271.3401063.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>B.</given-names>
            <surname>Paudel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Christofel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Newell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          , Updatable, accurate, diverse, and
          <article-title>scalable recommendations for interactive applications</article-title>
          ,
          <source>ACM Trans. Interact. Intell. Syst</source>
          .
          <volume>7</volume>
          (
          <issue>2017</issue>
          ) 1:
          <fpage>1</fpage>
          -
          <lpage>1</lpage>
          :
          <fpage>34</fpage>
          . URL: https://doi.org/10.1145/2955101. doi:
          <volume>10</volume>
          .1145/2955101.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>P.</given-names>
            <surname>Resnick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Iacovou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Suchak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bergstrom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          ,
          <string-name>
            <surname>Grouplens:</surname>
          </string-name>
          <article-title>An open architecture for collaborative filtering of netnews</article-title>
          , in: J. B.
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>F. D.</given-names>
          </string-name>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>T. W.</given-names>
          </string-name>
          <string-name>
            <surname>Malone</surname>
          </string-name>
          (Eds.),
          <source>CSCW '94, Proceedings of the Conference on Computer Supported Cooperative Work</source>
          , Chapel Hill,
          <string-name>
            <surname>NC</surname>
          </string-name>
          , USA, October
          <volume>22</volume>
          -
          <issue>26</issue>
          ,
          <year>1994</year>
          , ACM,
          <year>1994</year>
          , pp.
          <fpage>175</fpage>
          -
          <lpage>186</lpage>
          . URL: https://doi.org/ 10.1145/192844.192905. doi:
          <volume>10</volume>
          .1145/192844.192905.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>V. W.</given-names>
            <surname>Anelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bellogín</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferrara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Malitesta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Merra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pomo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Donini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. D.</given-names>
            <surname>Noia</surname>
          </string-name>
          ,
          <article-title>Elliot: A comprehensive and rigorous framework for reproducible recommender systems evaluation</article-title>
          , in: F. Diaz,
          <string-name>
            <given-names>C.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Suel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Castells</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jones</surname>
          </string-name>
          , T. Sakai (Eds.),
          <source>SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , Virtual Event, Canada,
          <source>July 11-15</source>
          ,
          <year>2021</year>
          , ACM,
          <year>2021</year>
          , pp.
          <fpage>2405</fpage>
          -
          <lpage>2414</lpage>
          . URL: https://doi.org/10.1145/3404835.3463245. doi:
          <volume>10</volume>
          .1145/3404835.3463245.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lerche</surname>
          </string-name>
          , I. Kamehkhosh,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jugovac</surname>
          </string-name>
          ,
          <article-title>What recommenders recommend: an analysis of recommendation biases and possible countermeasures, User Model</article-title>
          .
          <source>User Adapt. Interact</source>
          .
          <volume>25</volume>
          (
          <year>2015</year>
          )
          <fpage>427</fpage>
          -
          <lpage>491</lpage>
          . URL: https://doi.org/10.1007/s11257-015-9165-3. doi:
          <volume>10</volume>
          .1007/ s11257- 015- 9165- 3.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>H.</given-names>
            <surname>Abdollahpouri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Burke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mobasher</surname>
          </string-name>
          ,
          <article-title>Managing popularity bias in recommender systems with personalized re-ranking</article-title>
          , in: R.
          <string-name>
            <surname>Barták</surname>
          </string-name>
          , K. W. Brawner (Eds.),
          <source>Proceedings of the Thirty-Second International Florida Artificial Intelligence Research</source>
          Society Conference, Sarasota, Florida, USA, May
          <volume>19</volume>
          -22
          <year>2019</year>
          , AAAI Press,
          <year>2019</year>
          , pp.
          <fpage>413</fpage>
          -
          <lpage>418</lpage>
          . URL: https://aaai. org/ocs/index.php/FLAIRS/FLAIRS19/paper/view/18199.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>