<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>RADio-: a simplified codebase for evaluating normative diversity in recom mender systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sanne Vrijenhoek</string-name>
          <email>s.vrijenhoek@uva.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>AI, Media and Democracy Lab</institution>
          ,
          <addr-line>Amsterdam</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>INRA'24: News Recommendation and Analytics Workshop</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Institute for Information Law, University of Amsterdam</institution>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Diversity is one of the core beyond-accuracy objectives considered in the development of news recommender systems. However, there is a clear gap between its technical conceptualization, typically as an intra-list distance, and a more normative interpretation, which touches upon the role the recommender system plays in society. Vrijenhoek et al. [1] proposed to instead use rank-aware divergence metrics to express normative diversity in news recommendations. This work describes a repository that allows for easy implementation of these metrics, by making the diferent diversity aspects and tactics configurable. It also contains an example implementation and analysis of the results. With its modular setup, the repository thus allows for conceptualizations of diversity that can be tailored to the news domain they need to be applied in.</p>
      </abstract>
      <kwd-group>
        <kwd>LaTeX class</kwd>
        <kwd>paper template</kwd>
        <kwd>paper formatting</kwd>
        <kwd>CEUR-WS</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>CEUR</p>
      <p>ceur-ws.org
practice RADio implemented the diversity metrics (DART) outlined in Vrijenhoek et al. [18]: Calibration,
Fragmentation, Activation, Representation and Alternative Voices, which are inspired by democratic
theory. The metrics were prototyped with news recommenders trained on the Microsoft News Dataset
(MIND) [19].</p>
      <p>In order to do justice to the normative underpinnings of the DART metrics, the RADio metrics needed
metadata that was not included in MIND. This metadata would include things like which political
viewpoints are expressed in an article, is the article written in a neutral or subjective tone, or does the
article mention people from a minority background. This type of information is notoriously hard to
extract from just a text, and often RADio needed to rely on proxies that were known not to be exactly
right, but were necessary to prototype how the framework could theoretically function.</p>
      <p>
        Despite the fact that they were simplifications, the data preprocessing and augmentation steps
to identify these proxies were already quite elaborate. For example, political opinions would be
approximated by the mention of political actors in the text. These actors would be identified by 1)
scraping article body; 2) running Named Entity Recognition on the fulltexts; 3) attempting to match
entities of type Person to their entry on Wikidata; 4) checking whether this person was a politician,
and for which party. Without a Golden Standard it was not possible to evaluate the performance of this
approach, but even just looking at the procedure makes it quite clear that there are a lot of ways in
which this process can fail. Alterations in the spelling of a name (Barack Obama vs. President Obama)
could leave a political actor unidentified, and new elections or party compositions would render past
results invalid. This approach, or even one based on regular expressions and/or string matching, would
probably work well enough in a contained experiment over a limited amount of time where the relevant
actors are already known, such as in Michiels et al. [20] and Einarsson et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. An implementation
that monitors an algorithm in real-time would probably benefit from a more sophisticated approach to
viewpoint diversity, such as in Draws et al. [21]. Lastly, RADio’s implementation of the DART metrics
also distracts from the findings of Vrijenhoek et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], which claims that diversity can be conceptualized
in many ways, depending on the domain’s requirements.
      </p>
      <p>This work describes a repository that allows for easy implementation of the divergence-based
metrics, by making the diferent diversity aspects and tactics configurable. The code can be found
on Github.1 This paper works under the assumption that whoever implements the framework has a
data preprocessing or annotation pipeline that contains the required metadata for the metrics to work.
While it still keeps the DART metrics in the repository to give examples of metric configuration, the
framework can also accommodate domains beyond news recommendation. Keep in mind that the
repository does not provide plug-and-play metrics, and that conceptualizing diversity within a news
recommender system is still very much a matter of discussion with stakeholders from outside technical
teams [22, 23].</p>
    </sec>
    <sec id="sec-2">
      <title>2. The repository</title>
      <p>The repository consists of three primary components: a Jupyter notebook which showcases how metrics
could be configured, a class for building the rank-aware distributions, and a class for calculating the
divergence scores.</p>
      <sec id="sec-2-1">
        <title>2.1. Building distributions</title>
        <p>In this part of the framework, we aim to build the distributions for the recommendation and context
respectively. In order to do this, we pass the framework the list of relevant articles (either in the
recommendation or in the context), and tell it which feature to look for. When building the distribution,
the framework can optionally account for the rank of an article in the recommendation. It will then
count articles that appear higher up in the list more strongly than those that appear lower by weighing it</p>
        <sec id="sec-2-1-1">
          <title>1https://github.com/svrijenhoek/radio-/</title>
          <p>with the harmonic number 2 of the length of the list. Making a distribution rank-aware only makes sense
when there is some sort of meaning in the ordering of the articles; for example, in a recommendation
ranked by predicted relevance, or in a reading history when the most recently read articles are listed first.
It does not make sense in cases where such a meaning cannot be found; for example, when considering
all articles that have been published over the last few days. The framework can accommodate both
categorical and numerical data. Categorical data can have both single and multiple values per article. In
case of numerical data, the values will need to be discretized into bins. The number of bins to be used
can be set, but defaults to 10. With this approach, we lose a lot of important information. For example,
we will not know that certain bins may be closer to each other than others. Future work may look into
alternative ways of calculating divergence for numerical data.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Calculating divergence</title>
        <p>
          and a context:
Within RADio, diversity is conceptualized as a rank-aware divergence score between a recommendation
 ∗( , ) =
∑  ∗() (

 ∗()
 ∗()
)
(1)
where  refers to the relevant feature to consider;  to the recommendation, and  to the context. As
explained in the previous section, both the recommendation  and context  can be set up to be
rankaware. For more details regarding the justification of setting up diversity as a rank-aware divergence
score, see Vrijenhoek et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. Within this framework, we can calculate the divergence using both
Kullback-Leibler (KL) divergence and the Jensen-Shannon (JS) divergence [24]. While KL Divergence is
commonly known, JS Divergence has the added benefit of being 1) symmetric and 2) bound between 0
and 1, and is thus the default option within the framework.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Configuring the metrics</title>
        <p>While the repository contains instructions for configuring all the original RADio metrics, for this paper
we will discuss the configuration and output of the Calibration metric in more detail as an example.
While Calibration is from a normative perspective not the most interesting metric, it relies on data that
is supplied in MIND itself, and therefore does not rely on complicated data augmentation to show
meaningful results.</p>
        <p>The goal of Calibration is to measure to which extent a recommendation is tailored to a user’s
preferences. Thus, we want this score to show low divergence, meaning that there is actually
a large overlap between the recommendation and what a user wants to see. In this setup, we
approximate a users’ preferences by looking at the categories of articles they have consumed in
the past: their reading history. Note that this is just an example implementation, and that there
are likely many better ways to express a users’ interests than through categories in past reading behavior.</p>
        <sec id="sec-2-3-1">
          <title>In summary, we configure the metric in the following way:</title>
          <p>2https://en.wikipedia.org/wiki/Harmonic_number
history of that user; and 5) one or more generated recommendations, corresponding to diferent
recommendation algorithms. We assume that an apply-method will be called to calculate the diversity
metrics for each line, and thus for each of the diferent algorithms. We first configure a Metric:
1 Calibration = DiversityMetric(
2 feature_type='cat',
3 rank_aware_recommendation=True,
4 rank_aware_context=True,
5 divergence='JSD',
6 context = 'dynamic'
7 )</p>
          <p>Here, ‘feature_type’, ‘rank_aware_recommendation’, ‘rank_aware _context’ and ‘divergence’
correspond to the information summarized in Table 1. The  parameter is there for eficiency. If the
context is  , it will need to be calculated for every line. This is the case here, as we are looking at
the users’ reading history, which is of course diferent for every user. The context can also be  , or
the same for all users. This is the case when for example looking at all articles published, or when
considering an external distribution. Next, we write a  _ function to pass the right
recommendation and context to the framework:
1 def calculate_calibration(recommendations, history):
2 scores = []
3 context_features = get_features(history, 'category')
4 for recommendation in recommendations:
5 recommendation_features = get_features(recommendation, 'category')
6 if context_features and recommendation_features:
7 calibration = Calibration.compute(context_features,
↪ recommendation_features)
scores.append(calibration)
else:</p>
          <p>scores.append(None)
return scores</p>
          <p>We expect ‘recommendations’ to be a list where each entry in the list corresponds to a diferent
algorithm. Each entry again consists of a list of article IDs. We also expect that these are ordered
by which article is going to be recommended first according to that algorithm. Next, we tell the
framework to retrieve the ‘category’ feature for each article in both the recommendations and the
reading history. The resulting lists of features are given to the framework to, under the hood, build the
corresponding distributions and calculate the divergence. The resulting ‘scores’ is a list of scores, each
entry corresponding to one of the recommendation algorithms.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Output</title>
      <p>We run the configured metric on the news articles and recommendations of the ‘MINDsmall_dev’ dataset,
which can be obtained from the Microsoft website 3. We compare the recommendations generated by
the LSTUR [25] and NRMS [26] algorithms, trained using the code supplied by Microsoft 4, to those from
two simple baseline algorithms: a random selection, and a selection based on article popularity. For</p>
      <sec id="sec-3-1">
        <title>3https://msnews.github.io/ 4https://github.com/recommenders-team/recommenders/tree/main</title>
        <p>the most popular baseline, the popularity of an item is derived from the clicks recorded in the dataset.
However, there are many articles with zero recorded clicks, and in case of a tie in the number of clicks
the recommender makes a random selection.</p>
        <p>It is quite hard to pinpoint what exactly a ‘good’ divergence score would be. However, when we
compare the algorithm we are interested in to a baseline algorithm, we can draw some conclusions on
how that algorithm impacts the behavior of the metric. In this example, the first diference in metric
outcomes can already be observed from calculating basic statistics on the outcomes, visualized in Table
2. At each point, the divergence in the neural recommenders is lower than those for the baseline
recommenders. As expected, the neural recommenders are more tailored to the users’ preferences
than the baselines. Note that this does not mean that generally neural recommenders are more diverse
than baseline ones; it just means that in this conception of diversity, and in this setting, the neural
recommenders show more of the desired behavior than the baseline does.</p>
        <p>Figures 1 and 2, which can also be found in the repository, provide more details into the behavior
of the metric. In Figure 1 we see that the neural recommenders show similar patterns, and that the
baseline recommenders behave similarly between them too. It also shows the efect of the time of
day; there may have been meaningful events that influence the type of articles a recommender system
can choose from, and thus make the algorithm choose articles that diverge from the users’ personal
preferences. Note that this is not necessarily bad, if the primary goal of the recommender is to inform
readers about important events happening in the world. In Figure 2, the neural recommenders have
distinctly lower divergence, which means that the recommendations they generate are closer to the
users’ reading history. Given that there are clear diferences between the baseline recommenders in
this image but not in Figure 1, some meaningful consequences happen when aggregating scores per
user; the most popular recommender may generate more Calibrated recommendations for some users
than for others.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion</title>
      <p>Section 3 explains how to technically implement the RADio- framework to measure normative diversity
in recommendations. The example metrics are tailored towards news recommendation, but the
framework can be adapted to suit a wide range of applications. Yet, this does not yet answer the question
of how one should go about conceptualizing diversity for their application. This can be exceptionally
challenging for technical teams that, while they are the ones that need to implement the metric, often
do not have all the domain knowledge necessary for making such decisions. As such, it is important
that all relevant stakeholders are brought to the table. In the case of news recommendation, these would
include editorial, but also strategic and business roles [22]. Readers themselves also bring a diferent
perspective on what they value in their news, and why they would choose to read certain items but
others not [27, 28, 29, 30]. Lastly, one should not underestimate the efects of interface design on users’
reading behavior. Even a perfectly built and diverse algorithm may not accomplish what it is intended
to do due to position bias or simply diferences between users [ 31].</p>
      <p>
        Vrijenhoek et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] interviewed professionals in the media sector, and noted all the diferent ways
they spoke about diversity. The taxonomy that resulted from this, which is split into goals, aspects
and contexts of diversity, could serve as a starting point for other implementations; at the very least,
it should facilitate discussion and ease the identification of domain-specific needs and requirements.
Furthermore, one could take inspiration from literature beyond the technological domain. For example,
those working on news recommendation could look into how social scientists conceptualize diversity,
and draw inspiration from democratic theory and the role news plays in society [
        <xref ref-type="bibr" rid="ref6">6, 18</xref>
        ]. However, while
democratic theory is directly relevant to news, it should not be blindly applied to other domains. Rather,
we would encourage those from other domains to invest time choosing or building their own normative
framework [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>Without a doubt, conceptualizing and implementing diversity in any kind of recommender system is
a complicated process, and it is unlikely that a perfect (or even a good) solution will be attained in a
single iteration. One could argue that aiming for one would only prevent any progress from happening.
Rather, perhaps we should aim for imperfect solutions; ones that we fully understand, and where we
can exactly pinpoint what the metric does and does not do. As such, we would also urge readers
not to resort to opaque solutions such as of-the-shelf Large Language Models, which may be easy to
implement but are not under the control and full understanding of your organization. Solutions that
we know are simplified, perhaps even ‘stupid’, can be discussed and criticized, and thus be improved
upon. It is our hope that the RADio- codebase will make at least the technical part of the process more
straightforward.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work builds upon the code of the original RADio framework, which was a collaboration between
the author of this work, Gabriel Bénédict and Mateo Gutierrez Granada. I would like to thank Savvina
Daniil for testing and reviewing the code, Johannes Kruse for making the repository’s very first pull
request, and Lucien Heitz and Alain Starke for proofreading.
recommender systems on political agendas in the danish 2022 general election, Information,
Communication &amp; Society (2024) 1–21.
[14] B. Huebner, T. E. Kolb, J. Neidhardt, Evaluating group fairness in news recommendations: A
comparative study of algorithms and metrics, in: Adjunct Proceedings of the 32nd ACM Conference
on User Modeling, Adaptation and Personalization, UMAP Adjunct ’24, Association for Computing
Machinery, New York, NY, USA, 2024, p. 337–346. URL: https://doi.org/10.1145/3631700.3664897.
doi:10.1145/3631700.3664897.
[15] L. Heitz, J. A. Lischka, R. Abdullah, L. Laugwitz, H. Meyer, A. Bernstein, Deliberative diversity
for news recommendations: Operationalization and experimental user study, in: Proceedings
of the 17th ACM Conference on Recommender Systems, RecSys ’23, Association for Computing
Machinery, New York, NY, USA, 2023, p. 813–819. URL: https://doi.org/10.1145/3604915.3608834.
doi:10.1145/3604915.3608834.
[16] L. A. Møller, Recommended for you: how newspapers normalise algorithmic news recommendation
to fit their gatekeeping role, Journalism Studies 23 (2022) 800–817.
[17] S. Blassnig, E. Strikovic, E. Mitova, A. Urman, A. Hannák, C. de Vreese, F. Esser, A balancing act:
How media professionals perceive the implementation of news recommender systems, Digital
Journalism (2023) 1–23.
[18] S. Vrijenhoek, M. Kaya, N. Metoui, J. Möller, D. Odijk, N. Helberger, Recommenders with a mission:
assessing diversity in news recommendations, in: Proceedings of the 2021 conference on human
information interaction and retrieval, 2021, pp. 173–183.
[19] F. Wu, Y. Qiao, J.-H. Chen, C. Wu, T. Qi, J. Lian, D. Liu, X. Xie, J. Gao, W. Wu, et al., Mind: A
large-scale dataset for news recommendation, in: Proceedings of the 58th annual meeting of the
association for computational linguistics, 2020, pp. 3597–3606.
[20] L. Michiels, J. Vannieuwenhuyze, J. Leysen, R. Verachtert, A. Smets, B. Goethals, How should we
measure filter bubbles? a regression model and evidence for online news, in: Proceedings of the
17th ACM Conference on Recommender Systems, 2023, pp. 640–651.
[21] T. Draws, N. Roy, O. Inel, A. Rieger, R. Hada, M. O. Yalcin, B. Timmermans, N. Tintarev, Viewpoint
diversity in search results, in: European Conference on Information Retrieval, Springer, 2023, pp.
279–297.
[22] A. Smets, J. Hendrickx, P. Ballon, We’re in this together: a multi-stakeholder approach for news
recommenders, Digital Journalism 10 (2022) 1813–1831.
[23] N. Helberger, M. van Drunen, J. Moeller, S. Vrijenhoek, S. Eskens, Towards a normative perspective
on journalistic ai: Embracing the messy reality of normative ideals, 2022.
[24] M. L. Menéndez, J. Pardo, L. Pardo, M. Pardo, The jensen-shannon divergence, Journal of the</p>
      <p>Franklin Institute 334 (1997) 307–318.
[25] M. An, F. Wu, C. Wu, K. Zhang, Z. Liu, X. Xie, Neural news recommendation with long-and
short-term user representations, in: Proceedings of the 57th annual meeting of the association for
computational linguistics, 2019, pp. 336–345.
[26] C. Wu, F. Wu, S. Ge, T. Qi, Y. Huang, X. Xie, Neural news recommendation with multi-head
self-attention, in: Proceedings of the 2019 conference on empirical methods in natural language
processing and the 9th international joint conference on natural language processing
(EMNLPIJCNLP), 2019, pp. 6389–6394.
[27] A. Starke, A. S. Bremnes, E. Knudsen, D. Trilling, C. Trattner, Perception versus reality: Evaluating
user awareness of political selective exposure in news recommender systems, in: Adjunct
Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization, 2024,
pp. 286–291.
[28] J. Moeller, F. Löecherbach, J. Möller, N. Helberger, Out of control?: Using interactive testing to
understand user agency in news recommendation systems, in: News Quality in the Digital Age,
Routledge, 2023, pp. 117–133.
[29] L. Van den Bogaert, D. Geerts, J. Harambam, Putting a human face on the algorithm: co-designing
recommender personae to democratize news recommender systems, Digital Journalism (2022)
1–21.
[30] F. Loecherbach, K. Welbers, J. Moeller, D. Trilling, W. Van Atteveldt, Is this a click towards
diversity? explaining when and why news users make diverse choices, in: Proceedings of the
13th ACM Web Science Conference 2021, 2021, pp. 282–290.
[31] N. Mattis, T. Groot Kormelink, P. K. Masur, J. Moeller, W. van Atteveldt, Nudging news readers: A
mixed-methods approach to understanding when and how interface nudges afect news selection,
Digital Journalism (2024) 1–21.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vrijenhoek</surname>
          </string-name>
          , G. Bénédict,
          <string-name>
            <given-names>M. Gutierrez</given-names>
            <surname>Granada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Odijk</surname>
          </string-name>
          , M. De Rijke,
          <article-title>RADio-Rank-Aware Divergence Metrics to Measure Normative Diversity in News Recommendations</article-title>
          ,
          <source>in: Proceedings of the 16th ACM Conference on Recommender Systems</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>208</fpage>
          -
          <lpage>219</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kunaver</surname>
          </string-name>
          , T. Požrl,
          <article-title>Diversity in recommender systems-a survey</article-title>
          ,
          <source>Knowledge-based systems 123</source>
          (
          <year>2017</year>
          )
          <fpage>154</fpage>
          -
          <lpage>162</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bagchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. A.</given-names>
            <surname>Hundogan</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. van Es</surname>
          </string-name>
          ,
          <article-title>Where are the values? a systematic literature review on news recommender systems</article-title>
          ,
          <source>ACM Transactions on Recommender Systems</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vargas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Castells</surname>
          </string-name>
          ,
          <article-title>Rank and relevance in novelty and diversity metrics for recommender systems</article-title>
          ,
          <source>in: Proceedings of the fith ACM conference on Recommender systems</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>109</fpage>
          -
          <lpage>116</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Helberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Karppinen</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          <article-title>D'acunto, Exposure diversity as a design principle for recommender systems</article-title>
          , Information, communication &amp; society
          <volume>21</volume>
          (
          <year>2018</year>
          )
          <fpage>191</fpage>
          -
          <lpage>207</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>N.</given-names>
            <surname>Helberger</surname>
          </string-name>
          ,
          <article-title>On the democratic role of news recommenders</article-title>
          , in: Algorithms, Automation, and
          <string-name>
            <surname>News</surname>
          </string-name>
          , Routledge,
          <year>2021</year>
          , pp.
          <fpage>14</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vrijenhoek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Michiels</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kruse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Starke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tintarev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Viader</given-names>
            <surname>Guerrero</surname>
          </string-name>
          ,
          <article-title>Normalize: The first workshop on normative design and evaluation of recommender systems</article-title>
          ,
          <source>in: Proceedings of the 17th ACM Conference on Recommender Systems</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1252</fpage>
          -
          <lpage>1254</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>W. B.</given-names>
            <surname>Gallie</surname>
          </string-name>
          ,
          <source>Essentially contested concepts</source>
          ,
          <source>Aristotelian Society</source>
          ,
          <year>1956</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vrijenhoek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Daniil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sandel</surname>
          </string-name>
          , L. Hollink,
          <article-title>Diversity of what? on the diferent conceptualizations of diversity in recommender systems</article-title>
          ,
          <source>in: The 2024 ACM Conference on Fairness, Accountability, and Transparency</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>573</fpage>
          -
          <lpage>584</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Haim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Graefe</surname>
          </string-name>
          , H.
          <string-name>
            <surname>-B. Brosius</surname>
          </string-name>
          ,
          <article-title>Burst of the filter bubble? efects of personalization on the diversity of google news</article-title>
          ,
          <source>Digital journalism 6</source>
          (
          <year>2018</year>
          )
          <fpage>330</fpage>
          -
          <lpage>343</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Baker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Moorosi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Denton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hutchinson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hanna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gebru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Morgenstern</surname>
          </string-name>
          ,
          <article-title>Diversity and inclusion metrics in subset selection</article-title>
          ,
          <source>in: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>117</fpage>
          -
          <lpage>123</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Trilling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Helberger</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. van Es</surname>
          </string-name>
          ,
          <article-title>Do not blame it on the algorithm: an empirical assessment of multiple recommender systems and their impact on content diversity, in: Digital media, political polarization and challenges to democracy</article-title>
          , Routledge,
          <year>2020</year>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>63</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Á. M. Einarsson</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Helles</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Lomborg</surname>
          </string-name>
          ,
          <article-title>Algorithmic agenda-setting: the subtle efects of news</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>