<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">NPR: a News Portal Recommendations dataset</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Joel</forename><forename type="middle">Pinho</forename><surname>Lucas</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Grupo Globo</orgName>
								<address>
									<addrLine>Av. das Américas</addrLine>
									<postCode>1650</postCode>
									<settlement>Rio de Janeiro</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">João</forename><forename type="middle">Felipe</forename><surname>Guedes Da Silva</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Grupo Globo</orgName>
								<address>
									<addrLine>Av. das Américas</addrLine>
									<postCode>1650</postCode>
									<settlement>Rio de Janeiro</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="institution">Universidade Federal do Rio de Janeiro</orgName>
								<address>
									<addrLine>Av Athos da Silveira Ramos, 149 -Cidade Universitária -CT Bloco H sala 220</addrLine>
									<settlement>Rio de Janeiro</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Letícia</forename><forename type="middle">Freire</forename><surname>Figueiredo</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Grupo Globo</orgName>
								<address>
									<addrLine>Av. das Américas</addrLine>
									<postCode>1650</postCode>
									<settlement>Rio de Janeiro</settlement>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
							<affiliation key="aff2">
								<orgName type="institution" key="instit1">Universidade Federal Fluminense</orgName>
								<orgName type="institution" key="instit2">Av Gal. Milton Tavares de Souza</orgName>
								<address>
									<addrLine>s/nº -São Domingos</addrLine>
									<settlement>Niterói</settlement>
									<region>RJ</region>
									<country key="BR">Brazil</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">NPR: a News Portal Recommendations dataset</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">0804F57110573BCB95E39D5FCA180BF2</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T16:44+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Public dataset</term>
					<term>News recommendations</term>
					<term>Normative diversity</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Recommender systems have become key applications for news websites to filter relevant articles to users among an ever-growing catalog. However, building such applications brought challenges yet to be solved like filter bubbles and addressing diversity. In this way, publicly available datasets play a central role in solving these problems since they bring both academic and industrial researchers to a common ground for proposing new solutions. Yet, not only are news recommendation datasets scarce but also most of them lack the necessary content for research towards news diversity. In this paper, we introduce the News Portal Recommendations (NPR) dataset for news recommendation. NPR is an improvement of a previously published dataset, which lacked the information needed for normative diversity analysis. In this sense, we make use of the RADio framework in order to calculate diversity metrics on the dataset. Differently from other publicly available data, such as the MIND dataset, in this work, we are focusing on data tracked from frequent user interactions in hard news (i.e. users with more interactions with the portal). The NPR dataset is available in a Kaggle repository 1 .</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>News portals provide content to millions of users in current days, from topics like sports to politics. With such a wide range of themes and a massive number of possible articles to read, news recommender systems play a central role in filtering which items are more suited for a specific user at a given time <ref type="bibr" target="#b0">[1]</ref>. However, several challenges still need to be overcome when building such systems, both in the societal and technical domains.</p><p>The definition of what is suited for a user is somewhat relative. Some may design news recommender systems to optimize for user engagement, which should lead to higher click rates or reading time on a platform <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>. However, in this scenario, news recommender may not be suited to keep the users informed on relevant aspects of society other than those they are more leaned to consume, which raises concerns about the democratic roles of these systems <ref type="bibr" target="#b3">[4]</ref>.</p><p>This algorithmic influence is amplified due to the fact that readers tend to engage more on contents that confirm their own worldview <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6]</ref>. This phenomenon prompts recommenders to limit the diversity of suggested items, potentially leading to user segregation and biased opinions <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8]</ref>.</p><p>Other than these societal issues, technical challenges still need to be addressed on news recommender. As new items are added every minute with fresh information, old items are inactivated for recommendation, yielding a short item shelf life <ref type="bibr" target="#b8">[9]</ref>. As a consequence, traditional user-item matrix used by algorithms are commonly very sparse, which sets additional challenges to model user's preferences <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>. This scenario is aggravated with anonymous users who usually have few past interactions in the system <ref type="bibr" target="#b9">[10]</ref>.</p><p>Besides these sparsity challenges, news recommenders heavily rely on rich feature engineering to represent items and model users' consumption from previous behaviors <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b1">2]</ref>. Although simpler forms of metadata can be used, such as news article categories, representing items from its textual content requires applying complex natural language processing techniques to the article's title or body content <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b12">13]</ref>.</p><p>Creating solutions to these issues with news recommender systems requires contributions from both industrial and academic players. In this scenario, proper datasets need to be publicly available for researchers to discuss and explore solutions on a common ground. Several benchmarks have been proposed so far and each of them has had its own share of contribution towards research development.</p><p>However, most datasets so far lack the proper information structure for research on news recommender to be properly conducted <ref type="bibr" target="#b13">[14]</ref>. To the best of our knowledge, the most suited released so far is the MIND <ref type="bibr" target="#b10">[11]</ref> dataset, which has enabled several works to be published towards technical challenges <ref type="bibr" target="#b14">[15,</ref><ref type="bibr" target="#b15">16,</ref><ref type="bibr" target="#b16">17]</ref>. Nonetheless, even this last benchmark still has its own limitations when it comes to normative diversity research <ref type="bibr" target="#b0">[1]</ref>.</p><p>In order to fill the gaps in previously published datasets, this paper introduces the News Portal Recommendations dataset, an improved version of a past dataset that aims to provide the necessary information for research on normative diversity in news recommender systems. Therefore, it is structured as follows.</p><p>Section 2 revisits related works published in the past related to public datasets used for research development on news recommender systems. Then Section 3 describes how the proposed dataset was constructed and what are, as well as its main characteristics that contribute to bridging the gap of past datasets. Later, Section 4 brings normative diversity metrics from the proposed dataset and, finally, all results are discussed and concluded in Section 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Works</head><p>In the past years, a few datasets have been made public to foster research in news recommenders. A dataset from Globo.com <ref type="bibr" target="#b17">[18]</ref> was built by sampling user interactions from G1 1 , a Brazilian news portal. It contains 3𝑀 records distributed in 46𝑘 news articles and 314𝑘 users extracted from October 1 to October 16, 2017. However, instead of having text information from the articles (like its title or body content), it contains the article's word embeddings generated by a neural model trained on classification tasks <ref type="bibr" target="#b17">[18]</ref>, which considerably limited the use of recent natural language processing tools and other types of diversity-oriented explorations.</p><p>The MIcrosoft News Dataset (MIND) <ref type="bibr" target="#b10">[11]</ref> dataset was later published, providing news' textual information as metadata for 161𝑘 news items, MIND contains 24.1𝑀 logs for 1𝑀 randomly sampled users from Microsoft News<ref type="foot" target="#foot_1">2</ref> who had at least 5 clicks in the period between October 12 and November 22, 2019. In addition, the dataset is associated with a public competition<ref type="foot" target="#foot_2">3</ref> in which the goal was to predict the click scores of candidate news based on user interests.</p><p>A few other datasets were published before the aforementioned <ref type="bibr" target="#b13">[14]</ref>. Plista <ref type="bibr" target="#b18">[19]</ref> contains activity logs from 13 German news portals, recorded in June 2013 by ≈ 1𝑀 sampled records to ≈ 70𝑘 items. Adressa <ref type="bibr" target="#b19">[20]</ref> included 27M click interactions from 3M users to 48k news articles, extracted in a ten weeks period from Adresseavisen<ref type="foot" target="#foot_3">4</ref> . However, each of these datasets has its own limitations like size or lack of metadata, as thoroughly described in the MIND original article <ref type="bibr" target="#b10">[11]</ref>.</p><p>Considering these aforementioned datasets, MIND became a reference benchmark due to its size and textual components. Nonetheless, despite its contributions, some of its limitations towards recommender diversity were brought to light.</p><p>Firstly, as it contains a considerable amount of soft news, it may compromise research in normative diversity metrics which are more tailored towards the so-called "hard" news <ref type="bibr" target="#b0">[1]</ref>. Secondly, the dataset is split among training, validation, and test sets, in which the validation set only contains data from November 15, 2019. In this case, it is unlikely how the users and the recommender's behavior towards those users change over time <ref type="bibr" target="#b20">[21]</ref>. Finally, nearly half of the anonymous user IDs have unique visits, which makes it unlikely to model the long-term effects of the recommender diversity on user consumption.</p><p>In order to improve the research possibilities on news recommendations towards normative diversity, this article proposes the News Portal Recommendations (NPR) dataset, a restructured version of a previous dataset aiming to provide a qualified dataset for research purposes. In particular, we list the following main contributions compared to the MIND dataset:</p><p>• Focus on hard news • Ranked recommendation lists • Distinction between logged and returning users from anonymous ones • Longer periods of data</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">News Portal Recommendations Dataset</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Dataset Construction</head><p>The NPR dataset was built by sampling users from G1 <ref type="foot" target="#foot_4">5</ref> , the largest Brazilian news portal maintained by the Globo media company. It contains 1162802 randomly sampled users who received recommendations in the period between January 3rd, 2023, and May 1st, 2023, where nearly 73% are non-logged users. All users were anonymized in order to protect data privacy. NPR was developed following the same structure as the MIND <ref type="bibr" target="#b10">[11]</ref> dataset. Therefore, it is composed of the following files: behaviors, articles, and impressions.</p><p>The behaviors file contains 1402576 impressions logs regarding which sequence of items was recommended at a given time and which items users consumed before receiving such recommendations. Unlike the previous dataset, NPR also includes statistics on user behavior regarding the articles' page, such as the number of clicks, time spent on the page, and scroll percentage.</p><p>The articles file contains metadata for 148099 news items that were either consumed or recommended to users in the previous behaviors file. It contains news URLs, their title text, and a list of topics associated with each article assigned by a specialized editorial board. The complete schema for the article's file is displayed in the dataset's repository in Kaggle.</p><p>Finally, the recommendations contains three files on 92700 randomly sampled recommendations generated by means of the following algorithmic approaches: Collaborative Filtering, the most Recent publications, and the Top consumed articles. Each file refers to recommendations provided by one specific algorithm, but all files share the same impression IDs and, as a consequence, refer to the same users.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Dataset Analysis</head><p>Since textual information plays a central role in news datasets, Figure <ref type="figure" target="#fig_0">1</ref> displays the distribution from some of the article's features.</p><p>The top left and top right plots already show a language distinction between the NPR and MIND datasets. In terms of the number of words in the article's titles, NPR has an average of 14.9 words while its counterpart presents 11.52 <ref type="bibr" target="#b10">[11]</ref>. However, the biggest difference is in the article's body length. While NPR has a single-modal and skewed distribution, with articles having 471.7 words in their body on average, MIND's body length is multimodal, with averages around 20 and 80 <ref type="bibr" target="#b10">[11]</ref>. This indicates that NPR contributes to much richer textual information to be assessed with natural language processing techniques.</p><p>In addition, most news articles are associated with a topic assigned manually by the editorial board, which consists of multiple teams spread out in different geographic regions of the country. Such scenario potentially results in non-uniform categorization of news articles. Solutions to address this challenge are further discussed in the Future Work section. From a total of 94 topics (bottom left plot in Figure <ref type="figure" target="#fig_0">1</ref>), most of the news is related to sp, mg, and rj, which are acronyms for Brazil's states. Since these are some of the most populated states in Brazil, it indicates the predominance of regional content. In fact, the topic's distribution is so unequal that it reaches a 74% Gini index <ref type="bibr" target="#b21">[22]</ref> of distribution inequality. Other generic themes like "mundo" (world), "política" (politics), and "economia" (economy) also have a significant share of news articles.</p><p>Based on these topics, news articles can be associated with hard and soft news. As explored by Vrijenhoek <ref type="bibr" target="#b0">[1]</ref>, the MIND dataset has a higher share of soft news, which may not be the best scenario for research on normative diversity. In order to evaluate the differences in those news types, Table <ref type="table" target="#tab_0">1</ref> shows comparisons between NPR and MIND datasets.</p><p>Considering all articles in the catalog, NPR presents 91.0% of hard news items while MIND  Finally, the news in the G1 ecosystem has a short survival time. It can be observed in Figure <ref type="figure" target="#fig_0">1</ref>, where the consumption of the news is more concentrated on the first days after the publication. In this case, more recent news is more consumed.</p><p>Despite having a higher share of hard news, items on NPR seem to last longer than the ones presented in MIND. Figure <ref type="figure" target="#fig_0">1</ref> shows on the bottom right plot the cumulative distribution function (CDF) of news survival time (number of days between the article's publish date and last click). It can be seen that 85% of items are clicked up to 7.95 days for hard news and 9.55 days for soft news, which reinforces the characteristic of soft news to last longer.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Normative Diversity</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Theoretical Background</head><p>As aforementioned, news recommender systems play a central democratic role in keeping users informed by unlocking the diversity of online information <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b7">8]</ref>. However, the definition of diversity is plural, especially when contrasting the fields of computer science and normative literature. For instance, while technical metrics such as intra-list distance of recommended items <ref type="bibr" target="#b22">[23,</ref><ref type="bibr" target="#b23">24]</ref> or gini index <ref type="bibr" target="#b24">[25,</ref><ref type="bibr" target="#b25">26]</ref> may be a proxy to diversity in computer science, normative literature might lean towards concepts of democracy, freedom of expression and cultural inclusion <ref type="bibr" target="#b26">[27,</ref><ref type="bibr" target="#b3">4]</ref>.</p><p>To bridge the gap between technical and normative literature, a framework by the name of RADio (Rank-Aware Divergence Metrics) <ref type="bibr" target="#b27">[28]</ref> has been proposed to translate normative goals into a set of quantifiable metrics grounded in democratic theory. The framework works under five metrics which are summarized as follows (for a thorough description of the metrics, refer to <ref type="bibr" target="#b27">[28,</ref><ref type="bibr" target="#b26">27]</ref>):</p><p>• Calibration: assesses the degree to which the issued recommendations align with the user's preferences. The further from 0, the greater the deviation from the user's preferences. • Fragmentation: quantifies the level of overlap among recommendations presented to distinct users. The closer to 0, the greater the overlap. • Activation: gauges the extent to which the issued recommendations aim to motivate users into action. The closer to 0, the more neutral the content. • Representation: indicates how different opinions or perspectives are expressed. The closer to 0, the more balanced the content whereas higher scores measure larger discrepancies. • Alternative Voices: measures to which extent minority groups are represented in the content. The closer to 0, the fewer the presence of minority voices.</p><p>Based on the different values extracted from a news recommender for these five metrics, it can be assigned to four democratic models described by Helberger <ref type="bibr" target="#b3">[4]</ref>: liberal, participatory, deliberative, and critical. A reference table overview for each model is documented in <ref type="bibr" target="#b27">[28,</ref><ref type="bibr" target="#b26">27]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Experimental RADio Metrics</head><p>All five RADio metrics aforementioned were applied to the MIND dataset for 6 different algorithms <ref type="bibr" target="#b27">[28]</ref>. However, given that some of these metrics rely on applying natural language processing techniques to extract aspects such as entity recognition of minority voices or content neutrality, we focus the analysis on the calibration metric.</p><p>The NPR dataset contains recommendations for three different kinds of algorithms. The first is an Alternating Least Square ("ALS") strategy, which is a classical recommendation algorithm based on the factorization of the user-item matrix <ref type="bibr" target="#b28">[29]</ref>. The other two algorithms are the "Top" algorithm, which recommends the most consumed news articles from the past 48 hours, and "Recents", which recommends the most recent news articles published by the editorial board. Both of these later algorithms are non-personalized, meaning that all users who access the news portal at a given time receive the same recommendation.</p><p>Considering these three algorithms, Figure <ref type="figure" target="#fig_1">2</ref> provides two plots generated after extracting the calibration metrics on different recommendation scenarios. The left plot shows how calibration is distributed among three different algorithms after recommending 5 items, whereas the right plot shows the average calibration considering multiple recommendation list sizes. At first glance, no major differences can be observed between algorithms. A first hypothesis for such an overlap is that the "ALS" algorithm may converge to the "Top" or "Recents" strategy, especially when considering hard news with a short life span. Additionally, the dynamic behavior of hard news might make it more difficult to model user preferences, since users are likely to change their interests rapidly. For instance, users may consume distinct categories of hard news due to the fact that they are breaking news, yielding a more general type of consumption profile. This scenario can be addressed by more robust algorithms that are more suited for news recommendation, which are already in place in Globo portals. We discuss their use in the scope of this dataset later in the future work section.</p><p>However, the plot shows a noticeable difference when comparing recommendations between logged versus non-logged users. Recall that as calibration approaches 0, recommendations are more tailored towards the user's preferences. Since logged users tend to have more historical data, it is reasonable to see lower calibration values when compared to non-logged users, which can be seen by comparing the distributions' quantiles.</p><p>By expanding the analysis to ranks different than 5, the right plot in Figure <ref type="figure" target="#fig_1">2</ref> provides how the average calibration changes according to different recommendation list sizes. For longer lists, it is more probable to find items tailored to the users' preferences. Therefore, it is reasonable to observe a descending calibration variation as recommendation lists get larger. Since NPR contains up to 10 items in the recommendation lists, a lower limit of calibration can be observed around 78.9%.</p><p>Based on this lower limit, we can establish a calibration comparison between MIND and NPR using the results reported for the "top" algorithm in <ref type="bibr" target="#b27">[28]</ref> (referred to as the "most popular"). By recommending a list of 10 items to the users, a calibration of 65.3% was observed after using news article topics. Therefore, we can observe that even a non-personalized algorithm may present significantly different calibration results depending on the dataset, which reinforces the need for several datasets to be employed as benchmarks when analyzing the diversity capabilities of a recommendation algorithm.</p><p>Besides the calibration analysis, Figure <ref type="figure" target="#fig_2">3</ref> also shows preliminary results on the representation (left plot) and fragmentation (right plot) metrics. It can also be seen from Figure <ref type="figure" target="#fig_2">3</ref> that no clear difference is observed among algorithms, which reinforces the aforementioned hypothesis that the "ALS" algorithm converges to the "Top" approach. Notice that, for all algorithms, the average representation is close to 50%, whereas average fragmentation approaches 85%. A hypothesis for such behavior is the lack of proper natural language processing tools towards Portuguese texts which better indicates how different opinions or perspectives are expressed.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>Taking into account the need for proper datasets to be publicly available for both Academy and Industry researchers to discuss and explore solutions on common ground, this paper introduced the News Portal Recommendations (NPR) dataset. The dataset provides data on recommendation impressions, user behavior (consumption history), and also metadata about published articles.</p><p>Section 3 analyzed specific characteristics of the dataset and also compared it with the MIND dataset from Microsoft News <ref type="bibr" target="#b10">[11]</ref>. Other related datasets are also described. Besides providing much richer textual information in comparison with MIND, NPR has a considerably greater proportion of hard news consumption than the MIND dataset. Subsequently, section 4 showed that the NPR dataset could be applied to the RADio framework <ref type="bibr" target="#b27">[28]</ref>, translating normative goals into quantifiable metrics.</p><p>The first version of the dataset is already publicly available, opening the horizon for continuous updates and improvements based on feedback from the community. Some improvements are already planned for Future Work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Future Work</head><p>When looking at news categories, the NPR dataset also presents acronyms for Brazillian states, indicating the predominance of regional content. As stated in section 3.2, the need for manual tagging potentially results in non-uniform categories. In this sense, we are currently developing automatic extraction of semantic metadata from news articles, which will enrich the current categories already in place in the dataset. In this context, we also aim to explore content-representation techniques in order to remove any possible differences resulting from the Portuguese language.</p><p>Finally, in addition to the ALS algorithm, we will also incorporate recommendation impressions resulting from other, and more advanced, personalization algorithms. Although other algorithms are already being employed for providing recommendations to the final user, engineering efforts are needed to extract multiple algorithms' outputs to the same users due to Globo's AB platform. Since in Globo any information delivered to the final user is subjected to an AB test, the recommendation algorithm employed, as well as its resulting impressions, will vary depending on the AB testing alternative that has been employed for that specific user.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: (top left) Title number of words. (top right) Body number of words. (bottom left) Top 15 out of 94 most common articles' categories. (bottom right) Cumulative Distribution Function (CDF) of hard/soft news survival time in days and their 85% quantile.</figDesc><graphic coords="5,89.29,84.19,416.70,274.65" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: (left) Calibration distribution per user type for a 5-items recommendation. Dashed lines on the violin plots represent quantiles 25%, 50%, and 75%. (right) Average calibration considering multiple recommendation list sizes.</figDesc><graphic coords="7,89.29,204.54,416.70,181.82" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Representation (left) and Fragmentation (right) distributions per user type for a 5-items recommendation. Dashed lines on the violin plots represent quantiles 25%, 50%, and 75%.</figDesc><graphic coords="8,89.29,257.89,416.70,170.75" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Share of news types between NPR and MIND datasets (MIND numbers extracted from<ref type="bibr" target="#b0">[1]</ref>).</figDesc><table><row><cell cols="2">Dataset News Type</cell><cell>All</cell><cell cols="3">History Clicked Candidate</cell></row><row><cell>MIND</cell><cell cols="2">Soft News Hard News 36.3% 63.6%</cell><cell>62.2% 34.8%</cell><cell>73.0% 26.9%</cell><cell>69.8% 30.2%</cell></row><row><cell>NPR</cell><cell cols="2">Soft News Hard News 91.0% 9.0%</cell><cell>17.8% 82.2%</cell><cell>13.8% 86.2%</cell><cell>15.4% 84.6%</cell></row><row><cell cols="6">has 36.3%. This distinction expands to other aspects of the datasets such as the user's historical</cell></row><row><cell cols="6">consumption (82.2% on NPR against 34.8% on MIND), clicked items (86.2% against 26.9%) and</cell></row><row><cell cols="6">consumption candidate lists, i.e., recommendations (84.6% against 30.2%).</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://g1.globo.com/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://microsoftnews.msn.com</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://msnews.github.io/competition.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://reclab.idi.ntnu.no/dataset/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">https://g1.globo.com/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>Thanks to Mateo Gutierrez Granada, Johannes Kruse, and Gabriel Benedict for implementing the code that made it possible to run the RADio metrics on Globo's dataset. We would also like to acknowledge Globo for providing this dataset for the academic community, especially to the Recommendation team for preparing the original dataset from the G1 Portal.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Vrijenhoek</surname></persName>
		</author>
		<idno type="DOI">.org/10.48550/arXiv.2304.08253</idno>
		<idno type="arXiv">arXiv:2304.08253</idno>
		<title level="m">Do you mind? reflections on the mind dataset for research on diversity in news recommendations</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Personalized news recommendation based on click behavior</title>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dolan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">R</forename><surname>Pedersen</surname></persName>
		</author>
		<idno type="DOI">10.1145/1719970.1719976</idno>
		<idno>doi:10.1145/1719970.1719976</idno>
		<ptr target="https://doi.org/10.1145/1719970.1719976" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th International Conference on Intelligent User Interfaces, IUI &apos;10</title>
				<meeting>the 15th International Conference on Intelligent User Interfaces, IUI &apos;10<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="31" to="40" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Modeling and broadening temporal user interest in personalized news recommendation</title>
		<author>
			<persName><forename type="first">L</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Li</surname></persName>
		</author>
		<idno type="DOI">10.1016/j.eswa.2013.11.020</idno>
		<ptr target="https://doi.org/10.1016/j.eswa.2013.11.020" />
	</analytic>
	<monogr>
		<title level="j">Expert Systems with Applications</title>
		<imprint>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="page" from="3168" to="3177" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">On the democratic role of news recommenders</title>
		<author>
			<persName><forename type="first">N</forename><surname>Helberger</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:197796153" />
	</analytic>
	<monogr>
		<title level="j">Digital Journalism</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="1012" to="1993" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">The dual echo chamber: Modeling social media polarization for interventional recommending</title>
		<author>
			<persName><forename type="first">T</forename><surname>Donkers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ziegler</surname></persName>
		</author>
		<idno type="DOI">10.1145/3460231.3474261</idno>
		<idno>doi:10.1145/3460231. 3474261</idno>
		<ptr target="https://doi.org/10.1145/3460231.3474261" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th ACM Conference on Recommender Systems, RecSys &apos;21</title>
				<meeting>the 15th ACM Conference on Recommender Systems, RecSys &apos;21<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="12" to="22" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Recent research on selective exposure to information</title>
		<author>
			<persName><forename type="first">D</forename><surname>Frey</surname></persName>
		</author>
		<idno type="DOI">10.1016/S0065-2601(08)60212-9</idno>
		<ptr target="https://doi.org/10.1016/S0065-2601(08)60212-9" />
	</analytic>
	<monogr>
		<title level="m">Advances in Experimental Social Psychology</title>
				<imprint>
			<publisher>Academic Press</publisher>
			<date type="published" when="1986">1986</date>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="41" to="80" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">I want to break free! recommending friends from outside the echo chamber</title>
		<author>
			<persName><forename type="first">A</forename><surname>Tommasel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Rodriguez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Godoy</surname></persName>
		</author>
		<idno type="DOI">10.1145/3460231.3474270</idno>
		<idno>doi:10.1145/3460231.3474270</idno>
		<ptr target="https://doi.org/10.1145/3460231.3474270" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th ACM Conference on Recommender Systems, RecSys &apos;21</title>
				<meeting>the 15th ACM Conference on Recommender Systems, RecSys &apos;21<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="23" to="33" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Exploring the filter bubble: The effect of using recommender systems on content diversity</title>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">T</forename><surname>Nguyen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P.-M</forename><surname>Hui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">M</forename><surname>Harper</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Terveen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Konstan</surname></persName>
		</author>
		<idno type="DOI">10.1145/2566486.2568012</idno>
		<idno>doi:10.1145/2566486.2568012</idno>
		<ptr target="https://doi.org/10.1145/2566486.2568012" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 23rd International Conference on World Wide Web, WWW &apos;14</title>
				<meeting>the 23rd International Conference on World Wide Web, WWW &apos;14<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="677" to="686" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">A survey on challenges and methods in news recommendation</title>
		<author>
			<persName><forename type="first">Ö</forename><surname>Özgöbek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Gulla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">C</forename><surname>Erdur</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:19984721" />
	</analytic>
	<monogr>
		<title level="m">International Conference on Web Information Systems and Technologies</title>
				<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<title level="m" type="main">A deep learning meta-architecture for news recommender systems</title>
		<author>
			<persName><forename type="first">G</forename><surname>De Souza Pereira Moreira</surname></persName>
		</author>
		<author>
			<persName><surname>Chameleon</surname></persName>
		</author>
		<idno>CoRR abs/2001.04831</idno>
		<ptr target="https://arxiv.org/abs/2001.04831.arXiv:2001.04831" />
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note>phd. thesis</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">MIND: A large-scale dataset for news recommendation</title>
		<author>
			<persName><forename type="first">F</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Qiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-H</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Lian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zhou</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2020.acl-main.331</idno>
		<ptr target="https://aclanthology.org/2020.acl-main.331.doi:10.18653/v1/2020.acl-main.331" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</title>
				<meeting>the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="3597" to="3606" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Neural news recommendation with longand short-term user representations</title>
		<author>
			<persName><forename type="first">M</forename><surname>An</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xie</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/P19-1033</idno>
		<ptr target="https://aclanthology.org/P19-1033.doi:10.18653/v1/P19-1033" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</title>
				<meeting>the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics<address><addrLine>Florence, Italy</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="336" to="345" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Neural news recommendation with multi-head self-attention</title>
		<author>
			<persName><forename type="first">C</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ge</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xie</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/D19-1671</idno>
		<ptr target="https://aclanthology.org/D19-1671.doi:10.18653/v1/D19-1671" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics</title>
				<meeting>the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics<address><addrLine>Hong Kong, China</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="6389" to="6394" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xie</surname></persName>
		</author>
		<idno type="DOI">10.48550/arXiv.2106.08934</idno>
		<idno type="arXiv">arXiv:2106.08934</idno>
		<ptr target="https://doi.org/10.48550/arXiv.2106.08934" />
		<title level="m">Personalized news recommendation: Methods and challenges</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Huang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2104.07404</idno>
		<title level="m">Two birds with one stone: Unified model learning for both recall and ranking in news recommendation</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Empowering news recommendation with pre-trained language models</title>
		<author>
			<persName><forename type="first">C</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Huang</surname></persName>
		</author>
		<idno type="DOI">10.1145/3404835.3463069</idno>
		<idno>doi:10.1145/3404835.3463069</idno>
		<ptr target="https://doi.org/10.1145/3404835.3463069" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR &apos;21</title>
				<meeting>the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR &apos;21<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="1652" to="1656" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">News recommendation with candidate-aware user modeling</title>
		<author>
			<persName><forename type="first">T</forename><surname>Qi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Huang</surname></persName>
		</author>
		<idno type="DOI">10.1145/3477495.3531778</idno>
		<idno>doi:10.1145/3477495.3531778</idno>
		<ptr target="https://doi.org/10.1145/3477495.3531778" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR &apos;22</title>
				<meeting>the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR &apos;22<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1917" to="1921" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">News session-based recommendations using deep neural networks</title>
		<author>
			<persName><forename type="first">G</forename><surname>De Souza Pereira Moreira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ferreira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Da Cunha</surname></persName>
		</author>
		<idno type="DOI">10.1145/3270323.3270328</idno>
		<ptr target="https://doi.org/10.1145%2F3270323.3270328" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 3rd Workshop on Deep Learning for Recommender Systems</title>
				<meeting>the 3rd Workshop on Deep Learning for Recommender Systems</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">The plista dataset</title>
		<author>
			<persName><forename type="first">B</forename><surname>Kille</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Hopfgartner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Brodt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Heintz</surname></persName>
		</author>
		<idno type="DOI">10.1145/2516641.2516643</idno>
		<idno>doi:10.1145/2516641.2516643</idno>
		<ptr target="https://doi.org/10.1145/2516641.2516643" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2013 International News Recommender Systems Workshop and Challenge, NRS &apos;13</title>
				<meeting>the 2013 International News Recommender Systems Workshop and Challenge, NRS &apos;13<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="16" to="23" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">The adressa dataset for news recommendation</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>Gulla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Özgöbek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Su</surname></persName>
		</author>
		<idno type="DOI">10.1145/3106426.3109436</idno>
		<idno>doi:10.1145/3106426.3109436</idno>
		<ptr target="https://doi.org/10.1145/3106426.3109436" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Conference on Web Intelligence, WI &apos;17</title>
				<meeting>the International Conference on Web Intelligence, WI &apos;17<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1042" to="1048" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">What are filter bubbles really? a review of the conceptual and empirical work</title>
		<author>
			<persName><forename type="first">L</forename><surname>Michiels</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Leysen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Smets</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Goethals</surname></persName>
		</author>
		<idno type="DOI">10.1145/3511047.3538028</idno>
		<idno>doi:10.1145/3511047.3538028</idno>
		<ptr target="https://doi.org/10.1145/3511047.3538028" />
	</analytic>
	<monogr>
		<title level="m">Adjunct Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization, UMAP &apos;22 Adjunct</title>
				<meeting><address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="274" to="279" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">The measurement of the inequality of incomes</title>
		<author>
			<persName><forename type="first">H</forename><surname>Dalton</surname></persName>
		</author>
		<ptr target="http://www.jstor.org/stable/2223525" />
	</analytic>
	<monogr>
		<title level="j">The Economic Journal</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page" from="348" to="361" />
			<date type="published" when="1920">1920</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Novelty and diversity in recommender systems</title>
		<author>
			<persName><forename type="first">P</forename><surname>Castells</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">J</forename><surname>Hurley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Vargas</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:45086523" />
	</analytic>
	<monogr>
		<title level="m">Recommender Systems Handbook</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Rank and relevance in novelty and diversity metrics for recommender systems</title>
		<author>
			<persName><forename type="first">S</forename><surname>Vargas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Castells</surname></persName>
		</author>
		<idno type="DOI">10.1145/2043932.2043955</idno>
		<idno>doi:10.1145/2043932.2043955</idno>
		<ptr target="https://doi.org/10.1145/2043932.2043955" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fifth ACM Conference on Recommender Systems, RecSys &apos;11</title>
				<meeting>the Fifth ACM Conference on Recommender Systems, RecSys &apos;11<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="109" to="116" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Debiasing the human-recommender system feedback loop in collaborative filtering</title>
		<author>
			<persName><forename type="first">W</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Khenissi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Nasraoui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Shafto</surname></persName>
		</author>
		<idno type="DOI">10.1145/3308560.3317303</idno>
		<idno>doi:10.1145/ 3308560.3317303</idno>
		<ptr target="https://doi.org/10.1145/3308560.3317303" />
	</analytic>
	<monogr>
		<title level="m">Companion Proceedings of The 2019 World Wide Web Conference, WWW &apos;19</title>
				<meeting><address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="645" to="651" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Accuracy meets diversity in a news recommender system</title>
		<author>
			<persName><forename type="first">S</forename><surname>Raza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R</forename><surname>Bashir</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Naseem</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2022.coling-1.332" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 29th International Conference on Computational Linguistics, International Committee on Computational Linguistics</title>
				<meeting>the 29th International Conference on Computational Linguistics, International Committee on Computational Linguistics<address><addrLine>Gyeongju, Republic of Korea</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="3778" to="3787" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Recommenders with a mission: Assessing diversity in news recommendations</title>
		<author>
			<persName><forename type="first">S</forename><surname>Vrijenhoek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kaya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Metoui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Möller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Odijk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Helberger</surname></persName>
		</author>
		<idno type="DOI">10.1145/3406522.3446019</idno>
		<idno>doi:10.1145/3406522.3446019</idno>
		<ptr target="https://doi.org/10.1145/3406522.3446019" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 Conference on Human Information Interaction and Retrieval, CHIIR &apos;21</title>
				<meeting>the 2021 Conference on Human Information Interaction and Retrieval, CHIIR &apos;21<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="173" to="183" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Radio -rankaware divergence metrics to measure normative diversity in news recommendations</title>
		<author>
			<persName><forename type="first">S</forename><surname>Vrijenhoek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Bénédict</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gutierrez Granada</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Odijk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>De Rijke</surname></persName>
		</author>
		<idno type="DOI">10.1145/3523227.3546780</idno>
		<idno>doi:10.1145/3523227.3546780</idno>
		<ptr target="https://doi.org/10.1145/3523227.3546780" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 16th ACM Conference on Recommender Systems, RecSys &apos;22</title>
				<meeting>the 16th ACM Conference on Recommender Systems, RecSys &apos;22<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="208" to="219" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Matrix factorization techniques for recommender systems</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Koren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Bell</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Volinsky</surname></persName>
		</author>
		<idno type="DOI">10.1109/MC.2009.263</idno>
		<ptr target="https://doi.org/10.1109/MC.2009.263.doi:10.1109/MC.2009.263" />
	</analytic>
	<monogr>
		<title level="j">Computer</title>
		<imprint>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="page" from="30" to="37" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
