<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>September</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Longitudinal Evaluation of Two Similarity-based Approaches in a News Recom mender System</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gloria A.B. Kasangu</string-name>
          <email>gloria.kasangu@student.uib.no</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alain D. Starke</string-name>
          <email>alain.starke@uib.no</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christoph Trattner</string-name>
          <email>christoph.trattner@uib.no</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>News, Recommender Systems, Similarity, News Aggregator, Longitudinal Evaluation</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ASCoR, University of Amsterdam</institution>
          ,
          <country country="NL">Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>MediaFutures, University of Bergen</institution>
          ,
          <addr-line>Vestland</addr-line>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>2</volume>
      <fpage>2</fpage>
      <lpage>26</lpage>
      <abstract>
        <p>Similarity-based personalization is generally assumed to boost engagement in recommender systems. However, is this also true beyond a single session in a news recommender? Amid concerns about filter bubbles and preference volatility, we propose an empirical evaluation of both short-term and longer-term efects of a news recommender system with two phases of data collection: Initial preference elicitation and evaluation (Phase 1), a 48-hour interval, and a personalized follow-up (Phase 2). We compared two recommendation strategies in a preliminary longitudinal experiment ( = 166 ): An 'Aligned' feed that included articles that met a ≥ 70% cosine‐similarity threshold, and a 'Disaligned' feed with only a 30% similarity threshold. We collected behavioral metrics (article clicks, time on feed) and evaluative metrics (self-reported familiarity, perceived recommendation quality, choice satisfaction, topic preferences) in both phases. The Aligned feed was perceived to have more familiar content, while perceived diversity did not difer between recommendation strategies. Users clicked on significantly fewer articles in Phase 2, particularly in the Disaligned condition. We also explored the volatility of topic preferences, but did not observe significant diferences across phases. These findings suggest that short-term increases in feed-profile similarity can enhance familiarity and maintain behavioral engagement (i.e., clicks). In contrast, they do not lead to higher levels of perceived quality and choice satisfaction, which raises doubts about the relationship between the similarity of preference-based articles and user satisfaction.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        A large number of news platforms rely on recommender systems to provide digital news [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This has
fundamentally reshaped the way audiences consume information [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. By tailoring content based on
individual preferences and past behavior, news recommenders aim to enhance user engagement and
satisfaction, yet the dominance of similarity-based personalization raises unresolved questions about
its implications for both individual users and the broader information ecosystem. While short-term
evaluations of recommender systems are common, longitudinal assessments remain rare in the news
domain [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>In this preliminary study, we conduct a field experiment at two time points to compare two
recommendation strategies that difer in their degree of personalization. Our goal is not to settle the long-term
debate but to ofer an initial look at how user satisfaction and engagement evolve when exposed to
higher versus lower content similarity over a short interval. We compare two conditions: One condition
(“Misaligned”) in which users receive more generic than personalized content, and another condition
(“Aligned”) in which users receive more personalized content than generic content. For the latter, users
are presented only articles with at least 70% similar (by cosine similarity) to their past click history,
reflecting a typical personalization threshold.</p>
      <p>Proceedings of the 13th International Workshop on News Recommendation and Analytics (INRA 2025), co-located with the 19th
(C. Trattner)</p>
      <p>CEUR</p>
      <p>ceur-ws.org</p>
      <p>
        This design is motivated by concerns about filter bubbles and echo chambers [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ], in which highly
personalized consumption can reinforce ideological entrapment and reduce exposure to diverse
viewpoints [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Although some work suggests that personalization alone does not always produce these
efects [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], the larger impact on public discourse continues to be debated [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. At the same time,
recommender systems face the challenge of preference volatility, as user interests change over time and
algorithms often struggle to detect or adapt to these changes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>By examining user behavior and perceptions in two phases, our study provides an exploratory
window into (1) whether short-term preference shifts occur under diferent personalization regimes
and (2) how alignment with past behavior influences satisfaction with chosen articles. We address the
following research questions:
• RQ1: To what extent does presenting more aligned news recommender content (i.e., based on
user-item similarity) positively afect choice satisfaction over time?
• RQ2: To what extent does user-item similarity afect a user’s perceived recommendation quality
and clicking behavior in a news recommender system?</p>
      <p>To answer these questions, we collected behavioral measures (article clicks and self-reported percent
familiarity as a proxy for cosine similarity) alongside subjective ratings of choice satisfaction and
perceived quality. Although our two-timepoint design does not capture long-term dynamics, it ofers a
critical first step toward understanding how brief exposures to diferent levels of personalization shape
the user’s experience.</p>
      <p>Our contributions are threefold: (1) we provide early evidence on how short-term exposure to
high versus low similarity news feeds afects satisfaction and engagement; (2) we demonstrate the
utility of self-reported familiarity as a practical manipulation check; and (3) we highlight directions
for future longitudinal work on adaptive recommendation strategies that balance personalization with
informational diversity.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        We discuss literature in the context of news similarity and diversity and its related efects. For example,
content-based approaches that strongly optimize for similarity may lead to ‘more of the same’ content
that is less diverse [
        <xref ref-type="bibr" rid="ref1 ref8">1, 8</xref>
        ]. Therefore, we will discuss filter bubbles and echo chamber efects in the
context of news recommenders, as well as research that included longitudinal evaluation components.
      </p>
      <sec id="sec-2-1">
        <title>2.1. Filter Bubbles and Echo Chamber Efects in Recommender Systems</title>
        <p>
          A lack of recommended diversity over a longer time period can be described as a filter bubble. Although
definitions vary [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], filter bubbles and echo chambers can be defined as the tendency of personalization
algorithms to enclose users within a narrow band of similar content, potentially compromising
exposure to diverse viewpoints. Pariser’s influential work introduced the term filter bubble to warn that
algorithmic curation can invisibly tailor information streams around a user’s past behavior, reinforcing
existing beliefs rather than challenging them [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Subsequent empirical studies have confirmed that
personalization can increase ideological segregation. Flaxman et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] demonstrated that search result
personalization led users toward more politically extreme news sources compared to non-personalized
search. Nguyen et al. [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] showed that collaborative filtering methods tend to prioritize popular or
similar items at the expense of topic diversity, thereby fostering echo chamber efects. In the news
domain, such narrowing raises grave concerns for democratic discourse, since access to a plurality of
perspectives is essential [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Recent scholarship calls for critical reflection on how recommender design
choices, including similarity thresholds, feedback loops and diversification strategies, can exacerbate or
mitigate enclosure efects. Researchers also urge longitudinal studies to assess the real-world impacts
of these design decisions over time [
          <xref ref-type="bibr" rid="ref3 ref4">4, 3</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. News Personalization and Engagement</title>
        <p>
          Personalized news recommender systems aim to increase user engagement by tailoring content
presentation to individual users, often leveraging stored user attributes and past user behavior [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Previous
studies have shown that personalization can significantly improve short-term engagement metrics such
as click-through rates and next-item prediction accuracy [
          <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
          ]. By reducing information overload
and presenting content that aligns with user preferences, personalized systems often improve perceived
relevance and user satisfaction [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>
          Nonetheless, the relationship between personalization and long-term engagement is more
complicated. Previous research has shown that overly narrow recommendation strategies can lead to a
decrease in information diversity [
          <xref ref-type="bibr" rid="ref15 ref6">6, 15</xref>
          ], as well as user fatigue [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. Moreover, high similarity between
recommended and previously consumed content can boost engagement in the short term, while limiting
average individual exposure to diverse viewpoints [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. In the context of news, the trade-of between
personalization and exposure diversity is especially important as democratic deliberation relies on
access to varied perspectives [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Some studies suggest that moderate personalization can maintain
diversification while mitigating negative efects like the filter bubble. For instance, Gao et al [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] found
that moderate diversification retains recommendation accuracy while promoting exposure to a more
varied set of topics. Additionally, others have proposed approaches that combine personalization with
editorial or novel content to maintain reader interest without reinforcing filter bubble efects [ 19, 20].
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Longitudinal Experiments in Recommender Systems</title>
        <p>Recent work has begun to explore the long-term efects of recommender systems through longitudinal
ifeld experiments. The use of simulation-based methods, such as agent-based modelling, has been a
staple research methodology in many fields such as sociology and managerial science [ 21, 22]. Recently,
Zhang et al. [23] introduced an agent-based simulation framework to analyze the longitudinal efects
of recommender systems. Their findings revealed a phenomenon called the performance paradox, in
which user interaction with recommendation algorithms can paradoxically degrade overall system
performance over time. Their findings emphasize the risk of overpersonalization leading to decreased
user satisfaction, a finding that was also explored in a follow-up work using extended modeling
techniques. Similarly, Ferraro et al. [24], using a simulation-based framework tailored to session-based
recommender systems, found that repeated interactions can reinforce popularity bias and reduce item
diversity over time.</p>
        <p>To empirically validate the long-term efects of recommender systems on content exposure, some
longitudinal field experiments have been conducted. For example, Lee and Hosanger [ 25] ran a
randomized field experiment in multiple product categories and demonstrated that personalized recommender
systems led to a decrease in overall sales diversity, particularly when using collaborative filtering
methods. Furthermore, Fleder and Hosanger [26] found that recommendations, in the long term, can lead to
a concentrated consumption of a small group of popular items. As a response to these observed
concentration efects, various studies have proposed mitigation strategies such as hybrid recommendation
models that blend personalization with popularity-neutral signals [23] and ranking-based diversification
techniques [24, 27].</p>
        <p>Based on these findings, our study presents a longitudinal user experiment in the domain of news
recommendation. Unlike previous work that focuses primarily on e-commerce, we examine how varying
degrees of personalization afect user satisfaction, engagement, and content diversity over time. By
collecting both behavioral and subjective data at two time points, our work provides empirical insights
into how users respond to diferent recommendation strategies and how recommender systems might
better respond to evolving user preferences.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Research Design</title>
        <p>This study employed a between‐subjects longitudinal experimental design with two waves of data
collection, separated by a mandatory 48 h waiting period. The two experimental conditions difered in
the personalization strategy used to generate news recommendations in the second phase of the study.
Participants were randomly assigned to one of two conditions:
Condition 1: Alignment. Participants received a feed of articles that were mostly topically aligned
with their established preferences.</p>
        <p>Condition 2: Disalignment. Participants received a feed of articles that were mostly dissimilar to
their preferences, encouraging exploration of new topics.</p>
        <p>Outcome measures, including self-reported satisfaction, perceived quality, and behavioral click data
were collected at both timepoints to evaluate the efect of each recommendation strategy.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Participants</title>
        <p>Two hundred English‐fluent adults (95–100% approval rate) were recruited via Prolific and randomly
assigned to Alignment or Disalignment ( = 100 each). As 34 participants dropped out between both
phases (i.e., attrition), a sample of  = 166 participants remained for analysis (Alignment:  = 81 ;
Disalignment:  = 85 ). Participant ages ranged from 18 to 65 ( = 34.8 ,  = 8.9 ). Of the participants,
46.4% identified as female, 52.4% as male, and 1.2% declined to specify (see Table 1). Participants were
paid £9.00/hr for Phase 1 and £16.00/hr for Phase 2.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Materials &amp; Algorithms</title>
        <p>We sourced news articles in real time via the NewsCatcher API1 , restricted to 15 reputable
Englishlanguage outlets.2</p>
        <p>The NewsCatcher API provides real-time access to news articles from a wide range of publishers.
We configured it to return JSON-formatted metadata (headline, summary, publication date, and URL)
along with thumbnail images when available. Queries were limited to English-language content and
ifltered to include only articles published within the past 24 hours, ensuring both recency and relevance.
Articles were displayed in a uniform grid (title, image, short description). We removed all publisher
logos and other branding elements to control for any presentation-based biases. Participants browsed
this feed via our web interface (Figure 1), which supported bookmarking, article previews, and enforced
a minimum of five saves before proceeding.</p>
        <p>In Phase 2, a content‐based filtering algorithm generated each user’s personalized feed. We built a
TF–IDF profile vector from that user’s Phase 1 bookmarks, then computed each candidate article’s final
relevance score:
score = 0.60 × cos(TF-IDFuser, TF-IDFarticle)
+ 0.40 × freshness_bonus(publication_date).</p>
        <p>Articles with score ≥ 0.5 were labeled familiar, and those with score ≤ 0.4 novel. Under the Alignment
condition, feeds comprised 70% familiar and 30% novel articles; under Disalignment, this ratio was
reversed (30% familiar, 70% novel).</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Procedure</title>
        <p>The study consisted of two phases, separated by a 48 h interval. The study procedure is shown in Figure
2.</p>
        <p>Phase 1: Preference Elicitation &amp; Baseline. After completing an informed consent form and a
survey on the participant’s demographics and media habits, participants indicated “like”/“dislike” for 12
news topics. An initial feed of up to 30 de‐duplicated articles was generated and a persistent banner
(“Please save at least five articles before continuing”) enforced a minimum of five bookmarks before
they could advance. These bookmarks formed their profile, and they then completed the evaluation
survey as a baseline.</p>
        <p>Phase 2: Recommendation &amp; Evaluation. Participants were invited back exactly 48 hours after
Phase 1 to again complete the topic‐preferences survey. A new pool of 40 candidate articles was fetched;
the algorithm scored and selected items per condition to form the Phase 2 feed. After interacting with
this feed, they completed the evaluation survey again, concluding their participation.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Measures</title>
        <p>We operationalized a set of behavioral and subjective measures to evaluate user interactions with the
news recommendation system.</p>
        <sec id="sec-3-5-1">
          <title>3.5.1. Behavioral Measures</title>
          <p>We logged three objective metrics at each phase:
1Reuters, Associated Press, BBC, The New York Times, The Wall Street Journal, The Washington Post, NPR, PBS, The Guardian,
The Times (UK), Financial Times, The Independent, Al Jazeera, The Economist, CBS News.
2Prototype code and data are available at https://anonymous.4open.science/r/news-diversification-study-disalignment-1722/
and https://anonymous.4open.science/r/news-diversification-study-45FB/.
Article Clicks The total number of recommended articles a participant clicked. This metric serves
as a direct indicator of engagement: more clicks suggest greater interest in the feed content, whereas
fewer clicks may reflect disengagement or irrelevance of the recommendations.</p>
          <p>Article Similarity The mean cosine similarity between each recommended article and the set of
articles clicked in the previous phase. By quantifying how closely new recommendations match past
behavior, this measure operationalizes the degree of “alignment” versus “disalignment” in the feed and
allows us to link algorithmic similarity to downstream outcomes.</p>
          <p>Total Time on Feed The total time in seconds spent viewing the news feed. Total dwell time captures
sustained attention beyond the point of click, reflecting the extent to which participants explored the
feed and consumed article content even when they did not click through.</p>
          <p>Percent Familiarity The self‐reported percentage of recommended articles with which participants
felt they were already familiar. Although subjective, this rating serves as a practical proxy for the
underlying cosine‐similarity between new recommendations and each participant’s prior click history.
Higher values indicate stronger alignment between the feed and past interests.</p>
        </sec>
        <sec id="sec-3-5-2">
          <title>3.5.2. Subjective Measures</title>
          <p>All subjective items were rated on a 5-point Likert scale (1 = Strongly disagree, 5 = Strongly agree). We
assessed three constructs:
Choice Satisfaction Adapted from Knijnenburg et al. [28], this construct comprised two
positivelyphrased statements: “I like the articles I’ve chosen” and “I was/am looking forward to reading the
chosen articles.” Responses to these items were averaged to form a single Choice Satisfaction score.
An exploratory factor analysis (principal-axis factoring with varimax rotation) on Phase 1 responses
supported a clean two-factor solution, with the two satisfaction items loading strongly on one factor
( = 0.94 and  = 0.62 ) that accounted for 38 % of the variance. Cronbach’s  = 0.87 indicated good
internal consistency.</p>
          <p>Perceived Recommendation Quality Based on the advice‐solicitation scale of Starke et al. [29],
we used three items: “I found the recommended articles to be interesting,” “The recommended articles
iftted my preferences,” and “The recommended articles were relevant to me.” Responses were averaged
into a single perceived quality score. An exploratory factor analysis (principal‐axis factoring with
varimax rotation) on Phase 1 data showed that all three items loaded strongly on one factor ( = 0.74
for “interesting,”  = 0.88 for “relevant,”  = 0.51 for “fit preferences”), which accounted for 40% of the
variance. Cronbach’s  = 0.87 for this Quality factor, indicating good internal consistency.
Perceived Diversity To assess recommendation variety, we developed a three‐item scale: “The
recommended articles were similar to each other” (reverse‐scored), “The recommended articles difered
in terms of their topics,” and “The diversity in the recommended list of articles was high.” An exploratory
factor analysis (principal‐axis factoring, no rotation) on Phase 1 responses revealed that the “similar
to each other” item loaded weakly and negatively ( = −0.13 ), whereas the “difered in topics” and
“diversity high” items loaded strongly ( = 0.96 and  = 0.58 , respectively). Initial internal consistency
across all three items was poor (Cronbach’s  = 0.21 ). After reverse‐scoring and removing the “similar
to each other” item, reliability for the remaining two items improved to Cronbach’s  = 0.72 , supporting
the use of this two‐item composite diversity score in subsequent analyses.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>We confirmed internal consistency with Cronbach’s  (quality:  = .87 ; satisfaction:  = .87 ) and
assessed temporal stability via intraclass correlations over the 48 h interval (quality: ICC1,2 = .54;
satisfaction: ICC1,2 = .43). A manipulation check on self‐reported familiarity in Phase 2 showed
that Alignment participants reported greater familiarity ( = 71.5% ,  = 29.6 ) than Disalignment
participants ( = 36.5% ,  = 34.2 ), (162.5) = 7.05 ,  &lt; .001 . Behavioral outcomes were evaluated
with a repeated‐measures ANOVA on total article clicks. Changes in topic preferences between phases
were examined using paired‐sample  -tests. Finally, to address RQ1, we compared Phase 2 Choice
Satisfaction across conditions with an independent‐samples  -test.</p>
      <sec id="sec-4-1">
        <title>4.1. Manipulation Check</title>
        <p>To verify that our feed alignment manipulation had its intended efect on subjective familiarity without
inadvertently altering perceived diversity, we ran two Welch’s  -tests on Phase 2 self-reports. First,
participants in the Alignment condition reported substantially higher percentage of familiar articles (M
= 71.46%, SD = 29.6) than those in the Disalignment condition (M = 36.46%, SD = 34.2): (162.5) = 7.05 ,
 &lt; .001 ; see also Figure 3. This suggested that users facing more similar articles indeed reported they
were more familiar. Second, perceived diversity did not difer between the Alignment (M = 3.80, SD =
1.07) and Disalignment (M = 3.82, SD = 1.03) conditions: (162.8) = −0.13 ,  = .897 . This showed that
while familiarity corresponded to the research design changes in similarity, it was perceived by users in
terms of the diversity in the presented content.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Behavioral Outcomes</title>
        <p>main efect of Condition,  (1, 164) = 2.59 ,  = .109 ,  2 = .011, but a significant main efect of Phase,
 (1, 164) = 9.34 ,  = .003 ,  2 = .018, indicating an overall change in click behavior over time. The
Condition × Phase interaction was not significant,  (1, 164) = 4.08 ,  = .045 ,  2 = .008, suggesting
similar click‐patterns across groups. The time‐course of clicks is visualized in Figure 4.
Mean (SD) total article clicks by condition and phase.
Repeated‐measures ANOVA on the total number of article clicks by a user.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Topic-Preference Shifts</title>
        <p>To explore whether exposure to aligned versus Disaligned feeds induced any shifts in topical interests,
we conducted paired-sample  -tests on the Phase 2 and Phase 1 diference scores for each of the 12 topics.
As shown in Table 4, all mean changes were small and not significant at the  = .05 level. Figure 5
visualizes the distribution of these change scores (Phase 2 minus Phase 1).
Mean changes in topic preferences (Phase 2–Phase 1) for each topic, using paired  -tests. All  values exceeded
.05, indicating that topic preferences did not change significantly.</p>
        <p>Topic
Sport
Politics
Food &amp; Drink
Climate &amp; Environment
Lifestyle &amp; Health
Health &amp; Research
Society &amp; Work
Economy &amp; Business
Technology &amp; Science
Crime &amp; Legal
Entertainment &amp; Celebrities
International &amp; Global Conflicts
Δ
−0.08
−0.05
−0.06
−0.01
−0.04
−0.08
−0.08
−0.05
−0.04
−0.06
+0.02
−0.02

−1.61
−0.94
−1.15
−0.20
−0.73
−1.22
−1.22
−0.89
−0.73
−0.93
+0.43
−0.34</p>
        <p>165
165
165
165
165
165
165
165
165
165
165
165
.109
.347
.253
.842
.469
.224
.224
.373
.469
.355
.671
.733
horizontal dashed line indicates zero change.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. RQ1: Choice Satisfaction by Strategy</title>
        <p>An independent‐samples  -test on Phase 2 Choice Satisfaction revealed no significant diferences between
Alignment ( = 4.15 ,  = 0.89
) and Disalignment ( = 4.34 ,  = 0.74
); (155.96) = −1.47 ,  = .144 .</p>
        <p>As shown in Figure 6, the distributions of satisfaction scores largely overlapped, indicating comparable
levels of satisfaction between the recommendation strategies.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. RQ2: Familiarity Efects</title>
        <p>Next, we examined whether Phase 2 percent‐familiarity predicted perceived recommendation quality
and engagement (article clicks). As shown in Table 5, the percentage of perceived familiarity did not
individual scores, horizontal lines represent medians and boxes span the interquartile range.
significantly predict perceived quality:  = −0.003 ,  = 0.002 ,  = −1.65 ,  = .10 ( 2 = .016). Likewise,
the familiarity percentage neither predicted article clicks (cf. Table 6):  = 0.011 ,  = 0.011 ,  = 0.99 ,
 = .32 ( 2 = .006).
Phase 2 Regression: Familiarity Predicting Perceived Quality
Phase 2 Regression: Familiarity Predicting Article Clicks</p>
        <p>Parameter
Intercept
Percent Familiar
 2</p>
        <p>In general, both conditions encouraged robust engagement and satisfaction. Participants remained
consistently satisfied with their choices, maintained stable topic interests throughout phases, and
reported high familiarity without any adverse efects on perceived quality or clicking behavior.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>This study investigated how the algorithmic alignment of a news recommendation feed influences
both subjective and behavioral user outcomes. Using a two‐phase, between‐subjects design, we
manipulated whether recommended articles were more or less “aligned” with participants’ elicited topic
preferences. In addition, we measured (1) self‐reported familiarity, (2) perceived recommendation
quality, (3) choice satisfaction, (4) the total number of article clicks, and (5) changes in topic preferences.
Our manipulation successfully induced a higher percentage of familiarity in the Alignment condition,
compared to the Disalignment condition. However, this has not led to diferences in perceived quality,
satisfaction, or engagement; Hence, we have not observed any statistically significant diferences across
conditions in terms of choice satisfaction. Moreover, nor has our repeated‐measures ANOVA on click
behavior revealed any interaction efects between the feed condition and time phase. Furthermore,
exploring participants’ topic interests, we have found that they remain stable over the 48h interval,
while percent‐familiarity did not predict either quality perceptions or clicking behavior.</p>
      <p>
        Regarding RQ1, we hypothesized that presenting articles with high similarity would increase user
satisfaction relative to a more diverse feed. Instead, satisfaction scores were statistically equivalent
across the Alignment and Disalignment conditions. This suggests that once recommendations reach a
certain relevance threshold, further increases in similarity do not lead to additional benefit, mirroring
previous studies that reported diminishing personalization returns on engagement [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Users may
value novelty or variety just as much as pure similarity, and a feed that is too narrowly focused may
not improve, and might even reduce, perceived choice satisfaction in the long run.
      </p>
      <p>As for RQ2, we find that neither regression reached significance, familiarity did not predict perceived
quality or clicking behavior. This aligns with simulation studies suggesting that similarity alone cannot
sustain engagement over time and that moderate diversification may be equally efective, or even
necessary, to prevent user fatigue [24, 23]. Our results might imply that in a real-world news context,
users do not simply click more or rate higher quality when they recognize content as familiar, which
might underscore the need for hybrid strategies that balance relevance with serendipity.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Limitations and Future Work</title>
      <p>Our study is subject to a few limitations. First, our study only considers two time points separated by
48 hours. This relatively short time frame constrains our ability to detect longer-term changes in topic
interests or the cumulative efects of personalization. Future work should include additional follow-up
waves over weeks or months to capture preference volatility and longer-term behavioral adaptation.</p>
      <p>Second, we relied on self-reported familiarity and topic preference ratings, which are subject to recall
biases. Incorporating objective measures, such as feed-level cosine similarity scores or diversity indices
calculated on the full recommendation list, might strengthen the robustness of future findings.</p>
      <p>Third, our participant pool was drawn from Prolific, a relatively inexpensive crowdsourcing platform.
While it provides rapid data collection, it does not necessarily represent the full diversity of news
consumers. The demographics and engagement patterns of Prolific users may difer from those of
general audiences, potentially limiting external validity. Future studies should sample from multiple
platforms and demographic strata.</p>
      <p>Fourth, our behavioral engagement metrics were limited to article clicks and time on feed within the
experimental interface. These metrics do not capture longer-term news consumption behaviors, such as
sharing, commenting, or return visits. Expanding engagement measures to include social interactions
might yield a more comprehensive picture of user response.</p>
      <p>Finally, our diversity manipulation was implemented using a single “high diversity” indicator. This
may not fully capture the multifaceted nature of content variety or ideological breadth. More
sophisticated diversification strategies, such as topic-aware re-ranking or hybrid filtering, could produce
diferent patterns of user response and merit exploration in future work.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work was supported by the Research Council of Norway with funding to MediaFutures: Research
Centre for Responsible Media Technology and Innovation, through the Centre for Research-based
Innovation scheme, project number 309339.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>We confirm that all original text in this paper was written by the authors. An AI-based writing assistant,
WriteFull in Overleaf, was used to check grammar and spelling and improve the clarity of the author’s
written text. The content and intellectual contributions remain entirely those of the human authors.
the filter bubble while maintaining relevance: Targeted diversification with vae-based
recommender systems, in: Proceedings of the 45th International ACM SIGIR Conference on Research
and Development in Information Retrieval, 2022, pp. 2524–2531.
[19] F. Lu, A. Dumitrache, D. Graus, Beyond optimizing for clicks: Incorporating editorial values in
news recommendation, in: Proceedings of the 28th ACM conference on user modeling, adaptation
and personalization, 2020, pp. 145–153.
[20] V. W. Anelli, V. Bellini, T. Di Noia, W. La Bruna, P. Tomeo, E. Di Sciascio, An analysis on time- and
session-aware diversification in recommender systems, in: Proceedings of the 25th Conference on
User Modeling, Adaptation and Personalization, UMAP ’17, Association for Computing Machinery,
New York, NY, USA, 2017, p. 270–274. URL: https://doi.org/10.1145/3079628.3079703. doi:10.1145/
3079628.3079703.
[21] F. Bianchi, F. Squazzoni, Agent-based models in sociology, Wiley Interdisciplinary Reviews:</p>
      <p>Computational Statistics 7 (2015) 284–306.
[22] F. Wall, Agent-based modeling in managerial science: an illustrative survey and study, Review of</p>
      <p>Managerial Science 10 (2016) 135–193.
[23] J. Zhang, G. Adomavicius, A. Gupta, W. Ketter, Consumption and performance: Understanding
longitudinal dynamics of recommender systems via an agent-based simulation framework, Info.
Sys. Research 31 (2020) 76–101. URL: https://doi.org/10.1287/isre.2019.0876. doi:10.1287/isre.
2019.0876.
[24] A. Ferraro, D. Jannach, X. Serra, Exploring longitudinal efects of session-based recommendations,
in: Proceedings of the 14th ACM Conference on Recommender Systems, 2020, pp. 474–479.
[25] D. Lee, K. Hosanagar, How do recommender systems afect sales diversity? a cross-category
investigation via randomized field experiment, SSRN Electronic Journal (2017). doi: 10.2139/ssrn.
2603361.
[26] D. Fleder, K. Hosanagar, Blockbuster culture’s next rise or fall: The impact of recommender
systems on sales diversity, Management science 55 (2009) 697–712.
[27] G. Adomavicius, Y. Kwon, Improving aggregate recommendation diversity using ranking-based
techniques, IEEE Transactions on Knowledge and Data Engineering 24 (2012) 896–911. doi:10.
1109/TKDE.2011.15.
[28] B. P. Knijnenburg, M. C. Willemsen, Z. Gantner, H. Soncu, C. Newell, Explaining the user experience
of recommender systems, in: Proceedings of the 2012 ACM Conference on Recommender Systems
(RecSys ’12), ACM, New York, NY, USA, 2012, pp. 141–148. doi:10.1145/2365952.2365974.
[29] A. Starke, The efectiveness of advice solicitation and social peers in an energy recommender
system, in: 6th Joint Workshop on Interfaces and Human Decision Making for Recommender
Systems, IntRS 2019, CEUR-WS. org, 2019, pp. 65–71.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Karimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jugovac</surname>
          </string-name>
          ,
          <article-title>News recommender systems-survey and roads ahead</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>54</volume>
          (
          <year>2018</year>
          )
          <fpage>1203</fpage>
          -
          <lpage>1227</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
            <surname>Helberger</surname>
          </string-name>
          ,
          <article-title>On the democratic role of news recommenders</article-title>
          , in: Algorithms, automation, and news, Routledge,
          <year>2021</year>
          , pp.
          <fpage>14</fpage>
          -
          <lpage>33</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Raza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <article-title>News recommender system: a review of recent progress, challenges, and opportunities</article-title>
          ,
          <source>Artificial Intelligence Review</source>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>52</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Dahlgren</surname>
          </string-name>
          ,
          <article-title>A critical review of filter bubbles and a comparison with selective exposure</article-title>
          .,
          <source>Nordicom Review</source>
          <volume>42</volume>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Flaxman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Goel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Rao</surname>
          </string-name>
          ,
          <article-title>Filter bubbles, echo chambers, and online news consumption</article-title>
          ,
          <source>Public opinion quarterly 80</source>
          (
          <year>2016</year>
          )
          <fpage>298</fpage>
          -
          <lpage>320</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T. T.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.-M. Hui</surname>
            ,
            <given-names>F. M.</given-names>
          </string-name>
          <string-name>
            <surname>Harper</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Terveen</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          <string-name>
            <surname>Konstan</surname>
          </string-name>
          ,
          <article-title>Exploring the filter bubble: the efect of using recommender systems on content diversity</article-title>
          ,
          <source>in: Proceedings of the 23rd international conference on World wide web</source>
          ,
          <year>2014</year>
          , pp.
          <fpage>677</fpage>
          -
          <lpage>686</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <article-title>Filter bubbles and digital echo chambers 1, in: The routledge companion to media disinformation and populism</article-title>
          , Routledge,
          <year>2021</year>
          , pp.
          <fpage>92</fpage>
          -
          <lpage>100</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Rosnes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Starke</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Trattner, Shaping the future of content-based news recommenders: Insights from evaluating feature-specific similarity metrics</article-title>
          ,
          <source>in: Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>201</fpage>
          -
          <lpage>211</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>L.</given-names>
            <surname>Michiels</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leysen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smets</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Goethals</surname>
          </string-name>
          ,
          <article-title>What are filter bubbles really? a review of the conceptual and empirical work</article-title>
          ,
          <source>in: Adjunct proceedings of the 30th ACM conference on user modeling, adaptation and personalization</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>274</fpage>
          -
          <lpage>279</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>E. Pariser,</surname>
          </string-name>
          <article-title>The filter bubble: What the Internet is hiding from you</article-title>
          ,
          <source>penguin UK</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Langford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Schapire</surname>
          </string-name>
          ,
          <article-title>A contextual-bandit approach to personalized news article recommendation</article-title>
          ,
          <source>in: Proceedings of the 19th international conference on World wide web</source>
          ,
          <year>2010</year>
          , pp.
          <fpage>661</fpage>
          -
          <lpage>670</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dolan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. R.</given-names>
            <surname>Pedersen</surname>
          </string-name>
          ,
          <article-title>Personalized news recommendation based on click behavior</article-title>
          ,
          <source>in: Proceedings of the 15th international conference on Intelligent user interfaces</source>
          ,
          <year>2010</year>
          , pp.
          <fpage>31</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F.</given-names>
            <surname>Garcin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dimitrakakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Faltings</surname>
          </string-name>
          ,
          <article-title>Personalized news recommendation with context trees</article-title>
          ,
          <source>in: Proceedings of the 7th ACM Conference on Recommender Systems</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>105</fpage>
          -
          <lpage>112</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Jung,</surname>
          </string-name>
          <article-title>The impact of recommendation system on user satisfaction: A moderated mediation approach</article-title>
          ,
          <source>Journal of Theoretical and Applied Electronic Commerce Research</source>
          <volume>19</volume>
          (
          <year>2024</year>
          )
          <fpage>448</fpage>
          -
          <lpage>466</lpage>
          . URL: https://www.mdpi.com/0718-1876/19/1/24.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>N.</given-names>
            <surname>Helberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Karppinen</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          <article-title>D'Acunto, Exposure diversity as a design principle for recommender systems</article-title>
          , Information,
          <source>Communication &amp; Society</source>
          <volume>21</volume>
          (
          <year>2018</year>
          )
          <fpage>191</fpage>
          -
          <lpage>207</lpage>
          . doi:
          <volume>10</volume>
          .1080/1369118X.
          <year>2016</year>
          .
          <volume>1271900</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Sagtani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Jhawar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mehrotra</surname>
          </string-name>
          ,
          <article-title>Quantifying and leveraging user fatigue for interventions in recommender systems</article-title>
          ,
          <source>in: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>2293</fpage>
          -
          <lpage>2297</lpage>
          . doi:
          <volume>10</volume>
          . 1145/3539618.3592044.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>D.</given-names>
            <surname>Holtz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Carterette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Chandar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Nazari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Cramer</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Aral,</surname>
          </string-name>
          <article-title>The engagement-diversity connection: Evidence from a field experiment on spotify</article-title>
          ,
          <source>in: Proceedings of the 21st ACM Conference on Economics and Computation</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>75</fpage>
          -
          <lpage>76</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Mai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Bouadjenek</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Waller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Anderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bodkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sanner</surname>
          </string-name>
          , Mitigating
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>