<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Topical Preference Trumps Other Features in News Recom mendation: A Conjoint Analysis on a Representative Sample from Norway</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Erik Knudsen</string-name>
          <email>Erik.Knudsen@uib.no</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alain D. Starke</string-name>
          <email>a.d.starke@uva.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christoph Trattner</string-name>
          <email>Christoph.Trattner@uib.no</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Amsterdam School of Communication Research (ASCoR), University of Amsterdam</institution>
          ,
          <addr-line>PO Box 15791, 1001 NG</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Amsterdam</institution>
          ,
          <country country="NL">Netherlands</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>MediaFutures, University of Bergen</institution>
          ,
          <addr-line>Lars Hilles gate 30, 5008 Bergen</addr-line>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>A variety of news articles features can be used to tailor news content. However, only a few studies have actually compared the relative importance of diferent features in predicting news reading behavior in the context of news recommender systems. This study reports the results of a conjoint experiment, where we examined the relative importance of seven features in predicting a user's intention to read, including: topic headline (Abortion vs Meat Eating), reading time, recency, geographic distance, topical preference match, demographic similarity, and general popularity in a news recommender system. To ensure an externally valid result, the study was distributed among a representative Norwegian sample ( = 1664 ), where users had to choose their preferred news article profile from four diferent pairs. We found that a topical preference match was by far the strongest predictor for choosing a news article, while recency and demographic similarity had no impact.</p>
      </abstract>
      <kwd-group>
        <kwd>Norway</kwd>
        <kwd>news</kwd>
        <kwd>recommender systems</kwd>
        <kwd>choice-based conjoint experiment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        News recommender systems face various domain-specific challenges [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. A user’s interest can
strongly depend on contextual factors [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], such as time of the day, the user’s current location,
or the technology used to read the article [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ]. Moreover, while most recommender systems
use historical data to present content that a user likes or needs [
        <xref ref-type="bibr" rid="ref6 ref7 ref8">6, 7, 8</xref>
        ], this is challenging in
news. There is a fast churn of items, news articles may be updated, and many users do not log
in at all [9, 10, 11].
      </p>
      <p>
        Most news recommender systems face a ‘permanent cold-start problem’ [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Besides showing
the most recent items [12], many news recommender applications therefore focus on
contentbased recommendations. Central to such approaches are diferent news article features [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
These can describe the article’s content, authorship, and contextual factors. The list of possible
https://www.uib.no/en/persons/Erik.Knudsen (E. Knudsen); https://christophtrattner.info/ (C. Trattner)
© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
features is long, for it can not only include news article content (e.g., title or headline, keywords),
but also a user’s location or time of the day [13]. Such features can be used in various methods
of similarity comparisons, such as TF-IDF and cosine similarity [14, 15].
      </p>
      <p>
        Only a few of these features may be deemed useful by users. Studies have examined news
recommender scenarios through so-called similar-item recommendations [
        <xref ref-type="bibr" rid="ref1">1, 16</xref>
        ]. This is a
common method of recommendation on news websites, using the rationale to show ‘more like
this’ of a reference news article a user likes [17]. Recommendation approaches typically involve
similarity functions, falling in line with studies on semantic similarity [18].
      </p>
      <p>
        It is unclear how important each news article feature is, relatively speaking. Most studies
involve ofline evaluation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], focusing on training a model using a subset of features with the
goal of improving accuracy, instead of performing a holistic evaluation of multiple features.
One study [17], in the context of semantic similarity, has shown that title-based and body-text
similarity functions seem to represent user similarity judgments better than recency-based or
image-based features do. This suggests that users are more likely to rate recommendations
based on news article text as accurate in a personalization scenario. However, this study did
not include all factors, such as the proximity of a reader to the news event [19, 20], whether the
news article is among the most read articles [21], or whether it is clear why the news article is
recommended to a user [22]. Moreover, such studies have not directly compared the relative
importance of all news article features.
      </p>
      <p>
        It is not entirely clear which recommendation mechanisms users would like a news
recommender system to take into account. For example, while the ofline performance of diferent
recommendation approaches has been examined [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], it is less clear whether users would prefer
recency-based recommendations (i.e., content-based) over crowdsourced-based
recommendations (i.e., collaborative filtering). Moreover, it is also unclear how their importance relates to
other factors, related to personalization based on demographics.
      </p>
      <p>
        While numerous studies have looked into algorithmic optimization of news retrieval and
recommendation [
        <xref ref-type="bibr" rid="ref1">1, 23</xref>
        ], less is known about how users evaluate presented recommendations.
Also, the study of Starke et al. [17] only involves a user’s similarity judgment as a dependent
variable, and does not include an evaluative measure that is directly related to a recommendation
scenario.
      </p>
      <sec id="sec-1-1">
        <title>1.1. Research Question and Contribution</title>
        <p>
          In this study, we examine the relative importance of a variety of factors in a news
recommendation scenario. Included are seven features, which have been identified in earlier studies on news
recommendation [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. First, a news article’s topic and whether it aligns with a user’s interests,
because topic modeling and tags are at the core of many news recommender approaches [24],
transparency about whether it aligns with a user’s previous rating (cf. [25, 26]), and whether
the news article is recommended or not. Moreover, we compare the importance of whether
the news article is recent, whether it is among the most popular news articles, whether it is a
short or long article (i.e., reading time), and whether the news article discusses events that are
spatially close to the user (i.e., proximity). We present the following research question:
• RQ: In the context of news personalization, which news article features are the strongest
predictors of a user’s intention to read?
        </p>
        <p>This research question is examined using a method that is novel for the recommendation
domain. Whereas most studies will focus on ofline evaluation or some form of online evaluation,
possibly in the context of an evaluation framework [27, 28], this study presents a controlled
experiment. We employ conjoint analysis [29], a type of experiment that is increasingly used in
social science research, which aims to understand multidimensional decision-making, through
for instance a best-to-worst scaling of diferent factors by asking users to indicate a preferred
alternative from multiple set of options. Since each of these items contains certain values for
diferent attributes, the importance of these attributes can be determined [ 29]. This method is
one of the most notable advancements in social science experimental research in the last decade
[30]. In this study, we present pairs of news articles that users need to choose between in terms
of their preferences.</p>
        <p>Another stand-out aspect of this study is its sample. Our news article vignettes have been
evaluated by a representative sample of the Norwegian population, being part of a survey
administered by the Norwegian Citizen Panel (NCP). Whereas many computer science studies
currently rely on crowdsourcing participants from websites such as Amazon MTurk or Prolific
due to convenience and an improvement in quality compared to university student samples
[31], their demographic characteristics are far from representative of the larger population [32].
This is particularly important in the news recommender domain, where not only predictive
accuracy should be considered, but also democratic and normative values [33, 34].</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Methods</title>
      <p>In a choice-based conjoint experiment with a probability-based representative sample of
Norwegian Internet users ( = 1664 ), we tested the efects of news headline features and news
recommender system features on users’ intention to click on and read a full news article.</p>
      <sec id="sec-2-1">
        <title>2.1. Participants</title>
        <p>Our data collection was part of the 24th wave of the Norwegian Citizen Panel (NCP) in June 2022.
The NCP is a highly-respected time-sharing online survey panel that collects high-quality survey
data—representative of the Norwegian adult population—three times a year using
probabilitybased sampling. While costly (i.e., our study cost ≈ 17.000 €), probability sampling is considered
“the gold standard” of survey research [35], as the entire adult population of Norway has an equal
and known probability of being invited. NCP’s time-sharing strategy also ensures that the high
cost of collecting such high-quality data is distributed over a large number of studies. The entire
panel of the NCP’s respondents are gathered through postal recruitment of individuals over 18
years, with regular new recruitment due to well-known issues of panel attrition over time [36].
These individuals were randomly selected for recruitment from Norway’s National Registry:
a list of all individuals who either are or have been a resident of Norway, maintained by the
oficial Tax Administration. NCP’s data are available free of cost for scholars via the Norwegian
Social Science Data Archive. For more details about response rates or other methodological
matters, please refer to the NCP methodology reports [37].</p>
        <p>10.160 of NCP’s panel respondents participated in the 24th wave of the NCP. Demographic
information was collected for all panel respondents. A random sub-sample of 1664 participants
in the NCP were randomly assigned to and completed our experiment. In our sub-sample, 49 %
were female, 66 % had a higher education, and the median year of birth was between 1960-1989.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Procedure, Research Design, Materials</title>
        <p>Our research question was addressed using a conjoint experiment. In such experiments,
participants are typically asked to choose between  number of alternatives in terms of a specific
dependent variable, such as favorability or appropriateness [30]. Each alternative, referred to
as a Profile, contained randomly assigned levels of diferent features. By predicting for each
choice set what alternative was chosen in terms of its feature values, the relative importance of
each feature can be determined.</p>
        <p>The news profile task is depicted in Figure 1. For each choice task, participants were asked to
choose between the two profiles, selecting the news story they would prefer to click on and
read. Each respondent evaluated two profiles of news item recommendations through four
consecutive choice tasks, resulting in a total of 13.311 observations.</p>
        <p>In this study, participants were presented with pairs of news article profiles (i.e., descriptions).
Each profile was composed of seven news article features with two levels. As can be observed
from Figure 1, these features had distinct levels that were randomly assigned to each profile. As
such, more than 27 combinations were possible, as it was formally a 2x2x2x2x2x2x2 research
design.</p>
        <p>
          The seven specific features, what they represented and the involved levels are outlined in
Table 1. For instance, each profile featured a randomly selected headline (out of four headlines
that were based on actual Norwegian news stories from the newspapers “Vårt Land”, “TV2”,
“NRK”, “Dagsavisen”, and “ABC nyheter”) on the topic of either abortion and meat prices (in
response to mitigating climate change)1. In addition, each profile featured information whether
it was recommended, or not, due to, for instance, the demographic similarity, compared to
the respondents, of other users who had read the story. The features were selected based on
relevance in earlier work on news recommender systems (cf. [
          <xref ref-type="bibr" rid="ref1">24, 1</xref>
          ]); more detail can be found
in Section 2.3. As we opt for a statistically eficient conjoint design [ 38], and because each
feature only varied between two levels, each profile pair always displayed diferent feature
levels. To illustrate, if Profile 1 displayed a headline on abortion, Profile 2 always displayed a
headline on meat prices, and vice versa.
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Measures</title>
        <p>Independent variables. All experimental treatments (i.e., our independent variables) are listed
in Table 1. We provide additional detail on three of these attributes. First, the Topical Preference
factor was generated through, earlier in the survey, asking each respondent to rank a list of five
topics through the question “Below are a number of topics that Norwegian newspapers write
about. Please rank the topics according to which one you would like to know more about. You
1Note that although this strictly speaking involved more than two levels for the news headline feature, the two
relevant levels for the research design were the diferences in topic, not the specific ‘example’ headline. Variation
was implemented due to the repeated measures per participant, while designated two levels per feature represented
a statistically eficient research design.</p>
        <p>Figure 1: Screenshot of the experiment from the respondents’ point of view.</p>
        <sec id="sec-2-3-1">
          <title>Article</title>
        </sec>
        <sec id="sec-2-3-2">
          <title>Article</title>
        </sec>
        <sec id="sec-2-3-3">
          <title>Article RS RS RS</title>
          <p>RS</p>
        </sec>
        <sec id="sec-2-3-4">
          <title>Topic in Headline</title>
        </sec>
        <sec id="sec-2-3-5">
          <title>Reading time</title>
        </sec>
        <sec id="sec-2-3-6">
          <title>Recency of Publication</title>
        </sec>
        <sec id="sec-2-3-7">
          <title>Geographic Distance</title>
        </sec>
        <sec id="sec-2-3-8">
          <title>Topical Preference Match</title>
        </sec>
        <sec id="sec-2-3-9">
          <title>Demographic Similarity</title>
        </sec>
        <sec id="sec-2-3-10">
          <title>Popularity-based</title>
          <p>Levels</p>
        </sec>
        <sec id="sec-2-3-11">
          <title>Meat prices</title>
        </sec>
        <sec id="sec-2-3-12">
          <title>Extension of Abortion Rights</title>
        </sec>
        <sec id="sec-2-3-13">
          <title>2 minutes</title>
          <p>15 minutes</p>
        </sec>
        <sec id="sec-2-3-14">
          <title>2 minutes ago</title>
        </sec>
        <sec id="sec-2-3-15">
          <title>5 days ago</title>
        </sec>
        <sec id="sec-2-3-16">
          <title>Close</title>
        </sec>
        <sec id="sec-2-3-17">
          <title>Distant Yes No</title>
        </sec>
        <sec id="sec-2-3-18">
          <title>Read by people like you</title>
        </sec>
        <sec id="sec-2-3-19">
          <title>Not read by people like you</title>
        </sec>
        <sec id="sec-2-3-20">
          <title>Among Most Read</title>
        </sec>
        <sec id="sec-2-3-21">
          <title>Among Least Read</title>
          <p>should rank the topic that you would most like to read about at the top and the one that you
are least interested in reading at the bottom.” Based on each respondent’s ranking of these five
topics, a script matched each respondent’s ranking of topics with the topics presented in the
experiment. If the respondent had ranked one topic higher than another, the experiment would
present respondents with the following wording: “You have previously indicated that you are
interested in knowing more about the subject of the article. The article has therefore been
suggested to you, based on your ranking of the topic.” Vice versa, respondents were presented
with the following wording: “You have previously indicated that you are less interested in
knowing more about the subject of the article. The article has therefore not been suggested to
you, based on your ranking of the topic.”</p>
          <p>Second, the Geographical Distance factor mentioned a geographical place that was either close
or distant to the respondent, based on panel information of each respondent regarding where
they live. This information was exclusively displayed to the respondents and only recorded
as either “close” or “distant” to guarantee the utmost privacy of personal information for each
individual participant.</p>
          <p>Third, the Demographic Similarity feature was created based on the panel information
of each participant. A recommended article featured the following wording “This article is
widely read among [respondent’s gender] in the [respondent’s age group] age group living in
[respondent’s Region], and is therefore suggested for people like you.” Vice versa, an article that
was not recommended featured the following wording: “This article is not widely read among
[respondent’s gender] in the [respondent’s age group] age group living in [respondent’s Region],
and is therefore suggested for people like you.” The information on gender, age, and region was
exclusively displayed to the respondents and only recorded as either “recommended” or “not
recommended” to guarantee the utmost privacy of personal information for each individual
participant.</p>
          <p>Dependent variable. Our outcome of interest in this paper is the participants’ binary choice
between two news stories. The participants’ task was to select which news item they would
click on and read based on the presented choices through the question “Which of these two
stories are you most likely to click on to read more?”, with the binary choice between “News
story 1” and “News story 2” as the dependent variable. The choice made in the study was
considered a stated intention to read, relative to the other news article presented.</p>
        </sec>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Analysis</title>
        <p>The conjoint design enabled us to analyze the influence of individual news article features. To
compare treatment efects, we estimated the average marginal component efects (AMCE) of
each treatment [29]. This represented the marginal efect of one feature averaged across the
joint distribution of the other factors. While such designs produce a high number of possible
treatment combinations, Hainmueller and colleagues [29] demonstrate that not all specific
combinations are required to estimate the AMCEs of each component.</p>
        <p>We estimated the AMCEs by regressing each feature on news story selection. To this end,
logistic regression with within-respondent clustering was used to ensure robust standard errors
and get unbiased estimates of the variance [29].</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results</title>
      <p>The results of our AMCE-based logistic regression analysis are displayed in Figure 2. The
dots indicated point estimates, bars illustrated 95%-confidence intervals, and dots without bars
represented the reference categories. The AMCEs could be interpreted as percentage points
[30]. The coeficients and significance levels are also presented in Table 2 below.
Note: ∗∗∗ &lt; 0.001 , ∗∗ &lt; 0.01 , ∗ &lt; 0.05
 (S.E.)
0.04 (0.09)
0.05 (0.05)
-0.01 (0.05)
0.14 (0.05)∗∗
0.87 (0.09)∗∗∗
0.07 (0.05)
0.11 (0.05)∗
13380
0.067
17876.75
17929.27
s
e
r
u
t
a
e
f
e
l
c
i
t
r
A
s
e
r
u
t
a
e
f
S
R</p>
      <p>Topic in Headline
Extension of Abortion Rights</p>
      <p>Meat prices
Reading time (Short)
15 minutes
2 minutes
Recency of Publication</p>
      <p>5 days ago
2 minutes ago
Geographical Proximity</p>
      <p>Distant</p>
      <p>Close
Topical Preference</p>
      <p>No</p>
      <p>Yes
Demographic Similarity</p>
      <p>Not read by people like you</p>
      <p>Read by people like you
Popularity-based</p>
      <p>Among Least Read
Among Most Read
0
-1
0
0
0
0
0</p>
      <p>14
7
11
-50</p>
      <p>50
Marginal effects (%)
100</p>
      <p>We observed no statistically significant treatment efects between any of the levels of the
news article features on preferring any news profile. This involved the specific headline, reading
time, and recency of publication. This suggested that although these features were used in
earlier studies, they were less important in determining user preferences for news articles than
features that emphasize recommender aspects.</p>
      <p>Hence, in contrast, the news recommender system features, showed substantial efects. Working
from top to bottom in Figure 2 and Table 2, we observed that respondents are 14 percentage
points more likely to click on a news story if a news recommender highlights a location that is
geographically close to the user, compared to an alternative that is more distant. This suggested
that news article preferences are in part determined by local relevance.</p>
      <p>The most substantial efect was found through topical preference recommendation.
Respondents were 87 percentage points more likely to choose a news story if it was recommended to
them due to a highlighted match in their topical interests, compared to a news story with a
topic that was lower ranked ( &lt; 0.001 ). This efect was much stronger than those induced by
any of the other features, which suggested that transparent topical preference matching was
the most important feature.</p>
      <p>The two remaining features showed mixed results. Highlighting demographic similarity
by indicating ‘people like you’ read this, did not lead to significantly more preference choices
than ‘not read by people like you’ (a non-significant diference of 7 percentage points). In
contrast, highlighting the popularity of a news article, compared to unpopularity, led to a small,
significant increase in user preferences by 11 percentage points (  &lt; 0.05 ).</p>
      <sec id="sec-3-1">
        <title>3.1. Conclusion</title>
        <p>Overall, one feature had the largest impact. Comparing the treatment efects across all features
in our conjoint experiment, we observed that Topical Preference was the most important one (i.e.,
87 percentage points) for predicting news story selection, followed by Geographical Distance
and Popularity-based recommendations.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion</title>
      <p>
        We have presented the results of a user study with a novel approach in the context of news
recommender systems, employing a novel experimental design adopted from the social sciences.
Through a conjoint analysis, we have compared the impact of diferent news article features
and recommender system features on a user’s intention to read a news story. By doing so, we
have highlighted the relative importance of diferent news article features and recommender
system features that have been used in the past in news recommender studies [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], either in
ofline learning tasks (cf. [ 13, 39]) or online evaluation studies (cf. [15, 17]).
      </p>
      <p>
        Our study has involved a scenario in which both news article features and recommendation
methods are presented transparently to the user. In this context, we find that Topical Preference
is particularly important for a user’s news story selection. This indicates that news recommender
systems that focus on surveying users’ topic preferences, and recommend stories based on
the answers from such surveys, will likely have a higher chance or success rate in terms of
predicting clicks or reads. While prior work has argued that a topic match between the user
and news content should be more efective than a mismatch [ 24], our results contribute to
this literature by showing that it is far more efective than other forms of similarity, revealing
large diferences with other features. Topical preference seems to trump similarity based on
demographic similarity, general popularity, and recency, all of which are also often used in
news recommendation [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>Regarding the comparative evaluation of features, our findings are complementary to other
studies that examined multiple features simultaneously. For example, whereas Starke et al. [17]
show that title-based and body-text-based similarity resonate with a user’s similarity judgment,
we show which underlying mechanisms of recommendation are important in building user
preferences. Whereas other work in semantic similarity is focused on validating the ‘correctness’
of algorithmic functions [18], we have examined which features are preferred by users.</p>
      <p>
        These findings are important because they provide direction for designers of news
recommender systems. They illustrate that some features are indeed more important than others
when it comes to predicting what a user may read. This might narrow what factors and features
should be considered in ofline evaluation approaches, which are still the sole method in many
papers [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], to generate an algorithmic approach that actually resonates with users in online
evaluation [17, 18].
      </p>
      <p>
        Another strong point of this study is its research population and sample. Very few studies
in recommender research have tapped into representative samples for their evaluation. One
reason may be that such evaluations are costly for session-based studies, where the behavioral
implications remain unclear [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and therefore the sample is of less importance. Another reason
might be a systemic bias when it comes to how important such demographic characteristics
are. It could be, for example, that some studies with news recommenders have found favorable
results for a specific algorithmic approach because of a relatively young sample. By using a
representative sample from Norway, which is a typical ‘Western-type country’, the chance of
such a confounding bias is strongly reduced.
      </p>
      <p>
        The extent to which our efect sizes hold up in more traditional Recommender studies is less
clear. For this study, some features have been subject to two ‘ends of the scale’. For example,
popularity was compared by showing either ‘least popular’ or ‘most popular’, which is unlikely
to happen in a recommender field experiment with news. There, popularity is one of many
possible approaches [
        <xref ref-type="bibr" rid="ref1">24, 1</xref>
        ], which would be juxtaposed to other ‘best guesses’, not a diferent,
‘intentionally poor’ approach. Thus, we caution that our results should not be used to make
absolute statements on the probability that a recommender system feature influences news use.
At the same time, the goal of this study has been to assess the relative importance of diferent
features. Taking the ‘whole spectrum’ of a feature, such as by using least and most popular, does
provide a clear assessment of a feature’s importance. While this importance could be reduced
with a diferent baseline, all features except the headline had such extreme values, deeming it a
fair comparison.
      </p>
      <sec id="sec-4-1">
        <title>4.1. Limitations &amp; Future Work</title>
        <p>The main limitation of our design is the use of only two news topics, which are both controversial.
Abortion and meat-eating has led to polarized discussions in society [40, 41], which might have
amplified the importance of a match in the topic at hand. Hence, such a strong efect may not
be found for more nuanced topics, as our design is limited to two (controversial) political topics
and might not generalize to other news topics and genres. In addition, the design does not
mimic real news use decisions and thus it is hard to determine the extent to which our findings
would be reproduced outside the experimentally controlled environment in which we conduct
our study.</p>
        <p>
          Another limitation may explain the lack of a recency efect, which has been observed in
previous studies [
          <xref ref-type="bibr" rid="ref1">1, 42</xref>
          ]. Our headlines have not included breaking news elements, being more
“timeless” than many breaking news stories. This can, perhaps, have implications for recency,
for its importance may be amplified if a news story has just broken. Moreover, the fact that this
study has been rather hypothetical in the sense that only the headline, and not the entire news
article, were read, might have exacerbated the lack of a recency efect.
        </p>
        <p>
          Considering all the limitations mentioned, we encourage future studies to employ a more
naturalistic reproduction of this study. We argue that the conjoint method could still be
used since it provides a fair feature-based comparison, but that the user interaction with the
recommender system should be part of a news website context. This way, the user’s profile
may be more durable, and is not formed by a single preference elicitation session.
Moreover, more recent news articles could be used, making the dependent variable also
more relevant, in the sense that the presented news articles would be novel to a user.
This way, a user’s intention could also be related to actual reading behavior, which has been
investigated in some news recommender stories [
          <xref ref-type="bibr" rid="ref1">24, 1</xref>
          ]. Finally, assessing the effectiveness
of a conjoint method for preference elicitation would be a valuable line of research. While
only a few non-news domains have used this approach in recommender research [43], this
would be new to the news recommender system domain, and a promising line of research.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This work was supported by the NewsRec Project (project number: 324835) and by industry
partners and the Research Council of Norway with funding to MediaFutures: Research Centre
for Responsible Media Technology and Innovation, through the centers for Research-based
Innovation scheme (project number: 309339). We also thank a Dutch airline for directly
connecting Amsterdam and Bergen, and the inspiring Bergen mountainsides for boosting our
creativity.
[9] T. Luostarinen, O. Kohonen, Using topic models in content-based news recommender
systems, in: Proceedings of the 19th Nordic conference of computational linguistics
(NODALIDA 2013), 2013, pp. 239–251.
[10] A. S. Das, M. Datar, A. Garg, S. Rajaram, Google news personalization: scalable online
collaborative filtering, in: Proceedings of the 16th international conference on World Wide
Web, 2007, pp. 271–280.
[11] W. Chu, S.-T. Park, T. Beaupre, N. Motgi, A. Phadke, S. Chakraborty, J. Zachariah, A
case study of behavior-driven conjoint analysis on yahoo! front page today module, in:
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery
and data mining, 2009, pp. 1097–1104.
[12] J.-w. Ahn, P. Brusilovsky, J. Grady, D. He, S. Y. Syn, Open user profiles for adaptive news
systems: help or harm?, in: Proceedings of the 16th international conference on World
Wide Web, 2007, pp. 11–20.
[13] L. Feremans, R. Verachtert, B. Goethals, A neighbourhood-based location-and time-aware
recommender system (2022).
[14] R. K. Pon, A. F. Cardenas, D. Buttler, T. Critchlow, Tracking multiple topics for finding
interesting articles, in: Proceedings of the 13th ACM SIGKDD international conference on
Knowledge discovery and data mining, 2007, pp. 560–569.
[15] K. F. Yeung, Y. Yang, A proactive personalized mobile news recommendation system, in:
Proceedings - 3rd International Conference on Developments in eSystems Engineering,
DeSE 2010, 2010.
[16] F. Garcin, K. Zhou, B. Faltings, V. Schickel, Personalized news recommendation based
on collaborative filtering, in: 2012 IEEE/WIC/ACM International Conferences on Web
Intelligence and Intelligent Agent Technology, volume 1, IEEE, 2012, pp. 437–441.
[17] A. D. Starke, S. Øverhaug, C. Trattner, Predicting feature-based similarity in the news
domain using human judgments, in: 15th ACM Conference on Recommender Systems,
RecSys 2021, 2021.
[18] N. Tintarev, J. Masthof, Similarity for news recommender systems, in: In Proceedings of
the AH’06 Workshop on Recommender Systems and Intelligent User Interfaces, Citeseer,
2006.
[19] D. Berkowitz, D. W. Beach, News sources and news context: The efect of routine news,
conflict and proximity, Journalism Quarterly 70 (1993) 4–12.
[20] P. J. Shoemaker, J. H. Lee, G. Han, A. A. Cohen, Proximity and scope as news values, Media
studies: Key issues and debates (2007) 231–248.
[21] F. Garcin, B. Faltings, O. Donatsch, A. Alazzawi, C. Bruttin, A. Huber, Ofline and online
evaluation of news recommender systems at swissinfo. ch, in: Proceedings of the 8th ACM
Conference on Recommender systems, 2014, pp. 169–176.
[22] N. Diakopoulos, M. Koliska, Algorithmic transparency in the news media, Digital
journalism 5 (2017) 809–828.
[23] T. Bogers, A. Van Den Bosch, Comparing and evaluating information retrieval algorithms
for news recommendation, in: RecSys’07: Proceedings of the 2007 ACM Conference on
Recommender Systems, 2007. doi:10.1145/1297231.1297256.
[24] C. Feng, M. Khan, A. U. Rahman, A. Ahmad, News recommendation
systemsaccomplishments, challenges &amp; future directions, IEEE Access 8 (2020) 16702–16725.
[25] R. Blanco, D. Ceccarelli, C. Lucchese, R. Perego, F. Silvestri, You should read this! let me
explain you why: explaining news recommendations to users, in: Proceedings of the
21st ACM international conference on Information and knowledge management, 2012, pp.
1995–1999.
[26] M. Ter Hoeve, M. Heruer, D. Odijk, A. Schuth, M. de Rijke, Do news consumers want
explanations for personalized news rankings, in: FATREC Workshop on Responsible
Recommendation Proceedings, 2017.
[27] B. P. Knijnenburg, M. C. Willemsen, Evaluating recommender systems with user
experiments, Recommender systems handbook (2015) 309–352.
[28] P. Pu, L. Chen, R. Hu, A user-centric evaluation framework for recommender systems, in:</p>
      <p>Proceedings of the fith ACM conference on Recommender systems, 2011, pp. 157–164.
[29] J. Hainmueller, D. J. Hopkins, T. Yamamoto, Causal inference in conjoint analysis:
Understanding multidimensional choices via stated preference experiments, Political analysis 22
(2014) 1–30.
[30] E. Knudsen, M. P. Johannesson, Beyond the limits of survey experiments: How
conjoint designs advance causal inference in political communication research, Political
Communication 36 (2019) 259–271.
[31] T. S. Behrend, D. J. Sharek, A. W. Meade, E. N. Wiebe, The viability of crowdsourcing for
survey research, Behavior research methods 43 (2011) 800–813.
[32] N. Stewart, J. Chandler, G. Paolacci, Crowdsourcing samples in cognitive science, Trends
in cognitive sciences 21 (2017) 736–748.
[33] N. Helberger, M. van Drunen, J. Moeller, S. Vrijenhoek, S. Eskens, Towards a normative
perspective on journalistic ai: Embracing the messy reality of normative ideals, 2022.
[34] S. Vrijenhoek, G. Bénédict, M. Gutierrez Granada, D. Odijk, M. De Rijke,
Radio–rankaware divergence metrics to measure normative diversity in news recommendations, in:
Proceedings of the 16th ACM Conference on Recommender Systems, 2022, pp. 208–219.
[35] E. S. Zack, J. Kennedy, J. S. Long, Can nonprobability samples be used for social science
research? a cautionary tale, in: Survey Research Methods, volume 13, 2019, pp. 215–227.
[36] P. Lynn, Tackling panel attrition, The Palgrave handbook of survey research (2018)
143–153.
[37] Ø. Skjervheim, A. Høgestøl, Norwegian Citzen Panel Methodology Report wave 24,
Technical Report, Ideas 2 Evidence, Bergen, 2022.
[38] B. De la Cuesta, N. Egami, K. Imai, Improving the external validity of conjoint analysis:</p>
      <p>The essential role of profile distribution, Political Analysis 30 (2022) 19–45.
[39] M. Tavakolifard, J. A. Gulla, K. C. Almeroth, J. E. Ingvaldesn, G. Nygreen, E. Berg, Tailored
news in the palm of your hand: a multi-perspective transparent approach to news
recommendation, in: Proceedings of the 22nd international conference on world wide web, 2013,
pp. 305–308.
[40] M. R. Hofarth, G. Hodson, Green on the outside, red on the inside: Perceived
environmentalist threat as a factor explaining political polarization of climate change, Journal of
Environmental Psychology 45 (2016) 40–49.
[41] T. Mouw, M. E. Sobel, Culture wars and opinion polarization: the case of abortion,</p>
      <p>American Journal of Sociology 106 (2001) 913–943.
[42] L. Li, D.-D. Wang, S.-Z. Zhu, T. Li, Personalized news recommendation: a review and an
experimental investigation, Journal of computer science and technology 26 (2011) 754–766.
[43] B. Loepp, T. Hussein, J. Ziegler, Choice-based preference elicitation for collaborative
ifltering recommender systems, in: Proceedings of the SIGCHI Conference on Human
Factors in Computing Systems, 2014, pp. 3085–3094.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Karimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jugovac</surname>
          </string-name>
          ,
          <article-title>News recommender systems-survey and roads ahead</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>54</volume>
          (
          <year>2018</year>
          )
          <fpage>1203</fpage>
          -
          <lpage>1227</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>E.</given-names>
            <surname>Mitova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Blassnig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Strikovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Urman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hannak</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. H. de Vreese</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Esser</surname>
          </string-name>
          ,
          <article-title>News recommender systems: A programmatic research review</article-title>
          ,
          <source>Annals of the International Communication Association</source>
          <volume>47</volume>
          (
          <year>2023</year>
          )
          <fpage>84</fpage>
          -
          <lpage>113</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Campos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Fernández-Tobías</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Cantador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Díez</surname>
          </string-name>
          ,
          <article-title>Context-aware movie recommendations: an empirical comparison of pre-filtering, post-filtering and contextual modeling approaches</article-title>
          , in: E-Commerce and Web Technologies: 14th International Conference, EC-Web
          <year>2013</year>
          , Prague, Czech Republic,
          <source>August 27-28</source>
          ,
          <year>2013</year>
          . Proceedings 14, Springer,
          <year>2013</year>
          , pp.
          <fpage>137</fpage>
          -
          <lpage>149</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B.</given-names>
            <surname>Fortuna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Fortuna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mladenić</surname>
          </string-name>
          ,
          <article-title>Real-time news recommender system</article-title>
          ,
          <source>in: Joint European Conference on Machine Learning and Knowledge Discovery in Databases</source>
          , Springer,
          <year>2010</year>
          , pp.
          <fpage>583</fpage>
          -
          <lpage>586</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Kille</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Hopfgartner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Brodt</surname>
          </string-name>
          , T. Heintz,
          <article-title>The plista dataset</article-title>
          ,
          <source>in: Proceedings of the 2013 international news recommender systems workshop and challenge</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>16</fpage>
          -
          <lpage>23</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zanker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Felfernig</surname>
          </string-name>
          , G. Friedrich,
          <source>Recommender systems: an introduction</source>
          , Cambridge University Press, Cambridge, UK,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Starke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Willemsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Snijders</surname>
          </string-name>
          ,
          <article-title>Using explanations as energy-saving frames: A user-centric recommender study</article-title>
          ,
          <source>in: Adjunct Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>229</fpage>
          -
          <lpage>237</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Starke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Unifying recommender systems and conversational user interfaces</article-title>
          ,
          <source>in: Proceedings of the 4th Conference on Conversational User Interfaces</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>