<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Recommendation systems for news articles at the BBC</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maria Panteli</string-name>
          <email>maria.panteli@bbc.co.uk</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Piscopo</string-name>
          <email>alessandro.piscopo@bbc.co.uk</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adam Harland</string-name>
          <email>adam.harland@bbc.co.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jonathan Tutcher</string-name>
          <email>jon.tutcher@bbc.co.uk</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Felix Mercer Moss</string-name>
          <email>felix.mercermoss@bbc.co.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>British Broadcasting Corporation</institution>
          ,
          <addr-line>Bristol</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>British Broadcasting Corporation</institution>
          ,
          <addr-line>Glasgow</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>British Broadcasting Corporation</institution>
          ,
          <addr-line>London</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>British Broadcasting Corporation</institution>
          ,
          <addr-line>Salford</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Personalised user experiences have improved engagement in many industry applications. When it comes to news recommendations, and especially for a public service broadcaster like the BBC, recommendation systems need to be in line with the editorial policy and the business values of the organisation. In this paper we describe how we develop recommendation systems for news articles at the BBC. We present three models and describe how they compare with baseline approaches such as random and popularity. We also discuss the metrics we use, the unique challenges we face and the considerations needed to ensure the recommendations we generate uphold the trust and quality standards of the BBC.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <sec id="sec-1-1">
        <title>Information systems</title>
        <p>Computing methodologies</p>
      </sec>
      <sec id="sec-1-2">
        <title>Recommender systems;</title>
        <p>Machine learning approaches.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>
        The BBC is one of the world’s leading public service
broadcasters. Its services—television, radio, digital—reach more
than 80% of UK’s adult population every week [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and 279
million people worldwide (World Service [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]). This large
audience has access to a vast and diverse amount of content,
including video, audio and text, spanning topics such as news,
sport, and entertainment. In order to enable its audience to
enjoy the best possible experience, it is crucial for the BBC
to adopt strategies to guide users to the most relevant and
engaging content. The main approach until recently has been
to manually curate content following the guidelines formally
documented in an editorial tome [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. These have been
developed to ensure quality across all products, uphold the BBC
values, and build audience trust. Although manual curation is
an excellent way to surface quality content, it is not tailored
to the user and is hard to scale—the more the amount of
content, the harder it is for curators to find relevant items for
each type of content. In order to deliver an experience which
is relevant, timely, and contextually useful to every single
Copyright ' 2019 for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).
user, the BBC combines editorial curation with personalised,
automated approaches. Data-driven recommendations are a
key part of these approaches: they are an important tool to
enhance users’ ability to explore and discover content they
would not be aware of otherwise (see e.g. [
        <xref ref-type="bibr" rid="ref26 ref31 ref36 ref37">26, 31, 36, 37</xref>
        ]) and
have been successfully tested and deployed by several media
providers (e.g. Netflix [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]) and e-commerce companies (e.g.
Amazon [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ]).
      </p>
      <p>
        According to the mission of the BBC, the organisation
must “act in the public interest, serving all audiences through
the provision of impartial, high-quality and distinctive output
and services which inform, educate, and entertain” [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
Following this mission, the BBC must be a provider of accurate
and unbiased information and the content it produces and
distributes must aim to engage diverse audiences. Amongst
the diverse types of content produced by the BBC, news is
the product that likely contributes most to its reputation
as a trustworthy and authoritative media outlet. Besides
the UK service BBC News1, the BBC produces, broadcasts,
and delivers online news in more than 40 languages. Hence,
it is of utmost importance for automated recommendation
approaches implemented on any BBC news service to be not
only as accurate as possible, but also to conform with the
principles outlined above. This paper reports early results of
the experiments we carried out to that end. In particular, it
describes the development of recommendation systems for
BBC news articles and the challenges in building data-driven
applications for a public service broadcaster. The case study
adopted in the experiment was the application of
recommendation systems for BBC Mundo2, a Spanish-language news
website and part of BBC World Service [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>The structure of this paper is as follows. Section 2 defines
the problem addressed in the current work, and Section 3
discusses prior related work. Section 4 describes the
methodology including the data, models, and evaluation approaches.
Finally, results are presented and discussed in Sections 5
and 6.
1Please note that ‘News’ capitalised refers to the UK channel, whereas
lowercase regards to the type of content.
2https://www.bbc.com/mundo</p>
    </sec>
    <sec id="sec-3">
      <title>PROBLEM DEFINITION</title>
      <p>
        Our goal is to build recommendation systems for news
articles. Recommendations in the news domain have been
characterised distinctly in the literature [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] due to the short
life-cycle of items and the vast amounts of anonymous users.
Considering the reputation of the BBC and the responsibility
it has to deliver trustworthy and authoritative news to its
audience, we highlight the following challenges in achieving
our goal.
      </p>
      <p>Non-signed in users. The majority of users on any BBC
news platform are not signed in. This means that we have
limited information about the user and the items they have
previously interacted with. We typically work with
sessionbased information, i.e. user-item interactions that occurred
within 30 minutes from each other. This means that our
recommendation models need to achieve high accuracy for
cold-start user scenarios or predict the user’s taste after as
little as one item interaction.</p>
      <p>Many cold-start items. The publication cycle on any news
platform is rapid and unrelenting. BBC News is no diferent.
Fresh items are regularly uploaded and any recommendation
system we implement should be able to serve an item within
minutes of publication. Additionally, articles may become
obsolete or gain sudden relevance following an event—consider
for example the case of breaking news. Recommendation
approaches must thus be able to take these characteristics
into account, not being based solely on a user’s history, but
considering the content and context of the articles they read.</p>
      <p>Architecture constraints. Because of the popularity of BBC
news, multiple stakeholders (internal and external) rely on
and set the requirements for the news platform. Any changes
to the system architecture that could afect other
stakeholders need to be thoroughly investigated and justiefid. Our
recommendation models often have to adapt to the
existing architecture which means that our system architecture
choices are somewhat constrained.</p>
      <p>Mistakes are not tolerated. BBC news, and the Mundo
platform in particular, are consumed by millions of users. For
the majority of these users, this is the only BBC platform
they visit. News is also a very sensitive domain as is not just
entertainment but is also the way in which people inform and
educate themselves. Mistakes in data-driven
recommendations could lead to misinformation or compromise our quality
standards, something which will largely impact our audience.
The bar for the performance of the system is set very high
to limit the risk of unexpected behaviour.</p>
      <p>
        Fairness and impartiality. The BBC has built its trust
after many years of thoughtful manual curation and expert
editorial guidance. It commits to delivering content in a fair,
impartial and honest way and data-driven recommendations
should live up to, and advance, these standards.
Algorithmic fairness and impartiality in recommendation systems
are increasingly discussed in the literature [
        <xref ref-type="bibr" rid="ref19 ref33">19, 33</xref>
        ] but with
no standardised solutions yet. We consider evaluation
metrics that help us track the risk and bias induced by our
recommendation systems.
      </p>
      <p>The above challenges drive the decisions we make around
which models and evaluation strategies to implement. For
example, we place significant focus upon ofline evaluation
to avoid unexpected behaviour; we use a variety of
metrics to track the quality of recommendations; we consider
recency-based systems an essential baseline for news
recommendations; and we adopt content-based approaches to tackle
the cold-start scenarios. More details about our choices and
how they relate to these challenges are provided in Sections 3
and 4.
3</p>
    </sec>
    <sec id="sec-4">
      <title>RELATED WORK</title>
      <p>
        Recommendation systems in the news domain have been
investigated for more than a decade [
        <xref ref-type="bibr" rid="ref27 ref38">27, 38</xref>
        ], following
various approaches. Collaborative filtering [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] relies on past
user behaviour to formulate recommendations based on
commonalities across user preferences. Content-based approaches
rely on item properties (or user profiles constructed by the
properties of the items they consume) to recommend related
items [
        <xref ref-type="bibr" rid="ref10 ref29 ref39">10, 29, 39</xref>
        ]. Rather than considering the long user
history, session-based approaches focus on user-item interactions
that occur within a certain time frame or context [
        <xref ref-type="bibr" rid="ref40 ref43">40, 43</xref>
        ].
Finally, hybrid systems may put together aspects from these
approaches and use a broader range of features, in order to
achieve a more nuanced representation of user activity [
        <xref ref-type="bibr" rid="ref18 ref30">18, 30</xref>
        ].
Content-based, session-based, and hybrid approaches appear
to be the most suitable to address some of the problems we
outlined earlier, namely the large number of anonymous users
and cold-start items (Section 2).
      </p>
      <p>
        Beyond the news domain, recommendation systems have
been investigated in a variety of industrial applications.
Approaches vary between traditional content-based and
collaborative filtering while, more recently, the advent of deep
neural networks has facilitated the development of hybrid
strategies [
        <xref ref-type="bibr" rid="ref45">45</xref>
        ]. These have been applied to the problem of
accommodation search at Airbnb [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], product advertisement
at Criteo [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ], video recommendations at Youtube [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], and
movie recommendations at Netflix [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. Industry approaches
using neural networks are of particular interest to us due
to the scalability of the systems and the domain agnostic
capability of neural networks.
      </p>
      <p>
        Considering the system architecture, some neural
networkbased approaches for recommending textual content are
endto-end (for example [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]), that is, the model takes as input
the text of items related to a user, extracts features for the
items and the user, and ultimately outputs a recommendation.
Other approaches rely on separate modules for extracting
features for the content and the user and for generating
recommendations [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Here, we take the latter approach for a
number of reasons. First, an end-to-end approach was not
compatible with the current architecture of the system, over
which we have limited control (Section 2). Second, separating
Sequence length distribution
User visits distribution
s
e
c
n
e
u
q
e
s
f
o
.
o
N
content representation from the generation of
recommendations enables further experimentation and increases the
ability of the system to retrieve new items [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
4
4.1
      </p>
    </sec>
    <sec id="sec-5">
      <title>METHODOLOGY</title>
    </sec>
    <sec id="sec-6">
      <title>Data</title>
      <p>
        The BBC collects detailed user interaction data for its digital
services, providing information about users and the
circumstances of their visits to BBC websites. For the purpose
of this analysis, we used 15-days worth of data from BBC
Mundo, spanning from the 6th to 20th April 2019. We define
a sequence, or visit, as any succession of user interactions (i.e.
page views) within 30 minutes from each other. Page views
were aggregated into sequences according to this definition.
In this dataset, the average number of user interactions we
collected per day was in the order of millions. As shown in
Figure 1, most recorded sessions included only a single article
read (i.e., a sequence of length 1) which is a common
observation in news delivery platforms [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Users often visited BBC
Mundo only once over the time-span considered (Figure 2).
      </p>
      <p>Like all statistical learning models, to robustly evaluate
recommender system performance, the data is required to
be appropriately split. In traditional machine learning
problems where the raw data takes the form of input-output
pairs, this split is relatively straightforward. Assuming there
is enough data, a common split might be 80%, 10%, 10%
into training, validation and test sets respectively. For
recommender systems, the temporal nature of the data makes the
situation a little diferent. While we still need to perform a
train/validation/test split, referred to from now on as the test
split, we also need to perform an additional split, henceforth
be referred to as the query split. The query split describes the
process of transforming a temporal sequence of consumption
logs into a single or group of feature-target pairs suitable for
ingestion into algorithmic learning models.</p>
      <p>For the test split, our initial thought was to discard the
temporal dimension and sample user sessions according to
pre-determined train/test/validation fractions. While the
simplicity of this approach is attractive, we decided that to
maximise the similarity between our ofline testing framework
and our online production environment was more important.
The temporal approach we implemented is displayed in
Figure 3 where we choose a thirteen-day period for training, the
next day for validation and the following day for test. As we
have the capacity to train and serve fresh consumer-facing
models every day, we aim for this ofline approach to reflect
our production environment suficiently for inferences in the
former to provide valuable information about the latter.</p>
      <p>For the query split, we take a user session from a given
period defined earlier in the current section and divide it into
the maximum number of trigrams while preserving temporal
order. Then, for each trigram, the first two elements (articles
vectors) represent the user profile while the third and final
element is the groundtruth item used as a target for our
models. The length of the user profile was chosen based
upon two factors: (1) our client-side serving infrastructure
is currently limited to providing the current and previous
article; and (2) exploratory analysis indicated that minimal
gains were made from increasing the number of items that
make up the user profile.
4.2</p>
    </sec>
    <sec id="sec-7">
      <title>System architecture and models</title>
      <p>All recommendation models we implemented were constrained
by the need to have compatibility with our current system
architecture. This consists of three main components. The
ifrst is responsible for generating article embeddings. The
second takes user data and article embeddings as input and
produces a user embedding. Finally, the outputs of the first
and the second modules are combined by the third component,
which ranks the recommended articles for a user, based on a
nearest neighbour search in the latent article space (Figure 4).</p>
      <p>
        The content representation module generates article
embeddings. The article embeddings were derived using a Latent
Dirichlet Allocation (LDA) model as found performant in
related research [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. LDA is an unsupervised topic modelling
approach that represents each document by the probability
of a number of topics. The number of topics is defined in
advance. Prior work from another BBC team found the
optimal number of topics to be 75 for a related dataset of BBC
Mundo articles [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>The user representation module generates user embeddings.
The user embeddings are derived from the article embeddings
and previous user interactions. Our experiments focused
primarily on developing models to derive user embeddings.
We explored neural network approaches that combine both
content and user data as well as models based only on user
interactions (i.e. Cosine-based collaborative filtering model,
Section 4.2.2).</p>
      <p>
        The output of the user representation module is
subsequently processed by the recommendation generation module.
This component takes as input a user embedding and
performs an approximate nearest neighbour search in the article
latent space, returning as output the  articles with the
smallest distance to the user embedding. The distance is
computed using the angular metric from the Python package
ANNOY [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], defined as √︀2(1 − (, )) for a user
embedding  and an article embedding .
      </p>
      <p>We evaluated three diferent models to derive the user
embeddings: a) a weighted average of item embeddings
(Section 4.2.1), b) a cosine-based collaborative filtering method
(Section 4.2.2), and c) a rank-optimised neural network
(Section 4.2.3). The sections below describe each approach in
detail.</p>
      <p>Content data
Content representation module</p>
      <p>Article embeddings
(LDA)</p>
      <p>User data
User representation module</p>
      <p>User embeddings
Recommandation generation</p>
      <p>module
Nearest neighbours</p>
      <p>Recommendations
4.2.1 Weighted average of item embeddings. The first user
representation model we tested derived the user embeddings
from the weighted average of item embeddings, for all items
consumed by a given user within a session. The most recently
consumed item was weighted by a factor  while the rest of
the items in the user’s session were weighted by 1 −  .
4.2.2 Cosine-based collaborative filtering. The second approach
was a combination of simple user-item collaborative filtering
and a session-based approach. Since users do not need to
log in to view the articles, we had no explicit user profile
and instead treated each session as a user. To generate the
sparse user-item matrix, we took the article IDs for all user
sessions within a given time window. The inputs to the model
at prediction time were the IDs of the articles viewed in the
current user session, and the output was the  highest scored
items based on these interactions. Our metric for scoring the
articles to recommend was the cosine distance of the current
user session and all other user sessions.
Serving
environment</p>
      <p>Recommended
articles
KNN</p>
      <p>Training
environment</p>
      <p>Binary cross- 
entropy loss
Sigmoid
Dot product
Prediction model</p>
      <p>User embedding</p>
      <p>Article embedding
Multi-layer
perceptron</p>
      <p>Multi-layer
perceptron</p>
      <p>OR
Client (current +
previous article
vector)</p>
      <p>User profile (current +
previous article vector)</p>
      <p>Groundtruth
article
vector</p>
      <p>
        Negative
sampled
article
vector
4.2.3 Rank-optimised neural network. Motivated by the
awareness that a simple linear combination of a user’s current and
previous article representations led to modest performance
gains over using solely the current article, we sought to
explore non-linear combinations of these vectors. Artificial
neural networks are ideally suited to fitting such non-linear
functions, while we were encouraged by the results reported
by others that have successfully used deep architectures to
solve information retrieval problems, e.g. [
        <xref ref-type="bibr" rid="ref11 ref14 ref22 ref46">11, 14, 22, 46</xref>
        ].
      </p>
      <p>The challenge we faced was to design a neural network
architecture which learned a latent representation of a user
profile (current and previous article) to minimise the
distance between itself and the latent representation of the
most appropriately recommended article (in this case, the
subsequently consumed article). One way of reflecting this
problem is a pointwise architecture that behaves in a way
similar to a regression problem. The model illustrated in
Figure 5 takes a user profile (two concatenated 75-length
vectors) and an article as input (a 75-length vector), passes
each through a five-layer perceptron (with 1024, 512, 256, 128
and 75 hidden units, each with rectified linear activation
functions). The model then minimises the binary cross-entropy
between the target and the inner product of the final layer
of the two perceptrons. Batch normalisation placed before
the activation functions of the initial layers was found to
significantly boost performance while also halving convergence
time, facilitating greater experimentation. Training runs
including dropout layers produced no improvement in accuracy
so were not included in the final model. Negative articles
were randomly over-sampled from the population of positive
articles, whereby each training user profile has one positive
article and five negative articles. Once this model had been
trained, two further models were derived from it for use in
the prediction environment. The first, the user model, took
only the user profile as input and returned the final layer of
the connected five-layer perceptron. The second, the article
model, took only a single article as input and returned the
iffth layer of its own five-layer perceptron. The article model
was then used to transform all of the raw LDA embeddings
into the article model embedding space before being fed into
our vector-based nearest neighbour index.
4.3</p>
    </sec>
    <sec id="sec-8">
      <title>Evaluation</title>
      <p>
        The aim of our work is not only to increase user
engagement with BBC products, but also to inform, educate, and
entertain—according to the mission of our organisation. We
build recommendation systems taking into account these
values and develop evaluation strategies that reflect our mission.
This section focuses on ofline evaluation metrics and the
baselines we use in our experiments. Online evaluation is also
a big part of our work but goes beyond the scope of this
paper which focuses on preliminary results.
4.3.1 Metrics. When developing recommendation models
ofline, we currently monitor and optimise performance with
reference to a suite of six quantitative metrics. For all metrics
(with the exception of inter-list diversity) a value can be
computed for each groundtruth/recommendations list pair.
The overall metric is computed as the mean value over all
groundtruth/recommendations list pairs within the test
period. For each metric, in addition to calculating the overall
value, we also estimate the item-normalised value by rfist
taking the mean metric value for every unique groundtruth item.
This value provides an insight into the performance of an
algorithm independently of the test set bias towards popular
groundtruth items. All metrics were calculated upon
recommendation lists of length  = 100. We use a relatively large
 motivated by the finding that deeper cut-ofs in ofline
experiments provide greater robustness and discriminative
power [
        <xref ref-type="bibr" rid="ref42">42</xref>
        ] as well as by the fact that we have to exclude a lot
of the recommended items a posteriori due to our extensive
business rules. A brief description of each metric is provided
below (for further details see [
        <xref ref-type="bibr" rid="ref12 ref21 ref34">12, 21, 34</xref>
        ]).
      </p>
      <p>Normalised Discounted Cumulative Gain (NDCG). It
measures the gain of a document based on its ranked position in
the top 100 list, with lower ranks discounted by a logarithmic
factor, and normalises the result by the maximum gain of an
ideal top 100 list.</p>
      <p>Hitrate. A recall-based metric whereby a recommended
list of items is assigned 1 if it contains the groundtruth item,
and 0 otherwise.</p>
      <p>Intra-list diversity. It estimates the average distance
between every pair of items in a recommendations list. For
the experiments reported here, distance between two
articles is measured as the ANNOY angular distance (described
formally in Section 4.2) between two article embeddings.</p>
      <p>Inter-list diversity. It measures how diverse the
recommended items across multiple lists are. It compares two lists
of recommendations and computes the ratio of unique items
in these lists over the total number of recommended items
between these lists.</p>
      <p>
        Popularity-based surprisal. It measures how novel or
surprising the items in a list are. It is formally defined as the
log of the inverse popularity of an item (i.e. the probability
of observing an item in the recommendations) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>Recency. : Measures how recent the recommended items
are. It calculates the time diference between the
recommendation request and the age of the recommended items using
a Gaussian decay function. The mean is set to 1 and the
standard deviation is chosen such that articles of 7 days old
or more receive a score less than 0.5.</p>
      <p>The ideal recommendation engine would optimise all these
metrics providing recommendations that are relevant to the
user, but that are also diverse, recent, and avoid the
popularity bias. In practice this is usually a trade-of as an algorithm
that provides more accurate results is, conversely, less likely
to produce diverse ones (and vice versa). In line with our
values and objectives, we sometimes choose algorithms that
favour diverse and recent content at the cost of a certain
degree of accuracy.
4.3.2 Baselines. We compare our user models to four baseline
approaches and require that each new user model outperforms
the existing ones. We consider the following recommenders
as baselines:
∙ Random recommender : Produces  random
recommendations.
∙ Recency-based recommender : Ranks item by recency
and returns the top  most recent items.
∙ Popularity-based recommender : Ranks items by
popularity and returns the top  most popular items.
∙ Content similarity recommender : Finds the  nearest
neighbours of an item (e.g., the last item consumed by
a user) using the ANNOY angular distance between
item embeddings.</p>
      <p>Our ofline experiments report results on the four baselines
defined in above and the three models defined in Section 4.2.
We use the NDCG metric to comment on the accuracy of the
systems and the remaining metrics defined in Section 4.3.1
to comment on qualitative aspects of the recommendations.
5</p>
    </sec>
    <sec id="sec-9">
      <title>RESULTS</title>
      <p>The NDCG scores for each recommender system are shown
in Figure 6. The scores from all metrics are summarised in
Table 1.</p>
      <p>Accuracy scores recorded for the baselines models were in
line with expectations. Compared to a random selection of
items (the random model), all other baselines show clear
performance improvements for both overall and item-normalised
NDCG. The popularity and recency recommenders returned
higher values than the content-based similarity (CS) model
for NDCG overall; however, if the most popular items are
factored out by looking at the item-normalised score, the
opposite is true. The recency recommender scored
particularly high NDCG overall which confirms our expectation that
users in a news platform prefer to consume fresh content.</p>
      <p>Of the implemented models, the cosine-based collaborative
filtering (CF) model (Section 4.2.2) outperformed all
baselines and other models by a significant margin, this being the
case both for overall and item-normalised NDCG and hitrate.
However, this significant advantage in accuracy comes at a
cost to inter-list diversity and surprisal, where both other
models returned higher scores. However, this efect was not
observed with the intra-list diversity metric, indicating that
individual CF lists contained more diverse content while the
lists of the other models were more distinct.</p>
      <p>The weighted average (WA) model (described in Section 4.2.1,
with  optimised at 0.7) achieved accuracy scores surpassing
all the baselines in item-normalised NDCG, although as
expected, this was not the case for NDCG overall. This suggests
that the model consistently projects into relevant regions of
the embedding space, and that the nearest neighbours are not
just most popular candidates. Despite returning marginally
higher NDCG scores, the WA results are salient mainly for
how similar they are across the board, to the CS baseline
that lacks information from the previous article.</p>
      <p>The rank-optimised neural network (NN) model (Section 4.2.3)
returned accuracy scores that were a clear step up from both
other LDA-based models (CF and WA). This was the case
for both variants of NDCG and particularly so for hitrate,
indicating that the NN model was optimised more for recall
than precision and could possibly benefit from further
reranking procedures. The NN model also distinguished itself from
CF and WA models in the diversity and surprisal metrics.
Results suggest the NN model produces more distinct lists
(indicated by higher inter-list diversity) but that those lists
are more topically homogenous (indicated by lower intra-list
diversity and surprisal metrics).
6</p>
    </sec>
    <sec id="sec-10">
      <title>DISCUSSION</title>
      <p>The first cycle of research in our journey to find the best news
recommender for BBC Mundo is complete. In Section 2 we
have outlined the characteristics of the problem we address: a
majority of non-signed in users; a large number of cold-start
items; architectural constraints; and high quality demands,
not only in terms of accuracy, but also in what concerns
fairness and impartiality of recommendations.</p>
      <p>
        One of the lessons we learned is that—unsurprisingly—
balancing the diferent aspects of our problem is hard. One
model may satisfy one of our requirements, whilst failing
to fulfil another. A pure collaborative filtering approach is
currently our best option to maximise ofline scoring
accuracy, but that comes at the cost of reducing diversity (and
a degree of recency, dependent upon how regularly we
retrain). Moreover, the performance of the CF model was not
entirely unexpected, as it has been shown [
        <xref ref-type="bibr" rid="ref17 ref25 ref32">17, 25, 32</xref>
        ] that
such simple methods typically outperform the neural
approaches when only logged user items are used, and instead
only start to perform well when the input features contain
additional contextual meta-data. However, as with most
collaborative filtering approaches, this model sufers from the
item cold-start problem and so frequent generation of the
user-item sparse matrix would be required. Therefore, we
cannot depend upon a solution that is derived purely from
user interactions. To that end, we also know from our
experiments that the contribution of previous articles appears to
have a lower impact than expected. Despite performance of
the WA model consistently exceeding the CS baseline model
(across validation and test), this gain was always marginal.
Furthermore, our attempts at combining the current and
previous article vectors in a non-linear fashion using a neural
network had an impact that was also weaker than expected.
These unintuitive results raise further questions that we plan
to explore in the future.
      </p>
      <p>Fundamentally, we believe there is scope to optimise the
NN approach further so that it will perform more
competitively with CF. To achieve this end we have multiple
strategies. These fall into three categories: model architecture, data,
and training improvements.</p>
      <p>
        We know that learning to rank in a pointwise framework
is not optimal. Both pairwise and listwise approaches should,
in theory, achieve better results (see [
        <xref ref-type="bibr" rid="ref13 ref23">13, 23</xref>
        ]). Pairwise loss
functions together with triplet loss architectures have
demonstrated impressive results elsewhere but our own early
experiments have indicated they are dificult to train, tending
towards significant underfitting.
      </p>
      <p>
        A key reason for this may be the under-representation
of negative examples in our training set. Adopting a higher
proportion of negative training examples may address this,
but also using more informed negative sampling techniques
may be required (such as weighted approximate-rank
pairwise loss [
        <xref ref-type="bibr" rid="ref44">44</xref>
        ]). Even with the current pointwise architecture
there is a 5% diference in train/test performance
(itemnormalised NDCG) that should significantly reduce by using
the appropriate regularisation.
      </p>
      <p>
        Changes to our training process may also lead to
significant gains. In addition to increasing compute resources for
the exploration of the hyperparameter space, reducing the
training/testing window from the order of days/weeks to the
order of hours may provide greater scope for
experimentation (as has been reported elsewhere [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]). While a smaller
training window does necessitate more regular training of
deployed models, it also means more manageable datasets
where hyperparameter optimisation is more practical.
      </p>
      <p>A further change that may prove fruitful is to expand
the richness of the input to the user profile model. This
may include expanding the size of the user journeys in the
training set beyond 3 (a constraint which, incidentally, did
not apply to the CF model at training), while also introducing
contextual information about the user.</p>
      <p>
        Finally, another direction to be explored in the future
regards content representation. In experiments not reported in
the current work, raw article text has been encoded through
an LDA model. However, our system architecture afords
enough flexibility to replace the current content model with
alternative article embeddings and test diferent approaches.
In particular, we are interested in taking sub-word
information into consideration [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], enriching text with
semantics [
        <xref ref-type="bibr" rid="ref10 ref24">10, 24</xref>
        ], and augmenting text representations with
multimedia [
        <xref ref-type="bibr" rid="ref35 ref46">35, 46</xref>
        ].
      </p>
      <p>Our results demonstrate the dificulty of acquiring all
the desired characteristics of an ideal news recommender.
Ultimately, we expect ensemble approaches may represent
the best solution. Here we may take the cold-start benefits
of the content-based neural approach and combine it with
the less diverse but more accurate list of items generated by
a collaborative filtering model.
7</p>
    </sec>
    <sec id="sec-11">
      <title>CONCLUSION</title>
      <p>In this paper we evaluated three approaches to provide news
recommendations for the BBC Mundo service. The systems
we have built are compatible with BBC serving
infrastructure, a use case which includes millions of daily users and
new content in the order of several thousand articles per
week. In spite of our experiment being only the initial step
of a journey that promises to be much longer, our models
outperformed random, popularity-based, recency-based and
content-similarity baselines. It is worth noticing though, that
these results do not reflect current online performance. More
work is needed to ensure these models, when deployed, meet
the quality and editorial standards of the BBC. Future
challenges do not concern only achieving higher accuracy, but
also conforming to the principles of algorithmic fairness and
impartiality. We encourage the community to collaborate in
helping us create the way forward towards fair and engaging
recommendations and applications with responsible machine
learning.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Trapit</given-names>
            <surname>Bansal</surname>
          </string-name>
          , David Belanger, and
          <string-name>
            <surname>Andrew McCallum</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Ask the GRU : Multi-task Learning for Deep Text Recommendations</article-title>
          .
          <source>In Proceedings of the 10th ACM Conference on Recommender Systems</source>
          , Boston, MA, USA, September
          <volume>15</volume>
          -
          <issue>19</issue>
          ,
          <year>2016</year>
          .
          <fpage>107</fpage>
          -
          <lpage>114</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>BBC.</surname>
          </string-name>
          <year>2019</year>
          .
          <article-title>The BBC's services in the UK - About the BBC</article-title>
          . https: //www.bbc.com/aboutthebbc/whatwedo/publicservices Consulted on 21
          <year>June 2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>BBC. 2019. Editorial</given-names>
            <surname>Guidelines</surname>
          </string-name>
          . https://www.bbc.co.
          <source>uk/ editorialguidelines Consulted on 21 June</source>
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>BBC.</surname>
          </string-name>
          <year>2019</year>
          .
          <article-title>Global news services - About the BBC</article-title>
          . https: //www.bbc.com/aboutthebbc/whatwedo/worldservice Consulted on 21
          <year>June 2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>BBC.</surname>
          </string-name>
          <year>2019</year>
          .
          <article-title>Mission, values and public purposes - About the BBC</article-title>
          . https://www.bbc.com/aboutthebbc/governance/mission Consulted on 21
          <year>June 2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>BBC.</surname>
          </string-name>
          <year>2019</year>
          . News - Mundo. https://www.bbc.
          <source>com/mundo Consulted on 21 June</source>
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E</given-names>
            <surname>Bernhardsson</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>ANNOY: Approximate nearest neighbors in C++/Python optimized for memory usage and loading/saving to disk</article-title>
          . GitHub https://github. com/spotify/annoy (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Piotr</given-names>
            <surname>Bojanowski</surname>
          </string-name>
          , Edouard Grave, Armand Joulin, and
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Enriching Word Vectors with Subword Information</article-title>
          .
          <source>CoRR abs/1607</source>
          .04606 (
          <year>2016</year>
          ). arXiv:
          <volume>1607</volume>
          .04606 http://arxiv. org/abs/1607.04606
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Clara</given-names>
            <surname>Higuera</surname>
          </string-name>
          <article-title>Caban˜es, Michel Schammel, Shirley Ka Kei Yu, and</article-title>
          <string-name>
            <given-names>Ben</given-names>
            <surname>Fields</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Human-centric Evaluation of Similarity Spaces of News Articles</article-title>
          .
          <source>In 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (NewsIR'19 Third International Workshop on Recent Trends in News Information Retrieval)</source>
          .
          <fpage>51</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Michel</surname>
            <given-names>Capelle</given-names>
          </string-name>
          , Flavius Frasincar, Marnix Moerland, and
          <string-name>
            <given-names>Frederik</given-names>
            <surname>Hogenboom</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Semantics-based news recommendation</article-title>
          .
          <source>In 2nd International Conference on Web Intelligence</source>
          , Mining and Semantics, WIMS '12,
          <string-name>
            <surname>Craiova</surname>
          </string-name>
          , Romania, June 6-8,
          <year>2012</year>
          .
          <volume>27</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>27</lpage>
          :
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Hugo</surname>
          </string-name>
          Caselles-Dupr´e, Florian Lesaint, and
          <string-name>
            <surname>Jimena</surname>
          </string-name>
          Royo-Letelier.
          <year>2018</year>
          .
          <article-title>Word2Vec Applied to Recommendation: Hyperparameters Matter</article-title>
          .
          <source>In Proceedings of the 12th ACM Conference on Recommender Systems (RecSys '18)</source>
          . ACM, New York, NY, USA,
          <fpage>352</fpage>
          -
          <lpage>356</lpage>
          . https://doi.org/10.1145/3240323.3240377
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P</given-names>
            <surname>Castells</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S</given-names>
            <surname>Vargas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Novelty and diversity metrics for recommender systems: choice, discovery and relevance</article-title>
          .
          <source>In International Workshop on Diversity in Document Retrieval (DDR 2011) at the 33rd European Conference on Information Retrieval (ECIR</source>
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Ting</surname>
            <given-names>Chen</given-names>
          </string-name>
          , Yizhou Sun,
          <string-name>
            <given-names>Yue</given-names>
            <surname>Shi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Liangjie</given-names>
            <surname>Hong</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>On Sampling Strategies for Neural Network-based Collaborative Filtering</article-title>
          .
          <source>In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , Halifax,
          <string-name>
            <surname>NS</surname>
          </string-name>
          , Canada,
          <source>August 13 - 17</source>
          ,
          <year>2017</year>
          .
          <fpage>767</fpage>
          -
          <lpage>776</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Paul</given-names>
            <surname>Covington</surname>
          </string-name>
          , Jay Adams, and
          <string-name>
            <given-names>Emre</given-names>
            <surname>Sargin</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Deep Neural Networks for YouTube Recommendations</article-title>
          .
          <source>In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys '16)</source>
          . ACM,
          <volume>191</volume>
          -
          <fpage>198</fpage>
          . https://doi.org/10.1145/2959100.2959190
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Abhinandan</surname>
            <given-names>Das</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mayur Datar</surname>
            , Ashutosh Garg, and
            <given-names>Shyamsundar</given-names>
          </string-name>
          <string-name>
            <surname>Rajaram</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Google news personalization: scalable online collaborative filtering</article-title>
          .
          <source>InProceedings of the 16th International Conference on World Wide Web, WWW</source>
          <year>2007</year>
          , Banff, Alberta, Canada, May 8-
          <issue>12</issue>
          ,
          <year>2007</year>
          .
          <fpage>271</fpage>
          -
          <lpage>280</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16] Gabriel de Souza Pereira Moreira.
          <year>2018</year>
          .
          <article-title>CHAMELEON: a deep learning meta-architecture for news recommender systems</article-title>
          .
          <source>In Proceedings of the 12th ACM Conference on Recommender Systems, RecSys</source>
          <year>2018</year>
          , Vancouver, BC, Canada, October 2-
          <issue>7</issue>
          ,
          <year>2018</year>
          .
          <fpage>578</fpage>
          -
          <lpage>583</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17] Gabriel de Souza Pereira Moreira,
          <source>Dietmar Jannach, and Adilson Marques da Cunha</source>
          .
          <year>2019</year>
          .
          <article-title>Contextual Hybrid Session-based News Recommendation with Recurrent Neural Networks</article-title>
          . CoRR abs/
          <year>1904</year>
          .10367 (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Elena</given-names>
            <surname>Viorica</surname>
          </string-name>
          <string-name>
            <surname>Epure</surname>
          </string-name>
          , Benjamin Kille, Jon Espen Ingvaldsen, R´ebecca Deneck`ere, Camille Salinesi, and
          <string-name>
            <given-names>Sahin</given-names>
            <surname>Albayrak</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Recommending Personalized News in Short User Sessions</article-title>
          .
          <source>In Proceedings of the Eleventh ACM Conference on Recommender Systems, RecSys</source>
          <year>2017</year>
          , Como, Italy,
          <source>August 27-31</source>
          ,
          <year>2017</year>
          .
          <fpage>121</fpage>
          -
          <lpage>129</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Sahin</given-names>
            <surname>Cem</surname>
          </string-name>
          <string-name>
            <surname>Geyik</surname>
          </string-name>
          , Stuart Ambler, and
          <string-name>
            <given-names>Krishnaram</given-names>
            <surname>Kenthapadi</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Fairness-Aware Ranking in Search &amp; Recommendation Systems with Application to LinkedIn Talent Search</article-title>
          . CoRR abs/
          <year>1905</year>
          .
          <year>01989</year>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Carlos</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Gomez-Uribe</surname>
            and
            <given-names>Neil</given-names>
          </string-name>
          <string-name>
            <surname>Hunt</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>The Netflix Recommender System: Algorithms, Business Value, and Innovation</article-title>
          .
          <source>ACM Trans. Management Inf. Syst. 6</source>
          ,
          <issue>4</issue>
          (
          <year>2016</year>
          ),
          <volume>13</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          :
          <fpage>19</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Asela</given-names>
            <surname>Gunawardana</surname>
          </string-name>
          and
          <string-name>
            <given-names>Guy</given-names>
            <surname>Shani</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>A Survey of Accuracy Evaluation Metrics of Recommendation Tasks</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          <volume>10</volume>
          (
          <year>2009</year>
          ),
          <fpage>2935</fpage>
          -
          <lpage>2962</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Malay</surname>
            <given-names>Haldar</given-names>
          </string-name>
          , Mustafa Abdool, Prashant Ramanathan, Tao Xu,
          <string-name>
            <given-names>Shulin</given-names>
            <surname>Yang</surname>
          </string-name>
          , Huizhong Duan, Qing Zhang, Nick BarrowWilliams, Bradley C. Turnbull, Brendan M. Collins, and
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Legrand</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Applying Deep Learning To Airbnb Search</article-title>
          . CoRR abs/
          <year>1810</year>
          .09591 (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23] Bal´azs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and
          <string-name>
            <given-names>Domonkos</given-names>
            <surname>Tikk</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Session-based Recommendations with Recurrent Neural Networks</article-title>
          .
          <source>In 4th International Conference on Learning Representations, ICLR</source>
          <year>2016</year>
          , San Juan, Puerto Rico, May 2-
          <issue>4</issue>
          ,
          <year>2016</year>
          , Conference Track Proceedings.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Wouter</surname>
            <given-names>IJntema</given-names>
          </string-name>
          , Frank Goossen, Flavius Frasincar, and
          <string-name>
            <given-names>Frederik</given-names>
            <surname>Hogenboom</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Ontology-based news recommendation</article-title>
          .
          <source>In EDBT/ICDT Workshops (ACM International Conference Proceeding Series)</source>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Dietmar</given-names>
            <surname>Jannach</surname>
          </string-name>
          and
          <string-name>
            <given-names>Malte</given-names>
            <surname>Ludewig</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>When Recurrent Neural Networks meet the Neighborhood for Session-Based Recommendation</article-title>
          .
          <source>In Proceedings of the Eleventh ACM Conference on Recommender Systems, RecSys</source>
          <year>2017</year>
          , Como, Italy,
          <source>August 27-31</source>
          ,
          <year>2017</year>
          .
          <fpage>306</fpage>
          -
          <lpage>310</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Tomonari</surname>
            <given-names>Kamba</given-names>
          </string-name>
          , Krishna Bharat, and
          <string-name>
            <surname>Michael</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Albers</surname>
          </string-name>
          .
          <year>1996</year>
          .
          <article-title>The Krakatoa Chronicle: An Interactive Personalized Newspaper on the Web</article-title>
          .
          <source>World Wide Web Journal</source>
          <volume>1</volume>
          ,
          <issue>1</issue>
          (
          <year>1996</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Mozhgan</surname>
            <given-names>Karimi</given-names>
          </string-name>
          , Dietmar Jannach, and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Jugovac</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>News recommender systems - Survey and roads ahead</article-title>
          .
          <source>Inf. Process. Manage</source>
          .
          <volume>54</volume>
          ,
          <issue>6</issue>
          (
          <year>2018</year>
          ),
          <fpage>1203</fpage>
          -
          <lpage>1227</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Romain</surname>
            <given-names>Lerallut</given-names>
          </string-name>
          , Diane Gasselin, and Nicolas Le Roux.
          <year>2015</year>
          .
          <article-title>Large-Scale Real-Time Product Recommendation at Criteo</article-title>
          .
          <source>In Proceedings of the 9th ACM Conference on Recommender Systems, RecSys</source>
          <year>2015</year>
          , Vienna, Austria,
          <source>September 16-20</source>
          ,
          <year>2015</year>
          .
          <volume>232</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Lei</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Dingding</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Tao</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Knox</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Balaji</given-names>
            <surname>Padmanabhan</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>SCENE: a scalable two-stage personalized news recommendation system</article-title>
          .
          <source>In Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          ,
          <string-name>
            <surname>SIGIR</surname>
          </string-name>
          <year>2011</year>
          , Beijing, China,
          <source>July 25-29</source>
          ,
          <year>2011</year>
          .
          <fpage>125</fpage>
          -
          <lpage>134</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>Lei</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Li</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Fan</given-names>
            <surname>Yang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Tao</given-names>
            <surname>Li</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Modeling and broadening temporal user interest in personalized news recommendation</article-title>
          .
          <source>Expert Syst. Appl</source>
          .
          <volume>41</volume>
          ,
          <issue>7</issue>
          (
          <year>2014</year>
          ),
          <fpage>3168</fpage>
          -
          <lpage>3177</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>Greg</given-names>
            <surname>Linden</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Eli Pariser is wrong</article-title>
          . http://glinden.blogspot. com/
          <year>2011</year>
          /05/eli-pariser-is-wrong.
          <source>html Consulted on 21 June</source>
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Malte</given-names>
            <surname>Ludewig</surname>
          </string-name>
          and
          <string-name>
            <given-names>Dietmar</given-names>
            <surname>Jannach</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Evaluation of sessionbased recommendation algorithms</article-title>
          .
          <source>User Model. User-Adapt. Interact</source>
          .
          <volume>28</volume>
          ,
          <issue>4</issue>
          -
          <fpage>5</fpage>
          (
          <year>2018</year>
          ),
          <fpage>331</fpage>
          -
          <lpage>390</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Rishabh</surname>
            <given-names>Mehrotra</given-names>
          </string-name>
          ,
          <string-name>
            <surname>James</surname>
            <given-names>McInerney</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Hugues</given-names>
            <surname>Bouchard</surname>
          </string-name>
          , Mounia Lalmas, and
          <string-name>
            <given-names>Fernando</given-names>
            <surname>Diaz</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Towards a Fair Marketplace: Counterfactual Evaluation of the trade-off between Relevance, Fairness &amp; Satisfaction in Recommendation Systems</article-title>
          .
          <source>In Proceedings of the 27th ACM International Conference on Information and Knowledge Management</source>
          ,
          <string-name>
            <surname>CIKM</surname>
          </string-name>
          <year>2018</year>
          , Torino, Italy,
          <source>October 22-26</source>
          ,
          <year>2018</year>
          .
          <fpage>2243</fpage>
          -
          <lpage>2251</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Tomoko</surname>
            <given-names>Murakami</given-names>
          </string-name>
          , Koichiro Mori, and
          <string-name>
            <given-names>Ryohei</given-names>
            <surname>Orihara</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Metrics for Evaluating the Serendipity of Recommendation Lists</article-title>
          .
          <source>In JSAI (Lecture Notes in Computer Science)</source>
          , Vol.
          <volume>4914</volume>
          . Springer,
          <fpage>40</fpage>
          -
          <lpage>46</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Thomas</surname>
            <given-names>Nedelec</given-names>
          </string-name>
          , Elena Smirnova, and
          <string-name>
            <given-names>Flavian</given-names>
            <surname>Vasile</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Specializing Joint Representations for the task of Product Recommendation</article-title>
          .
          <source>CoRR abs/1706</source>
          .07625 (
          <year>2017</year>
          ). arXiv:
          <volume>1706</volume>
          .07625 http://arxiv.org/abs/1706.07625
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Nicholas</surname>
            <given-names>Negroponte. 1996. Being</given-names>
          </string-name>
          <string-name>
            <surname>Digital</surname>
          </string-name>
          . Random House Inc., New York, NY, USA.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <surname>Tien</surname>
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Pik-Mai Hui</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Maxwell</surname>
            <given-names>Harper</given-names>
          </string-name>
          , Loren G. Terveen, and
          <string-name>
            <surname>Joseph</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Konstan</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Exploring the filter bubble: the effect of using recommender systems on content diversity</article-title>
          .
          <source>In 23rd International World Wide Web Conference, WWW '14</source>
          , Seoul, Republic of Korea, April 7-
          <issue>11</issue>
          ,
          <year>2014</year>
          .
          <fpage>677</fpage>
          -
          <lpage>686</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>O</given-names>
            <surname>¨zlem</surname>
          </string-name>
          <string-name>
            <surname>O</surname>
          </string-name>
          ¨zg¨obek, Jon Atle Gulla, and Riza Cenk Erdur.
          <year>2014</year>
          .
          <article-title>A Survey on Challenges and Methods in News Recommendation</article-title>
          .
          <source>In WEBIST 2014 - Proceedings of the 10th International Conference on Web Information Systems and Technologies</source>
          , Volume
          <volume>2</volume>
          ,
          <string-name>
            <surname>Barcelona</surname>
          </string-name>
          , Spain,
          <fpage>3</fpage>
          -5 April,
          <year>2014</year>
          .
          <fpage>278</fpage>
          -
          <lpage>285</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <surname>Michael</surname>
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Pazzani</surname>
            and
            <given-names>Daniel</given-names>
          </string-name>
          <string-name>
            <surname>Billsus</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Content-Based Recommendation Systems</article-title>
          .
          <source>In The Adaptive Web (Lecture Notes in Computer Science)</source>
          , Vol.
          <volume>4321</volume>
          . Springer,
          <fpage>325</fpage>
          -
          <lpage>341</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <surname>Massimo</surname>
            <given-names>Quadrana</given-names>
          </string-name>
          , Paolo Cremonesi, and
          <string-name>
            <given-names>Dietmar</given-names>
            <surname>Jannach</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Sequence-Aware Recommender Systems</article-title>
          .
          <source>ACM Comput. Surv</source>
          .
          <volume>51</volume>
          ,
          <issue>4</issue>
          (
          <year>2018</year>
          ),
          <volume>66</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>66</lpage>
          :
          <fpage>36</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>Brent</given-names>
            <surname>Smith</surname>
          </string-name>
          and
          <string-name>
            <given-names>Greg</given-names>
            <surname>Linden</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Two Decades of Recommender Systems at Amazon.com</article-title>
          .
          <source>IEEE Internet Computing</source>
          <volume>21</volume>
          ,
          <issue>3</issue>
          (
          <year>2017</year>
          ),
          <fpage>12</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref42">
        <mixed-citation>
          [42]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Valcarce</surname>
          </string-name>
          , Alejandro Bellog´ın, Javier Parapar, and
          <string-name>
            <given-names>Pablo</given-names>
            <surname>Castells</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>On the robustness and discriminative power of information retrieval metrics for top-N recommendation</article-title>
          .
          <source>In Proceedings of the 12th ACM Conference on Recommender Systems, RecSys</source>
          <year>2018</year>
          , Vancouver, BC, Canada, October 2-
          <issue>7</issue>
          ,
          <year>2018</year>
          .
          <fpage>260</fpage>
          -
          <lpage>268</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref43">
        <mixed-citation>
          [43]
          <string-name>
            <surname>Shoujin</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Longbing Cao</surname>
            , and
            <given-names>Yan</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>A Survey on Session-based Recommender Systems</article-title>
          . CoRR abs/
          <year>1902</year>
          .04864 (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref44">
        <mixed-citation>
          [44]
          <string-name>
            <surname>Jason</surname>
            <given-names>Weston</given-names>
          </string-name>
          , Hector Yee, and
          <string-name>
            <given-names>Ron J.</given-names>
            <surname>Weiss</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Learning to rank recommendations with the k-order statistic loss</article-title>
          .
          <source>In Seventh ACM Conference on Recommender Systems</source>
          , RecSys '13,
          <string-name>
            <surname>Hong</surname>
            <given-names>Kong</given-names>
          </string-name>
          , China,
          <source>October 12-16</source>
          ,
          <year>2013</year>
          .
          <fpage>245</fpage>
          -
          <lpage>248</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref45">
        <mixed-citation>
          [45]
          <string-name>
            <surname>Shuai</surname>
            <given-names>Zhang</given-names>
          </string-name>
          , Lina Yao, Aixin Sun, and
          <string-name>
            <given-names>Yi</given-names>
            <surname>Tay</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Deep Learning Based Recommender System: A Survey and New Perspectives</article-title>
          .
          <source>ACM Comput. Surv</source>
          .
          <volume>52</volume>
          ,
          <issue>1</issue>
          (
          <year>2019</year>
          ),
          <volume>5</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          :
          <fpage>38</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref46">
        <mixed-citation>
          [46]
          <string-name>
            <surname>Lu</surname>
            <given-names>Zheng</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            <given-names>Tan</given-names>
          </string-name>
          , Kun Han, and
          <string-name>
            <given-names>Ren</given-names>
            <surname>Mao</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Collaborative Multi-modal deep learning for the personalized product retrieval in Facebook Marketplace</article-title>
          . CoRR abs/
          <year>1805</year>
          .12312 (
          <year>2018</year>
          ). arXiv:
          <year>1805</year>
          .12312 http://arxiv.org/abs/
          <year>1805</year>
          .12312
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>