<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Combining Content-based and Collaborative Filtering for Personalized Sports News Recommendations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Philip Lenhart</string-name>
          <email>philip.lenhart@in.tum.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniel Herzog</string-name>
          <email>herzogd@in.tum.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Recommender System; Sports News; Content-based; Col-</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Informatics, Technical University of Munich</institution>
          ,
          <addr-line>Boltzmannstr. 3, 85748 Garching</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>laborative Filtering; Hybrid</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <volume>16</volume>
      <issue>2016</issue>
      <abstract>
        <p>Sports news are a special case in the eld of news recommendations as readers often come with a strong emotional attachment to selected sports, teams or players. Furthermore, the interest in a topic can suddenly change if, for example, an important sports event is taking place. In this work, we present a hybrid sports news recommender system that combines content-based recommendations with collaborative ltering. We developed a recommender dashboard and integrated it into the Sport1.de website. In a user study, we evaluated our solution. Results show that a pure content-based approach delivers accurate news recommendations and the users con rm our recommender dashboard a high usability. Nevertheless, the collaborative ltering component of our hybrid approach is necessary to increase the diversity of the recommendations and to recommend older articles if they are of special importance to the user.</p>
      </abstract>
      <kwd-group>
        <kwd>Information systems ! Recommender systems</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Recommender systems (RSs) suggest items like movies,
songs or points of interest based on the user's preferences.
Traditional RSs have to face some challenges when
recommending such items. One of the most common problems
is the cold-start problem [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. News items without any
ratings cannot be recommended while new users who did not
share their preferences with the RS yet cannot receive any
personalized recommendations. When recommending news,
recency plays a critical role [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. News have to be
up-todate but sometimes older articles are important if there is
a connection to current events. Sports news represent only
one category of news but they complicate the news
recommendation process. People interested in sports are often
characterized by a strong emotional attachment to selected
sports, teams or players. With regard to recommendations,
a user could be in favor of a lot of news about one team
while she or he absolutely wants to avoid any information
about a rival. Furthermore, the user's interest in a topic can
suddenly change. For example, during the Fifa World Cup,
even some people who are not interested at all in soccer want
to be kept up-to-date with regard to current results.
      </p>
      <p>In this work, we want to examine how well content-based
RSs work for recommending sports news. In addition, we
extend our RS by collaborative ltering. We develop a
recommender dashboard and integrate it into the website of the
German television channel and Internet portal Sport11. We
evaluate both algorithms and the usability of our prototype
in a user study.</p>
      <p>This paper is structured as follows: in Section 2 we present
related work and highlight our contribution to the current
state of research in content-based and hybrid news RSs. We
explain how we combine a content-based and a collaborative
ltering component to a hybrid sports news RS in Section 3.
Our development is evaluated in a user study. The results
of this study are summarized in Section 4. This work ends
with a conclusion and an outlook on future work.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>
        Di erent approaches try to tackle the problem of
personalized news recommendations. One of the rst news RSs
was developed and evaluated by the GroupLens project [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
The researchers used collaborative ltering to provide
personalized recommendations. A seven-week trial showed that
their predictions are meaningful and valuable to users.
Furthermore, they found out that users value such predictions
for news because in the experiment, the participants tended
to read highly rated articles more than less highly rated
articles. Liu et al [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] developed a news RS based on pro les
learned from user activity in Google News. They modeled
the user's interests by observing her or his past click history
and combined it with the local news trend. Compared with
an existing collaborative ltering method, their combined
method improved the quality of the recommendations and
attracted more frequent visits to the Google News website.
      </p>
      <p>
        Using article keywords to build user pro les for news
recommendations has already been researched. The
Personalized Information Network (PIN) creates user pro les by so
called interest terms which consist of one or more keywords
[
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Experiments show that PIN is able to deliver
personalized news recommendations on-the- y.
      </p>
      <p>
        Some researchers used hybrid RS combining di erent
techniques to suggest news articles. Claypool et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
developed P-Tango, an online newspaper combining the strengths
of content-based and collaborative ltering. News@hand is
a system that makes use of semantic-based technologies to
recommend news [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. It creates ontology-based item
descriptions and user pro les to provide personalized,
contextaware, group-oriented and multi-facet recommendations. Its
hybrid models allow overcoming some limitations of
traditional RS techniques such as the cold-start problem and
enables recommendations for grey sheeps, i.e. users whose
preferences do not consistently agree or disagree with any
group of people [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The authors evaluated the
personalized and context-aware recommendation models in an
experiment with 16 participants. Results showed that the
combination of both models plus their semantic extension
provides the best results [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. De Pessemier et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] used
an hybrid approach to recommend news of di erent sources.
Their approach combines a search engine as a content-based
approach with collaborative ltering and uses implicit
feedback to determine if the user is interested in a certain topic.
The recommendations are presented in a web application
optimized for mobile devices.
      </p>
      <p>
        Asikin and Worndl [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] presented approaches for
recommending news article by using spatial variables such as
geographic coordinates or the name and physical character of
a location. Their goal was to to deliver serendipitous
recommendation while improving the user satisfaction. A user
study showed that their approaches deliver news
recommendations that are more surprising than a baseline algorithm
but still favored by the users.
      </p>
      <p>To the best of our knowledge, no research focusing on the
special case of sports news has been done. In this work
we want to show how sports news can be recommended in
a content-based approach. In addition, we extend this RS
by a collaborative ltering component. In a user study, we
evaluate both approaches to nd out if the hybrid algorithm
improves the recommendations. We show how sports news
can be suggested to real users by developing and testing a
fully working recommender dashboard which can be
integrated into existing webpages.</p>
    </sec>
    <sec id="sec-3">
      <title>DEVELOPMENT OF A PERSONALIZED</title>
    </sec>
    <sec id="sec-4">
      <title>SPORTS NEWS RECOMMENDER SYSTEM</title>
      <p>This section explains the algorithms we used in our RS,
but also illustrates the user pro le modeling that is needed
to provide personalized recommendations. Finally, the
prototype is shown to point out how our concepts are
implemented on a website.
3.1</p>
    </sec>
    <sec id="sec-5">
      <title>User Profile and Preference Elicitation</title>
      <p>The user's preferences with regard to sports news are
expressed by keywords of articles that she or he is reading.
Each article of our recommendation database is
characterized by ve to ten keywords which are automatically
generated by analyzing the article's text. We are storing a list
of keywords and how often each keyword occurs in articles
the user has read. The more articles the user is reading, the
better the recommendations are optimized with regard to
the user's preferences. In our rst prototype the counter for
each present keyword is incremented by one when the user
reads the article containing this keyword. In future works,
the keywords in an article could be weighted according to
the relevance and importance of the keyword to the article.</p>
      <p>The new user problem a ects every user who did not read
an article yet. As explained, sports news di er from other
kinds of news in the emotional attachment to selected sports,
teams or players. We use this nding to overcome the new
user problem. Before starting the recommendation process,
the user can specify her or his favorite sport and team. News
can then be recommended based on this selection and will
improve when the user is reading articles, thus providing
implicit feedback.
3.2</p>
    </sec>
    <sec id="sec-6">
      <title>Content-based Sports News Recommendations</title>
      <p>
        Content-based recommender suggest items that are
similar to items the user liked in the past [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Since the user
pro le uses weighted keywords, we use vector
representations of the pro le and the articles to calculate the similarity
between two articles.
      </p>
      <p>One of the most important things for a news RS is to
provide articles that are not dated. Especially in the sports
news domain the environment is fast changing and usually
the user is not interested in news about a sports event or her
or his favorite team that are not up-to-date. The main
challenge for us was to determine how old sports news can be
before they are not considered for recommendation anymore.
For our content-based RS we only take news into account
that are not older than three days. Besides only
providing relevant articles, this decision promises a better
performance of the algorithm. The more articles are considered,
the longer the process of calculating the recommendations
takes. Our system currently uses only one news provider,
but if the system grows, this could lead to a signi cant loss
of performance. Our hybrid algorithm which incorporates
collaborative ltering is also able to provide older articles if
they are of high importance to the user (cf. Section 3.4).</p>
      <p>The formula below computes the similarity between two
articles (g and h),</p>
      <p>sim(g; h) =
where</p>
      <p>Pi2W (gi hi)
q(Pi2W gi2</p>
      <p>Pi2W hi2)
;
(1)
g,h are vectors representing articles with</p>
      <p>weighted keywords,</p>
      <sec id="sec-6-1">
        <title>W is the set union of the particular keywords,</title>
        <p>i is a keyword and
gi,hi are the weights of i in g and h, respectively.</p>
        <p>In the computation of content-based similarity scores we
only consider the relative dimension of the keyword weights.
For the reason that user pro les have di erent dimensions
compared to articles, the use of relative dimensions provides
better results for our system. As an illustration of the main
idea of the algorithm, let us consider the simple case where
the user pro le contains two keywords with the weights 5
and 10. Additionally there is another article with these two
keywords but the weights are 1 and 2, respectively. In this
case the similarity is 1, because of the same relative
dimensions of the article and the user pro le.</p>
        <p>The algorithm considers every article as an element in a
vector space, where the keywords are forming the base. The
coordinate of an article in the direction of a keyword is given
by the weight of this keyword. If the keyword does not occur,
the weight will be 0.</p>
        <p>We normalize each article relative to the standard scalar
product by dividing it by its absolute value. Consequently,
the standard scalar product of the two normalized vectors
conforms to the desired comparison features. Even if there
are negative weights, e.g. for active suppressed keywords,
the algorithm calculates similarities correctly.</p>
        <p>In order to understand the similarity calculation better,
we explain how the algorithm works for an article with itself
(or another article with the same weight proportions). In
this case, the scalar product is 1, because of the way the
vectors are normalized. But if two articles have
disjunctive keyword sets, the result is 0, because such vectors are
orthogonal to each other.</p>
        <p>In the end, the system sorts the articles by similarity
descending and returns the 50 articles with the highest score.
3.3</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Collaborative Filtering Component</title>
      <p>
        In contrast to content-based ltering, a collaborative RS
uses the ratings of other users to calculate the similarity of
articles [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Di erent algorithms for item-based
collaborative ltering exist. We explain some common algorithms in
the following and explain our choice for a sports news RS.
Therefore we refer to [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <sec id="sec-7-1">
        <title>Vector-based / Cosine-based Similarity:</title>
        <p>
          sim(i; j) = cos(~i;~j) =
(2)
~i ~j
kik kjk
The rst algorithm is the vector-based, also called
cosinebased, similarity. In this algorithm, items are represented
as two vectors that contain the user ratings. The
similarity between item i and item j is calculated by the cosine of
the angle between the two vectors. The " " denotes the dot
product of vector ~i and vector ~j [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. Due to the fact that
cosine based similarity does not consider the average rating
of an item, Pearson (correlation)-based similarity tries to
solve this issue.
        </p>
      </sec>
      <sec id="sec-7-2">
        <title>Pearson (Correlation)-based Similarity:</title>
        <p>sim(i; j) =
qP
u2U (Ru;i</p>
        <p>Pu2U (Ru;i</p>
        <p>Ri)(Ru;j</p>
        <p>Rj)
Ri)2qPu2U (Ru;j</p>
        <p>Rj)2
(3)</p>
        <p>The rst part of this algorithm is to nd a set of users U
that contains all users who rated both items i and j. These
items are called co-rated items. Not co-rated items are not
taken into consideration of this algorithm. This similarity
calculation is based on how much the rating of a user
deviates from the average rating of this item. Ru;i represents
the rating of a user u on item i and Ri denotes the average
rating of an item i.</p>
      </sec>
      <sec id="sec-7-3">
        <title>Adjusted Cosine Similarity:</title>
        <p>sim(i; j) =
qP
u2U (Ru;i</p>
        <p>Pu2U (Ru;i</p>
        <p>Ru)(Ru;j</p>
        <p>Ru)
Ru)2qPu2U (Ru;j</p>
        <p>Ru)2
(4)</p>
        <p>Adjusted cosine similarity takes into account that the
rating preferences of the di erent users di er. There are some
user that always give low ratings, but on the other side there
are users that rate highly in general. To avoid this drawback,
this algorithm subtracts the average rating of a user Ru from
each rating Ru;i and Ru;j on the items i and j.</p>
        <p>The presented advantage is the reason why we apply the
adjusted cosine similarity in our development. First, the
system has to calculate the related articles list of all articles.
To compute the related article list of an article, we iterate
through the list of all articles. If the current article is not
the same as the article to compare, we will calculate the
similarity.</p>
        <p>The function returns a value between minus one and one.
Since the article rating range is from one to ve, we map the
similarity to the rating range by using the linear function:
sim = 2 sim + 3
(5)</p>
        <p>There is one bigger problem in the adjusted cosine
similarity calculation. When there is just one common user
between articles, the similarity for those items is one, which
is the highest value of the rating range. This is due to the
subtraction of the average rating from the user's rating. To
avoid the e ect that the best rated articles are the articles
with just one common user, we speci ed a minimum
number of users that two articles need to have in common. In
our implementation the minimum number of common users
is ve. When there are less than ve common users, the
articles are not considered in our related article list.</p>
        <p>Afterwards, we sort the list by similarity. Moreover, we set
a limit of 50 related articles to avoid additional expenses due
to articles that are not considered for computation. When
the related article list is calculated, we can predict the top
articles for a user. For each article in the related articles
list, we check if the user has already read the article. If that
is the case, the article is not recommended anymore and the
system jumps to the next article. If it is a new article, the
prediction is calculated and added to the recommendation
list. After calculating the prediction for every article, we
sort the recommendation list by the predicted value.</p>
        <p>
          The prediction Pu;i can be calculated by the weighted sum
method [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]:
        </p>
        <p>Pu;i =</p>
        <p>Pall similar items;N (si;N</p>
        <p>Pall similar items;N (jsi;N j)</p>
        <p>Ru;N )
(6)</p>
        <p>
          This approach is "computing the sum of the ratings given
by the user on the items similar to i" [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. Afterwards, each
rating Ru;j is weighted by the similarity between item i and
item j 2 N . The basic idea of this approach is to nd items
that are forecasted to be liked by the user. The top predicted
items are recommended to the user.
        </p>
        <p>A key advantage over content-based ltering techniques
is the fact that collaborative RSs are able to provide a
bigger variety of topics. Furthermore, with collaborative
techniques, it is possible to provide event- or trend based
recommendations, such as news about the World Cup. A pure
content-based RS is not be able to recommend news about
the darts championship if the user has just read football
articles before.
3.4</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Weighted Hybrid Recommender</title>
      <p>In this section, we explain how we combine the
contentbased and the collaborative components to a hybrid sports
news RS.</p>
      <p>
        As combination technique, we use the weighted hybrid
strategy as described in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. For our rst version, we decided
to weight both components equally. The content-based
component is important for recommending new articles even if
no ratings exist. Additionally, the content-based system is
able to provide content to users with special interests as well.
Moreover, the content-based version is important, because
of fan culture and constant interest in some topics. But we
decided that the collaborative ltering part is as important
as the content-based component, due to the event-based
environment and the changing popularity of some sports. We
want the system to be able to recommend articles that are
attractive for just a small time slot. For example, many
persons are interested in the Olympic Games, but not in the
di erent kind of sports in general.
      </p>
      <p>We determined that the weights are just applied if both
components of our system recommend the corresponding
article. Otherwise, additional requests have to be sent to
calculate a combined score for each article. If just one
component recommends the article, just the score of this
component is taken with the full weight. Due to this procedure, we
are able to provide recommendations of both components.
3.5</p>
    </sec>
    <sec id="sec-9">
      <title>Implementation</title>
      <p>We developed a dashboard widget which can be integrated
in existing websites to provide personalized sports news
recommendations. For the development of the front-end, we
used the JavaScript framework AngularJS, the style sheet
language Less and HTML5 local storage. Figure 1 shows a
current screenshot of our recommender dashboard. Nine
recommendations are presented at one time. When the mouse
is moved over one article, the user can read it ("Ansehen")
or reject the recommendation ("Entfernen").</p>
      <p>It is critical to identify the user every time she or he
accesses the RS to provide personalized recommendations. We
avoided to implement a mandatory login as this could be a
big obstacle for new users visiting a sports news website.
Instead, we calculate a Globally Unique Identi er (GUID)
which is then stored in HTML5 local storage without an
expiration date. This is an important advantage for our RS.
Due to the fact that HTML5 local storage has no explicit
lifecycle, we can use it not only for user identi cation, but
also for generating a pro le of the user. Storing this data
on client site is decreasing the amount of data stored on
the server which makes the system more scalable. Only the
item similarities and recommendations are calculated on the
server, due to direct access to articles from our backend.</p>
      <p>In order to get content-based recommendations, the client
sends an Ajax request to a NodeJS server. Therefore, the
user ID and the corresponding pro le are sent as
parameters. We decided to use an Ajax request due to the fact that
the computation causes no overhead at site loading if it is
done from JavaScript code. At our backend, the weighted
keyword pro le is sorted by the keyword name
alphabetically. As mentioned, we receive articles published within
the last three days. After obtaining those articles from our
article repository, we add the suitable keywords to each of
them. The weighted keywords are in the same form as the
user pro le to make them comparable to each other. The
system calculates the similarity of the user pro le with
every article. Therefore, the union of the keyword sets is built.
Subsequently, the similarity is computed using formula 1.</p>
      <p>A JSON response sends the 50 articles with the highest
score back to the client. The response is then processed by
the Angular directive of the personalized dashboard. If the
user removes an article, the next recommended article takes
over its place. In addition, further statistics like the last
read articles or last and next matches of the preferred team
are displayed.</p>
      <p>For the computation of collaborative recommended
articles we use the same NodeJS server. In contrast to other
systems, we do not store our data in a database. Due to the
fact that we have to iterate through lists most of the time
to compare ratings and users, we decided to use arrays to
store our data within the application. The ratings provided
by the user are collected in a rating variable that is kept in
memory. It stores JSON objects with the user ID, the
article path and the provided rating. Furthermore, the current
date is used to distinguish current data from dated ratings
that are not relevant for our system anymore.</p>
      <p>To speed up the similarity computation, we adapt the
average rating of a user every time providing a new rating.
The average ratings are kept in an extra variable for
performance reasons. The current average rating and the number
of ratings provided by the given user is enough to adapt the
average. Just a few basic arithmetic operations are
necessary to avoid calculating the average from the rating variable
every time from scratch. We minimize the accesses to the
rating variable due to the fact that this variable is the main
component of our server. Most of the requests read or write
this variable. Every variable access that can be eliminated
helps to improve the system's performance.</p>
      <p>Moreover, we store a list of users as well as a list of
articles to iterate through these arrays without generating them
rst. Using a list of all articles is primarily important when
the system computes related articles. The list of related
articles is updated every hour. A cronjob is executed every
hour to consider current news as well. After one hour there
are more ratings provided and the new item problem of a
pure collaborative RS is suppressed.</p>
      <p>For similarity calculation of two items, we need to nd a
set of users that contains all users who rated both items.
Therefore, we generate a list of objects that contain the
articles and all the users who rated the corresponding article.
To compute the user set of two articles, we compare the two
user lists and determine the intersection.</p>
      <p>The combination of the content-based and the
collaborative part of our RS is implemented in JavaScript. First, we
send an Ajax request to our backend to collect the
contentbased recommended articles. In addition, another request
is sent to our NodeJS server where the collaborative ltered
articles are computed. If the collaborative ltered
recommendations are returned correctly, the system computes the
combination of both article sets. Finally, the recommended
articles are returned and the JSON response is sent to the
application.</p>
      <p>In the news domain the age of an article is de nitely one
of the most important properties when the article's
attractiveness is determined for a user. Because of the recency
problem, we decided to implement a route in our NodeJS
server to remove dated ratings and articles from our system.
Every two weeks a cronjob is executed and every rating that
is older than four weeks is removed from the ratings table.
The removal of those ratings implies the secondary e ect
that old users that do not exist anymore are removed as well.
This is a very common scenario in our system, due to the
fact that we identify the user by using HTML5 local storage.
If the local storage is deleted, the old user ID does not occur
anymore. We decided to use these time intervals, because
our content-based version considers only articles published
the last three days and we want to provide recommendations
of articles older than a few days as well if an older article
is getting popular again. In this case, our system is able
to recommend those articles as well as long as ratings are
provided in the last four weeks.</p>
    </sec>
    <sec id="sec-10">
      <title>EVALUATION</title>
      <p>We conducted user studies to evaluate our algorithms and
the usability of our recommender dashboard. In this section
we present the goals, the procedure and the results of our
evaluation. We interpret our ndings to answer the question
how well content-based algorithms support user in receiving
interesting sports news and if a hybrid algorithm can
improve the performance of our RS.
4.1</p>
    </sec>
    <sec id="sec-11">
      <title>Analysis of Usage Data of the Content-based</title>
    </sec>
    <sec id="sec-12">
      <title>Recommender</title>
      <p>In order to collect usage data of real users, we tested the
content-based approach on the live version of the Sport1
website. For this purpose, the recommender dashboard
prototype is presented to one percent of the users. Due to the
fact that the website is visited by thousands of users every
day, one percent of the users is enough to evaluate not only
the functionality but also the usability of our RS. In future,
we will increase the amount of test users from time to time
and adapt our implementation accordingly. We used Google
Analytics to measure relevant Key Performance Indicators
(KPIs) that help us to evaluate our solution. We analyzed
how much the users clicked on the read and the remove
button, respectively. Moreover, we tested how often the users
navigated to articles they have already read by using the
last read articles widget. In addition to the event tracking,
we analyzed if there is an impact on the article ratings due
to the new personalized dashboard. This is why we
compare the average ratings of di erent articles. Articles are
just taken into account, if they are rated by the one percent
of users that can use the personalized dashboard.</p>
      <p>At the end of our live study 5132 user IDs were registered
on our server. This does not mean that more than 5000
di erent users used the dashboard due to the fact that every
device has its own GUID and if the history of the browser
is deleted, a new ID is generated. But there were enough
users producing events we can track.</p>
      <p>The click behaviors of the users give information about
the user acceptance of the di erent components. Figure
2 illustrates how many clicks are executed on the di erent
components of the personalized dashboard.</p>
      <p>Almost 50 percent of the clicks were executed on the
remove button of the news recommendation widget. On the
rst view, this number is quite high. But if we consider that
at all the other buttons navigate the user to another page,
it is obvious that the remove button is executed more often
than all the other buttons. If the user clicks on remove, the
article will disappear and a new one will be displayed in the
dashboard. The user is then able to interact again with the
dashboard. 27 percent of the clicks are executed on the view
article button, which is a good proportion. Especially, if we
consider that the RS is new, it is noticeable that after every
third interaction, an article that potentially ts to the
interests of the user, is recommended. To get better informations
about the quality of the recommendations, we need to
organize a long-term study. The sports news domain is very
dynamic and the click behavior is changing depending on
the current events. By that reason, the two week evaluation
is not enough to ensure that the amount of clicks on an
element is constantly similar. Around one quarter of all clicks
are executed on links and buttons which are not part of the
recommender dashboard but provide additional information
such as last read articles and team-related statistics such as
last and next matches and top scorers.</p>
      <p>We expect that the quality of the recommendations
increases with the time of use. To test this assumption, we
analyzed the trend of the remove button clicks. Except for
some days, the ratio of clicks on remove decreased with every
day performing our testing (cf. Figure 3). The exceptions
may base on new users or users that do not read many
articles on the website. If none or just a few articles are read
before using our dashboard, the quality of the
recommendations will be low. Since the dashboard is just presented to
one percent of the users, we are not able to give evidence
that the subjective quality will be the same when
publishing the dashboard to all users. With increasing the number
of testers, the ratio of remove button clicks increases at the
beginning and then falls again with the time of use. We
detected this when we released the dashboard on the website.</p>
      <p>To analyze if the dashboard has an e ect on the article
ratings, we compared three types of articles. First, articles
that are bad rated in general, second average rated articles
and nally articles that are high rated in general. A
generally bad rated article gets a better score from our RS users.
This is mainly due to the fact that we use the implicit
feedback of three stars if a user reads the article. Moreover, the
bulk of the users is not providing any rating for an article.
So the average rating of the testers is almost at three stars
for bad and average rated articles. For high rated articles,
the RS users scored a little lower in general. The chance
to get such an article provided by our system is higher due
to the fact that more comparable users are available for the
most read articles. If the user clicks on remove, the lowest
rating of one star is implicitly provided and the average
rating is decreasing. Since the personalized dashboard is not
established on the website, we can sum up that the
recommendations have almost no e ect on the rating scores. This
may change if the users will use the dashboard as their rst
contact point on the website.</p>
      <p>It was also noticeable that the users want to read already
read articles again. The last read article widget helps them
to navigate back and easily get an overview of the last
interactions. We expect that the amount of clicks in this widget
will decrease in the future. Since new articles are
potentially more attractive for a user, we can not imagine that
every tenth click is executed on an already read article. We
believe that the users in the study were curious and wanted
to test this new feature.
4.2</p>
    </sec>
    <sec id="sec-13">
      <title>Comparison of Content-based and Hybrid</title>
    </sec>
    <sec id="sec-14">
      <title>Recommendations</title>
      <p>Our hybrid algorithm extends the presented content-based
approach by a collaborative ltering component. This
algorithm is not part of the live version of the RS yet. We tested
the hybrid recommendations with a selected user group. We
paid attention to choose persons from di erent backgrounds
and with diverging interests to ensure comparability to our
users.</p>
      <p>The participants had to rate the RS in two scales on a
scale from 1 (worst rating) to 5 (best rating): How well the
recommendations t their interests and how diversi ed they
are. The pure content-based solutions served as a baseline
algorithm. In total, we received 40 completed questionnaires
for the content-based approach and 20 for the extended,
hybrid RS.</p>
      <p>The results show that the recommendations are not
diversi ed enough in our pure content-based approach ( 2.9), but
they improve in our hybrid implementation where the
average rating was 3.3. The content-based recommendations are
representing the interests of the user ( 3.4) which shows us
that the dashboard provides additional value. This value did
not change in our hybrid version. It was noticeable that the
more frequent a user is visiting the website, the more he is
satis ed with the result of the recommendations. The users
that visited the website every day gave an average rating of
3.6. This con rms that the quality of our recommendations
increases over time.
4.3</p>
    </sec>
    <sec id="sec-15">
      <title>Usability of the Recommender Dashboard</title>
      <p>
        Besides evaluating our recommendation algorithms, we
asked the study participants to rate the usability of our
recommender dashboard. We used the well-established System
Usability Scale (SUS) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This questionnaire consists of ten
questions providing a global view of subjective assessments
of usability. Participants respond using a Likert scale with
ve response options; from Strongly agree to Strongly
disagree. Furthermore, our participants were allowed to add
further thoughts in a free-text eld.
      </p>
      <p>
        To calculate the SUS score, the answers for each question
are converted to a new score from 0 to 4 where 4 is the
best score and 0 is representing the worst possible answer of
this question. Afterwards the di erent scores are added
together and multiplied by 2.5 to get a ranking value between
0 and 100 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Every SUS score above 68 is considered as
above average, everything lower than 68 as below average
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The average scores of each question, collected in our
user feedback, are shown in Table 1.
      </p>
      <p>The score is calculated by adding the scores and
multiplying the sum with 2.5:
score = sum</p>
      <p>The score of 87 exceeded our expectations although we
attached great importance to the design and usability of
our system. This was required because the dashboard is
implemented on the live website of Sport1. Nevertheless the
users mentioned some desires concerning the usability. For
example, some users wished to change the design by choosing
their own colors. Since these informations have not an direct
impact on our RS implementation, we will not deepen these
suggestions here.
4.4</p>
    </sec>
    <sec id="sec-16">
      <title>Discussion</title>
      <p>The user study results show that our content-based RS
is a promising approach to suggest sports news to users.
The recommendations t the users' interests and improve
when the users provide more feedback. Nevertheless, the
diversity of the recommended articles remains to be low.
This is a typical problem of pure content-based RS and can
be overcome by using a hybrid solution. We extended the
RS by a collaborative ltering component which increased
the diversity of the recommendations.</p>
      <p>As described before, it is very important that a news RS
provides current articles. Especially in sports, the
environment is very dynamic and the news topics are changing all
the time. For that reason the system can not be a pure
collaborative RS. With collaborative ltering it is almost
impossible to recommend new items. But this problem can be
solved by using a content-based component as well.
Contentbased RSs can provide content that ts to the general
interests of the users. In addition, attention should be paid
to event based interests, e.g. the Super Bowl, an event that
is closely followed by many people. If a user has no
interests in American Football in general, the content-based RS
does not provide articles about the Super Bowl. So there
must be a combination of both techniques to bene t from
the strengths of each component.
5.</p>
    </sec>
    <sec id="sec-17">
      <title>CONCLUSION AND FUTURE WORK</title>
      <p>In this work, we tackled the problem of recommending
sports news. Sports news are a special case in the eld of
news recommendations as users often come with a strong
emotional attachment to selected sports, teams or players.
Furthermore, the interest in a topic is event-driven and can
suddenly change. We developed a content-based RS that
creates user pro les based on implicit feedback the user
shares when reading articles. Using automatically created
keywords, the similarity between articles can be measured
and the relevance for the user can be predicted. This
approach delivers accurate recommendations but lacks
diversity. In a rst prototype, we designed and evaluated a hybrid
algorithm that extends our content-based RS by a
collaborative component. This hybrid approach increases diversity
and also allows to recommend older articles if they are of
particular interest for the user.</p>
      <p>To improve the quality of the hybrid recommendations,
we will adjust our implementation from time to time and
test if the adaptions serve their purpose. First, we will test
di erent weights for the two components. One idea is to
increase the weight of the content-based version. The decrease
of the weight of the collaborative version does not exclude
the event-based recommendations. Even if the collaborative
part does just count one third, it is able to provide
recommendations because if the article is only recommended by
our collaborative version, just the score of this component is
taken into account. If both components provide this article
recommendation, the content-based version is more adapted
to the users interests. To nd out which weight ratio is the
best for our case, we have to analyze the implicit and
explicit user feedback for a longer time period. The evaluation
of the weights is just meaningful if the feedback is collected
for a few months to avoid temporally uctuation, which is
quite common in the news domain.</p>
      <p>Furthermore, we want to implement a switching hybrid
as well. If there is a new item, the collaborative ltering
method can not provide recommendations from the rst
second. This is the strength of our content-based version. The
RS has to switch to the content-based version if the
article is newer than a speci c date. Recommendations for a
new user are calculated by our collaborative ltering
component to handle new users as well as the preferences at the
rst use of the system are not su cient to compute pure
content-based recommendations. If a larger user pro le is
constructed and an article is not published in the last
minutes, the combination of both techniques will be applied as
described before.</p>
      <p>We tested our rst developments in a two-week user study.
Our content-based RS has been tested with live users while
the hybrid approach was only accessible for a selected user
group. In future, we want to conduct larger studies with
more users for all algorithms we develop. Our rst results
will serve as the baseline for future extensions and other
algorithms.</p>
      <sec id="sec-17-1">
        <title>Question</title>
        <p>I think that I would like to use this system frequently.</p>
        <p>I found the system unnecessarily complex.</p>
        <p>I thought the system was easy to use.</p>
        <p>I think that I would need the support of a technical person to be able to use this system.
I found the various functions in this system were well integrated.</p>
        <p>I thought there was too much inconsistency in this system.</p>
        <p>I would imagine that most people would learn to use this system very quickly.
I found the system very cumbersome to use.</p>
        <p>I felt very con dent using the system.</p>
        <p>I needed to learn a lot of things before I could get going with this system.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Adomavicius</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Tuzhilin</surname>
          </string-name>
          .
          <article-title>Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions</article-title>
          .
          <source>IEEE Trans. on Knowl. and Data Eng</source>
          .,
          <volume>17</volume>
          (
          <issue>6</issue>
          ):
          <volume>734</volume>
          {
          <fpage>749</fpage>
          ,
          <year>June 2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y. A.</given-names>
            <surname>Asikin</surname>
          </string-name>
          and
          <string-name>
            <given-names>W.</given-names>
            <surname>Wo</surname>
          </string-name>
          <article-title>rndl. Stories around you: Location-based serendipitous recommendation of news articles</article-title>
          .
          <source>In Proceedings of 2nd International Workshop on News Recommendation and Analytics</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Brooke</surname>
          </string-name>
          .
          <article-title>SUS-A quick and dirty usability scale. Usability evaluation in industry</article-title>
          , pages
          <volume>189</volume>
          {
          <fpage>194</fpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Burke</surname>
          </string-name>
          .
          <article-title>Hybrid recommender systems: Survey and experiments</article-title>
          .
          <source>User Modeling</source>
          and
          <string-name>
            <surname>User-Adapted</surname>
            <given-names>Interaction</given-names>
          </string-name>
          ,
          <volume>12</volume>
          (
          <issue>4</issue>
          ):
          <volume>331</volume>
          {
          <fpage>370</fpage>
          ,
          <string-name>
            <surname>Nov</surname>
          </string-name>
          .
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>I.</given-names>
            <surname>Cantador</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Bellog n, and</article-title>
          <string-name>
            <given-names>P.</given-names>
            <surname>Castells</surname>
          </string-name>
          .
          <article-title>News@hand: A semantic web approach to recommending news</article-title>
          . In W. Nejdl,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pu</surname>
          </string-name>
          , and E. Herder, editors,
          <source>Adaptive Hypermedia and Adaptive Web-Based Systems: 5th International Conference, AH</source>
          <year>2008</year>
          , Hannover, Germany,
          <source>July 29 - August 1</source>
          ,
          <year>2008</year>
          . Proceedings, pages
          <volume>279</volume>
          {
          <fpage>283</fpage>
          . Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>I.</given-names>
            <surname>Cantador</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Bellog n, and</article-title>
          <string-name>
            <given-names>P.</given-names>
            <surname>Castells</surname>
          </string-name>
          .
          <article-title>Ontology-based personalised and context-aware recommendations of news items</article-title>
          .
          <source>In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01, WI-IAT '08</source>
          , pages
          <fpage>562</fpage>
          {
          <fpage>565</fpage>
          , Washington, DC, USA,
          <year>2008</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Claypool</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gokhale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miranda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Murnikov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Netes</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Sartin</surname>
          </string-name>
          .
          <article-title>Combining content-based and collaborative lters in an online newspaper</article-title>
          .
          <source>In Proceedings of ACM SIGIR workshop on recommender systems</source>
          , volume
          <volume>60</volume>
          .
          <string-name>
            <surname>Citeseer</surname>
          </string-name>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>T. De Pessemier</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Leroux</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Vanhecke</surname>
            , and
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Martens</surname>
          </string-name>
          .
          <article-title>Combining collaborative ltering and search engine into hybrid news recommendations</article-title>
          .
          <source>In Proceedings of the 3rd International Workshop on News Recommendation and Analytics (INRA</source>
          <year>2015</year>
          )
          <article-title>co-located with 9th ACM Conference on Recommender Systems (RecSys</article-title>
          <year>2015</year>
          ), Vienna, Austria,
          <year>September 20</year>
          ,
          <year>2015</year>
          ., pages
          <volume>14</volume>
          {
          <fpage>19</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Konstan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. N.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Maltz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Herlocker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Gordon</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          . Grouplens:
          <article-title>Applying collaborative ltering to usenet news</article-title>
          .
          <source>Commun. ACM</source>
          ,
          <volume>40</volume>
          (
          <issue>3</issue>
          ):
          <volume>77</volume>
          {
          <fpage>87</fpage>
          ,
          <string-name>
            <surname>Mar</surname>
          </string-name>
          .
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dolan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E. R.</given-names>
            <surname>Pedersen</surname>
          </string-name>
          .
          <article-title>Personalized news recommendation based on click behavior</article-title>
          .
          <source>In Proceedings of the 15th International Conference on Intelligent User Interfaces</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>O</given-names>
            <surname>. O</surname>
          </string-name>
          <article-title>zgobek</article-title>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Gulla</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Erdur</surname>
          </string-name>
          .
          <article-title>A survey on challenges and methods in news recommendation</article-title>
          .
          <source>In Proceedings of the 10th International Conference on Web Information Systems and Technologies</source>
          , pages
          <volume>278</volume>
          {
          <fpage>285</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B.</given-names>
            <surname>Sarwar</surname>
          </string-name>
          , G. Karypis,
          <string-name>
            <given-names>J.</given-names>
            <surname>Konstan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          .
          <article-title>Item-based collaborative ltering recommendation algorithms</article-title>
          .
          <source>In Proceedings of the 10th International Conference on World Wide Web, WWW '01</source>
          , pages
          <fpage>285</fpage>
          {
          <fpage>295</fpage>
          , New York, NY, USA,
          <year>2001</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Sauro</surname>
          </string-name>
          .
          <article-title>Measuring Usability with the System Usability Scale (SUS</article-title>
          ),
          <year>2011</year>
          . Retrieved June 20,
          <year>2016</year>
          from http://www.measuringu.com/sus.php.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>X.</given-names>
            <surname>Su</surname>
          </string-name>
          and
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Khoshgoftaar</surname>
          </string-name>
          .
          <article-title>A survey of collaborative ltering techniques</article-title>
          .
          <source>Hindawi Publishing Corporation</source>
          ,
          <year>2009</year>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.-H.</given-names>
            <surname>Tan</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Teo</surname>
          </string-name>
          .
          <article-title>Learning user pro les for personalized information dissemination</article-title>
          .
          <source>In Neural Networks Proceedings, 1998. IEEE World Congress on Computational Intelligence</source>
          . The 1998 IEEE International Joint Conference on, volume
          <volume>1</volume>
          , pages
          <fpage>183</fpage>
          {
          <fpage>188</fpage>
          , May
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>