=Paper= {{Paper |id=Vol-1673/paper1 |storemode=property |title=Combining Content-based and Collaborative Filtering for Personalized Sports News Recommendations |pdfUrl=https://ceur-ws.org/Vol-1673/paper1.pdf |volume=Vol-1673 |authors=Philip Lenhart,Daniel Herzog |dblpUrl=https://dblp.org/rec/conf/recsys/LenhartH16 }} ==Combining Content-based and Collaborative Filtering for Personalized Sports News Recommendations== https://ceur-ws.org/Vol-1673/paper1.pdf
     Combining Content-based and Collaborative Filtering for
         Personalized Sports News Recommendations

                               Philip Lenhart                                            Daniel Herzog
                     Department of Informatics                                      Department of Informatics
                   Technical University of Munich                                 Technical University of Munich
             Boltzmannstr. 3, 85748 Garching, Germany                       Boltzmannstr. 3, 85748 Garching, Germany
                      philip.lenhart@in.tum.de                                       herzogd@in.tum.de

ABSTRACT                                                                one category of news but they complicate the news recom-
Sports news are a special case in the field of news recommen-           mendation process. People interested in sports are often
dations as readers often come with a strong emotional at-               characterized by a strong emotional attachment to selected
tachment to selected sports, teams or players. Furthermore,             sports, teams or players. With regard to recommendations,
the interest in a topic can suddenly change if, for example,            a user could be in favor of a lot of news about one team
an important sports event is taking place. In this work, we             while she or he absolutely wants to avoid any information
present a hybrid sports news recommender system that com-               about a rival. Furthermore, the user’s interest in a topic can
bines content-based recommendations with collaborative fil-             suddenly change. For example, during the Fifa World Cup,
tering. We developed a recommender dashboard and inte-                  even some people who are not interested at all in soccer want
grated it into the Sport1.de website. In a user study, we eval-         to be kept up-to-date with regard to current results.
uated our solution. Results show that a pure content-based                 In this work, we want to examine how well content-based
approach delivers accurate news recommendations and the                 RSs work for recommending sports news. In addition, we
users confirm our recommender dashboard a high usability.               extend our RS by collaborative filtering. We develop a rec-
Nevertheless, the collaborative filtering component of our              ommender dashboard and integrate it into the website of the
hybrid approach is necessary to increase the diversity of the           German television channel and Internet portal Sport11 . We
recommendations and to recommend older articles if they                 evaluate both algorithms and the usability of our prototype
are of special importance to the user.                                  in a user study.
                                                                           This paper is structured as follows: in Section 2 we present
                                                                        related work and highlight our contribution to the current
CCS Concepts                                                            state of research in content-based and hybrid news RSs. We
•Information systems → Recommender systems;                             explain how we combine a content-based and a collaborative
                                                                        filtering component to a hybrid sports news RS in Section 3.
Keywords                                                                Our development is evaluated in a user study. The results
                                                                        of this study are summarized in Section 4. This work ends
Recommender System; Sports News; Content-based; Col-                    with a conclusion and an outlook on future work.
laborative Filtering; Hybrid

1.    INTRODUCTION                                                      2.     RELATED WORK
   Recommender systems (RSs) suggest items like movies,                    Different approaches try to tackle the problem of person-
songs or points of interest based on the user’s preferences.            alized news recommendations. One of the first news RSs
Traditional RSs have to face some challenges when recom-                was developed and evaluated by the GroupLens project [9].
mending such items. One of the most common problems                     The researchers used collaborative filtering to provide per-
is the cold-start problem [1]. News items without any rat-              sonalized recommendations. A seven-week trial showed that
ings cannot be recommended while new users who did not                  their predictions are meaningful and valuable to users. Fur-
share their preferences with the RS yet cannot receive any              thermore, they found out that users value such predictions
personalized recommendations. When recommending news,                   for news because in the experiment, the participants tended
recency plays a critical role [11]. News have to be up-to-              to read highly rated articles more than less highly rated ar-
date but sometimes older articles are important if there is             ticles. Liu et al [10] developed a news RS based on profiles
a connection to current events. Sports news represent only              learned from user activity in Google News. They modeled
                                                                        the user’s interests by observing her or his past click history
                                                                        and combined it with the local news trend. Compared with
                                                                        an existing collaborative filtering method, their combined
                                                                        method improved the quality of the recommendations and
                                                                        attracted more frequent visits to the Google News website.
                                                                           Using article keywords to build user profiles for news rec-
                                                                        ommendations has already been researched. The Personal-
                                                                        ized Information Network (PIN) creates user profiles by so
CBRecSys 2016, September 16, 2016, Boston, MA, USA.                     1
Copyright remains with the authors and/or original copyright holders.       http://www.sport1.de/
called interest terms which consist of one or more keywords       better the recommendations are optimized with regard to
[15]. Experiments show that PIN is able to deliver person-        the user’s preferences. In our first prototype the counter for
alized news recommendations on-the-fly.                           each present keyword is incremented by one when the user
   Some researchers used hybrid RS combining different tech-      reads the article containing this keyword. In future works,
niques to suggest news articles. Claypool et al. [7] devel-       the keywords in an article could be weighted according to
oped P-Tango, an online newspaper combining the strengths         the relevance and importance of the keyword to the article.
of content-based and collaborative filtering. News@hand is          The new user problem affects every user who did not read
a system that makes use of semantic-based technologies to         an article yet. As explained, sports news differ from other
recommend news [5]. It creates ontology-based item de-            kinds of news in the emotional attachment to selected sports,
scriptions and user profiles to provide personalized, context-    teams or players. We use this finding to overcome the new
aware, group-oriented and multi-facet recommendations. Its        user problem. Before starting the recommendation process,
hybrid models allow overcoming some limitations of tradi-         the user can specify her or his favorite sport and team. News
tional RS techniques such as the cold-start problem and en-       can then be recommended based on this selection and will
ables recommendations for grey sheeps, i.e. users whose           improve when the user is reading articles, thus providing
preferences do not consistently agree or disagree with any        implicit feedback.
group of people [7]. The authors evaluated the personal-
ized and context-aware recommendation models in an ex-            3.2    Content-based Sports News Recommenda-
periment with 16 participants. Results showed that the                   tions
combination of both models plus their semantic extension             Content-based recommender suggest items that are sim-
provides the best results [6]. De Pessemier et al. [8] used       ilar to items the user liked in the past [1]. Since the user
an hybrid approach to recommend news of different sources.        profile uses weighted keywords, we use vector representa-
Their approach combines a search engine as a content-based        tions of the profile and the articles to calculate the similarity
approach with collaborative filtering and uses implicit feed-     between two articles.
back to determine if the user is interested in a certain topic.      One of the most important things for a news RS is to
The recommendations are presented in a web application            provide articles that are not dated. Especially in the sports
optimized for mobile devices.                                     news domain the environment is fast changing and usually
   Asikin and Wörndl [2] presented approaches for recom-         the user is not interested in news about a sports event or her
mending news article by using spatial variables such as ge-       or his favorite team that are not up-to-date. The main chal-
ographic coordinates or the name and physical character of        lenge for us was to determine how old sports news can be be-
a location. Their goal was to to deliver serendipitous rec-       fore they are not considered for recommendation anymore.
ommendation while improving the user satisfaction. A user         For our content-based RS we only take news into account
study showed that their approaches deliver news recommen-         that are not older than three days. Besides only provid-
dations that are more surprising than a baseline algorithm        ing relevant articles, this decision promises a better perfor-
but still favored by the users.                                   mance of the algorithm. The more articles are considered,
   To the best of our knowledge, no research focusing on the      the longer the process of calculating the recommendations
special case of sports news has been done. In this work           takes. Our system currently uses only one news provider,
we want to show how sports news can be recommended in             but if the system grows, this could lead to a significant loss
a content-based approach. In addition, we extend this RS          of performance. Our hybrid algorithm which incorporates
by a collaborative filtering component. In a user study, we       collaborative filtering is also able to provide older articles if
evaluate both approaches to find out if the hybrid algorithm      they are of high importance to the user (cf. Section 3.4).
improves the recommendations. We show how sports news                The formula below computes the similarity between two
can be suggested to real users by developing and testing a        articles (g and h),
fully working recommender dashboard which can be inte-                                       P
                                                                                                    (gi ∗ hi )
grated into existing webpages.                                            sim(g, h) = q P i∈W P                            ,    (1)
                                                                                          ( i∈W gi ∗ i∈W h2i )
                                                                                                    2

3.    DEVELOPMENT OF A PERSONALIZED
      SPORTS NEWS RECOMMENDER SYS-                                      where
      TEM                                                                  g,h are vectors representing articles with
  This section explains the algorithms we used in our RS,                       weighted keywords,
but also illustrates the user profile modeling that is needed               W is the set union of the particular keywords,
to provide personalized recommendations. Finally, the pro-                    i is a keyword and
totype is shown to point out how our concepts are imple-
mented on a website.                                                     gi ,hi are the weights of i in g and h, respectively.
                                                                    In the computation of content-based similarity scores we
3.1    User Profile and Preference Elicitation                    only consider the relative dimension of the keyword weights.
   The user’s preferences with regard to sports news are ex-      For the reason that user profiles have different dimensions
pressed by keywords of articles that she or he is reading.        compared to articles, the use of relative dimensions provides
Each article of our recommendation database is character-         better results for our system. As an illustration of the main
ized by five to ten keywords which are automatically gen-         idea of the algorithm, let us consider the simple case where
erated by analyzing the article’s text. We are storing a list     the user profile contains two keywords with the weights 5
of keywords and how often each keyword occurs in articles         and 10. Additionally there is another article with these two
the user has read. The more articles the user is reading, the     keywords but the weights are 1 and 2, respectively. In this
case the similarity is 1, because of the same relative dimen-               Adjusted Cosine Similarity:
sions of the article and the user profile.                                                      P               ¯            ¯
   The algorithm considers every article as an element in a                                        u∈U (Ru,i − Ru )(Ru,j − Ru )
                                                                            sim(i, j) = qP                      qP
vector space, where the keywords are forming the base. The                                                  ¯ 2                     ¯ 2
                                                                                              u∈U (Ru,i − Ru )       u∈U (Ru,j − Ru )
coordinate of an article in the direction of a keyword is given                                                                         (4)
by the weight of this keyword. If the keyword does not occur,               Adjusted cosine similarity takes into account that the rat-
the weight will be 0.                                                     ing preferences of the different users differ. There are some
   We normalize each article relative to the standard scalar              user that always give low ratings, but on the other side there
product by dividing it by its absolute value. Consequently,               are users that rate highly in general. To avoid this drawback,
the standard scalar product of the two normalized vectors                 this algorithm subtracts the average rating of a user R¯u from
conforms to the desired comparison features. Even if there                each rating Ru,i and Ru,j on the items i and j.
are negative weights, e.g. for active suppressed keywords,                  The presented advantage is the reason why we apply the
the algorithm calculates similarities correctly.                          adjusted cosine similarity in our development. First, the
   In order to understand the similarity calculation better,              system has to calculate the related articles list of all articles.
we explain how the algorithm works for an article with itself             To compute the related article list of an article, we iterate
(or another article with the same weight proportions). In                 through the list of all articles. If the current article is not
this case, the scalar product is 1, because of the way the                the same as the article to compare, we will calculate the
vectors are normalized. But if two articles have disjunc-                 similarity.
tive keyword sets, the result is 0, because such vectors are                The function returns a value between minus one and one.
orthogonal to each other.                                                 Since the article rating range is from one to five, we map the
   In the end, the system sorts the articles by similarity de-            similarity to the rating range by using the linear function:
scending and returns the 50 articles with the highest score.

                                                                                                 sim = 2 ∗ sim + 3                      (5)
3.3    Collaborative Filtering Component
   In contrast to content-based filtering, a collaborative RS                There is one bigger problem in the adjusted cosine sim-
uses the ratings of other users to calculate the similarity of            ilarity calculation. When there is just one common user
articles [1]. Different algorithms for item-based collabora-              between articles, the similarity for those items is one, which
tive filtering exist. We explain some common algorithms in                is the highest value of the rating range. This is due to the
the following and explain our choice for a sports news RS.                subtraction of the average rating from the user’s rating. To
Therefore we refer to [12] and [14].                                      avoid the effect that the best rated articles are the articles
                                                                          with just one common user, we specified a minimum num-
  Vector-based / Cosine-based Similarity:                                 ber of users that two articles need to have in common. In
                                                                          our implementation the minimum number of common users
                                                                          is five. When there are less than five common users, the
                                             ~i · ~j                      articles are not considered in our related article list.
               sim(i, j) = cos(~i, ~j) =                            (2)
                                           kik ∗ kjk                         Afterwards, we sort the list by similarity. Moreover, we set
                                                                          a limit of 50 related articles to avoid additional expenses due
The first algorithm is the vector-based, also called cosine-              to articles that are not considered for computation. When
based, similarity. In this algorithm, items are represented               the related article list is calculated, we can predict the top
as two vectors that contain the user ratings. The similar-                articles for a user. For each article in the related articles
ity between item i and item j is calculated by the cosine of              list, we check if the user has already read the article. If that
the angle between the two vectors. The ”·” denotes the dot                is the case, the article is not recommended anymore and the
product of vector ~i and vector ~j [12]. Due to the fact that             system jumps to the next article. If it is a new article, the
cosine based similarity does not consider the average rating              prediction is calculated and added to the recommendation
of an item, Pearson (correlation)-based similarity tries to               list. After calculating the prediction for every article, we
solve this issue.                                                         sort the recommendation list by the predicted value.
                                                                             The prediction Pu,i can be calculated by the weighted sum
  Pearson (Correlation)-based Similarity:                                 method [12]:
                                                                                             P
                                                                                               all similar items,N (si,N ∗ Ru,N )
                       P                                                            Pu,i =                                              (6)
                         u∈U (Ru,i − R̄i )(Ru,j − R̄j )
                                                                                                 P
  sim(i, j) = qP                          qP                                                      all similar items,N (|si,N |)

                    u∈U (Ru,i − R̄i )
                                        2
                                               u∈U (Ru,j − R̄j )
                                                                2
                                                                             This approach is ”computing the sum of the ratings given
                                                             (3)          by the user on the items similar to i” [12]. Afterwards, each
   The first part of this algorithm is to find a set of users U           rating Ru,j is weighted by the similarity between item i and
that contains all users who rated both items i and j. These               item j ∈ N . The basic idea of this approach is to find items
items are called co-rated items. Not co-rated items are not               that are forecasted to be liked by the user. The top predicted
taken into consideration of this algorithm. This similarity               items are recommended to the user.
calculation is based on how much the rating of a user de-                    A key advantage over content-based filtering techniques
viates from the average rating of this item. Ru,i represents              is the fact that collaborative RSs are able to provide a big-
the rating of a user u on item i and R̄i denotes the average              ger variety of topics. Furthermore, with collaborative tech-
rating of an item i.                                                      niques, it is possible to provide event- or trend based rec-
                                                                          ommendations, such as news about the World Cup. A pure
content-based RS is not be able to recommend news about            the last three days. After obtaining those articles from our
the darts championship if the user has just read football          article repository, we add the suitable keywords to each of
articles before.                                                   them. The weighted keywords are in the same form as the
                                                                   user profile to make them comparable to each other. The
3.4    Weighted Hybrid Recommender                                 system calculates the similarity of the user profile with ev-
   In this section, we explain how we combine the content-         ery article. Therefore, the union of the keyword sets is built.
based and the collaborative components to a hybrid sports          Subsequently, the similarity is computed using formula 1.
news RS.                                                              A JSON response sends the 50 articles with the highest
   As combination technique, we use the weighted hybrid            score back to the client. The response is then processed by
strategy as described in [4]. For our first version, we decided    the Angular directive of the personalized dashboard. If the
to weight both components equally. The content-based com-          user removes an article, the next recommended article takes
ponent is important for recommending new articles even if          over its place. In addition, further statistics like the last
no ratings exist. Additionally, the content-based system is        read articles or last and next matches of the preferred team
able to provide content to users with special interests as well.   are displayed.
Moreover, the content-based version is important, because             For the computation of collaborative recommended arti-
of fan culture and constant interest in some topics. But we        cles we use the same NodeJS server. In contrast to other
decided that the collaborative filtering part is as important      systems, we do not store our data in a database. Due to the
as the content-based component, due to the event-based en-         fact that we have to iterate through lists most of the time
vironment and the changing popularity of some sports. We           to compare ratings and users, we decided to use arrays to
want the system to be able to recommend articles that are          store our data within the application. The ratings provided
attractive for just a small time slot. For example, many           by the user are collected in a rating variable that is kept in
persons are interested in the Olympic Games, but not in the        memory. It stores JSON objects with the user ID, the arti-
different kind of sports in general.                               cle path and the provided rating. Furthermore, the current
   We determined that the weights are just applied if both         date is used to distinguish current data from dated ratings
components of our system recommend the corresponding ar-           that are not relevant for our system anymore.
ticle. Otherwise, additional requests have to be sent to cal-         To speed up the similarity computation, we adapt the av-
culate a combined score for each article. If just one compo-       erage rating of a user every time providing a new rating.
nent recommends the article, just the score of this compo-         The average ratings are kept in an extra variable for perfor-
nent is taken with the full weight. Due to this procedure, we      mance reasons. The current average rating and the number
are able to provide recommendations of both components.            of ratings provided by the given user is enough to adapt the
                                                                   average. Just a few basic arithmetic operations are neces-
3.5    Implementation                                              sary to avoid calculating the average from the rating variable
                                                                   every time from scratch. We minimize the accesses to the
   We developed a dashboard widget which can be integrated
                                                                   rating variable due to the fact that this variable is the main
in existing websites to provide personalized sports news rec-
                                                                   component of our server. Most of the requests read or write
ommendations. For the development of the front-end, we
                                                                   this variable. Every variable access that can be eliminated
used the JavaScript framework AngularJS, the style sheet
                                                                   helps to improve the system’s performance.
language Less and HTML5 local storage. Figure 1 shows a
                                                                      Moreover, we store a list of users as well as a list of arti-
current screenshot of our recommender dashboard. Nine rec-
                                                                   cles to iterate through these arrays without generating them
ommendations are presented at one time. When the mouse
                                                                   first. Using a list of all articles is primarily important when
is moved over one article, the user can read it (”Ansehen”)
                                                                   the system computes related articles. The list of related ar-
or reject the recommendation (”Entfernen”).
                                                                   ticles is updated every hour. A cronjob is executed every
   It is critical to identify the user every time she or he ac-
                                                                   hour to consider current news as well. After one hour there
cesses the RS to provide personalized recommendations. We
                                                                   are more ratings provided and the new item problem of a
avoided to implement a mandatory login as this could be a
                                                                   pure collaborative RS is suppressed.
big obstacle for new users visiting a sports news website.
                                                                      For similarity calculation of two items, we need to find a
Instead, we calculate a Globally Unique Identifier (GUID)
                                                                   set of users that contains all users who rated both items.
which is then stored in HTML5 local storage without an ex-
                                                                   Therefore, we generate a list of objects that contain the
piration date. This is an important advantage for our RS.
                                                                   articles and all the users who rated the corresponding article.
Due to the fact that HTML5 local storage has no explicit
                                                                   To compute the user set of two articles, we compare the two
lifecycle, we can use it not only for user identification, but
                                                                   user lists and determine the intersection.
also for generating a profile of the user. Storing this data
                                                                      The combination of the content-based and the collabora-
on client site is decreasing the amount of data stored on
                                                                   tive part of our RS is implemented in JavaScript. First, we
the server which makes the system more scalable. Only the
                                                                   send an Ajax request to our backend to collect the content-
item similarities and recommendations are calculated on the
                                                                   based recommended articles. In addition, another request
server, due to direct access to articles from our backend.
                                                                   is sent to our NodeJS server where the collaborative filtered
   In order to get content-based recommendations, the client
                                                                   articles are computed. If the collaborative filtered recom-
sends an Ajax request to a NodeJS server. Therefore, the
                                                                   mendations are returned correctly, the system computes the
user ID and the corresponding profile are sent as parame-
                                                                   combination of both article sets. Finally, the recommended
ters. We decided to use an Ajax request due to the fact that
                                                                   articles are returned and the JSON response is sent to the
the computation causes no overhead at site loading if it is
                                                                   application.
done from JavaScript code. At our backend, the weighted
                                                                      In the news domain the age of an article is definitely one
keyword profile is sorted by the keyword name alphabeti-
                                                                   of the most important properties when the article’s attrac-
cally. As mentioned, we receive articles published within
                                            Figure 1: Recommender Dashboard


tiveness is determined for a user. Because of the recency         4.1   Analysis of Usage Data of the Content-based
problem, we decided to implement a route in our NodeJS                  Recommender
server to remove dated ratings and articles from our system.         In order to collect usage data of real users, we tested the
Every two weeks a cronjob is executed and every rating that       content-based approach on the live version of the Sport1
is older than four weeks is removed from the ratings table.       website. For this purpose, the recommender dashboard pro-
The removal of those ratings implies the secondary effect         totype is presented to one percent of the users. Due to the
that old users that do not exist anymore are removed as well.     fact that the website is visited by thousands of users every
This is a very common scenario in our system, due to the          day, one percent of the users is enough to evaluate not only
fact that we identify the user by using HTML5 local storage.      the functionality but also the usability of our RS. In future,
If the local storage is deleted, the old user ID does not occur   we will increase the amount of test users from time to time
anymore. We decided to use these time intervals, because          and adapt our implementation accordingly. We used Google
our content-based version considers only articles published       Analytics to measure relevant Key Performance Indicators
the last three days and we want to provide recommendations        (KPIs) that help us to evaluate our solution. We analyzed
of articles older than a few days as well if an older article     how much the users clicked on the read and the remove but-
is getting popular again. In this case, our system is able        ton, respectively. Moreover, we tested how often the users
to recommend those articles as well as long as ratings are        navigated to articles they have already read by using the
provided in the last four weeks.                                  last read articles widget. In addition to the event tracking,
                                                                  we analyzed if there is an impact on the article ratings due
4.   EVALUATION                                                   to the new personalized dashboard. This is why we com-
  We conducted user studies to evaluate our algorithms and        pare the average ratings of different articles. Articles are
the usability of our recommender dashboard. In this section       just taken into account, if they are rated by the one percent
we present the goals, the procedure and the results of our        of users that can use the personalized dashboard.
evaluation. We interpret our findings to answer the question         At the end of our live study 5132 user IDs were registered
how well content-based algorithms support user in receiving       on our server. This does not mean that more than 5000
interesting sports news and if a hybrid algorithm can im-         different users used the dashboard due to the fact that every
prove the performance of our RS.                                  device has its own GUID and if the history of the browser
                                                                  is deleted, a new ID is generated. But there were enough
                                                                  users producing events we can track.
                                                                     The click behaviors of the users give information about
   Figure 2: Analysis of the user’s click behavior                  Figure 3: Daily clicks on remove (figures in per cent)



the user acceptance of the different components. Figure             and finally articles that are high rated in general. A gener-
2 illustrates how many clicks are executed on the different         ally bad rated article gets a better score from our RS users.
components of the personalized dashboard.                           This is mainly due to the fact that we use the implicit feed-
   Almost 50 percent of the clicks were executed on the re-         back of three stars if a user reads the article. Moreover, the
move button of the news recommendation widget. On the               bulk of the users is not providing any rating for an article.
first view, this number is quite high. But if we consider that      So the average rating of the testers is almost at three stars
at all the other buttons navigate the user to another page,         for bad and average rated articles. For high rated articles,
it is obvious that the remove button is executed more often         the RS users scored a little lower in general. The chance
than all the other buttons. If the user clicks on remove, the       to get such an article provided by our system is higher due
article will disappear and a new one will be displayed in the       to the fact that more comparable users are available for the
dashboard. The user is then able to interact again with the         most read articles. If the user clicks on remove, the lowest
dashboard. 27 percent of the clicks are executed on the view        rating of one star is implicitly provided and the average rat-
article button, which is a good proportion. Especially, if we       ing is decreasing. Since the personalized dashboard is not
consider that the RS is new, it is noticeable that after every      established on the website, we can sum up that the recom-
third interaction, an article that potentially fits to the inter-   mendations have almost no effect on the rating scores. This
ests of the user, is recommended. To get better informations        may change if the users will use the dashboard as their first
about the quality of the recommendations, we need to or-            contact point on the website.
ganize a long-term study. The sports news domain is very               It was also noticeable that the users want to read already
dynamic and the click behavior is changing depending on             read articles again. The last read article widget helps them
the current events. By that reason, the two week evaluation         to navigate back and easily get an overview of the last inter-
is not enough to ensure that the amount of clicks on an ele-        actions. We expect that the amount of clicks in this widget
ment is constantly similar. Around one quarter of all clicks        will decrease in the future. Since new articles are poten-
are executed on links and buttons which are not part of the         tially more attractive for a user, we can not imagine that
recommender dashboard but provide additional information            every tenth click is executed on an already read article. We
such as last read articles and team-related statistics such as      believe that the users in the study were curious and wanted
last and next matches and top scorers.                              to test this new feature.
   We expect that the quality of the recommendations in-
creases with the time of use. To test this assumption, we           4.2   Comparison of Content-based and Hybrid
analyzed the trend of the remove button clicks. Except for                Recommendations
some days, the ratio of clicks on remove decreased with every          Our hybrid algorithm extends the presented content-based
day performing our testing (cf. Figure 3). The exceptions           approach by a collaborative filtering component. This algo-
may base on new users or users that do not read many ar-            rithm is not part of the live version of the RS yet. We tested
ticles on the website. If none or just a few articles are read      the hybrid recommendations with a selected user group. We
before using our dashboard, the quality of the recommenda-          paid attention to choose persons from different backgrounds
tions will be low. Since the dashboard is just presented to         and with diverging interests to ensure comparability to our
one percent of the users, we are not able to give evidence          users.
that the subjective quality will be the same when publish-             The participants had to rate the RS in two scales on a
ing the dashboard to all users. With increasing the number          scale from 1 (worst rating) to 5 (best rating): How well the
of testers, the ratio of remove button clicks increases at the      recommendations fit their interests and how diversified they
beginning and then falls again with the time of use. We de-         are. The pure content-based solutions served as a baseline
tected this when we released the dashboard on the website.          algorithm. In total, we received 40 completed questionnaires
   To analyze if the dashboard has an effect on the article         for the content-based approach and 20 for the extended, hy-
ratings, we compared three types of articles. First, articles       brid RS.
that are bad rated in general, second average rated articles           The results show that the recommendations are not diver-
sified enough in our pure content-based approach (Ø2.9), but     terests of the users. In addition, attention should be paid
they improve in our hybrid implementation where the aver-        to event based interests, e.g. the Super Bowl, an event that
age rating was 3.3. The content-based recommendations are        is closely followed by many people. If a user has no inter-
representing the interests of the user (Ø3.4) which shows us     ests in American Football in general, the content-based RS
that the dashboard provides additional value. This value did     does not provide articles about the Super Bowl. So there
not change in our hybrid version. It was noticeable that the     must be a combination of both techniques to benefit from
more frequent a user is visiting the website, the more he is     the strengths of each component.
satisfied with the result of the recommendations. The users
that visited the website every day gave an average rating of     5.   CONCLUSION AND FUTURE WORK
3.6. This confirms that the quality of our recommendations
                                                                    In this work, we tackled the problem of recommending
increases over time.
                                                                 sports news. Sports news are a special case in the field of
4.3   Usability of the Recommender Dashboard                     news recommendations as users often come with a strong
                                                                 emotional attachment to selected sports, teams or players.
   Besides evaluating our recommendation algorithms, we          Furthermore, the interest in a topic is event-driven and can
asked the study participants to rate the usability of our rec-   suddenly change. We developed a content-based RS that
ommender dashboard. We used the well-established System          creates user profiles based on implicit feedback the user
Usability Scale (SUS) [3]. This questionnaire consists of ten    shares when reading articles. Using automatically created
questions providing a global view of subjective assessments      keywords, the similarity between articles can be measured
of usability. Participants respond using a Likert scale with     and the relevance for the user can be predicted. This ap-
five response options; from Strongly agree to Strongly dis-      proach delivers accurate recommendations but lacks diver-
agree. Furthermore, our participants were allowed to add         sity. In a first prototype, we designed and evaluated a hybrid
further thoughts in a free-text field.                           algorithm that extends our content-based RS by a collabo-
   To calculate the SUS score, the answers for each question     rative component. This hybrid approach increases diversity
are converted to a new score from 0 to 4 where 4 is the          and also allows to recommend older articles if they are of
best score and 0 is representing the worst possible answer of    particular interest for the user.
this question. Afterwards the different scores are added to-        To improve the quality of the hybrid recommendations,
gether and multiplied by 2.5 to get a ranking value between      we will adjust our implementation from time to time and
0 and 100 [3]. Every SUS score above 68 is considered as         test if the adaptions serve their purpose. First, we will test
above average, everything lower than 68 as below average         different weights for the two components. One idea is to in-
[13]. The average scores of each question, collected in our      crease the weight of the content-based version. The decrease
user feedback, are shown in Table 1.                             of the weight of the collaborative version does not exclude
   The score is calculated by adding the scores and multiply-    the event-based recommendations. Even if the collaborative
ing the sum with 2.5:                                            part does just count one third, it is able to provide recom-
        score = sum ∗ 2.5 = 34.8 ∗ 2.5 = 87               (7)    mendations because if the article is only recommended by
                                                                 our collaborative version, just the score of this component is
  The score of 87 exceeded our expectations although we          taken into account. If both components provide this article
attached great importance to the design and usability of         recommendation, the content-based version is more adapted
our system. This was required because the dashboard is           to the users interests. To find out which weight ratio is the
implemented on the live website of Sport1. Nevertheless the      best for our case, we have to analyze the implicit and ex-
users mentioned some desires concerning the usability. For       plicit user feedback for a longer time period. The evaluation
example, some users wished to change the design by choosing      of the weights is just meaningful if the feedback is collected
their own colors. Since these informations have not an direct    for a few months to avoid temporally fluctuation, which is
impact on our RS implementation, we will not deepen these        quite common in the news domain.
suggestions here.                                                   Furthermore, we want to implement a switching hybrid
                                                                 as well. If there is a new item, the collaborative filtering
4.4   Discussion                                                 method can not provide recommendations from the first sec-
   The user study results show that our content-based RS         ond. This is the strength of our content-based version. The
is a promising approach to suggest sports news to users.         RS has to switch to the content-based version if the arti-
The recommendations fit the users’ interests and improve         cle is newer than a specific date. Recommendations for a
when the users provide more feedback. Nevertheless, the          new user are calculated by our collaborative filtering com-
diversity of the recommended articles remains to be low.         ponent to handle new users as well as the preferences at the
This is a typical problem of pure content-based RS and can       first use of the system are not sufficient to compute pure
be overcome by using a hybrid solution. We extended the          content-based recommendations. If a larger user profile is
RS by a collaborative filtering component which increased        constructed and an article is not published in the last min-
the diversity of the recommendations.                            utes, the combination of both techniques will be applied as
   As described before, it is very important that a news RS      described before.
provides current articles. Especially in sports, the environ-       We tested our first developments in a two-week user study.
ment is very dynamic and the news topics are changing all        Our content-based RS has been tested with live users while
the time. For that reason the system can not be a pure col-      the hybrid approach was only accessible for a selected user
laborative RS. With collaborative filtering it is almost im-     group. In future, we want to conduct larger studies with
possible to recommend new items. But this problem can be         more users for all algorithms we develop. Our first results
solved by using a content-based component as well. Content-      will serve as the baseline for future extensions and other
based RSs can provide content that fits to the general in-       algorithms.
                              Table 1: Questions and Results of the SUS questionnaire
     Number   Question                                                                                     Average Score
        1     I think that I would like to use this system frequently.                                          3.4
        2     I found the system unnecessarily complex.                                                         3.3
        3     I thought the system was easy to use.                                                             3.6
        4     I think that I would need the support of a technical person to be able to use this system.        3.9
        5     I found the various functions in this system were well integrated.                                3.3
        6     I thought there was too much inconsistency in this system.                                        3.1
        7     I would imagine that most people would learn to use this system very quickly.                     3.6
        8     I found the system very cumbersome to use.                                                        3.6
        9     I felt very confident using the system.                                                           3.2
       10     I needed to learn a lot of things before I could get going with this system.                      3.8


6.   REFERENCES                                                       news recommendation based on click behavior. In
                                                                      Proceedings of the 15th International Conference on
 [1] G. Adomavicius and A. Tuzhilin. Toward the next
                                                                      Intelligent User Interfaces, 2010.
     generation of recommender systems: A survey of the
                                                                 [11] Ö. Özgöbek, J. A. Gulla, and R. C. Erdur. A survey
     state-of-the-art and possible extensions. IEEE Trans.
                                                                      on challenges and methods in news recommendation.
     on Knowl. and Data Eng., 17(6):734–749, June 2005.
                                                                      In Proceedings of the 10th International Conference on
 [2] Y. A. Asikin and W. Wörndl. Stories around you:
                                                                      Web Information Systems and Technologies, pages
     Location-based serendipitous recommendation of news
                                                                      278–285, 2014.
     articles. In Proceedings of 2nd International Workshop
                                                                 [12] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl.
     on News Recommendation and Analytics, 2014.
                                                                      Item-based collaborative filtering recommendation
 [3] J. Brooke. SUS-A quick and dirty usability scale.
                                                                      algorithms. In Proceedings of the 10th International
     Usability evaluation in industry, pages 189–194, 1996.
                                                                      Conference on World Wide Web, WWW ’01, pages
 [4] R. Burke. Hybrid recommender systems: Survey and                 285–295, New York, NY, USA, 2001. ACM.
     experiments. User Modeling and User-Adapted
                                                                 [13] J. Sauro. Measuring Usability with the System
     Interaction, 12(4):331–370, Nov. 2002.
                                                                      Usability Scale (SUS), 2011. Retrieved June 20, 2016
 [5] I. Cantador, A. Bellogı́n, and P. Castells. News@hand:           from http://www.measuringu.com/sus.php.
     A semantic web approach to recommending news. In
                                                                 [14] X. Su and T. M. Khoshgoftaar. A survey of
     W. Nejdl, J. Kay, P. Pu, and E. Herder, editors,
                                                                      collaborative filtering techniques. Hindawi Publishing
     Adaptive Hypermedia and Adaptive Web-Based
                                                                      Corporation, 2009, 2009.
     Systems: 5th International Conference, AH 2008,
                                                                 [15] A.-H. Tan and C. Teo. Learning user profiles for
     Hannover, Germany, July 29 - August 1, 2008.
                                                                      personalized information dissemination. In Neural
     Proceedings, pages 279–283. Springer Berlin
                                                                      Networks Proceedings, 1998. IEEE World Congress on
     Heidelberg, Berlin, Heidelberg, 2008.
                                                                      Computational Intelligence. The 1998 IEEE
 [6] I. Cantador, A. Bellogı́n, and P. Castells.
                                                                      International Joint Conference on, volume 1, pages
     Ontology-based personalised and context-aware
                                                                      183–188, May 1998.
     recommendations of news items. In Proceedings of the
     2008 IEEE/WIC/ACM International Conference on
     Web Intelligence and Intelligent Agent Technology -
     Volume 01, WI-IAT ’08, pages 562–565, Washington,
     DC, USA, 2008. IEEE Computer Society.
 [7] M. Claypool, A. Gokhale, T. Miranda, P. Murnikov,
     D. Netes, and M. Sartin. Combining content-based
     and collaborative filters in an online newspaper. In
     Proceedings of ACM SIGIR workshop on recommender
     systems, volume 60. Citeseer, 1999.
 [8] T. De Pessemier, S. Leroux, K. Vanhecke, and
     L. Martens. Combining collaborative filtering and
     search engine into hybrid news recommendations. In
     Proceedings of the 3rd International Workshop on
     News Recommendation and Analytics (INRA 2015)
     co-located with 9th ACM Conference on Recommender
     Systems (RecSys 2015), Vienna, Austria, September
     20, 2015., pages 14–19, 2015.
 [9] J. A. Konstan, B. N. Miller, D. Maltz, J. L. Herlocker,
     L. R. Gordon, and J. Riedl. Grouplens: Applying
     collaborative filtering to usenet news. Commun. ACM,
     40(3):77–87, Mar. 1997.
[10] J. Liu, P. Dolan, and E. R. Pedersen. Personalized