=Paper=
{{Paper
|id=Vol-1609/16090679
|storemode=property
|title=Clicks Pattern Analysis for Online News Recommendation Systems
|pdfUrl=https://ceur-ws.org/Vol-1609/16090679.pdf
|volume=Vol-1609
|authors=Jing Yuan,Andreas Lommatzsch,Benjamin Kille
|dblpUrl=https://dblp.org/rec/conf/clef/YuanLK16
}}
==Clicks Pattern Analysis for Online News Recommendation Systems==
<pdf width="1500px">https://ceur-ws.org/Vol-1609/16090679.pdf</pdf>
<pre>
      Clicks Pattern Analysis for Online News
             Recommendation Systems

              Jing Yuan1 , Andreas Lommatzsch1 , Benjamin Kille1
               1
                DAI-Labor, Technische Universtät Berlin, Germany
       {jing.yuan, andreas.lommatzsch, benjamin.kille}@dai-labor.de


      Abstract. The NewsREEL challenge provides researchers with an op-
      portunity to evaluate their news recommending algorithms live based on
      real users’ feedback. Since 2014, participants evaluated a variety of ap-
      proaches on the Open Recommendation Platform (ORP), yet popularity-
      based algorithms constitute the most successful ones. In this working
      note, we chronologically describe our participation in NewsREEL on-
      line task in the year 2016. With approaches including “most impressed”,
      “newest”, “most impressed by category”, “content similar” and “most
      clicked”, we reconfirm that content relevance is not a very good indicator
      for recommending news. Meanwhile, for the dominating portal Sport1,
      the extrapolation of the time series of impressions and clicks enables us
      to predict the items most likely to be clicked in the next hours. A sample
      analysis on one week data shows us that the duration of an item being
      popular is much longer than we expected. Thus, we propose that when
      designing recommenders in this contest, more attention should be paid
      on the time series patterns of clicks and impressions.


1   Introduction
News, as important media content, still keeps its role of guiding social opinion,
even in modern world which is full of virtual social network and personal ideas.
Many news providers employ recommender systems and similar personalization
techniques to assist users in finding relevant news quickly and conveniently.
    Different ways to incorporate recommendations in news publishers have been
successfully launched in the current digital news content market. We exemplify
three ways in which recommendations are pushed to news consumers. First, as
an e-Magazine Provider, Flipboard aggregates news contents from different third
party providers and then selects news which is relevant to a user’s pre-defined
topics forming their personalized news board. Second, some Content Providers,
such as ByteDance, generate contents themselves. They recommend in a closed
system based on internal users, news, and interaction in between both. Third,
Recommendation Providers, e.g. plista and outbrain, offer recommendation
services for different kinds of websites, including news websites. Table 1 com-
pares characteristics of the three main-stream news recommenders concerning
aspects such as whether they generate content by themselves, the stability of
users, and the stable range of news items, respectively. As a representative of
Recommendation Providers, plista manifests its non-trivial condition in terms
of variety of news portals and differences in users’ expectations. Considering that
NewsREEL competition receives data stream from plista, participants have to
cope with all these knotty conditions to win the contest[8].


            Table 1: Characteristics of Main-stream News Recommender


                                         Generating     Stability    Stable Range
        News Recommender
                                          Content       of Users       of News

e-Magazine Provider (e.g. Flipboard)          7             3              7
News Content Provider (e.g. Bytedance)        3             3              3
Recommendation Provider (e.g. plista)         7             7              7


    The NewsREEL challenge 2016 provides participants with the chance of
evaluating recommender algorithms with online live user feedback [4, 3]. In the
challenge, teams registered on Open Recommendation Platform (ORP) receive
streamed messages describing published news articles, users’ impressions and
clicks on items, as well as recommendation requests from plista. The challeng-
ing aspects of participating NewsREEL include: (1) recommendations must be
provided in 100ms upon request; (2) participants need to deal with news portals
from different domains; (3) user groups on specific portal alter; (4) number of
messages varies largely among portals [8, 5].
    In contrast to recommending movies or music, news items continuously emerge
and become outdated constituting a dynamic environment. This makes the
NewsREEL competition particularly challenging. Algorithms have to consider
these dynamics in news articles and users’ preferences. We focused on popular-
ity and freshness to cope with the dynamics following the notion that users prefer
important and recent news over insignificant and outdated articles. The success
of the “most clicked” strategy in terms of CTR further supports this notion.
Even though the method is rather simple, it captures crucial aspects. Visualizing
clicks on items over time, we observe continued click activity stretching several
hours for popular items. We compared contents of popular items and discovered
that they overlap. Still, content-based algorithms have failed to benefit of these
overlaps in previous editions of NewsREEL.
    The remainder of this paper is structured as follows. In Section 2, we briefly
introduce the approaches we used in year 2016 and discuss other algorithms
developed in previous years. Subsequently, we analyze characteristic user-item
interaction patterns for different news portals in Section 3, and found that “most
clicked items” has its own power of self-predicting. Finally, conclusion and an
outlook to future work are given in Section 4.
2   Approach Used
In this section, we chronologically describe the approaches we have deployed
in ORP, i.e. the online task of NewsREEL2016, and changes in our thoughts
meanwhile. When the most simple approach “most clicked” finally shows its
power to outperform other algorithms, it attracts our interest to dig deeper into
clicks pattern from the perspective of time series analysis in the next section.

Most Impressed Inspired by the good performance of “baseline” in the past
years (see [9]), which directly uses the most recently impressed items as recom-
mendation candidates, we implemented a similar method by sorting the 2000
most recent impressions by their frequencies. Typically, this approach is called
“most popular”, but to distinguish it from “most clicked” which will be intro-
duced later on, we refer to it as “most impressed” in this paper. The approach
ran on ORP for two weeks (January 31 to February 13, 2016), and got the CTR
1.21% (ranked 3rd, team “artificial intelligence” got the first place with CTR
1.48%) and 1.35% (ranked 2nd, team “abc” got the first place with CTR 1.4%)
in these two weeks separately.

Newest Considering that freshness represents a vital aspect of news, we also
implemented an approach “newest” which provides the most recently created
items from the same category as the currently visited item as recommendation.
Given the good performance of “most impressed” mentioned above, we used it as
an alternative solution when the request lacked an item id, i.e. the category can-
not be determined. In addition, for a recommendation request with 6 candidate
slots, 3 positions are still filled by “most impressed” approach. Therefore, this
approach can be seen as a simple ensemble of “most impressed” and “newest”.
With this solution, from 21–27 February, our team “news ctr” got CTR 1.19%
(ranked 5th, team “artificial intelligence” got the first place with CTR 1.45%)
in the contest leader board.

Most Impressed by Category After witnessing how “newest” weakened the
effect of “most impressed”, we conducted another experiment which only consid-
ered the number of impressions, but separates the impression counts according to
categories, thus for the recommendation request with item id the recommending
targets will only be the “most impressed” items in the relevant category. The
approach ran on ORP for three weeks, from 6–12 March 2016 it got CTR of
0.82% (ranked 7th, team “is@uniol” got the first place with CTR 1.03%), from
13–19 March 2016 it got CTR of 0.97% (ranked 11th, i.e. the last one, team
“xyz” got the first place with CTR 1.85%) and from 20–26 March 2016 it got
CTR 1.24%(ranked 6th, team “xyz” got first place with CTR 2.16%).

Content Similar Having confirmed that considering categories in combination
with popularity lead to worse performances, we implemented a pure content-
based recommender using Apache Lucene to see how content relevance influence
recommending effect after all. We deployed this content-based recommender on
ORP and noticed that from 27 March to 2 April it got a CTR of 0.77% (ranked
11th, team “xyz” got the first place with CTR 1.51%). This confirmed that in
real-time news recommendation scenario as in this contest, pure content similar-
ity is not sufficient for a successful recommending strategy. Said et al. [10] came
to the same conclusion hypothesizing that content similarity fails to pick up on
new stories but redirects users to similar contents.

Most Clicked While varying on different algorithms, we discovered an interest-
ing phenomenon through the clicks message we received from ORP. Even though
different contest teams used different algorithms, the clicked items for all of these
recommendations tended to be similar. This consistent regularity reminded us
to think whether characteristic patterns exist within clicks along the time axis.
Hence we implemented the simplest approach “most clicked” which only serves
the most frequently clicked items in the last hour to the recommendation re-
quests. From 3–9 April 2016, this simple approach got a CTR of 1.14% winning
the leader board ahead of “xyz” (0.96%). Figure 1 shows the result during this
week.


            Fig. 1: news ctr got first place in the period 3–9 April, 2016


    Having observed this interesting phenomenon, we looked into previous work
related to this contest. Kliegr and Kuchař [6, 7] implemented an approach based
on association rules. They used contextual features (e.g. ISP, OS, GEO, WEEKDAY,
LANG, ZIP, and CLASS) to train the rule engine. The results obtained in the online
evaluation indicate that association rules do not outperform other algorithms.
Through the investigation in [2], Gebremeskel and de Vries found that there is
no striking improvement through including geographic information on news rec-
ommendation, yet more randomness of the system should be taken into account
when considering evaluation for recommenders. Doychev et al. introduced their
6 popularity-based and 6 similarity-based approaches in [1], but their algorithms
seemed to perform poorly compared with “baseline” due to being influenced by
content aggregating. As Said et al. concluded in [10] that news article readers
might be reluctant to be confronted with similar topic all the way, but more
pleased to be distracted by something breaking or interesting. In the following
section, we are digging into how this breaking phenomenon is reflected in clicks
behavior.


3     Clicks Pattern Analysis

As a further exploration of click patterns, we focus on clicks following recom-
mendation requests in Sport1 and Tagesspiegel on April 5, 2016. Clicks in Sport1
follow more obvious and stable trends. We analyze this consistency for the “most
clicked” recommender’s suggestions in terms of the Jaccard similarity of tempo-
rally adjacent item groups.


3.1   Clicks Pattern for Sport1

First, we draw the histogram of clicks regarding recommendation requests on
items in portal Sport1. Considering that plista has only delivered part of all rec-
ommendation requests to ORP participants, we suppose that the click patterns
might slightly differ amid contest teams scope and the whole plista scope. ORP
hides such scope information in its click notification JSON “context.simple” ob-
ject where key number ’41’ stands for “contest team” and value number repre-
sents specific team number. For instance, “news ctr” is the contest team with
team number 2465, while team number -1 signals that the click happened out-
side contest team range. Thus, the figures are drawn separately by these two
scales: contest teams scale excluding clicks outside the contest scope and whole
plista range without any restrictions on contest team.
    Figure 2 shows the click conditions among contest teams. In order to track the
top clicked items in a time sequence, we draw the figure for each hour for April
5, 2016, i.e. 24 subfigures covering the whole day. In each subfigure, news items
are located on the x-axis as points sorted by the click frequency in descending
order. A red vertical line separates the six most frequently clicked items—most
recommendation requests ask for six suggestions. For a majority of intervals, we
observe a power law distribution. Few highly popular items occupy a majority
of clicks. The percentage of clicks occupied by the top six items is shown in the
red boxes.
    We analyze how popularity transitions into the future. Therefore, we high-
light the item ids of the six most popular items in the top right corner of each
                          Items Clicked Condition in              Sport1       for Contest Teams on April 5, 2016
           0:00 - 1:00       6
                                    1:00 - 2:00      6
                                                             2:00 - 3:00       7
                                                                                       3:00 - 4:00      4.0
                                                                                                               4:00 - 5:00      9
                                                                                                                                        5:00 - 6:00
   10   77.14% 273681975
               273707540     5   81.25% 274109826
                                        273681975    5    85.71% 273681975
                                                                 274110878     6    88.89% 274110878
                                                                                           273681975 3.5 58.82% 273795547
                                                                                                                273681975
                                                                                                                273795547       8
                                                                                                                                7
                                                                                                                                     91.67% 274110878
                                                                                                                                            273681975
   8            274110878                273877703                 274109826   5            274001468
                                                                                            274001468 3.0           274008188
                                                                                                                    274008188                274142904
                274109826    4           274110878   4             274062086
                                                                   274062086                274109826 2.5           273707540   6            274146517
                                                                                                                                             274146517
   6            273877703                273707540                 273926210   4            274115260
                                                                                            274115260               274135490
                                                                                                                    274135490   5            273951845
                                                                                                                                             273951845
                274142904
                274142904    3           274086605
                                         274086605   3             274128003
                                                                   274128003                273926210 2.0           274142904   4            273707540
                                                                               3                        1.5
   4                         2                       2                                                                          3
                                                                               2                        1.0
   2                         1                       1                                                                          2
                                                                               1                        0.5                     1
   0                         0                       0                         0                        0.0                     0
   16
           6:00 - 7:00              7:00 - 8:00              8:00 - 9:00       14
                                                                                      9:00 - 10:00      16
                                                                                                              10:00 - 11:00     12
                                                                                                                                       11:00 - 12:00
   14   74.47% 273681975
               274110878 25 80.0%
                                         273681975
                                         274110878   15
                                                          81.16% 273681975
                                                                 274110878 12 88.24% 274110878 14 72.92% 274110878 10 69.57% 274190378
                                                                                     273707540           273707540           273707540
   12           274142904   20           273707540                 273707540   10           273681975   12          274135490                274110878
   10           274135490                274135490                 274133196
                                                                   274133196                274135490   10          274190378   8            274172365
                274133196
                274133196   15           273877703   10            273877703    8           274172365               274190652                273795547
                                                                                                                                             273795547
    8           273707540                274142904                 274135490                274109826
                                                                                            274109826    8          274172365   6            274190652
    6                                                                           6                        6
                            10                                                                                                  4
    4                                                 5                         4                        4
    2                        5                                                  2                        2                      2
    0                        0                       0                          0                        0                      0
          12:00 - 13:00            13:00 - 14:00            14:00 - 15:00             15:00 - 16:00           16:00 - 17:00            17:00 - 18:00
                                                                               25
  30
  25
        66.12% 273707540
               274224023 40 76.0%
                                         274224023
                                         273770166
                                                     20   78.3%    274207242
                                                                   274224023        88.41% 274224023
                                                                                           273877703
                                                                                                     30 87.38% 274224023
                                                                                                               273770166 15 77.27% 273770166
                                                                                                                                   274275555
                273770166                274207242                 273770166   20           274227548
                                                                                            274227548   25          274219226                274224023
                274190378   30           273877703   15            273877703                273770166               273877703                273877703
  20                                                                           15                       20
                274207242                274110878                 274219226                274219226               274062086   10           274062086
  15            274219226   20           274219226   10            274110878                274197901
                                                                                            274197901   15          274242801
                                                                                                                    274242801                274110878
                                                                               10
  10                                                                                                    10                       5
                            10                        5                         5
   5                                                                                                     5
   0                         0                       0                         0                         0                      0
          18:00 - 19:00            19:00 - 20:00            20:00 - 21:00             21:00 - 22:00     35
                                                                                                              22:00 - 23:00            23:00 - 24:00
        83.58% 274224023                               274224023 20 79.22% 274224023
               273877703 14 75.41% 274275555 15 74.67% 273877703           273877703 30 88.0%                                        78.03% 273877703
  20                     16        274224023                                                                        274224023   25
                                                                                                                    273877703               274224023
   15
                274275555   12           273877703                 274280808   15           274110878   25          274331454   20           274110878
                273770166                273770166                 274275555                274275555               274110878                274331454
                274110878   10           274280808                 273952709                274280808   20          274280808   15           274340273
                274280808    8           274110878   10            274110878   10           273952709               274275555                274280808
   10                                                                                                   15
                             6                                                                                                  10
                             4                        5                         5                       10
   5                                                                                                                             5
                             2                                                                           5
   0                         0                       0                         0                         0                      0


           Fig. 2: Top clicks condition of contest teams on Sport1 on April 5, 2016


subfigure. We color item ids to facilitate tracking individual items throughout
the plot. Items tinted in gray only appear in a single one hour interval. From
Figure 2, we see that on April 5, 2016, items ranked in the top three manifest
more continuity, i.e. they are more likely to re-appear in the next hour’s top
clicked 6 items group.
    Aside from the scope of contest teams in ORP, we are also interested in the
power law distribution of clicked items in the whole plista range. In Figure 3,
we find that along with the increasing number of distinct clicked items in the
“whole” range, the power law distribution of clicked items is even more signif-
icant. In all one hour time windows, more than 87% clicks are contributed by
the top 6 items. The more complete data may cause the increased steepness of
the histograms. The distribution can be described by Zipf’s law. The significant
advantage of the six most frequently clicked items reminded us that we should
pay more attention to the short head with higher business value. Thereby, we
can keep a relatively high CTR. Still, more sophisticated methods are required
to leverage the potential of the long tail.
    When focusing on the most frequently clicked items within this one day, we
find some clues for the future work. First, Table 2 illustrates the four items oc-
curring most frequently in the top 6 group. It lists their item id, the duration
contained in the top 6, their ranking trends, and the date they had been created.
All four items remained in the top 6 for at least eleven hours. We had expected
considerably less time as news continuously emerge. We notice fluctuating rank-
ings of these four items. Recognizing patterns in shifting rankings will be subject
to future work. The dates of creation subvert our previous expectations. We as-
                           Items Clicked Condition in              Sport1        for      Whole Range on April 5, 2016
           0:00 - 1:00               1:00 - 2:00             2:00 - 3:00                  3:00 - 4:00              4:00 - 5:00              5:00 - 6:00
                            200
  400   89.76% 273681975
               273707540          87.81% 273681975
                                         273707540 120 89.66% 273707540 100 86.51% 273707540 120 88.22% 273707540
                                                              273681975            273681975            273681975 250 94.76% 273681975
                                                                                                                             273707540
                 273877703 150            273770166 100             273877703   80              273877703 100            273770166 200           273770166
  300            273770166                273877703                 273770166                   273770166                273877703               273877703
                 274110878                273795547 80
                                          273795547                 274110878   60              273880301 80
                                                                                                273880301                273926210 150
                                                                                                                         273926210               274110878
  200            273880301 100
                 273880301                274110878 60              273926210
                                                                    273926210                   274110878 60             274008188
                                                                                                                         274008188               273795547
                                                                                40                                                 100
                             50                      40                                                    40
  100                                                                            20                                                 50
                                                     20                                                    20
    0                         0                       0                          0                          0                        0
           6:00 - 7:00               7:00 - 8:00             8:00 - 9:00                 9:00 - 10:00             10:00 - 11:00            11:00 - 12:00
  600   92.69%   273681975 800    93.4%   273681975 700   93.08%    273681975 600      91.91%   273681975 500   87.57%   273707540 600   89.23% 273707540
                 273707540 700            273707540 600             273707540                   273707540                273770166 500          273770166
  500            273770166 600            273877703                 273770166 500               273770166 400            273877703               273877703
  400            273877703 500            273770166 500             273877703                   273877703                273681975 400           274110878
                 274110878                274110878 400             274110878 400               274110878 300            274110878               273795547
                                                                                                                                                 273795547
  300            273795547 400            273795547                 273795547 300               274062086                274062086 300           274062086
                            300                     300                                                   200
  200                                               200                         200                                                200
                            200
  100                       100                     100                         100                       100                      100
    0                         0                       0                           0                         0                        0
          12:00 - 13:00             13:00 - 14:00           14:00 - 15:00                15:00 - 16:00            16:00 - 17:00            17:00 - 18:00
                                                                                                                                   600
  700   88.71% 273707540
               273770166 600 88.89% 273877703
                         700        273770166 600 89.9%             273770166
                                                                    273877703 500      93.19% 273770166
                                                                                              273877703 600 94.19% 273877703 500 92.39% 273877703
                                                                                                        700        273770166            273770166
  600            273877703                274110878 500             274110878                   274110878                274110878               274110878
  500            274110878 500            274062086 400             274062086 400               274062086 500            274062086 400           274062086
  400            274062086 400            274224023                 274224023 300               274224023 400            274224023               274224023
                 273926210
                 273926210                274008188 300
                                          274008188                 273795547
                                                                    273795547                   274008188                274008188 300           273795547
  300                       300                                                                           300
                                                    200                         200                                                200
  200                       200                                                                           200
  100                       100                     100                         100                       100                      100
    0                         0                       0                          0                          0                        0
          18:00 - 19:00             19:00 - 20:00           20:00 - 21:00                21:00 - 22:00            22:00 - 23:00            23:00 - 24:00
                 273770166 400                     400
  400
        93.5%    273877703 350    93.38% 273770166
                                         273877703 350 89.5%
                                                                    273877703 600
                                                                    274110878          93.42% 273877703
                                                                                              274110878 800 91.98% 274110878 800 92.02% 274110878
                                                                                                                   273877703            273877703
                 274110878 300            274110878 300             274062086 500               274062086                274224023               274224023
  300            274062086 250            274062086 250             273770166                   274224023 600            274062086 600           274062086
                 274224023                274224023                 274224023 400               274008188                274219226               274219226
                 273795547 200            273795547 200             274219226 300               274219226 400            274008188 400           274275555
  200                       150                     150
                            100                     100                         200
  100                                                                                                     200                      200
                             50                      50                         100
    0                         0                       0                           0                         0                        0


            Fig. 3: Top clicks condition of whole range on Sport1 on April 5, 2016


sumed that news would remain relevant for a very limited time. In contrast,
news articles created on April 2, 2016, dominated the top 6 news three days
later. This indicates a noticeably longer life-cycle of news than we anticipated.


                         Table 2: Stable top clicked items condition on April 5, 2016


                Item Id            Being in Top6                                 Ranking                             Created At

            273681975                     0:00–11:00                  Most of the time 1st                         2nd, Apr. 2016
            273707540                     0:00–13:00                      2nd → 1st                                2nd, Apr. 2016
            273877703                     0:00–24:00                 3rd/4th → 2nd → 1st                           3rd, Apr. 2016
            273770166                     0:00–21:00                 2nd → 3rd/4th → 1st                           2nd, Apr. 2016


    Next, we analyze the categories, titles, and descriptions of these four items
in order to get a better understanding of the contents. Table 3 highlights co-
occurring terms in these four items. Among those, we see that the breaking news
that coach Pep Guadiola will be leaving Bayern attracted many users’ interest on
relevant articles. We hypothesize that content similarity affects recommenders,
but only for popular items. Still, a majority of users only pays attention to the
most popular articles which is why pure content-based recommenders frequently
suggest articles with minor click chances.
                Table 3: Stable top clicked items description on April 5, 2016


      Item Id       273681975        273707540        273877703        273770166

                                     intenational-
      category        fussball                           fussball         fussball
                                        fussball
                    Guardiola         Van Gaal       Robben:“Van Mittelfeldbestie
        title      macht Götze       watscht di     Gaal ist wie Götze: Wechsel
                       froh           Maria ab       Guardiola” nur im Notfall.
                                           Als
                                                     Arjen Robben
                    Der Coach des Rekordtransfer                     Mario Götzes
                                                     vergleicht den
                     FC Bayern        geholt, nach                      beherzter
                                                       ehemaligen
                   lässt Youngster nur einer Saison                  Auftritt gegen
                                                     Bayern-Coach
                     Felix Götze wieder vom Hof                     Frankfurt zeigt:
                                                       Louis van
                     erstmals mit    gejagt. Jetzt                     Er will sich
                                                     Gaal mit dem
                       den Profis       geht die                     unbedingt beim
        text                                            aktuellen
                   trainieren. Javi    Geschichte                      FC Bayern
                                                      Trainer Pep
                    Martinez und        zwischen                       durchsetzen.
                                                      Guardiola.
                    Manuel Neuer United-Trainer                         Der Verein
                                                      Mit dem FC
                    stehen derweil Louis van Gaal                     mauert noch
                                                      Bayern will
                        vor dem      und Angel di                     beim Thema
                                                      Robben noch
                      Comeback.       Maria in die                       Transfer.
                                                     viel erreichen.
                                     Verlängerung.


3.2    Jaccard Similarity Between Clicks in Neighbor Hours
In this subsection, we quantify the continuity of most frequently clicked items
and analyze this continuity behavior concerning contextual factors such as time
of day and day of week. Jaccard Similarity, as defined in Equation 1, is a metric
to measure the similarity of two sets A and B. The value of this metric equates
to the cardinality of the intersection divided by the size of union of these two
sets. In our scenario, A and B refer to the sets of the six most frequently clicked
items of two neighboring one hour time slots. The higher the Jaccard similarity,
the more items users constantly are concerned with across neighbor hours.


                                       |A ∩ B|         |A ∩ B|
                    Jaccard(A, B) =            =                                        (1)
                                       |A ∪ B|   |A| + |B| − |A ∩ B|

    We expand our view from a single day to the week 3–9 April, 2016. Thereby,
we obtain 24 × 7 = 168 one hour time windows. Thereof, we derive 167 pairs of
subsequent time windows to compute the average Jaccard similarity. Figure 4
illustrates our findings overall, for specific times of day, and for each weekday. We
distinguish the contest scope and the whole plista scope by cornflowerblue and
violet colors. Throughout the three subfigures, we noted that the plista scope’s
Jaccard metric exceeds the contest scope. The gap is most obvious in the night
(0:00–8:00). Still, we have to consider the fact that the night has relatively few
interactions compared with the day time. Independent of context, we observe
Jaccard scores in the range of 40–60%. These signal that more than half of the
most popular items re-occur in the next hour’s top 6 group. Thus, recommending
popular items guarantees a good chance to perform well. This explains the good
performance of the “baseline” in previous editions of NewsREEL.


                                    1.0
                                                 Total                                 1.0
                                                                                                     Time of Day                                             1.0
                                                                                                                                                                            Day of Week
                                          Contest Teams                                          Contest Teams                                                          Contest Teams
                                          Whole                                                  Whole                                                                  Whole
       Average Jaccard Similarity


                                                          Average Jaccard Similarity


                                                                                                                                Average Jaccard Similarity
                                    0.8                                                0.8                                                                   0.8


                                    0.6                                                0.6                                                                   0.6


                                    0.4                                                0.4                                                                   0.4


                                    0.2                                                0.2                                                                   0.2


                                    0.0                                                0.0                                                                   0.0
                                                  Total                                      0:00-8:00 8:00-16:00 16:00-24:00                                      Sun. Mon. Tue. Wed.Thur. Fri. Sat.


Fig. 4: Jaccard similarity of top clicked items between continuous hours for Sport1 in
                                the week 3–9 April, 2106


3.3   Predicting Ability of Impressions and Clicks
Empirically speaking, “Most Impressed” approach always performs well on CTR,
thus we compare the predicting ability of impressions and clicks regarding clicks
in the next hour. As the six most frequently clicked items receive more than 80%
of all hourly clicks, we define predicting ability here by the Jaccard similarity
between the set of recommended items and the set of the six most frequently
clicked items in a specific hour. The six items most frequently viewed in the
last hour form the recommending set “Most Impressed”. On the other hand, the
six items most frequently clicked after having been suggested in the last hour
characterize the recommending set “Most Clicked”. Figure 5 shows both meth-
ods’ performances over time. The cyan curve refers to “Most Clicked” while the
magenta line refers to “Most Impressed”. The upper subfigure shows the compar-
ison of “Most Impressed” and “Most Clicked” in the range “contest teams”, and
the bottom subfigure presents the same comparison in range “whole plista”.
“Most Clicked” outperforms “Most Impressed” in both scenarios. This indicates
that at least on 5 April 2016, users’ reactions to recommendations let the system
better predict future clicks than what they read.

3.4   Clicks Pattern for Tagesspiegel
Hitherto, we focused on Sport1. We repeated our experiments for the second
largest publisher—Tagesspiegel. Figure 6 shows a considerably lesser number of
clicks compared with Sport1. Some one hour intervals have less than six clicks
in total. Even considering the whole plista range of clicks, Figure 7 shows a
                                          Predicting Ability of 'Most Impressed' vs. 'Most Clicked' within Contest Teams on April 5, 2016
                                 1.0    recommended set: most clicked 6 items in last hour
                                        recommending set: most impressed 6 items in last hour


    Jaccard Similarity between
      Recommended Set and
       Most Clicked 6 Items
                                 0.8


            in this Hour
                                 0.6


                                 0.4


                                 0.2


                                 0.0
                                 0:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:00 10:0011:0012:0013:0014:0015:0016:0017:0018:0019:0020:0021:0022:0023:0024:00
                                             Predicting Ability of 'Most Impressed' vs. 'Most Clicked' on Whole Range on April 5, 2016
                                 1.0
    Jaccard Similarity between
      Recommended Set and
       Most Clicked 6 Items


                                 0.8
            in this Hour


                                 0.6                                                                          recommended set: most clicked 6 items in last hour
                                                                                                              recommending set: most impressed 6 items in last hour
                                 0.4


                                 0.2


                                 0.0
                                 0:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:00 10:0011:0012:0013:0014:0015:0016:0017:0018:0019:0020:0021:0022:0023:0024:00


Fig. 5: “Most Impressed” vs. “Most Clicked” on predicting next Hour “Most Clicked”
                             in Sport1 on April 5, 2016


flatter power law distribution. The top items only account for 20–35% of clicks.
In addition, we observe more variation in the most frequently clicked items such
that many items appear only for a single hour in the top 6 group. We hypothesize
that the increased variation is caused by the higher diversity in topics. Sport1
exclusively provides sport-related news. Contrarily, Tagesspiegel covers a wide
range of topics including politics, economy, sports, and local news.


4      Conclusion and Future Work

In this working note, we describe our experience with the real-time news rec-
ommendation contest NewsREEL online task in 2016. Through evaluating ap-
proaches such as “Most Impressed”, “Newest”, “Most Impressed by Category”,
“Content Similar”, and “Most Clicked”, we found out that a small subset of
news items attracted most clicks. This holds true beyond the scope of individ-
ual algorithms. Hence we started analyzing the patterns of clicked items on the
dominating portals Sport1 and Tagesspiegel. In particular for Sport1, item pop-
ularity followed a power law distribution and items continued to be popular for
hours. This phenomenon was less pronounced on Tagesspiegel. Monitoring which
articles users clicked provided better information to predict future clicks than
tracking which articles users read. These observations inspire us to change the
perspective of implementing recommender from analyzing features and contex-
tual factors to investigating clicked items’ time series patterns. Thus, as long as
Sport1 continues to be the dominant news source in the contest, we can focus
on the following points as future work: (1) analyzing the duration regularity of
an item staying in the most clicked items group; (2) the ranking prediction of
                             Items Clicked Condition in Tagesspiegel for Contest Teams on April 5, 2016
    3.0
               0:00 - 1:00      3.0
                                         1:00 - 2:00       3.0
                                                                    2:00 - 3:00     2.0
                                                                                               3:00 - 4:00      3.0
                                                                                                                         4:00 - 5:00       2.0
                                                                                                                                                      5:00 - 6:00
    2.5   100.0% 273697047
                 273110073 2.5 100.0% 273697322 2.5 100.0%
                 273697047
                 273110073
                                      273836087
                                      273836087            273520214
                                                           273520214                      0%                    2.5   100.0% 272928513
                                                                                                                             272928513
                                                                                                                             273499850
                                                                                                                             273499850           0%
                    273697322                  274035840
                                               274035840                            1.5                                                    1.5
    2.0                         2.0                        2.0                                                  2.0
    1.5                         1.5                        1.5                      1.0                         1.5                        1.0
    1.0                         1.0                        1.0                                                  1.0
                                                                                    0.5                                                    0.5
    0.5                         0.5                        0.5                                                  0.5
    0.0                         0.0                        0.0                      0.0                         0.0                        0.0
    2.0
               6:00 - 7:00      3.0
                                         7:00 - 8:00       3.0
                                                                    8:00 - 9:00     3.0
                                                                                               9:00 - 10:00      5
                                                                                                                        10:00 - 11:00      4.0
                                                                                                                                                   11:00 - 12:00
          0%                    2.5   100.0%   273415977
                                               273415977
                                               274060060 2.5
                                               274060060         100.0%   272900321
                                                                          272900321
                                                                          273570247 2.5
                                                                          273570247       100.0%    273415978
                                                                                                    273415978
                                                                                                    273697047
                                                                                                    273697047    4
                                                                                                                      81.82%   273673523
                                                                                                                               273189241
                                                                                                                               273189241   3.5   100.0% 273673523
                                                                                                                                                        274035845
                                                                                                                                                        274035845
    1.5                                        273593252
                                               273593252                  273990503
                                                                          273990503                                            273316112
                                                                                                                               273316112   3.0             273698751
                                                                                                                                                           273698751
                                2.0                        2.0                      2.0                                        273316114
                                                                                                                               273316114   2.5             273697964
                                                                                                                                                           273697964
                                                                                                                 3             274144707
                                                                                                                               274144707
    1.0                         1.5                        1.5                      1.5                                        273520214
                                                                                                                               273520214   2.0
                                                                                                                 2                         1.5
                                1.0                        1.0                      1.0
    0.5                                                                                                          1                         1.0
                                0.5                        0.5                      0.5                                                    0.5
    0.0                         0.0                        0.0                      0.0                          0                         0.0
    3.0
            12:00 - 13:00       4.0
                                        13:00 - 14:00      3.0
                                                                   14:00 - 15:00    3.0
                                                                                            15:00 - 16:00        8
                                                                                                                        16:00 - 17:00      3.0
                                                                                                                                                   17:00 - 18:00
    2.5   100.0% 273110073
                 273110073 3.5 100.0% 274234181
                 274126037
                 274126037            273836087 2.5 100.0% 273593248
                                      273836087            273593248 2.5 85.71% 272928517
                                                           272953019
                                                           272953019            273836087
                                                                                272928517                        7    100.0% 273697558
                                                                                                                             273836087 2.5 100.0% 273593252
                                                                                                                             273697558            273138987
                                                                                                                                                  273138987
                                                                                                                                                  273593252
                    273458037
                    273458037 3.0                                         272900324
                                                                          272900324                 274035849
                                                                                                    274035849    6             273189239
                                                                                                                               273189239                   274234181
                                                                                                                                                           274234181
    2.0             274144707 2.5
                    274144707                              2.0            274234181 2.0             273189241
                                                                                                    273189241    5             273673523 2.0
                                                                                                                               273673523                   273110071
                                                                                                                                                           273110071
                    273138989
                    273138989                                             274083234
                                                                          274083234                 273368057
                                                                                                    273368057                  274083237
                                                                                                                               274083237                   274035847
                                                                                                                                                           274035847
    1.5             273593251 2.0
                    273593251                              1.5                      1.5             273547435
                                                                                                    273547435    4             273189248 1.5
                                                                                                                               273189248                   273316119
                                                                                                                                                           273316119
    1.0                         1.5                        1.0                      1.0                          3                         1.0
                                1.0                                                                              2
    0.5                         0.5                        0.5                      0.5                          1                         0.5
    0.0                         0.0                        0.0                      0.0                          0                         0.0
     5
            18:00 - 19:00       3.0
                                        19:00 - 20:00      3.0
                                                                   20:00 - 21:00    3.0
                                                                                            21:00 - 22:00       3.0
                                                                                                                        22:00 - 23:00      4.0
                                                                                                                                                   23:00 - 24:00
     4
          90.0%     273232180
                    273232180
                    273836087 2.5
                    273836087         100.0% 273836088
                                             273086578 2.5 100.0% 273393614
                                             273836088
                                             273086578            273393614 2.5 100.0% 273631823
                                                                  273138991
                                                                  273138991
                                                                                       273631823 2.5 75.0%
                                                                                       274126035
                                                                                       274126035                               273520214
                                                                                                                               273520214
                                                                                                                               273836087
                                                                                                                               273836087   3.5   100.0% 273698756
                                                                                                                                                        274035850
                    273110073
                    273110073                  273697322
                                               273697322                  274283027
                                                                          274283027                 273343967
                                                                                                    273343967                  273138991
                                                                                                                               273138991   3.0             273138989
     3              273189241 2.0
                    273189241                  273189246 2.0
                                               273189246                  273743269 2.0
                                                                          273743269                             2.0            273673523
                                                                                                                               273673523   2.5             273697964
                    273593251
                    273593251                  274234175
                                               274234175                  273209444
                                                                          273209444                                            273743269
                                                                                                                               273743269                   273547435
                    273368056 1.5
                    273368056                              1.5                      1.5                         1.5            273189241
                                                                                                                               273189241   2.0             273316112
     2                                                                                                                                     1.5
                                1.0                        1.0                      1.0                         1.0
     1                                                                                                                                     1.0
                                0.5                        0.5                      0.5                         0.5                        0.5
     0                          0.0                        0.0                      0.0                         0.0                        0.0


 Fig. 6: Top clicks condition of contest teams range on Tagesspiegel on April 5, 2016


an item of being popular; (3) making use of long tail to find relevant features
and contexts.


5         Acknowledgement

The work of the first author has been continuously funded by China Scholarship
Council (CSC). The research leading to these results is partially supported by the
CrowdRec project, which has received funding from the European Union Seventh
Framework Program FP7/2007–2013 under grant agreement No. 610594.


References

 1. D. Doychev, R. Rafter, A. Lawlor, and B. Smyth. News recommenders: Real-time,
    real-life experiences. In Proceedings of UMAP 2015, pages 337–342, 2015.
 2. G. Gebremeskel and A. P. de Vries. The degree of randomness in a live recom-
    mender systems evaluation. In Working Notes of CLEF 2015 - Conference and
    Labs of the Evaluation forum, Toulouse, France, September 8-11, 2015. CEUR,
    2015.
 3. F. Hopfgartner, T. Brodt, J. Seiler, B. Kille, A. Lommatzsch, M. Larson, R. Turrin,
    and A. Serény. Benchmarking news recommendations: The clef newsreel use case.
    SIGIR Forum, 49(2):129–136, Jan. 2016.
 4. B. Kille, A. Lommatzsch, G. Gebremeskel, F. Hopfgartner, M. Larson, J. Seiler,
    D. Malagoli, A. Serény, T. Brodt, and A. de Vries. Overview of newsreel’16:
    Multi-dimensional evaluation of real-time stream-recommendation algorithms. In
                         Items Clicked Condition in Tagesspiegel for                  Whole Range on April 5, 2016
   7
          0:00 - 1:00      6
                                  1:00 - 2:00      4.0
                                                            2:00 - 3:00      3.0
                                                                                      3:00 - 4:00      3.0
                                                                                                                4:00 - 5:00      4.0
                                                                                                                                          5:00 - 6:00
   6   32.14% 273110073
              273110073
              273232180    5   34.15% 273836087
                                      273836087
                                      273232180    3.5   63.64% 273836088
                                                                273836088
                                                                273164106
                                                                273164106    2.5   54.55% 273022318
                                                                                          272953027 2.5 66.67% 274104219
                                                                                          273022318            273697187 3.5 40.0%
                                                                                                               273697187
                                                                                                               274104219
                                                                                                                                               273316114
                                                                                                                                               273316114
                                                                                                                                               273593251
                                                                                                                                               273593251
   5           273547436
               273547436               274035840
                                       274035840   3.0           273499850
                                                                 273499850                 273547436
                                                                                           273547436                 272953027 3.0             273189241
                                                                                                                                               273189241
               273836088
               273836088   4           273673523
                                       273673523   2.5           273593245
                                                                 273593245   2.0           273368055 2.0
                                                                                           273368055                 272928513 2.5
                                                                                                                     272928513                 274126035
   4           273697322               273759107
                                       273759107                 273698754
                                                                 273698754                 273697322
                                                                                           273697322                 274126035                 273232180
                                                                                                                                               273232180
               273138987
               273138987   3           273697322   2.0           273138987
                                                                 273138987   1.5           273456648 1.5
                                                                                           273456648                 273164110 2.0
                                                                                                                     273164110                 273697047
                                                                                                                                               273697047
   3                                               1.5                                                                           1.5
   2                       2                                                 1.0                       1.0
                                                   1.0                                                                           1.0
   1                       1                       0.5                       0.5                       0.5                       0.5
   0                       0                       0.0                       0.0                       0.0                       0.0
   5
          6:00 - 7:00      6
                                  7:00 - 8:00       6
                                                            8:00 - 9:00       7
                                                                                     9:00 - 10:00              10:00 - 11:00      8
                                                                                                                                         11:00 - 12:00
   4
       30.3%   273697322
               273189248
               273189248   5   28.3%   273697322
                                       273697964
                                       273697964    5    27.03% 273697322
                                                                273836087
                                                                273836087     6    23.42% 273441031
                                                                                          273441031 10 24.8%
                                                                                          273697322
                                                                                                                     273232180
                                                                                                                     274060057
                                                                                                                     274060057    7    26.67% 273673523
                                                                                                                                              273697964
                                                                                                                                              273697964
               273110074
               273110074               273593252
                                       273593252                 274010943
                                                                 274010943    5            273698751
                                                                                           273698751    8            274083237
                                                                                                                     274083237    6            273593251
                                                                                                                                               273593251
               273437466
               273437466   4           273457794    4            273138989                 273232180                 273189241
                                                                                                                     273189241    5            274035845
                                                                                                                                               274035845
   3           273697271
               273697271               273110073
                                       273110073                 273110074
                                                                 273110074    4            273673523    6            273673523                 273393614
                                                                                                                                               273393614
               273457794   3           273138989    3            274083237
                                                                 274083237                 273456648
                                                                                           273456648                 273697187
                                                                                                                     273697187    4            273164106
                                                                                                                                               273164106
   2                                                                          3                                                   3
                           2                        2                                                   4
                                                                              2                                                   2
   1                       1                        1                                                   2
                                                                              1                                                   1
   0                       0                        0                         0                         0                         0
   6
         12:00 - 13:00     6
                                 13:00 - 14:00      7
                                                           14:00 - 15:00      9
                                                                                     15:00 - 16:00     10
                                                                                                               16:00 - 17:00      7
                                                                                                                                         17:00 - 18:00
   5   18.89% 274126037
              274126037
              273189241
              273189241    5   20.88% 273086578
                                      273086578
                                      273697047
                                      273697047     6    21.43% 273836087
                                                                273836087
                                                                272928517
                                                                272928517     8
                                                                              7
                                                                                   24.58% 273232180
                                                                                          273232180
                                                                                          273138987
                                                                                          273138987     8
                                                                                                             30.91% 273836087
                                                                                                                    273836087
                                                                                                                    273697322     6    23.4%   274083237
                                                                                                                                               274083237
                                                                                                                                               273697187
                                                                                                                                               273697187
               272953024
               272953024               274104220
                                       274104220    5            273110073
                                                                 273110073                 273697322                 273697558
                                                                                                                     273697558    5            273673523
                                                                                                                                               273673523
   4           273458037
               273458037   4           273232180
                                       273232180                 272900327
                                                                 272900327    6            273697964                 273110073                 273138988
                                                                                                                                               273138988
               273138989
               273138989               273316112
                                       273316112    4            273458037
                                                                 273458037    5            273673523
                                                                                           273673523    6            273697964    4            273110073
   3           273209447
               273209447   3           273316117
                                       273316117                 274083234
                                                                 274083234    4            273138989
                                                                                           273138989                 273164106
                                                                                                                     273164106                 273909706
                                                                                                                                               273909706
                                                    3                                                   4                         3
   2                       2                        2                         3                                                   2
   1                       1                                                  2                         2
                                                    1                         1                                                   1
   0                       0                        0                         0                         0                         0
   8
         18:00 - 19:00     8
                                 19:00 - 20:00      6
                                                           20:00 - 21:00      9
                                                                                     21:00 - 22:00      7
                                                                                                               22:00 - 23:00      6
                                                                                                                                         23:00 - 24:00
   7   25.74% 273836087
              273836087
              273164106    7   26.25% 273697322
                                      273393614     5    25.0%   273393614
                                                                 273570250
                                                                 273570250    8
                                                                              7
                                                                                   31.53% 273611797
                                                                                          273611797
                                                                                          274144707
                                                                                          274144707     6    21.79% 273316117
                                                                                                                    273316117
                                                                                                                    274259916
                                                                                                                    274259916     5    21.84% 273368057
                                                                                                                                              273458037
   6           273232180
               273232180   6           274283026
                                       274283026                 273138989
                                                                 273138989                 273593253
                                                                                           273593253    5            273189247
                                                                                                                     273189247                 273456648
   5           273110073   5           274083237
                                       274083237    4            273189246    6            274104219
                                                                                           274104219                 273697322
                                                                                                                     273697322    4            274283026
               273697322               273164106                 273836087
                                                                 273836087    5            274060060
                                                                                           274060060    4            273316114
                                                                                                                     273316114                 274344245
   4           274234175
               274234175   4           273209447
                                       273209447    3            273164109
                                                                 273164109    4            273189246                 273743269
                                                                                                                     273743269    3            273909706
   3                       3                                                                            3
                                                    2                         3                         2                         2
   2                       2                                                  2
   1                       1                        1                         1                         1                         1
   0                       0                        0                         0                         0                         0


       Fig. 7: Top clicks condition of whole range on Tagesspiegel on April 5, 2016


    N. Fuhr, P. Quaresma, B. Larsen, T. Goncalves, K. Balog, C. Macdonald, L. Cap-
    pellato, and N. Ferro, editors, Experimental IR Meets Multilinguality, Multimodal-
    ity, and Interaction 7th International Conference of the CLEF Association, CLEF
    2016, Évora, Portugal, September 5-8, 2016. Springer, 2016.
 5. B. Kille, A. Lommatzsch, R. Turrin, A. Serény, M. Larson, T. Brodt, J. Seiler, and
    F. Hopfgartner. Stream-based recommendations: Online and offline evaluation
    as a service. In Proceedings of the Sixth International Conference of the CLEF
    Association, CLEF’15, pages 497–517, 2015.
 6. T. Kliegr and J. Kuchar. Benchmark of rule-based classifiers in the news recom-
    mendation task. In J. Mothe, J. Savoy, J. Kamps, K. Pinel-Sauvagnat, G. J. F.
    Jones, E. SanJuan, L. Cappellato, and N. Ferro, editors, CLEF, volume 9283 of
    Lecture Notes in Computer Science, pages 130–141. Springer, 2015.
 7. J. Kuchar and T. Kliegr. InBeat: Recommender System as a Service. In Working
    Notes for CLEF 2014 Conference, Sheffield, UK, September 15-18, 2014, pages
    837–844, 2014.
 8. A. Lommatzsch. Real-time news recommendation using context-aware ensembles.
    In Advances in Information Retrieval - 36th European Conference on IR Research,
    ECIR 2014, Amsterdam, The Netherlands, April 13-16, 2014. Proceedings, pages
    51–62, 2014.
 9. A. Lommatzsch. Real-time recommendations for user-item streams. In Proc. of
    the 30th Symposium On Applied Computing, SAC 2015, SAC ’15, pages 1039–1046,
    New York, NY, USA, 2015. ACM.
10. A. Said, A. Bellogı́n, J. Lin, and A. P. de Vries. Do recommendations matter?: news
    recommendation in real life. In Computer Supported Cooperative Work, CSCW ’14,
    Baltimore, MD, USA, February 15-19, 2014, Companion Volume, pages 237–240,
    2014.

</pre>