=Paper=
{{Paper
|id=Vol-1609/16090679
|storemode=property
|title=Clicks Pattern Analysis for Online News Recommendation Systems
|pdfUrl=https://ceur-ws.org/Vol-1609/16090679.pdf
|volume=Vol-1609
|authors=Jing Yuan,Andreas Lommatzsch,Benjamin Kille
|dblpUrl=https://dblp.org/rec/conf/clef/YuanLK16
}}
==Clicks Pattern Analysis for Online News Recommendation Systems==
Clicks Pattern Analysis for Online News Recommendation Systems Jing Yuan1 , Andreas Lommatzsch1 , Benjamin Kille1 1 DAI-Labor, Technische Universtät Berlin, Germany {jing.yuan, andreas.lommatzsch, benjamin.kille}@dai-labor.de Abstract. The NewsREEL challenge provides researchers with an op- portunity to evaluate their news recommending algorithms live based on real users’ feedback. Since 2014, participants evaluated a variety of ap- proaches on the Open Recommendation Platform (ORP), yet popularity- based algorithms constitute the most successful ones. In this working note, we chronologically describe our participation in NewsREEL on- line task in the year 2016. With approaches including “most impressed”, “newest”, “most impressed by category”, “content similar” and “most clicked”, we reconfirm that content relevance is not a very good indicator for recommending news. Meanwhile, for the dominating portal Sport1, the extrapolation of the time series of impressions and clicks enables us to predict the items most likely to be clicked in the next hours. A sample analysis on one week data shows us that the duration of an item being popular is much longer than we expected. Thus, we propose that when designing recommenders in this contest, more attention should be paid on the time series patterns of clicks and impressions. 1 Introduction News, as important media content, still keeps its role of guiding social opinion, even in modern world which is full of virtual social network and personal ideas. Many news providers employ recommender systems and similar personalization techniques to assist users in finding relevant news quickly and conveniently. Different ways to incorporate recommendations in news publishers have been successfully launched in the current digital news content market. We exemplify three ways in which recommendations are pushed to news consumers. First, as an e-Magazine Provider, Flipboard aggregates news contents from different third party providers and then selects news which is relevant to a user’s pre-defined topics forming their personalized news board. Second, some Content Providers, such as ByteDance, generate contents themselves. They recommend in a closed system based on internal users, news, and interaction in between both. Third, Recommendation Providers, e.g. plista and outbrain, offer recommendation services for different kinds of websites, including news websites. Table 1 com- pares characteristics of the three main-stream news recommenders concerning aspects such as whether they generate content by themselves, the stability of users, and the stable range of news items, respectively. As a representative of Recommendation Providers, plista manifests its non-trivial condition in terms of variety of news portals and differences in users’ expectations. Considering that NewsREEL competition receives data stream from plista, participants have to cope with all these knotty conditions to win the contest[8]. Table 1: Characteristics of Main-stream News Recommender Generating Stability Stable Range News Recommender Content of Users of News e-Magazine Provider (e.g. Flipboard) 7 3 7 News Content Provider (e.g. Bytedance) 3 3 3 Recommendation Provider (e.g. plista) 7 7 7 The NewsREEL challenge 2016 provides participants with the chance of evaluating recommender algorithms with online live user feedback [4, 3]. In the challenge, teams registered on Open Recommendation Platform (ORP) receive streamed messages describing published news articles, users’ impressions and clicks on items, as well as recommendation requests from plista. The challeng- ing aspects of participating NewsREEL include: (1) recommendations must be provided in 100ms upon request; (2) participants need to deal with news portals from different domains; (3) user groups on specific portal alter; (4) number of messages varies largely among portals [8, 5]. In contrast to recommending movies or music, news items continuously emerge and become outdated constituting a dynamic environment. This makes the NewsREEL competition particularly challenging. Algorithms have to consider these dynamics in news articles and users’ preferences. We focused on popular- ity and freshness to cope with the dynamics following the notion that users prefer important and recent news over insignificant and outdated articles. The success of the “most clicked” strategy in terms of CTR further supports this notion. Even though the method is rather simple, it captures crucial aspects. Visualizing clicks on items over time, we observe continued click activity stretching several hours for popular items. We compared contents of popular items and discovered that they overlap. Still, content-based algorithms have failed to benefit of these overlaps in previous editions of NewsREEL. The remainder of this paper is structured as follows. In Section 2, we briefly introduce the approaches we used in year 2016 and discuss other algorithms developed in previous years. Subsequently, we analyze characteristic user-item interaction patterns for different news portals in Section 3, and found that “most clicked items” has its own power of self-predicting. Finally, conclusion and an outlook to future work are given in Section 4. 2 Approach Used In this section, we chronologically describe the approaches we have deployed in ORP, i.e. the online task of NewsREEL2016, and changes in our thoughts meanwhile. When the most simple approach “most clicked” finally shows its power to outperform other algorithms, it attracts our interest to dig deeper into clicks pattern from the perspective of time series analysis in the next section. Most Impressed Inspired by the good performance of “baseline” in the past years (see [9]), which directly uses the most recently impressed items as recom- mendation candidates, we implemented a similar method by sorting the 2000 most recent impressions by their frequencies. Typically, this approach is called “most popular”, but to distinguish it from “most clicked” which will be intro- duced later on, we refer to it as “most impressed” in this paper. The approach ran on ORP for two weeks (January 31 to February 13, 2016), and got the CTR 1.21% (ranked 3rd, team “artificial intelligence” got the first place with CTR 1.48%) and 1.35% (ranked 2nd, team “abc” got the first place with CTR 1.4%) in these two weeks separately. Newest Considering that freshness represents a vital aspect of news, we also implemented an approach “newest” which provides the most recently created items from the same category as the currently visited item as recommendation. Given the good performance of “most impressed” mentioned above, we used it as an alternative solution when the request lacked an item id, i.e. the category can- not be determined. In addition, for a recommendation request with 6 candidate slots, 3 positions are still filled by “most impressed” approach. Therefore, this approach can be seen as a simple ensemble of “most impressed” and “newest”. With this solution, from 21–27 February, our team “news ctr” got CTR 1.19% (ranked 5th, team “artificial intelligence” got the first place with CTR 1.45%) in the contest leader board. Most Impressed by Category After witnessing how “newest” weakened the effect of “most impressed”, we conducted another experiment which only consid- ered the number of impressions, but separates the impression counts according to categories, thus for the recommendation request with item id the recommending targets will only be the “most impressed” items in the relevant category. The approach ran on ORP for three weeks, from 6–12 March 2016 it got CTR of 0.82% (ranked 7th, team “is@uniol” got the first place with CTR 1.03%), from 13–19 March 2016 it got CTR of 0.97% (ranked 11th, i.e. the last one, team “xyz” got the first place with CTR 1.85%) and from 20–26 March 2016 it got CTR 1.24%(ranked 6th, team “xyz” got first place with CTR 2.16%). Content Similar Having confirmed that considering categories in combination with popularity lead to worse performances, we implemented a pure content- based recommender using Apache Lucene to see how content relevance influence recommending effect after all. We deployed this content-based recommender on ORP and noticed that from 27 March to 2 April it got a CTR of 0.77% (ranked 11th, team “xyz” got the first place with CTR 1.51%). This confirmed that in real-time news recommendation scenario as in this contest, pure content similar- ity is not sufficient for a successful recommending strategy. Said et al. [10] came to the same conclusion hypothesizing that content similarity fails to pick up on new stories but redirects users to similar contents. Most Clicked While varying on different algorithms, we discovered an interest- ing phenomenon through the clicks message we received from ORP. Even though different contest teams used different algorithms, the clicked items for all of these recommendations tended to be similar. This consistent regularity reminded us to think whether characteristic patterns exist within clicks along the time axis. Hence we implemented the simplest approach “most clicked” which only serves the most frequently clicked items in the last hour to the recommendation re- quests. From 3–9 April 2016, this simple approach got a CTR of 1.14% winning the leader board ahead of “xyz” (0.96%). Figure 1 shows the result during this week. Fig. 1: news ctr got first place in the period 3–9 April, 2016 Having observed this interesting phenomenon, we looked into previous work related to this contest. Kliegr and Kuchař [6, 7] implemented an approach based on association rules. They used contextual features (e.g. ISP, OS, GEO, WEEKDAY, LANG, ZIP, and CLASS) to train the rule engine. The results obtained in the online evaluation indicate that association rules do not outperform other algorithms. Through the investigation in [2], Gebremeskel and de Vries found that there is no striking improvement through including geographic information on news rec- ommendation, yet more randomness of the system should be taken into account when considering evaluation for recommenders. Doychev et al. introduced their 6 popularity-based and 6 similarity-based approaches in [1], but their algorithms seemed to perform poorly compared with “baseline” due to being influenced by content aggregating. As Said et al. concluded in [10] that news article readers might be reluctant to be confronted with similar topic all the way, but more pleased to be distracted by something breaking or interesting. In the following section, we are digging into how this breaking phenomenon is reflected in clicks behavior. 3 Clicks Pattern Analysis As a further exploration of click patterns, we focus on clicks following recom- mendation requests in Sport1 and Tagesspiegel on April 5, 2016. Clicks in Sport1 follow more obvious and stable trends. We analyze this consistency for the “most clicked” recommender’s suggestions in terms of the Jaccard similarity of tempo- rally adjacent item groups. 3.1 Clicks Pattern for Sport1 First, we draw the histogram of clicks regarding recommendation requests on items in portal Sport1. Considering that plista has only delivered part of all rec- ommendation requests to ORP participants, we suppose that the click patterns might slightly differ amid contest teams scope and the whole plista scope. ORP hides such scope information in its click notification JSON “context.simple” ob- ject where key number ’41’ stands for “contest team” and value number repre- sents specific team number. For instance, “news ctr” is the contest team with team number 2465, while team number -1 signals that the click happened out- side contest team range. Thus, the figures are drawn separately by these two scales: contest teams scale excluding clicks outside the contest scope and whole plista range without any restrictions on contest team. Figure 2 shows the click conditions among contest teams. In order to track the top clicked items in a time sequence, we draw the figure for each hour for April 5, 2016, i.e. 24 subfigures covering the whole day. In each subfigure, news items are located on the x-axis as points sorted by the click frequency in descending order. A red vertical line separates the six most frequently clicked items—most recommendation requests ask for six suggestions. For a majority of intervals, we observe a power law distribution. Few highly popular items occupy a majority of clicks. The percentage of clicks occupied by the top six items is shown in the red boxes. We analyze how popularity transitions into the future. Therefore, we high- light the item ids of the six most popular items in the top right corner of each Items Clicked Condition in Sport1 for Contest Teams on April 5, 2016 0:00 - 1:00 6 1:00 - 2:00 6 2:00 - 3:00 7 3:00 - 4:00 4.0 4:00 - 5:00 9 5:00 - 6:00 10 77.14% 273681975 273707540 5 81.25% 274109826 273681975 5 85.71% 273681975 274110878 6 88.89% 274110878 273681975 3.5 58.82% 273795547 273681975 273795547 8 7 91.67% 274110878 273681975 8 274110878 273877703 274109826 5 274001468 274001468 3.0 274008188 274008188 274142904 274109826 4 274110878 4 274062086 274062086 274109826 2.5 273707540 6 274146517 274146517 6 273877703 273707540 273926210 4 274115260 274115260 274135490 274135490 5 273951845 273951845 274142904 274142904 3 274086605 274086605 3 274128003 274128003 273926210 2.0 274142904 4 273707540 3 1.5 4 2 2 3 2 1.0 2 1 1 2 1 0.5 1 0 0 0 0 0.0 0 16 6:00 - 7:00 7:00 - 8:00 8:00 - 9:00 14 9:00 - 10:00 16 10:00 - 11:00 12 11:00 - 12:00 14 74.47% 273681975 274110878 25 80.0% 273681975 274110878 15 81.16% 273681975 274110878 12 88.24% 274110878 14 72.92% 274110878 10 69.57% 274190378 273707540 273707540 273707540 12 274142904 20 273707540 273707540 10 273681975 12 274135490 274110878 10 274135490 274135490 274133196 274133196 274135490 10 274190378 8 274172365 274133196 274133196 15 273877703 10 273877703 8 274172365 274190652 273795547 273795547 8 273707540 274142904 274135490 274109826 274109826 8 274172365 6 274190652 6 6 6 10 4 4 5 4 4 2 5 2 2 2 0 0 0 0 0 0 12:00 - 13:00 13:00 - 14:00 14:00 - 15:00 15:00 - 16:00 16:00 - 17:00 17:00 - 18:00 25 30 25 66.12% 273707540 274224023 40 76.0% 274224023 273770166 20 78.3% 274207242 274224023 88.41% 274224023 273877703 30 87.38% 274224023 273770166 15 77.27% 273770166 274275555 273770166 274207242 273770166 20 274227548 274227548 25 274219226 274224023 274190378 30 273877703 15 273877703 273770166 273877703 273877703 20 15 20 274207242 274110878 274219226 274219226 274062086 10 274062086 15 274219226 20 274219226 10 274110878 274197901 274197901 15 274242801 274242801 274110878 10 10 10 5 10 5 5 5 5 0 0 0 0 0 0 18:00 - 19:00 19:00 - 20:00 20:00 - 21:00 21:00 - 22:00 35 22:00 - 23:00 23:00 - 24:00 83.58% 274224023 274224023 20 79.22% 274224023 273877703 14 75.41% 274275555 15 74.67% 273877703 273877703 30 88.0% 78.03% 273877703 20 16 274224023 274224023 25 273877703 274224023 15 274275555 12 273877703 274280808 15 274110878 25 274331454 20 274110878 273770166 273770166 274275555 274275555 274110878 274331454 274110878 10 274280808 273952709 274280808 20 274280808 15 274340273 274280808 8 274110878 10 274110878 10 273952709 274275555 274280808 10 15 6 10 4 5 5 10 5 5 2 5 0 0 0 0 0 0 Fig. 2: Top clicks condition of contest teams on Sport1 on April 5, 2016 subfigure. We color item ids to facilitate tracking individual items throughout the plot. Items tinted in gray only appear in a single one hour interval. From Figure 2, we see that on April 5, 2016, items ranked in the top three manifest more continuity, i.e. they are more likely to re-appear in the next hour’s top clicked 6 items group. Aside from the scope of contest teams in ORP, we are also interested in the power law distribution of clicked items in the whole plista range. In Figure 3, we find that along with the increasing number of distinct clicked items in the “whole” range, the power law distribution of clicked items is even more signif- icant. In all one hour time windows, more than 87% clicks are contributed by the top 6 items. The more complete data may cause the increased steepness of the histograms. The distribution can be described by Zipf’s law. The significant advantage of the six most frequently clicked items reminded us that we should pay more attention to the short head with higher business value. Thereby, we can keep a relatively high CTR. Still, more sophisticated methods are required to leverage the potential of the long tail. When focusing on the most frequently clicked items within this one day, we find some clues for the future work. First, Table 2 illustrates the four items oc- curring most frequently in the top 6 group. It lists their item id, the duration contained in the top 6, their ranking trends, and the date they had been created. All four items remained in the top 6 for at least eleven hours. We had expected considerably less time as news continuously emerge. We notice fluctuating rank- ings of these four items. Recognizing patterns in shifting rankings will be subject to future work. The dates of creation subvert our previous expectations. We as- Items Clicked Condition in Sport1 for Whole Range on April 5, 2016 0:00 - 1:00 1:00 - 2:00 2:00 - 3:00 3:00 - 4:00 4:00 - 5:00 5:00 - 6:00 200 400 89.76% 273681975 273707540 87.81% 273681975 273707540 120 89.66% 273707540 100 86.51% 273707540 120 88.22% 273707540 273681975 273681975 273681975 250 94.76% 273681975 273707540 273877703 150 273770166 100 273877703 80 273877703 100 273770166 200 273770166 300 273770166 273877703 273770166 273770166 273877703 273877703 274110878 273795547 80 273795547 274110878 60 273880301 80 273880301 273926210 150 273926210 274110878 200 273880301 100 273880301 274110878 60 273926210 273926210 274110878 60 274008188 274008188 273795547 40 100 50 40 40 100 20 50 20 20 0 0 0 0 0 0 6:00 - 7:00 7:00 - 8:00 8:00 - 9:00 9:00 - 10:00 10:00 - 11:00 11:00 - 12:00 600 92.69% 273681975 800 93.4% 273681975 700 93.08% 273681975 600 91.91% 273681975 500 87.57% 273707540 600 89.23% 273707540 273707540 700 273707540 600 273707540 273707540 273770166 500 273770166 500 273770166 600 273877703 273770166 500 273770166 400 273877703 273877703 400 273877703 500 273770166 500 273877703 273877703 273681975 400 274110878 274110878 274110878 400 274110878 400 274110878 300 274110878 273795547 273795547 300 273795547 400 273795547 273795547 300 274062086 274062086 300 274062086 300 300 200 200 200 200 200 200 100 100 100 100 100 100 0 0 0 0 0 0 12:00 - 13:00 13:00 - 14:00 14:00 - 15:00 15:00 - 16:00 16:00 - 17:00 17:00 - 18:00 600 700 88.71% 273707540 273770166 600 88.89% 273877703 700 273770166 600 89.9% 273770166 273877703 500 93.19% 273770166 273877703 600 94.19% 273877703 500 92.39% 273877703 700 273770166 273770166 600 273877703 274110878 500 274110878 274110878 274110878 274110878 500 274110878 500 274062086 400 274062086 400 274062086 500 274062086 400 274062086 400 274062086 400 274224023 274224023 300 274224023 400 274224023 274224023 273926210 273926210 274008188 300 274008188 273795547 273795547 274008188 274008188 300 273795547 300 300 300 200 200 200 200 200 200 100 100 100 100 100 100 0 0 0 0 0 0 18:00 - 19:00 19:00 - 20:00 20:00 - 21:00 21:00 - 22:00 22:00 - 23:00 23:00 - 24:00 273770166 400 400 400 93.5% 273877703 350 93.38% 273770166 273877703 350 89.5% 273877703 600 274110878 93.42% 273877703 274110878 800 91.98% 274110878 800 92.02% 274110878 273877703 273877703 274110878 300 274110878 300 274062086 500 274062086 274224023 274224023 300 274062086 250 274062086 250 273770166 274224023 600 274062086 600 274062086 274224023 274224023 274224023 400 274008188 274219226 274219226 273795547 200 273795547 200 274219226 300 274219226 400 274008188 400 274275555 200 150 150 100 100 200 100 200 200 50 50 100 0 0 0 0 0 0 Fig. 3: Top clicks condition of whole range on Sport1 on April 5, 2016 sumed that news would remain relevant for a very limited time. In contrast, news articles created on April 2, 2016, dominated the top 6 news three days later. This indicates a noticeably longer life-cycle of news than we anticipated. Table 2: Stable top clicked items condition on April 5, 2016 Item Id Being in Top6 Ranking Created At 273681975 0:00–11:00 Most of the time 1st 2nd, Apr. 2016 273707540 0:00–13:00 2nd → 1st 2nd, Apr. 2016 273877703 0:00–24:00 3rd/4th → 2nd → 1st 3rd, Apr. 2016 273770166 0:00–21:00 2nd → 3rd/4th → 1st 2nd, Apr. 2016 Next, we analyze the categories, titles, and descriptions of these four items in order to get a better understanding of the contents. Table 3 highlights co- occurring terms in these four items. Among those, we see that the breaking news that coach Pep Guadiola will be leaving Bayern attracted many users’ interest on relevant articles. We hypothesize that content similarity affects recommenders, but only for popular items. Still, a majority of users only pays attention to the most popular articles which is why pure content-based recommenders frequently suggest articles with minor click chances. Table 3: Stable top clicked items description on April 5, 2016 Item Id 273681975 273707540 273877703 273770166 intenational- category fussball fussball fussball fussball Guardiola Van Gaal Robben:“Van Mittelfeldbestie title macht Götze watscht di Gaal ist wie Götze: Wechsel froh Maria ab Guardiola” nur im Notfall. Als Arjen Robben Der Coach des Rekordtransfer Mario Götzes vergleicht den FC Bayern geholt, nach beherzter ehemaligen lässt Youngster nur einer Saison Auftritt gegen Bayern-Coach Felix Götze wieder vom Hof Frankfurt zeigt: Louis van erstmals mit gejagt. Jetzt Er will sich Gaal mit dem den Profis geht die unbedingt beim text aktuellen trainieren. Javi Geschichte FC Bayern Trainer Pep Martinez und zwischen durchsetzen. Guardiola. Manuel Neuer United-Trainer Der Verein Mit dem FC stehen derweil Louis van Gaal mauert noch Bayern will vor dem und Angel di beim Thema Robben noch Comeback. Maria in die Transfer. viel erreichen. Verlängerung. 3.2 Jaccard Similarity Between Clicks in Neighbor Hours In this subsection, we quantify the continuity of most frequently clicked items and analyze this continuity behavior concerning contextual factors such as time of day and day of week. Jaccard Similarity, as defined in Equation 1, is a metric to measure the similarity of two sets A and B. The value of this metric equates to the cardinality of the intersection divided by the size of union of these two sets. In our scenario, A and B refer to the sets of the six most frequently clicked items of two neighboring one hour time slots. The higher the Jaccard similarity, the more items users constantly are concerned with across neighbor hours. |A ∩ B| |A ∩ B| Jaccard(A, B) = = (1) |A ∪ B| |A| + |B| − |A ∩ B| We expand our view from a single day to the week 3–9 April, 2016. Thereby, we obtain 24 × 7 = 168 one hour time windows. Thereof, we derive 167 pairs of subsequent time windows to compute the average Jaccard similarity. Figure 4 illustrates our findings overall, for specific times of day, and for each weekday. We distinguish the contest scope and the whole plista scope by cornflowerblue and violet colors. Throughout the three subfigures, we noted that the plista scope’s Jaccard metric exceeds the contest scope. The gap is most obvious in the night (0:00–8:00). Still, we have to consider the fact that the night has relatively few interactions compared with the day time. Independent of context, we observe Jaccard scores in the range of 40–60%. These signal that more than half of the most popular items re-occur in the next hour’s top 6 group. Thus, recommending popular items guarantees a good chance to perform well. This explains the good performance of the “baseline” in previous editions of NewsREEL. 1.0 Total 1.0 Time of Day 1.0 Day of Week Contest Teams Contest Teams Contest Teams Whole Whole Whole Average Jaccard Similarity Average Jaccard Similarity Average Jaccard Similarity 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0.0 0.0 0.0 Total 0:00-8:00 8:00-16:00 16:00-24:00 Sun. Mon. Tue. Wed.Thur. Fri. Sat. Fig. 4: Jaccard similarity of top clicked items between continuous hours for Sport1 in the week 3–9 April, 2106 3.3 Predicting Ability of Impressions and Clicks Empirically speaking, “Most Impressed” approach always performs well on CTR, thus we compare the predicting ability of impressions and clicks regarding clicks in the next hour. As the six most frequently clicked items receive more than 80% of all hourly clicks, we define predicting ability here by the Jaccard similarity between the set of recommended items and the set of the six most frequently clicked items in a specific hour. The six items most frequently viewed in the last hour form the recommending set “Most Impressed”. On the other hand, the six items most frequently clicked after having been suggested in the last hour characterize the recommending set “Most Clicked”. Figure 5 shows both meth- ods’ performances over time. The cyan curve refers to “Most Clicked” while the magenta line refers to “Most Impressed”. The upper subfigure shows the compar- ison of “Most Impressed” and “Most Clicked” in the range “contest teams”, and the bottom subfigure presents the same comparison in range “whole plista”. “Most Clicked” outperforms “Most Impressed” in both scenarios. This indicates that at least on 5 April 2016, users’ reactions to recommendations let the system better predict future clicks than what they read. 3.4 Clicks Pattern for Tagesspiegel Hitherto, we focused on Sport1. We repeated our experiments for the second largest publisher—Tagesspiegel. Figure 6 shows a considerably lesser number of clicks compared with Sport1. Some one hour intervals have less than six clicks in total. Even considering the whole plista range of clicks, Figure 7 shows a Predicting Ability of 'Most Impressed' vs. 'Most Clicked' within Contest Teams on April 5, 2016 1.0 recommended set: most clicked 6 items in last hour recommending set: most impressed 6 items in last hour Jaccard Similarity between Recommended Set and Most Clicked 6 Items 0.8 in this Hour 0.6 0.4 0.2 0.0 0:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:00 10:0011:0012:0013:0014:0015:0016:0017:0018:0019:0020:0021:0022:0023:0024:00 Predicting Ability of 'Most Impressed' vs. 'Most Clicked' on Whole Range on April 5, 2016 1.0 Jaccard Similarity between Recommended Set and Most Clicked 6 Items 0.8 in this Hour 0.6 recommended set: most clicked 6 items in last hour recommending set: most impressed 6 items in last hour 0.4 0.2 0.0 0:00 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:00 10:0011:0012:0013:0014:0015:0016:0017:0018:0019:0020:0021:0022:0023:0024:00 Fig. 5: “Most Impressed” vs. “Most Clicked” on predicting next Hour “Most Clicked” in Sport1 on April 5, 2016 flatter power law distribution. The top items only account for 20–35% of clicks. In addition, we observe more variation in the most frequently clicked items such that many items appear only for a single hour in the top 6 group. We hypothesize that the increased variation is caused by the higher diversity in topics. Sport1 exclusively provides sport-related news. Contrarily, Tagesspiegel covers a wide range of topics including politics, economy, sports, and local news. 4 Conclusion and Future Work In this working note, we describe our experience with the real-time news rec- ommendation contest NewsREEL online task in 2016. Through evaluating ap- proaches such as “Most Impressed”, “Newest”, “Most Impressed by Category”, “Content Similar”, and “Most Clicked”, we found out that a small subset of news items attracted most clicks. This holds true beyond the scope of individ- ual algorithms. Hence we started analyzing the patterns of clicked items on the dominating portals Sport1 and Tagesspiegel. In particular for Sport1, item pop- ularity followed a power law distribution and items continued to be popular for hours. This phenomenon was less pronounced on Tagesspiegel. Monitoring which articles users clicked provided better information to predict future clicks than tracking which articles users read. These observations inspire us to change the perspective of implementing recommender from analyzing features and contex- tual factors to investigating clicked items’ time series patterns. Thus, as long as Sport1 continues to be the dominant news source in the contest, we can focus on the following points as future work: (1) analyzing the duration regularity of an item staying in the most clicked items group; (2) the ranking prediction of Items Clicked Condition in Tagesspiegel for Contest Teams on April 5, 2016 3.0 0:00 - 1:00 3.0 1:00 - 2:00 3.0 2:00 - 3:00 2.0 3:00 - 4:00 3.0 4:00 - 5:00 2.0 5:00 - 6:00 2.5 100.0% 273697047 273110073 2.5 100.0% 273697322 2.5 100.0% 273697047 273110073 273836087 273836087 273520214 273520214 0% 2.5 100.0% 272928513 272928513 273499850 273499850 0% 273697322 274035840 274035840 1.5 1.5 2.0 2.0 2.0 2.0 1.5 1.5 1.5 1.0 1.5 1.0 1.0 1.0 1.0 1.0 0.5 0.5 0.5 0.5 0.5 0.5 0.0 0.0 0.0 0.0 0.0 0.0 2.0 6:00 - 7:00 3.0 7:00 - 8:00 3.0 8:00 - 9:00 3.0 9:00 - 10:00 5 10:00 - 11:00 4.0 11:00 - 12:00 0% 2.5 100.0% 273415977 273415977 274060060 2.5 274060060 100.0% 272900321 272900321 273570247 2.5 273570247 100.0% 273415978 273415978 273697047 273697047 4 81.82% 273673523 273189241 273189241 3.5 100.0% 273673523 274035845 274035845 1.5 273593252 273593252 273990503 273990503 273316112 273316112 3.0 273698751 273698751 2.0 2.0 2.0 273316114 273316114 2.5 273697964 273697964 3 274144707 274144707 1.0 1.5 1.5 1.5 273520214 273520214 2.0 2 1.5 1.0 1.0 1.0 0.5 1 1.0 0.5 0.5 0.5 0.5 0.0 0.0 0.0 0.0 0 0.0 3.0 12:00 - 13:00 4.0 13:00 - 14:00 3.0 14:00 - 15:00 3.0 15:00 - 16:00 8 16:00 - 17:00 3.0 17:00 - 18:00 2.5 100.0% 273110073 273110073 3.5 100.0% 274234181 274126037 274126037 273836087 2.5 100.0% 273593248 273836087 273593248 2.5 85.71% 272928517 272953019 272953019 273836087 272928517 7 100.0% 273697558 273836087 2.5 100.0% 273593252 273697558 273138987 273138987 273593252 273458037 273458037 3.0 272900324 272900324 274035849 274035849 6 273189239 273189239 274234181 274234181 2.0 274144707 2.5 274144707 2.0 274234181 2.0 273189241 273189241 5 273673523 2.0 273673523 273110071 273110071 273138989 273138989 274083234 274083234 273368057 273368057 274083237 274083237 274035847 274035847 1.5 273593251 2.0 273593251 1.5 1.5 273547435 273547435 4 273189248 1.5 273189248 273316119 273316119 1.0 1.5 1.0 1.0 3 1.0 1.0 2 0.5 0.5 0.5 0.5 1 0.5 0.0 0.0 0.0 0.0 0 0.0 5 18:00 - 19:00 3.0 19:00 - 20:00 3.0 20:00 - 21:00 3.0 21:00 - 22:00 3.0 22:00 - 23:00 4.0 23:00 - 24:00 4 90.0% 273232180 273232180 273836087 2.5 273836087 100.0% 273836088 273086578 2.5 100.0% 273393614 273836088 273086578 273393614 2.5 100.0% 273631823 273138991 273138991 273631823 2.5 75.0% 274126035 274126035 273520214 273520214 273836087 273836087 3.5 100.0% 273698756 274035850 273110073 273110073 273697322 273697322 274283027 274283027 273343967 273343967 273138991 273138991 3.0 273138989 3 273189241 2.0 273189241 273189246 2.0 273189246 273743269 2.0 273743269 2.0 273673523 273673523 2.5 273697964 273593251 273593251 274234175 274234175 273209444 273209444 273743269 273743269 273547435 273368056 1.5 273368056 1.5 1.5 1.5 273189241 273189241 2.0 273316112 2 1.5 1.0 1.0 1.0 1.0 1 1.0 0.5 0.5 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0.0 Fig. 6: Top clicks condition of contest teams range on Tagesspiegel on April 5, 2016 an item of being popular; (3) making use of long tail to find relevant features and contexts. 5 Acknowledgement The work of the first author has been continuously funded by China Scholarship Council (CSC). The research leading to these results is partially supported by the CrowdRec project, which has received funding from the European Union Seventh Framework Program FP7/2007–2013 under grant agreement No. 610594. References 1. D. Doychev, R. Rafter, A. Lawlor, and B. Smyth. News recommenders: Real-time, real-life experiences. In Proceedings of UMAP 2015, pages 337–342, 2015. 2. G. Gebremeskel and A. P. de Vries. The degree of randomness in a live recom- mender systems evaluation. In Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, Toulouse, France, September 8-11, 2015. CEUR, 2015. 3. F. Hopfgartner, T. Brodt, J. Seiler, B. Kille, A. Lommatzsch, M. Larson, R. Turrin, and A. Serény. Benchmarking news recommendations: The clef newsreel use case. SIGIR Forum, 49(2):129–136, Jan. 2016. 4. B. Kille, A. Lommatzsch, G. Gebremeskel, F. Hopfgartner, M. Larson, J. Seiler, D. Malagoli, A. Serény, T. Brodt, and A. de Vries. Overview of newsreel’16: Multi-dimensional evaluation of real-time stream-recommendation algorithms. In Items Clicked Condition in Tagesspiegel for Whole Range on April 5, 2016 7 0:00 - 1:00 6 1:00 - 2:00 4.0 2:00 - 3:00 3.0 3:00 - 4:00 3.0 4:00 - 5:00 4.0 5:00 - 6:00 6 32.14% 273110073 273110073 273232180 5 34.15% 273836087 273836087 273232180 3.5 63.64% 273836088 273836088 273164106 273164106 2.5 54.55% 273022318 272953027 2.5 66.67% 274104219 273022318 273697187 3.5 40.0% 273697187 274104219 273316114 273316114 273593251 273593251 5 273547436 273547436 274035840 274035840 3.0 273499850 273499850 273547436 273547436 272953027 3.0 273189241 273189241 273836088 273836088 4 273673523 273673523 2.5 273593245 273593245 2.0 273368055 2.0 273368055 272928513 2.5 272928513 274126035 4 273697322 273759107 273759107 273698754 273698754 273697322 273697322 274126035 273232180 273232180 273138987 273138987 3 273697322 2.0 273138987 273138987 1.5 273456648 1.5 273456648 273164110 2.0 273164110 273697047 273697047 3 1.5 1.5 2 2 1.0 1.0 1.0 1.0 1 1 0.5 0.5 0.5 0.5 0 0 0.0 0.0 0.0 0.0 5 6:00 - 7:00 6 7:00 - 8:00 6 8:00 - 9:00 7 9:00 - 10:00 10:00 - 11:00 8 11:00 - 12:00 4 30.3% 273697322 273189248 273189248 5 28.3% 273697322 273697964 273697964 5 27.03% 273697322 273836087 273836087 6 23.42% 273441031 273441031 10 24.8% 273697322 273232180 274060057 274060057 7 26.67% 273673523 273697964 273697964 273110074 273110074 273593252 273593252 274010943 274010943 5 273698751 273698751 8 274083237 274083237 6 273593251 273593251 273437466 273437466 4 273457794 4 273138989 273232180 273189241 273189241 5 274035845 274035845 3 273697271 273697271 273110073 273110073 273110074 273110074 4 273673523 6 273673523 273393614 273393614 273457794 3 273138989 3 274083237 274083237 273456648 273456648 273697187 273697187 4 273164106 273164106 2 3 3 2 2 4 2 2 1 1 1 2 1 1 0 0 0 0 0 0 6 12:00 - 13:00 6 13:00 - 14:00 7 14:00 - 15:00 9 15:00 - 16:00 10 16:00 - 17:00 7 17:00 - 18:00 5 18.89% 274126037 274126037 273189241 273189241 5 20.88% 273086578 273086578 273697047 273697047 6 21.43% 273836087 273836087 272928517 272928517 8 7 24.58% 273232180 273232180 273138987 273138987 8 30.91% 273836087 273836087 273697322 6 23.4% 274083237 274083237 273697187 273697187 272953024 272953024 274104220 274104220 5 273110073 273110073 273697322 273697558 273697558 5 273673523 273673523 4 273458037 273458037 4 273232180 273232180 272900327 272900327 6 273697964 273110073 273138988 273138988 273138989 273138989 273316112 273316112 4 273458037 273458037 5 273673523 273673523 6 273697964 4 273110073 3 273209447 273209447 3 273316117 273316117 274083234 274083234 4 273138989 273138989 273164106 273164106 273909706 273909706 3 4 3 2 2 2 3 2 1 1 2 2 1 1 1 0 0 0 0 0 0 8 18:00 - 19:00 8 19:00 - 20:00 6 20:00 - 21:00 9 21:00 - 22:00 7 22:00 - 23:00 6 23:00 - 24:00 7 25.74% 273836087 273836087 273164106 7 26.25% 273697322 273393614 5 25.0% 273393614 273570250 273570250 8 7 31.53% 273611797 273611797 274144707 274144707 6 21.79% 273316117 273316117 274259916 274259916 5 21.84% 273368057 273458037 6 273232180 273232180 6 274283026 274283026 273138989 273138989 273593253 273593253 5 273189247 273189247 273456648 5 273110073 5 274083237 274083237 4 273189246 6 274104219 274104219 273697322 273697322 4 274283026 273697322 273164106 273836087 273836087 5 274060060 274060060 4 273316114 273316114 274344245 4 274234175 274234175 4 273209447 273209447 3 273164109 273164109 4 273189246 273743269 273743269 3 273909706 3 3 3 2 3 2 2 2 2 2 1 1 1 1 1 1 0 0 0 0 0 0 Fig. 7: Top clicks condition of whole range on Tagesspiegel on April 5, 2016 N. Fuhr, P. Quaresma, B. Larsen, T. Goncalves, K. Balog, C. Macdonald, L. Cap- pellato, and N. Ferro, editors, Experimental IR Meets Multilinguality, Multimodal- ity, and Interaction 7th International Conference of the CLEF Association, CLEF 2016, Évora, Portugal, September 5-8, 2016. Springer, 2016. 5. B. Kille, A. Lommatzsch, R. Turrin, A. Serény, M. Larson, T. Brodt, J. Seiler, and F. Hopfgartner. Stream-based recommendations: Online and offline evaluation as a service. In Proceedings of the Sixth International Conference of the CLEF Association, CLEF’15, pages 497–517, 2015. 6. T. Kliegr and J. Kuchar. Benchmark of rule-based classifiers in the news recom- mendation task. In J. Mothe, J. Savoy, J. Kamps, K. Pinel-Sauvagnat, G. J. F. Jones, E. SanJuan, L. Cappellato, and N. Ferro, editors, CLEF, volume 9283 of Lecture Notes in Computer Science, pages 130–141. Springer, 2015. 7. J. Kuchar and T. Kliegr. InBeat: Recommender System as a Service. In Working Notes for CLEF 2014 Conference, Sheffield, UK, September 15-18, 2014, pages 837–844, 2014. 8. A. Lommatzsch. Real-time news recommendation using context-aware ensembles. In Advances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014, Amsterdam, The Netherlands, April 13-16, 2014. Proceedings, pages 51–62, 2014. 9. A. Lommatzsch. Real-time recommendations for user-item streams. In Proc. of the 30th Symposium On Applied Computing, SAC 2015, SAC ’15, pages 1039–1046, New York, NY, USA, 2015. ACM. 10. A. Said, A. Bellogı́n, J. Lin, and A. P. de Vries. Do recommendations matter?: news recommendation in real life. In Computer Supported Cooperative Work, CSCW ’14, Baltimore, MD, USA, February 15-19, 2014, Companion Volume, pages 237–240, 2014.