=Paper=
{{Paper
|id=Vol-1866/paper_111
|storemode=property
|title=Recommending News Articles in the CLEF News Recommendation Evaluation Lab with the Data Stream Management System Odysseus
|pdfUrl=https://ceur-ws.org/Vol-1866/paper_111.pdf
|volume=Vol-1866
|authors=Cornelius A. Ludmann
|dblpUrl=https://dblp.org/rec/conf/clef/Ludmann17
}}
==Recommending News Articles in the CLEF News Recommendation Evaluation Lab with the Data Stream Management System Odysseus==
<pdf width="1500px">https://ceur-ws.org/Vol-1866/paper_111.pdf</pdf>
<pre>
  Recommending News Articles in the CLEF
News Recommendation Evaluation Lab with the
 Data Stream Management System Odysseus

                           Cornelius A. Ludmann
                    cornelius.ludmann@uni-oldenburg.de

                             University of Oldenburg
                         Department of Computer Science
                      Escherweg 2, 26121 Oldenburg, Germany


      Abstract. A crucial aspect of recommending news articles is the rele-
      vance of currentness of articles. Every day, news portals add plenty of new
      articles. Typically, users are more interested in recently published articles
      (or articles that provide background information to recently published
      articles) than in older ones. That leads to the demand to continuously
      adapt the set of recommendable items in a recommender system.
      In this paper, we share our experiences with the usage of the generic and
      open source Data Stream Management System (DSMS) Odysseus as a
      Recommender System in the CLEF NewsREEL 2017. We continuously
      calculate the currently most read articles based on a stream of impression
      events. Our approach uses operators of a stream-based variant of the
      relational algebra that respects validity intervals of events. This allows
      us to continuously calculate the K most popular articles (Top-K set
      regarding to the number of views) in a sliding time window based on well
      established relational operations. The flexible composition of operators
      allows us to variate, e.g., the grouping of impressions to get different
      recommendation sets for different user groups or the exclusion of articles
      the user already knows.

      Keywords: Recommender Systems, Data Stream Processing


1   Introduction
A recommender system as a component of modern information systems provides
a user with a set of items (e.g., news articles, movies, or products) which might
be of interest to him or her. The challenge is to select a small sample from a big
collection of items that hopefully meets the taste of the user. A common approach
is to learn from previous actions (e.g., page impressions of a web news portal) to
estimate the usefulness of items. The K most useful items are presented to the
user as recommendations.
    The CLEF News Recommendation Evaluation Lab [4] (NewsREEL) allows
researchers to evaluate recommender systems with real user data in real-time.
They provide a platform, called Open Recommendation Platform (ORP), that
sends the participants events (impressions and clicks on recommendations) of
real commercial news portals. The participants have to analyze the data and
must answer requests for recommendations with a set of news articles within a
timespan of 100 ms. The recommendations are displayed on the news portal next
to the article a user currently reads. For each participant, the organizers measure
the clicks on the recommendations and calculates the Click Through Rate (CTR).
    In order to participate, a participant needs to implement a server that pro-
cesses the events and replies to recommendation requests. The organizers provide
a framework1 that parses the events and lets one implement his/her own algo-
rithms. However, using that framework means you need to take care of the data
management and processing of potentially unbounded data streams of events.
    In this paper we present our solution that uses the general purpose Data Stream
Management System (DSMS) called Odysseus 2 [1]. It adapts the concepts of
Database Management Systems (DBMS) to process data streams with continuous
queries.


2     Background

In this section we give a summary of the ORP platform, the CLEF NewsREEL,
and the DSMS Odysseus.


2.1    ORP and CLEF NewsREEL

ORP as the technical platform for NewsREEL provides events of user activity
of different news web portals in real-time. Participants are invited to use these
events to calculate recommendations. The recommendations have to be returned
upon request within 100 ms. ORP selects a set of recommendations at random
out of all valid recommendation responses of all participants and displays the
recommendations to the users next to the articles. As evaluation criteria, ORP uses
click through rate (CTR). It counts how many of the requested recommendations
get clicks by the users of the news portals. The fraction of clicks divided by the
number of displayed recommendation sets is the CTR.
    Participants have to subscribe the following data streams:
1. Notifications about user events (e.g., a user reads a news article).
2. Requests for recommendations, to which the participants have to respond to.
3. Notifications about new or updated news articles.
4. Error notifications (e.g., a participant response could not be parsed).
   In our approach we use the first two streams: The user events stream provides
impressions for news articles in real-time. It comprises the news article ID,
publisher ID, user/session ID, user geolocation code, and other information about
the impression. That is our source of data to calculate recommendation sets.
1
    https://github.com/plista/orp-sdk-java
2
    http://odysseus.uni-oldenburg.de/
The requests stream triggers the reply of an ordered set of article IDs—the
recommendations. A request consists of similar attributes as the impressions:
The article the user reads currently, the publisher of the article, the user, etc.
    The events are pushed via HTTP POST request as JSON documents. The
response of a request for recommendations has to be the article IDs intended
to recommend to the user. For each participant ORP sends the data with an
individual data rate. Participants that are able to process more data in time get a
higher data rate. ORP determines the individual data rate by increasing it until
the participant is not able to answer in time anymore. According to the organizers,
each algorithm has to give at least 75 % of the amount of recommendations the
baseline approach gave in order to be considered. That should ensure that the
algorithms are able to handle the workload and allows the comparison of different
approaches.
    Table 1 shows the results of NewsREEL 2017 [6] as provided by the organizers.
It shows the name of the recommender approach, the number of recommendations
given, the number of clicks on recommendations, and the click through rate.
Additionally, we added the last column that indicates if the approach gave at least
75 % of the baseline recommender (BL2Beat). Each recommender that did not
meet this criteria has been grayed out. That applies to each recommender that
gave less than 46,539 recommendations. Our approaches have been highlighted
in bold face.


            Recommender       # Recomm.      # Clicks     CTR 75 % BL
            Riadi NV 01               443          12    0.0271     no
            ORLY KS                42,786         896    0.0209     no
         1. ody4                   72,601       1,139    0.0157    yes
            IRS5                    3,708          58    0.0156     no
         2. ody5                   81,245       1,268    0.0156    yes
         3. ody3                   59,227         813    0.0137    yes
         4. ody2                   63,950         875    0.0137    yes
         5. IT5                    68,582         925    0.0135    yes
         6. eins                   61,524         817    0.0133    yes
         7. yl-2                   60,814         747    0.0123    yes
         8. WIRG                   49,830         600    0.0120    yes
         9. ody1                   68,768         810    0.0118    yes
        10. BL2Beat                62,052         726    0.0117    yes
        11. RIADI pn               77,723         879    0.0113    yes
        12. IL                     79,120         813    0,0103    yes
        13. RIADI nehyb            75,535         764    0.0101    yes
            Has logs                  816           6    0.0074     no
            ody0                   23,023         166    0.0072     no
            RIADI hyb                 349           2    0.0057     no

                  Table 1: Results of CLEF NewsREEL 2017
                                                                       Time of
                                                           w=5        Observation
       Out-dated Elements                                  Window              Prospective Elements


                   ...      E58     E59 E60 E61 E62                         E63     E64        E65      ...
    Validity Intervals


                          E58
                                  E59     E60
                                          E61
                                                   E62
                                                                          E63
                                                                                  E64
                                Overlapping Intervals at t = 109                            E65
                                                                                        Application Time t
                 102        103     104     105      106     107    108     109     110        111      112

                                                Fig. 1: Snapshot Reducibility


2.2                      DSMS—The Odysseus System

The concept of a DSMS bases on the relation algebra of Database Management
Systems (DBMS). Instead of managing a static database of relations, a DSMS
manages volatile data that is pushed to the system by active sources. That results
in a potentially unbounded sequence of elements—the data stream. Similar to
a DBMS, a user of a DSMS writes a query by the use of a query language and
the system builds a Query Execution Plan (QEP), which consists of reusable
operators. Each operator is responsible for a certain operation. Examples of
common operators are selection (filtering of elements), projection (filtering
of attributes of elements), join (combining of elements of two data streams), and
aggregation (combining a set of elements to a single resulting value).
    A relational DSMS extends the concept of a relational DBMS by time-aware
and incremental data processing. The underlying concept has been extensively
studied, e.g. in [3, 2, 5, 1].
    We treat events as tuples with a fixed schema. Each event occurs at a certain
point in time t ∈ T with T as set of discrete and ordered time instances, e.g., the
UNIX timestamp or an arbitrary application/event time (cf. [5]).
    The most important differences to a DBMS are that a DSMS holds its data in
memory and supports (sliding time) windows.3 That allows to define a finite set
of valid tuples for each point in time t ∈ T , e.g., to calculate a moving average of
all elements not older than 60 seconds. The term snapshot reducibility [5] claims
that the resulting tuples of a query of a stream processing system at a specific
3
    In this paper we limit ourselves to sliding time windows. However, there are also other
    types of windows, e.g., element windows, tumbling windows etc. that are supported
    by our DSMS.
1    /∗ Input and Output Stream Definitions ∗/
2    CREATE STREAM events (type STRING, articleid INT,
3      publisherid INT, userid INT, userloc INT, ...) ...;
4    CREATE STREAM requests (publisherid INT, userid INT,
5      userloc INT, ...) ...;
6    CREATE SINK recommendations (recs LIST_INT) ...;
7
8    /∗ Continuous Query Definition ∗/
9    CREATE VIEW impressions FROM (
10     SELECT articleid, publisherid
11     FROM events [SIZE 30 Minutes TIME]
12     WHERE type = "impression" AND articleid > 0
13   );
14   CREATE VIEW counted_impressions FROM (
15      SELECT publisherid, articleid, count(∗) AS counts
16      FROM impressions GROUP BY publisherid, articleid
17   );
18   CREATE VIEW topk_sets FROM (
19      SELECT publisherid,
20             nest(articleid) AS most_popular_articles
21      FROM counted_impressions
22      GROUP BY publisherid ORDER BY counts DESC
23      GROUP LIMIT 6
24   );
25
26   /∗ Join of Requests and TopK Sets ∗/
27   STREAM TO recommendations FROM (
28      SELECT topk_sets.most_popular_articles AS recs
29      FROM topk_sets, requests [SIZE 1 TIME] AS req
30      WHERE topk_sets.publisherid = req.publisherid
31   );

                            Listing 1.1: CQL Query Definition


     point in time t are the same as the resulting tuples of a corresponding query of a
     DBMS over the set of all valid tuples at t.
         In this paper we use the concept of validity intervals (cf. [5]). This approach
     adds an additional operator to the relational algebra—the window operator
     ωw (R). It assigns validity intervals to elements of a relational stream R. E.g.,
     a sliding time window of size w states that the processing at a point in time t0
     should respect all events not older than t0 − w. Therefore, the operator sets a
     (half open) validity interval [ts , te ) with ts = t and te = t + w to an event that
     has been arisen at time t. Thus, the other operators do not need to know the
     window size: The calculation of a result at time t0 need to respect all events
     where t0 ∈ [ts , te ) for the certain event. These are all events that are not older
     than t0 − w.
         Fig. 1 illustrates the concept of snapshot reducibility with validity intervals.
     It shows the validity intervals for the elements and a sliding time window of
     size w = 5. It depicts a part of a data stream from t = 102 to t = 112 with the
     elements E58 to E65. A sliding time window of size w = 5 defines that at each
     point in time t all elements that are not older than t − 5 should be considered as
     valid. That means at point t = 109 in Fig. 1 all elements not older than t0 = 105
     are valid. That corresponds to a window from 105 to 109.
    To process the events of a data stream, the user writes one or more queries.
In contrast to queries of a DBMS, a query will be executed until the user stops
it explicitly. Such a query is called a continuous query since it processes the data
continuously. To write a query, the users take advantage of a query language,
as the SQL-like stream-based query language CQL [2] or the functional query
language PQL [1].
    Similar to a DBMS, the stream processing system translates the query to a
Query Execution Plan (QEP; in the streaming context also called data flow graph).
A QEP is a directed graph of operators. It determines the order of processing
and can be optimized, e.g., by changing the order of operators.
    In addition to the actual stream processing, a DSMS is also responsible for
resource and user management, authentication, authorization, accounting, etc.


3    General Approach

To illustrate how to use a DBMS as a RecSys in the CLEF NewsREEL challenge,
we present an approach that recommends the most popular articles of the last 30
 minutes as basis query for further improvements. Listing 1.1 shows the definition
 of the query in CQL. This is very similar to SQL and consists of a data definition
(DDL) and a data manipulation (DML) part.
     The first part is the definition of the input streams (Lines 2-5) and the output
 stream (called sink, Line 6). This is similar to a CREATE TABLE statement of
 a DBMS. It consists of the name of the stream (e.g., events) and a schema
 definition (truncated in Listing 1.1). The stream or sink definition includes also
 information about the data connection (e.g., HTTP, TCP) and data format (e.g.,
 CSV, JSON). This is omitted in Listing 1.1 because this is out of the scope of
 this paper. After parsing the incoming data each event results in a tuple with a
 fixed schema.
     The second part is the definition of the actual queries. It consists of three
 parts: the pre-processing of the events (lines 9-13), the counting of the number
 of impressions for each article (lines 14-17) and the aggregation of the six most
 popular articles for each publisher (lines 18-24). Each of this is expressed as a view
 definition, which allows to reference the results in the subsequent queries. The
 impressions view selects the article ID and publisher ID of all events of type
“impression” that have an article ID. Additionally, it defines a sliding time window
 of a fixed size (e.g., 30 minutes). The result is used in the counted_impressions
view. It counts the number of impressions (in the time window) of each unique
 pair of publisher and article ID. When a new impression arrives or an event
 gets invalid, the count changes and a new tuple is produced that updates the
 previous value. The topk_sets view aggregates the counted impressions to a list
 of articles IDs of the six most popular articles for each publisher.
     The third part uses the list of most popular articles to answer the recom-
 mendation requests (lines 27-31). For this, we join a request event with an event
 of the topk_sets view for the events where publisher ID of request and recom-
 mendation set are the same. The stream-based join operator solely joins events
that are valid at the same time (which means they have overlapping validity
intervals). Because the aggregation of the topk_sets view updates the most
popular articles set when they change, all resulting tuples of the same publisher
have non-overlapping validity intervals: For each point in time there is exactly one
valid aggregation result for the same group. By defining a sliding time window
of size 1 (validity of 1 time slice) over the stream of requests, the join assigns
exactly one set of article IDs to each request event.


                                                                                     Recommend.
  Impr.        ω          σ         π         γ         γ
                   1          3         4         5         6
                                                                  ./         π
                                                                       7         8
   Req.        ω
                   2

                              Fig. 2: Query Execution Plan


   The resulting QEP is depicted in Fig. 2. It consists of the following types of
operators:
– A window ωw (R) as described in the previous section.
  We use this operator to control how long the impression events are considered
  by the subsequent aggregation operator ( 1 in Fig. 2) and to limit the validity
  of a recommendation request to 1 time slice ( 2 ) in order to join it with exactly
  one set of recommendations.
– A selection σϕ (R) removes all tuples t ∈ R for which the propositional
  expression ϕ does not hold.
  We use the selection to filter the impressions from the event stream ( 3 ).
– A projection πa1 ,...,an (R) restricts the set of attributes of each tuple t ∈ R
  to the attributes {a1 , . . . , an }. An extended version, called map, allows to use
  functions (e.g., πf (a1 ),a2 ,f (a3 ,a4 ),...,an (R)) that are invoked with one ore more
  attributes.
  The projection is used in our example to filter the needed attributes of the
  impression events and of the recommendation responses ( 4 and 8 ).
– An aggregation γG,F (R) takes a set of tuples t ∈ R and returns for each
  group defined by the grouping attribute g ∈ G a single tuple as result that
  contains the results of the aggregate functions f ∈ F . Typical aggregate
  functions are sum, count, avg, max, and min. The stream-based variant
  continuously aggregates tuples that lie in the same window (which means they
  have overlapping validity intervals).
  We use the aggregation to count the number of impressions of each article
  ( 5 ) and to nest the 6 most viewed articles into one outgoing tuple ( 6 ).
1    State[] state     % the current state for each group
2    Event[] events    % events in the current window
3
4    for each incoming Event e:
5      for each Event r in events with r.endts ≤ e.startts:
6        state[group(r)] ← remove(state[group(r)], r)
7        events ← events − r
8        output: eval(state[group(r)])
9      end for
10     state[group(e)] ← add(state[group(e)], e)
11     output: eval(state[group(e)])
12     events ← events + e
13   end for

                      Listing 1.2: Algorithm of the aggregation operator.


     – A join R ./θ S combines the tuples of R and S for which a condition θ holds.
       The stream-based variant claims also that the tuples have overlapping time
       intervals. The validity interval of a resulting tuple is the intersection of the
       validity intervals of the input tuples.
       The join ( 7 ) is responsible for the matching of the requests to the recommen-
       dation sets.
          The important parts of the query are the calculating of a ranking score
     (aggregation 5 ), the building of the recommendation sets (aggregation 6 ),
     and the matching of requests and recommendation sets (join 7 ). These operators
     are heavily influenced by validity intervals of the events, that are assigned by the
     window operator.
          The calculation of the recommendation sets is implemented by two aggrega-
     tion operators: One to calculate the popularity of an article and one to combine
     the most popular articles to a recommendation set. Our aggregation operator
     is data driven. That means it outputs an updated result for every incoming event.
     The operator internally holds a state s for each group and updates this state using
     an aggregation function add(s, e) 7→ s for each incoming event e. For the count
     function in Listing 1.1 the aggregation function is defined as: add(s, e) = s + 1.
          Since the aggregation is defined over a sliding time window the operator
     has to remove all values that do not lie in the window anymore. For that, all
     events whose end timestamp of their validity intervals are lower or equal than
     the start timestamp of the incoming event have to be removed. This is done
     before adding a new event by using a function remove(s, e) 7→ s, e.g., for count:
     remove(s, e) = s − 1.
          After each change of the state the operator outputs a result by calling the
     function eval(s) 7→ e, which calculates the resulting event based on the state. For
     count, it just outputs the state value s. Other functions as for example avg
     need to calculate the resulting value. Consider a state for avg that consists of a
                                                        s.sum
     count and a sum, the eval function is eval(s) = s.count  . Listing 1.2 illustrates the
     algorithm of the aggregation operator.
          The second aggregation operator uses the nest function with add(s, e) =
     s. insertSorted(e) (sorted by count), remove(s, e) = s. remove(e), and eval(s) =
s. sublist(0, 6). The output is an ordered set of the article IDs of the 6 events on
the top.
    Since the recommendation sets are calculated incrementally and continuously,
the calculation of the response to a request is the executing of the join operator.
This leads to very short latencies. The join holds for each group (here the publisher
ID) the latest known recommendation set in a hash map. The operator appends
the matching recommendation sets to each request and gives the result to the
sink that transfers it back to the inquirer.


4    Variations

During the CLEF NewsREEL 2017, we evaluated six different approaches that
are based on the general approach presented in the previous section. Table 2
shows an overview over our approaches and their resulting CTRs.
   The approach ody0 uses the aggregation function nest over a sliding time
window of 60 min. to aggregate a set of unique item IDs that have been read in
the past 60 min. After that, a map function is used to randomly draw 6 items to
recommend them. This approach acts as our internal baseline.
   All other approaches follow the same pattern. Given a data stream of article
IDs and optional further attributes:
1. Define a sliding time window of size w over the data stream. (In our DSMS,
   this is done by annotating validity intervals.)
2. Partition the data by the article IDs and optional other attributes p1 , p2 , . . . , pn .
3. Apply an aggregation function over each partition that calculates a score for
   each item in each partition (over the sliding time window). (In our approach
   we used the count function. In case of explicit item rating, the average function
   would be an alternative.)
4. Optional: Combine the results of different approaches to prevent underfull
   recommendation sets.
    This results in a incrementally and data-driven calculated stream of recom-
mendation sets for each partition.
    The approaches ody1 and ody2 partition the data by publisher IDs as described
in the previous section. They differ solely in the window size. ody1 considers all
impressions of the past 30 min., ody2 of the past 5 min. The evaluation shows,
that the 5 min. window leads to a higher CTR than the 30 min. window.
    ody3 uses the same window size as ody2 but adds as partition attribute the
user location. Because more partitions lead to less data in each partition, our
tests show that a larger window size than 5 min. is necessary to have enough
data in each partition and each window. This approach reaches a similar CTR
than ody2.
    In contrast, ody4 and ody5 calculate the recommendations by aggregating
the previously successfully given recommendations of the past 12 hours. This
information is part of the user events stream. For each news article i we cal-
culated the set of news articles that have been successfully recommended to
visitors of item i the most. Additionally, ody5 fills the recommendation set up
with recommendations of ody1 in case that there are less than six successfully
recommended items in the past 12 hours.


          Approach                                                              CTR
ody0:     Random sample of 60 min. sliding time window.                        0.0072
ody1:     Most viewed articles of 30 min. sliding time window partitioned      0.0118
          by publisher.
ody2:     Most viewed articles of 5 min. sliding time window partitioned by    0.0137
          publisher.
ody3:     Most viewed articles of 30 min. sliding time window partitioned      0.0137
          by publisher and user location.
ody4:     Most clicked recommendations of 12 hour sliding time window          0.0157
          partitioned by item.
ody5:     Most clicked recommendations of 12 hour sliding time window          0.0156
          partitioned by item filled up with most viewed articles of 30 min.
          sliding time window partitioned by publisher.

     Table 2: Overview over our approaches in the CLEF NewsREEL 2017


5    Evaluation of the Window Size

A crucial parameter is the window size of the impression events. Since we want
to count the current views of each article we have to figure out which time span
leads to a set of impression events that represent the current popularity of articles:
Is the size too large the system is not sensitive to popularity changes (concept
drifts). Is it too small there is not enough data to distinguish the popularity of
articles.
    To determine an appropriate window size we conducted an experiment. We
ran 21 queries as shown in Listing 1.1 with different window sizes in parallel over
the same stream (1 to 10 min, 20 min, 30 min, 40 min, 50 min, 60 min, 90 min,
2 hrs, 3 hrs, 6 hrs, 12 hrs, 24 hrs). Similar to the Interleaved Test-Then-Train
(ITTT) evaluation method we calculated the reciprocal rank of each article of an
impression event before it has influenced the recommendation set. That gave us
a mean reciprocal rank (MRR) for each query resp. window size. A higher MRR
means there are frequently viewed articles more often near the top. Additionally,
we calculated how often an article is part of the top 6 of the most popular items
in the different time windows (Precision@6).
   Because the data rates of the publishers are different (some publishers have
many more impressions than others) we evaluated the window sizes for each
publisher separately.


                             0.44804                                                                                                                                                   0.5885                0.5885
       0.4480                      0.44798                                                                                                                                                                               0.58835
                                                                                                                                                                                                    0.58821
                                            0.44789                                                                                                                                                                               0.5881
                    0.44787                                                                                                                                                            0.5880
       0.4478                                        0.44779                                                                                                                                                                               0.5878


                                                                                                                                                                       Precision @ 6
                                                                                                              0.44769                                                                  0.5875                                                       0.58747
 MRR


       0.4476                                                                                                          0.44759
                                                                                                                                                                                                                                                             0.58713
                                                                                                                                                                                       0.5870
                                                                                                                                   0.44747                                                                                                                            0.58679
       0.4474
                                                                                                                                               0.44733                                 0.5865                                                                                  0.58644

                                                                                                                                                       0.44721                                                                                                                       0.58611
       0.4472
                                                                                                                                                                                       0.5860
                1        2        3     4        5                                                        6        7           8           9          10                                        1        2           3        4        5        6        7        8        9       10
                                                Window Size                                                                                                                                                                           Window Size


                    (a) Mean Reciprocal Rank                                                                                                                                                                    (b) Precision @ 6
                                                                                                                       0.051
                                                       Percent of recomm. sets w/ less than 6 articles


                                                                                                         0.05 %


                                                                                                         0.04 %


                                                                                                         0.03 %

                                                                                                                                   0.023
                                                                                                         0.02 %


                                                                                                                                               0.01
                                                                                                         0.01 %
                                                                                                                                                           0.005
                                                                                                                                                                   0.003 0.002
                                                                                                                                                                               0.002 0.002 0.001 0.001
                                                                                                              0%
                                                                                                                       1           2           3       4           5                    6       7       8        9          10
                                                                                                                                                                   Window Size


                                                                                          (c) Percent of Underfull Recomm. Sets

Fig. 3: Mean Reciprocal Rank, Precision@6, and underfull recommendation sets
for window sizes 1 - 10 min and publisher with ID 35774.


    Fig. 3 shows the result for the publisher with ID 35774 (window sizes 1 to
10 min). As you can see in Fig. 3a, a window of 2 min. leads to the highest MRR
of 0.44804. Correspondingly, the Precision@6 in Fig. 3b shows that about 59%
of each impression the user views an article that is part of the 6 most popular
articles of the last 2 min. For this publisher, lower or higher window sizes lead to
lower values.
    Besides the metrics MRR and Precision@6, we measured how often a window
does not have impressions of at least 6 distinct articles so that the recommendation
sets have less than 6 recommendations (underfull recommendation sets). Fig. 3c
shows the fraction of underfull recommendation sets. As expected, the fraction
of underfull recommendation sets decreases with higher window sizes. Overall,
the fraction of underfull recommendation sets for publisher 35774 is pretty low.
Even a window size of 2 min. has just 0.023% recommendation sets with less
than 6 recommendations.
    The results of the other publisher follow this pattern, even though publisher
with lower data rates get better results with bigger window sizes. As a tradeoff
between different publishers we made good experiences with a window size of
five minutes.


6    Conclusions
In this paper we presented an approach for the CLEF NewsREEL 2017. We
implemented a RecSys by using a generic Data Stream Management System.
By analyzing impression events of users we calculated a set of recommendations
based on the popularity in a given time window. To find an appropriate time
window we evaluated different sizes in parallel.


References
1. Appelrath, H.J., Geesen, D., Grawunder, M., Michelsen, T., Nicklas, D.: Odysseus:
   A highly customizable framework for creating efficient event stream management
   systems. In: DEBS’12. pp. 367–368. ACM (2012)
2. Arasu, A., Babu, S., Widom, J.: The CQL continuous query language: semantic
   foundations and query execution. VLDB Journal 15(2), 121–142 (2006)
3. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in
   data stream systems. In: PODS 2002. pp. 1–16. ACM (2002)
4. Hopfgartner, F., Brodt, T., Seiler, J., Kille, B., Lommatzsch, A., Larson, M., Turrin,
   R., Serény, A.: Benchmarking news recommendations: The clef newsreel use case. In:
   ACM SIGIR Forum. vol. 49, pp. 129–136. ACM (2016)
5. Krämer, J., Seeger, B.: Semantics and implementation of continuous sliding window
   queries over data streams. ACM TODS’09 34(1), 4 (2009)
6. Lommatzsch, A., Kille, B., Hopfgartner, F., Larson, M., Brodt, T., Seiler, J., Özgöbek,
   Ö.: Clef 2017 newsreel overview: A stream-based recommender task for evaluation
   and education. In: CLEF 2017. Springer Verlag (2017)

</pre>