=Paper=
{{Paper
|id=Vol-1905/recsys2017_poster7
|storemode=property
|title=Alpenglow: Open Source Recommender Framework with Time-aware Learning and Evaluation
|pdfUrl=https://ceur-ws.org/Vol-1905/recsys2017_poster7.pdf
|volume=Vol-1905
|authors=Erzsébet Frigó,Róbert Pálovics,Domokos Kelen,Levente Kocsis,András A. Benczúr
|dblpUrl=https://dblp.org/rec/conf/recsys/FrigoPKKB17
}}
==Alpenglow: Open Source Recommender Framework with Time-aware Learning and Evaluation==
<pdf width="1500px">https://ceur-ws.org/Vol-1905/recsys2017_poster7.pdf</pdf>
<pre>
         Alpenglow: Open Source Recommender Framework with
                 Time-aware Learning and Evaluation∗
          Erzsébet Frigó        Róbert Pálovics Domokos Kelen Levente Kocsis András A. Benczúr
                                      Institute for Computer Science and Control
                                    Hungarian Academy of Sciences (MTA SZTAKI)
                         {frigo.erzsebet, palovics, kdomokos, kocsis, benczur}@sztaki.hu
ABSTRACT                                                                                   Algorithm 1: An Alpenglow experiment in Python.
Alpenglow 1 is a free and open source C++ framework with easy-to-                    import alpenglow
use Python API. Alpenglow is capable of training and evaluating                      from alpenglow . experiments import BatchAndOnlineExperiment
industry standard recommendation algorithms including variants                       import pandas
of popularity, nearest neighbor, and factorization models.
   Traditional recommender algorithms may periodically rebuild                       data = pandas . read_csv ("/path/to/ sample_dataset ")
their models, but they cannot adjust online to quick changes in                      factor_model_experiment = BatchAndOnlineExperiment (
trends. Besides batch training and evaluation, Alpenglow supports                        top_k =100 ,
online training of recommendation models capable of adapting to                          dimension =10 ,
concept drift in non-stationary environments.                                            online_learning_rate =0.2 ,
                                                                                         batch_learning_rate =0.07 ,
1    INTRODUCTION                                                                        number_of_iterations =9,
                                                                                         period_length =608400      # 1 week in sec
Available free and open source recommender systems2 mostly fol-
                                                                                     )
low the needs of static research data such as the Netflix prize compe-
                                                                                     rankings = factor_model_experiment .run(data , verbose =True)
tition with a predefined subset of the data for training and another
                                                                                     results = alpenglow . DcgScore ( rankings )
for evaluation. In a real service, users request one or a few recom-
mendations at a time and get exposed to new information that may
change their preferences for their next visit. Furthermore, recom-                   • batch and online matrix factorization (MF), including asymmetric
mendation applications usually require top-k list recommendations                       MF, SVD++ and other MF variants;
and provide implicit feedback for training recommender models [1].                   • time-aware online combination [2] of all these models.
    In a real application, top item recommendation by online learning                   The framework is composed of a large number of components
is hence more relevant than batch rating prediction. We target top-                  written in C++ and a thin Python API for combining them into
k recommendation in highly non-stationary environments with                          reusable experiments. It is compatible with popular packages such
implicit feedback [1, 2, 4]. Our goal is to promptly update the rec-                 as the Jupyter Notebook, and is able to process data from Pandas
ommender models after each user interaction by online learning.                      data frames. Furthermore, the framework provides a scikit-learn
    We present Alpenglow, a conjoint batch and online learning                       style API as well, for traditional batch training and evaluation.
recommender framework. When Alpenglow reads a stream of user–                           The Python API is illustrated in Algorithm 1. In the code sample,
item interaction events, first it constructs a recommendation top                    user–item pairs are read into a data frame. Then we set up an
list for the user. Next, the consumed item is revealed, the relevance                experiment, in which 10-dimensional factor models are periodically
of the top list is assessed, and the model is immediately updated.                   batch trained and then continuously updated by online learning.
Alpenglow works in a single server shared memory multithreaded                       The learning rates, the batch iterations, the batch training periods
architecture.                                                                        and possibly negative and past event sample counts and other
                                                                                     parameters are passed to the object. When running the experiment,
2    ALPENGLOW IMPLEMENTATION                                                        we obtain the list of rankings, which is finally evaluated by online
Alpenglow is capable of training various factorization, similarity,                  DCG (see Section 3). Both rankings and online DCG scores are
recency, and popularity based models including                                       stored in data frames. Ongoing work includes further modularizing
                                                                                     the components of mixed batch and online models.
• temporal popularity and item-to-item models;
                                                                                        Another advantage of the Python API is that it gives access to the
• time sensitive variants of nearest neighbor, e.g. with the intro-
                                                                                     modular construction of the C++ core implementation. Users may
  duction of a time-decay;
                                                                                     construct recommenders from various models, objectives, updaters,
∗ Support from the EU H2020 grant Streamline No 688191 and the “Big Data—Momentum”   learners and run their experiments in different experimental settings.
grant of the Hungarian Academy of Sciences.                                          After a new record is evaluated, the training process is orchestrated
1 https://github.com/rpalovics/Alpenglow
2 https://github.com/grahamjenson/list_of_recommender_systems
                                                                                     by learners, which execute batch, online or sampling learning strate-
                                                                                     gies. Updaters train the models by altering their states, which in turn
RecSys 2017 Poster Proceedings, August 27-31, Como, Italy                            use the trained states to provide predictions to be evaluated. The
                                                                                     updaters are often defined using modular objectives. The framework
RecSys 2017 Poster Proceedings, August 27-31, Como, Italy                                                                                     Frigó et al.

                                                        time
                                           i1                                                 0.05
                                           i2


                                                                         average weekly DCG
                                                                                              0.04
                                     u     i
                                           i4                                                 0.03

Figure 1: Temporal evaluation and learning in Alpenglow.                                      0.02
                                                                                                                                   batch
also provides a number of preconfigured experiments ready to be                               0.01                                 online
                                                                                                                                   batch & online
run on a given data set. For evaluation, both rating and ranking                              0.00
based measures are available in an online evaluation framework                                       0   10   20          30      40        50
including MSE, DCG, recall, or precision, which are all continuously
                                                                                                                   time (weeks)
updated.                                                                Figure 2: Performance of the batch, online and batch & on-
                                                                        line methods over the Last.fm data set. DCG scores are com-
3   TEMPORAL EVALUATION                                                 puted individually for each unique user-artist interaction
In an online setting as in Fig. 1, whenever a new user-item inter-      and then averaged weekly.
action is observed, we assume that the user becomes active and
requests a recommendation. Hence for every single unique event,         model uses a single iteration and processes each record only once,
Alpenglow executes the following steps:                                 immediately after it is observed, and applies higher learning rate
(1) generates top-k recommendation for the active user,                 lr = 0.2. We evaluated top-k recommendation for each single in-
(2) evaluates the list against the single relevant item that the user   teraction by using DCG and then computed weekly averages. As
    interacted with,                                                    seen in Figure 2, the performance of the online model is close to
(3) updates its model on the revealed user-item interaction.            the batch model, despite the fact that it cannot iterate in the data.
                                                                           Finally, we describe the batch&online model, which is imple-
We use DCG computed individually for each event and averaged in
                                                                        mented in Algorithm 1. In this model, we periodically re-train the
time as an appropriate measure for real-time recommender eval-
                                                                        model at the end of each week by batch SGD. Afterwards, we train
uation [2]. If i is the next consumed item by the user, the online
                                                                        the model during the next week via lightweight online updates.
DCG@K is defined as the following function of the rank of i re-
                                                                        Batch&online results in significant improvement over both individ-
turned by the recommender system,
                                                                        ual models.
                  0
                  
                                               if rank(i) > K;
       DCG@K(i) =          1                                           5                     CONCLUSIONS AND FUTURE WORK
                   log (rank(i) + 1)
                                               otherwise.
                   2                                                   We presented Alpenglow, a C++ recommender framework with
4   ONLINE LEARNING                                                     Python API. The current version of the code is able to produce
                                                                        recommender models that can adapt to non-stationary effects in real
Online algorithms read the data in temporal order and may process
                                                                        recommendation scenarios. It includes batch and online variants of
each record only once. For example, the online variant of a matrix
                                                                        several standard recommendation models.
factorization model with gradient descent updates the correspond-
                                                                           The goal of Alpenglow is twofold. First, it produces temporal
ing user and item latent vectors after each observed interaction.
                                                                        recommendation models that can be combined to batch models to
Compared to batch recommenders, online models may be advanta-
                                                                        achieve significant performance gains. Second, the framework can
geous, as they
                                                                        simulate the streaming recommendation scenario offline. Hence
• can adopt to temporal effects, hence may handle concept drift,        it supports the selection and hyperparameter tuning of models
• can often be trained significantly faster than their batch variant.   trained on streaming data.
    Next we present an experiment of the simplest batch and online         In our future work, we intend to connect Alpenglow with other
trained Alpenglow models. We use data carefully distilled from the      popular Python packages. Furthermore, our plan is to advance the
(user, artist, timestamp) tuples crawled from Last.fm [3].              architecture of the framework towards distributed recommender
• We deleted artists appearing less than 10 times.                      APIs (Apache Flink, Apache Spark) and data streams, thus mak-
• For each user-artist pair, we kept only the first occurrence and      ing it possible to rapidly prototype and evaluate online learning
   deleted all others so that we only recommend new artists.            recommenders.
• We discarded playlist effects: to avoid learning automatically
   generated sequences of items, we kept only those user-item in-       REFERENCES
   teractions that start a user session.                                [1] X. Amatriain and J. Basilico. Past, present, and future of recommender systems:
                                                                            An industry perspective. In Proceedings of the 10th ACM RecSys, 2016.
The final data contains 1,500-2,000 events per day for over one year.   [2] R. Pálovics, A. A. Benczúr, L. Kocsis, T. Kiss, and E. Frigó. Exploiting temporal
    Figure 2 shows the best performing 10-dimensional factor models         influence in online recommendation. In Proceedings of the 8th ACM RecSys, 2014.
                                                                        [3] R. Turrin, M. Quadrana, A. Condorelli, R. Pagano, and P. Cremonesi. 30music
by batch and online training. All of them are trained with stochastic       listening and playlists dataset. In RecSys Posters, 2015.
gradient descent (SGD) for mean squared error (MSE) on implicit         [4] J. Vinagre, A. M. Jorge, and J. Gama. Evaluation of recommender systems in
data. The batch model is only retrained weekly, with several it-            streaming environments. In Workshop on Recommender Systems Evaluation, October
                                                                            10, 2014, Silicon Valley, United States, 2014.
erations of lower learning rate lr = 0.07. In contrast, the online

</pre>