Incorporating Dwell Time in Session-Based Recommendations with
                       Recurrent Neural Networks
                                                              Research In Progress†


                                       Veronika Bogina                                     Tsvi Kuflik
                                       Haifa University                                  Haifa University
                                            Israel                                            Israel
                                      sveron@gmail.com                                tsvikak@is.haifa.ac.il


ABSTRACT                                                                   which is defined as a sequence of clicks in news, music and e-
                                                                           commerce domains. Hitherto, with the development of RNN,
Recurrent Neural Networks (RNN) is a frequently used technique             sequences of events were taken into consideration, still without
for sequence data predictions. Recently, it gains popularity in the        incorporating the dwell time - the time that user spent examining
Recommender Systems domain, especially for session-based                   this specific item [4][5][6][9].
recommendations where naturally, each session is defined as a
sequence of clicks, and timestamped data per click is available. In        Therefore, we decided to explore the effect of incorporating dwell
our research, in its early stages, we explore the value of                 time into the input data of the RNN based on the existing
incorporating dwell time into existing RNN framework for session-          framework, and evaluate it in a dataset taken from the e-commerce
based recommendations by boosting items above the predefined               domain.
dwell time threshold. We show improvement in recall@20 and
MRR@20 by evaluating the proposed approach on e-commerce                   Next, we will describe first our method, then we present an initial
RecSys’15 challenge dataset.                                               evaluation and, finally, discuss our findings and future research
                                                                           directions.
CCS CONCEPTS
Information retrieval -> Retrieval tasks and goals ->                      2    METHOD
Recommender systems                                                        The general idea of the proposed method is that the longer user
                                                                           examines an item (stays on its web page), the more interested s/he
KEYWORDS                                                                   is in that item. Obviously, we are not talking about outliers, where
Recommender systems, temporal aspects, dwell time, deep                    there is a possibility that the user just left the application or simply
learning, recurrent neural networks                                        kept the web page open while moving away from the computer.
                                                                           Therefore, this approach can be used in the next click
ACM Reference format:                                                      recommendations techniques.
V.Bogina, T.Kuflik 2017. Incorporating Dwell Time in Session-Based         Let an e-commerce session be a sequence of clicks {x1,…,xn} and
Recommendations with Recurrent Neural Networks. SIG Proceedings            for each click there is a dwell time dti. We propose to use Boosting
Paper in Word Format. In Proceedings of RecTemp Workshop co-located
                                                                           to boost items that have dwell time greater than a predefined
with ACM RecSys’2017, Como, Italy
                                                                           threshold t. This way we multiply the number of such items in the
                                                                           session and define the number of occurrences of the item in the
1   MOTIVATION                                                             session as (dti/t +1).
Nowadays, users are flooded with a wide variety of items to                Hidasi et al.[5] proposed to use the Gated Recurrent Unit (GRU)
purchase/listen/read in the Sea of Possibilities over the Internet. In     based RNN for session-based recommendations that was preferable
this scenario, relevant and useful recommendations can be a life           to LSTM in its performance [3]. One of the challenges in session
saver for a customer by reducing the number of alternatives. Hence,        based recommendations modelling with RNN is that sessions differ
predicting next items a customer will be interested in (to click on,       in their length. Therefore, the authors proposed to represent each
listen to, read through etc.) is an ongoing attractive and interesting     mini-batch as set of elements from parallel sessions – refer to Fig1
research task, as it is important to the service provider and the          and then predict next elements in the parallel sessions by predicting
customer alike [5]. To succeed in predicting the “next item to be          next mini-batch. On the left, we see the original click sequences
clicked”, some of the researchers considered past sessions of the          while on the right – input to the model and an output from the
user [5][8], item repetition in past interaction, favorite items, items    model. All sessions are ordered by session id and time. After that
co-occurrences, topics similarity [7] and more.                            the first elements of the few sessions are used in the first mini batch,
                                                                           the second elements in the second min-batch and so forth. Only
However, as the previous studies [2][10] show, dwell time plays a          elements with next element available are used in the input. Once
significant role in predictions based on implicit users’ feedback          session ends, another one is proceeded (as in the Tetris game). The

†
  Copyright©2017 for this paper by its authors. Copying
permitted for private and academic purposes
    RecTemp Workshop, RecSys’2017, August 2017, Como, Italy                                                                V.Bogina, T.Kuflik

output forms the set of the next items for each item in the input         by clicks, including timestamp of the click. Clearly, dwell time for
batch. Therefore, the last element in the session is not part of the      all clicks, except the last one in the session (as there is no dwell
input (no item follows it), as well as the first element of the session   time for the last click in the data set), where dwell time is not
is not part of the output (no item precedes it). We will refer to their   available, can be extracted. The dwell time is calculated as a
model in the paper as GRU4Rec.                                            difference between timestamp of the current item and the next one.
In our study, we propose to enrich Hidasi et al. [5] model                Following Hidasi et al. research [5] , the one-click sessions from
(GRU4Rec) with the representation of dwell time as additional             the original training dataset were dropped due their nature – the
elements by incorporating item boosting.                                  lack of next click to predict.
                                                                          Yoochoose click data was split to training and test sets the same
                                                                          way as it was done in [5]. To be able to compare our results with
                                                                          the results of GRU4Rec the same metrics were used – recall@20
                                                                          and MRR@20.
                                                                          The statistics on the dwell time (in seconds as the reader can see on
                                                                          axis X) is presented as a boxplot in Fig3. Items dwell time
                                                                          distribution is presented in Fig4.


Figure 1. Session-parallel mini batch creation (following
Hidasi et.al paper).

In our proposed method, session i is represented by clicks, based
on their dwell time. Let’s assume that the predefined threshold for
the dwell time is t seconds and in session2 the dwell time of the
first element is greater than 2t seconds but less than 3t, then this
session parallel mini batch is different from the previous one (as
described in Fig1) by inclusion of 2 instances of the first element –
see Fig2. We call it items boosting. Items dwell times are different.
To differentiate items accordingly, we propose to increase the            Figure 3. Boxplot with statistics on the data set’s dwell time
presence in the session of those, having dwell time greater than the
pre-defined threshold. Indeed, the presence remains at the same           The reader can see from the boxplot that the average dwell time is
location in the sequence.                                                 around 148 seconds, while the median is around 60 and standard
                                                                          deviation is 326.05. 25th percentile is 26 seconds, 50th - 58.5
                                                                          seconds and 75th percentile - 130 seconds.


Figure 2. Dwell time based session-parallel mini batch creation
using items boosting

Recently, Hidasi and Karatzoglou [6] (will refer to it later as
GRU4Rec with sampling) improved the performance of the
GRU4Rec by changing a sampling strategy: where for each
example in the mini batch another example is used as a negative
sampling; and presenting a novel family of ranking loss functions,        Figure 4. Dwell time distribution that is limited to items with
based on individual pairwise losses. Their recall@20 and                  dwell time less than 450 seconds.
MRR@20 results outperformed GRU4Rec. We will also approach
this method it in the next section of evaluation.                         We decided to test our approach initially with the threshold that is
                                                                          half of the average time – 75 seconds (from practical performance
3     EVALUATION                                                          reasons), as well as between the 50th and 75th. Moreover, since we
Our proposed method was evaluated on RecSys’2015 challenge                are curious about other threshold values, we have conducted an
data set – Yoochoose [1]. Each e-commerce session is represented          experiment with 100 seconds as a threshold and got worse results

2
 Addressing Temporal Aspects in User Modelling                                                     UMAP’2017, July 2017, Bratislava, Slovakia

than with 75 seconds, as presented in Table 1. Looking at Table 1,      learning process is faster, since more data is available (relatively to
the reader can see two baseline method’s GRU4Rec[5] and                 Rec4GRU with sampling). As well given the fact that Hidasi et al.
GRU4Rec with sampling[6] results. Under each one of them we             models were optimized.
present results for the same method, but with the dwell time
enrichment with the threshold set to 75 seconds and 100 seconds.
According to the table, incorporating dwell time into GRU4Rec           4     DISCUSSION AND FUTURE WORK
with sampling, using the threshold set to 75 provides the best          We proposed and evaluated a method for incorporating dwell time
results.                                                                in session-based recommendations with RNN for next item
                                                                        prediction. By multiplying items instances we increased the
Table 1. Comparison between different results                           original training dataset that we had. As a result, we also improved
 Method            recall@20        MRR@20                              recall@20 and MRR@20 by boosting significant to user items,
 GRU4Rec [5]       0.5853           0.2305                              based on their dwell time.
 GRU4Rec with 0.7885                0.5834                              We showed that the newly suggested method enhances and
 dwell        time                                                      outperforms the method suggested by [5] in a specific case study.
 threshold 75                                                           This supports our claim that the recommendations of the next item
 GRU4Rec with 0.7754                0.548                               are dynamic and depend also on the time user spends on the specific
 dwell        time                                                      item. Moreover, they evolve over time.
 threshold 100                                                          In future work, we plan to explore the optimal threshold for the
 GRU4Rec with 0.7117                0.308                               dwell time and check our findings on few more data sets, not only
 sampling [6]                                                           on RecSys’2015 challenge.
 GRU4Rec with 0.84                  0.61
 sampling and                                                           ACKNOWLEDGMENTS
 dwell        time                                                      The work is partially supported by the Israeli Innovation Authority,
 threshold 100                                                          the Ministry of Economy and Industry, MAGNET “Infomedia”
 GRU4Rec with 0.853                 0.636                               project.
 sampling      and
 dwell        time
 threshold 75                                                           REFERENCES
                                                                        [1]  Ben-Shimon, D., Tsikinovsky, A., Friedmann, M., Shapira, B., Rokach, L., &
We will continue to experiment and test different thresholds effects         Hoerle, J. (2015). Recsys challenge 2015 and the yoochoose dataset. In
on the next item recommendation task.                                        Proceedings of the 9th ACM Conference on Recommender Systems (pp. 357-
The loss function values progression over epochs in the extended             358). ACM.
version of the algorithm, based on Rec4GRU with sampling [6], is        [2] Bogina, V., Kuflik, T., & Mokryn, O. (2016). Learning Item Temporal
presented in Fig5. The goal is to minimize the loss function for the         Dynamics for Predicting Buying Sessions. In Proceedings of the 21st
                                                                             International Conference on Intelligent User Interfaces 251-255
training data. Therefore, the smaller the value of the loss function,
                                                                        [3] Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the
the better our predictions. Frequently used loss function is the             properties of neural machine translation: Encoder-decoder approaches. arXiv
cross-entropy loss. In Rec4GRU – TOP1 loss was used [5].                     preprint arXiv:1409.1259.
However, in Rec4GRU with sampling - Bayesian Personalized               [4] Greenstein-Messica, A., Rokach, L., & Friedman, M. (2017, March). Session-
Ranking (BPR)-max provided the best results.                                 Based Recommendations Using Item Embedding. In Proceedings of the 22nd
                                                                             International Conference on Intelligent User Interfaces (pp. 629-633). ACM.
                                                                        [5] Hidasi, B., Karatzoglou, A., Baltrunas, L., & Tikk, D. (2015). Session-based
                                                                             recommendations with recurrent neural networks. arXiv preprint
                                                                             arXiv:1511.06939.
                                                                        [6] Hidasi, B., & Karatzoglou, A. (2017). Recurrent Neural Networks with Top-k
                                                                             Gains for Session-based Recommendations. arXiv preprint arXiv:1706.03847.
                                                                        [7] Jannach, D., Kamehkhosh, I., & Lerche, L. (2017, April). Leveraging multi-
                                                                             dimensional user models for personalized next-track music recommendation. In
                                                                             Proceedings of the Symposium on Applied Computing (pp. 1635-1642). ACM.
                                                                        [8] Park, S. E., Lee, S., & Lee, S. G. (2011). Session-based collaborative filtering
                                                                             for predicting the next song. In Computers, Networks, Systems and Industrial
                                                                             Engineering (CNSI), 2011 First ACIS/JNU International Conference on(pp.
                                                                             353-358). IEEE.
                                                                        [9] Tan, Y. K., Xu, X., & Liu, Y. (2016, September). Improved recurrent neural
                                                                             networks for session-based recommendations. In Proceedings of the 1st
                                                                             Workshop on Deep Learning for Recommender Systems (pp. 17-22). ACM.
                                                                        [10] Yi, X., Hong, L., Zhong, E., Liu, N. N., & Rajan, S. (2014). Beyond clicks:
                                                                             dwell time for personalization. In Proceedings of the 8th ACM Conference on
Figure 5. Loss function values for methods based on Rec4GRU
                                                                             Recommender systems, 113-120.
with sampling with and without dwell time.

For each epoch in RNN the loss function value is outputted as well.
On the first epoch0 the values are the worst and after that, slowly,
it converges. When the threshold is set to 75 seconds, then the


                                                                                                                                                          3