=Paper=
{{Paper
|id=Vol-1922/paper11
|storemode=property
|title=Incorporating Dwell Time in Session-Based Recommendations with Recurrent Neural Networks
|pdfUrl=https://ceur-ws.org/Vol-1922/paper11.pdf
|volume=Vol-1922
|authors=Veronika Bogina,Tsvi Kuflik
|dblpUrl=https://dblp.org/rec/conf/recsys/BoginaK17
}}
==Incorporating Dwell Time in Session-Based Recommendations with Recurrent Neural Networks==
Incorporating Dwell Time in Session-Based Recommendations with
Recurrent Neural Networks
Research In Progress†
Veronika Bogina Tsvi Kuflik
Haifa University Haifa University
Israel Israel
sveron@gmail.com tsvikak@is.haifa.ac.il
ABSTRACT which is defined as a sequence of clicks in news, music and e-
commerce domains. Hitherto, with the development of RNN,
Recurrent Neural Networks (RNN) is a frequently used technique sequences of events were taken into consideration, still without
for sequence data predictions. Recently, it gains popularity in the incorporating the dwell time - the time that user spent examining
Recommender Systems domain, especially for session-based this specific item [4][5][6][9].
recommendations where naturally, each session is defined as a
sequence of clicks, and timestamped data per click is available. In Therefore, we decided to explore the effect of incorporating dwell
our research, in its early stages, we explore the value of time into the input data of the RNN based on the existing
incorporating dwell time into existing RNN framework for session- framework, and evaluate it in a dataset taken from the e-commerce
based recommendations by boosting items above the predefined domain.
dwell time threshold. We show improvement in recall@20 and
MRR@20 by evaluating the proposed approach on e-commerce Next, we will describe first our method, then we present an initial
RecSys’15 challenge dataset. evaluation and, finally, discuss our findings and future research
directions.
CCS CONCEPTS
Information retrieval -> Retrieval tasks and goals -> 2 METHOD
Recommender systems The general idea of the proposed method is that the longer user
examines an item (stays on its web page), the more interested s/he
KEYWORDS is in that item. Obviously, we are not talking about outliers, where
Recommender systems, temporal aspects, dwell time, deep there is a possibility that the user just left the application or simply
learning, recurrent neural networks kept the web page open while moving away from the computer.
Therefore, this approach can be used in the next click
ACM Reference format: recommendations techniques.
V.Bogina, T.Kuflik 2017. Incorporating Dwell Time in Session-Based Let an e-commerce session be a sequence of clicks {x1,…,xn} and
Recommendations with Recurrent Neural Networks. SIG Proceedings for each click there is a dwell time dti. We propose to use Boosting
Paper in Word Format. In Proceedings of RecTemp Workshop co-located
to boost items that have dwell time greater than a predefined
with ACM RecSys’2017, Como, Italy
threshold t. This way we multiply the number of such items in the
session and define the number of occurrences of the item in the
1 MOTIVATION session as (dti/t +1).
Nowadays, users are flooded with a wide variety of items to Hidasi et al.[5] proposed to use the Gated Recurrent Unit (GRU)
purchase/listen/read in the Sea of Possibilities over the Internet. In based RNN for session-based recommendations that was preferable
this scenario, relevant and useful recommendations can be a life to LSTM in its performance [3]. One of the challenges in session
saver for a customer by reducing the number of alternatives. Hence, based recommendations modelling with RNN is that sessions differ
predicting next items a customer will be interested in (to click on, in their length. Therefore, the authors proposed to represent each
listen to, read through etc.) is an ongoing attractive and interesting mini-batch as set of elements from parallel sessions – refer to Fig1
research task, as it is important to the service provider and the and then predict next elements in the parallel sessions by predicting
customer alike [5]. To succeed in predicting the “next item to be next mini-batch. On the left, we see the original click sequences
clicked”, some of the researchers considered past sessions of the while on the right – input to the model and an output from the
user [5][8], item repetition in past interaction, favorite items, items model. All sessions are ordered by session id and time. After that
co-occurrences, topics similarity [7] and more. the first elements of the few sessions are used in the first mini batch,
the second elements in the second min-batch and so forth. Only
However, as the previous studies [2][10] show, dwell time plays a elements with next element available are used in the input. Once
significant role in predictions based on implicit users’ feedback session ends, another one is proceeded (as in the Tetris game). The
†
Copyright©2017 for this paper by its authors. Copying
permitted for private and academic purposes
RecTemp Workshop, RecSys’2017, August 2017, Como, Italy V.Bogina, T.Kuflik
output forms the set of the next items for each item in the input by clicks, including timestamp of the click. Clearly, dwell time for
batch. Therefore, the last element in the session is not part of the all clicks, except the last one in the session (as there is no dwell
input (no item follows it), as well as the first element of the session time for the last click in the data set), where dwell time is not
is not part of the output (no item precedes it). We will refer to their available, can be extracted. The dwell time is calculated as a
model in the paper as GRU4Rec. difference between timestamp of the current item and the next one.
In our study, we propose to enrich Hidasi et al. [5] model Following Hidasi et al. research [5] , the one-click sessions from
(GRU4Rec) with the representation of dwell time as additional the original training dataset were dropped due their nature – the
elements by incorporating item boosting. lack of next click to predict.
Yoochoose click data was split to training and test sets the same
way as it was done in [5]. To be able to compare our results with
the results of GRU4Rec the same metrics were used – recall@20
and MRR@20.
The statistics on the dwell time (in seconds as the reader can see on
axis X) is presented as a boxplot in Fig3. Items dwell time
distribution is presented in Fig4.
Figure 1. Session-parallel mini batch creation (following
Hidasi et.al paper).
In our proposed method, session i is represented by clicks, based
on their dwell time. Let’s assume that the predefined threshold for
the dwell time is t seconds and in session2 the dwell time of the
first element is greater than 2t seconds but less than 3t, then this
session parallel mini batch is different from the previous one (as
described in Fig1) by inclusion of 2 instances of the first element –
see Fig2. We call it items boosting. Items dwell times are different.
To differentiate items accordingly, we propose to increase the Figure 3. Boxplot with statistics on the data set’s dwell time
presence in the session of those, having dwell time greater than the
pre-defined threshold. Indeed, the presence remains at the same The reader can see from the boxplot that the average dwell time is
location in the sequence. around 148 seconds, while the median is around 60 and standard
deviation is 326.05. 25th percentile is 26 seconds, 50th - 58.5
seconds and 75th percentile - 130 seconds.
Figure 2. Dwell time based session-parallel mini batch creation
using items boosting
Recently, Hidasi and Karatzoglou [6] (will refer to it later as
GRU4Rec with sampling) improved the performance of the
GRU4Rec by changing a sampling strategy: where for each
example in the mini batch another example is used as a negative
sampling; and presenting a novel family of ranking loss functions, Figure 4. Dwell time distribution that is limited to items with
based on individual pairwise losses. Their recall@20 and dwell time less than 450 seconds.
MRR@20 results outperformed GRU4Rec. We will also approach
this method it in the next section of evaluation. We decided to test our approach initially with the threshold that is
half of the average time – 75 seconds (from practical performance
3 EVALUATION reasons), as well as between the 50th and 75th. Moreover, since we
Our proposed method was evaluated on RecSys’2015 challenge are curious about other threshold values, we have conducted an
data set – Yoochoose [1]. Each e-commerce session is represented experiment with 100 seconds as a threshold and got worse results
2
Addressing Temporal Aspects in User Modelling UMAP’2017, July 2017, Bratislava, Slovakia
than with 75 seconds, as presented in Table 1. Looking at Table 1, learning process is faster, since more data is available (relatively to
the reader can see two baseline method’s GRU4Rec[5] and Rec4GRU with sampling). As well given the fact that Hidasi et al.
GRU4Rec with sampling[6] results. Under each one of them we models were optimized.
present results for the same method, but with the dwell time
enrichment with the threshold set to 75 seconds and 100 seconds.
According to the table, incorporating dwell time into GRU4Rec 4 DISCUSSION AND FUTURE WORK
with sampling, using the threshold set to 75 provides the best We proposed and evaluated a method for incorporating dwell time
results. in session-based recommendations with RNN for next item
prediction. By multiplying items instances we increased the
Table 1. Comparison between different results original training dataset that we had. As a result, we also improved
Method recall@20 MRR@20 recall@20 and MRR@20 by boosting significant to user items,
GRU4Rec [5] 0.5853 0.2305 based on their dwell time.
GRU4Rec with 0.7885 0.5834 We showed that the newly suggested method enhances and
dwell time outperforms the method suggested by [5] in a specific case study.
threshold 75 This supports our claim that the recommendations of the next item
GRU4Rec with 0.7754 0.548 are dynamic and depend also on the time user spends on the specific
dwell time item. Moreover, they evolve over time.
threshold 100 In future work, we plan to explore the optimal threshold for the
GRU4Rec with 0.7117 0.308 dwell time and check our findings on few more data sets, not only
sampling [6] on RecSys’2015 challenge.
GRU4Rec with 0.84 0.61
sampling and ACKNOWLEDGMENTS
dwell time The work is partially supported by the Israeli Innovation Authority,
threshold 100 the Ministry of Economy and Industry, MAGNET “Infomedia”
GRU4Rec with 0.853 0.636 project.
sampling and
dwell time
threshold 75 REFERENCES
[1] Ben-Shimon, D., Tsikinovsky, A., Friedmann, M., Shapira, B., Rokach, L., &
We will continue to experiment and test different thresholds effects Hoerle, J. (2015). Recsys challenge 2015 and the yoochoose dataset. In
on the next item recommendation task. Proceedings of the 9th ACM Conference on Recommender Systems (pp. 357-
The loss function values progression over epochs in the extended 358). ACM.
version of the algorithm, based on Rec4GRU with sampling [6], is [2] Bogina, V., Kuflik, T., & Mokryn, O. (2016). Learning Item Temporal
presented in Fig5. The goal is to minimize the loss function for the Dynamics for Predicting Buying Sessions. In Proceedings of the 21st
International Conference on Intelligent User Interfaces 251-255
training data. Therefore, the smaller the value of the loss function,
[3] Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the
the better our predictions. Frequently used loss function is the properties of neural machine translation: Encoder-decoder approaches. arXiv
cross-entropy loss. In Rec4GRU – TOP1 loss was used [5]. preprint arXiv:1409.1259.
However, in Rec4GRU with sampling - Bayesian Personalized [4] Greenstein-Messica, A., Rokach, L., & Friedman, M. (2017, March). Session-
Ranking (BPR)-max provided the best results. Based Recommendations Using Item Embedding. In Proceedings of the 22nd
International Conference on Intelligent User Interfaces (pp. 629-633). ACM.
[5] Hidasi, B., Karatzoglou, A., Baltrunas, L., & Tikk, D. (2015). Session-based
recommendations with recurrent neural networks. arXiv preprint
arXiv:1511.06939.
[6] Hidasi, B., & Karatzoglou, A. (2017). Recurrent Neural Networks with Top-k
Gains for Session-based Recommendations. arXiv preprint arXiv:1706.03847.
[7] Jannach, D., Kamehkhosh, I., & Lerche, L. (2017, April). Leveraging multi-
dimensional user models for personalized next-track music recommendation. In
Proceedings of the Symposium on Applied Computing (pp. 1635-1642). ACM.
[8] Park, S. E., Lee, S., & Lee, S. G. (2011). Session-based collaborative filtering
for predicting the next song. In Computers, Networks, Systems and Industrial
Engineering (CNSI), 2011 First ACIS/JNU International Conference on(pp.
353-358). IEEE.
[9] Tan, Y. K., Xu, X., & Liu, Y. (2016, September). Improved recurrent neural
networks for session-based recommendations. In Proceedings of the 1st
Workshop on Deep Learning for Recommender Systems (pp. 17-22). ACM.
[10] Yi, X., Hong, L., Zhong, E., Liu, N. N., & Rajan, S. (2014). Beyond clicks:
dwell time for personalization. In Proceedings of the 8th ACM Conference on
Figure 5. Loss function values for methods based on Rec4GRU
Recommender systems, 113-120.
with sampling with and without dwell time.
For each epoch in RNN the loss function value is outputted as well.
On the first epoch0 the values are the worst and after that, slowly,
it converges. When the threshold is set to 75 seconds, then the
3