Incorporating Dwell Time in Session-Based Recommendations with Recurrent Neural Networks Research In Progress† Veronika Bogina Tsvi Kuflik Haifa University Haifa University Israel Israel sveron@gmail.com tsvikak@is.haifa.ac.il ABSTRACT which is defined as a sequence of clicks in news, music and e- commerce domains. Hitherto, with the development of RNN, Recurrent Neural Networks (RNN) is a frequently used technique sequences of events were taken into consideration, still without for sequence data predictions. Recently, it gains popularity in the incorporating the dwell time - the time that user spent examining Recommender Systems domain, especially for session-based this specific item [4][5][6][9]. recommendations where naturally, each session is defined as a sequence of clicks, and timestamped data per click is available. In Therefore, we decided to explore the effect of incorporating dwell our research, in its early stages, we explore the value of time into the input data of the RNN based on the existing incorporating dwell time into existing RNN framework for session- framework, and evaluate it in a dataset taken from the e-commerce based recommendations by boosting items above the predefined domain. dwell time threshold. We show improvement in recall@20 and MRR@20 by evaluating the proposed approach on e-commerce Next, we will describe first our method, then we present an initial RecSys’15 challenge dataset. evaluation and, finally, discuss our findings and future research directions. CCS CONCEPTS Information retrieval -> Retrieval tasks and goals -> 2 METHOD Recommender systems The general idea of the proposed method is that the longer user examines an item (stays on its web page), the more interested s/he KEYWORDS is in that item. Obviously, we are not talking about outliers, where Recommender systems, temporal aspects, dwell time, deep there is a possibility that the user just left the application or simply learning, recurrent neural networks kept the web page open while moving away from the computer. Therefore, this approach can be used in the next click ACM Reference format: recommendations techniques. V.Bogina, T.Kuflik 2017. Incorporating Dwell Time in Session-Based Let an e-commerce session be a sequence of clicks {x1,…,xn} and Recommendations with Recurrent Neural Networks. SIG Proceedings for each click there is a dwell time dti. We propose to use Boosting Paper in Word Format. In Proceedings of RecTemp Workshop co-located to boost items that have dwell time greater than a predefined with ACM RecSys’2017, Como, Italy threshold t. This way we multiply the number of such items in the session and define the number of occurrences of the item in the 1 MOTIVATION session as (dti/t +1). Nowadays, users are flooded with a wide variety of items to Hidasi et al.[5] proposed to use the Gated Recurrent Unit (GRU) purchase/listen/read in the Sea of Possibilities over the Internet. In based RNN for session-based recommendations that was preferable this scenario, relevant and useful recommendations can be a life to LSTM in its performance [3]. One of the challenges in session saver for a customer by reducing the number of alternatives. Hence, based recommendations modelling with RNN is that sessions differ predicting next items a customer will be interested in (to click on, in their length. Therefore, the authors proposed to represent each listen to, read through etc.) is an ongoing attractive and interesting mini-batch as set of elements from parallel sessions – refer to Fig1 research task, as it is important to the service provider and the and then predict next elements in the parallel sessions by predicting customer alike [5]. To succeed in predicting the “next item to be next mini-batch. On the left, we see the original click sequences clicked”, some of the researchers considered past sessions of the while on the right – input to the model and an output from the user [5][8], item repetition in past interaction, favorite items, items model. All sessions are ordered by session id and time. After that co-occurrences, topics similarity [7] and more. the first elements of the few sessions are used in the first mini batch, the second elements in the second min-batch and so forth. Only However, as the previous studies [2][10] show, dwell time plays a elements with next element available are used in the input. Once significant role in predictions based on implicit users’ feedback session ends, another one is proceeded (as in the Tetris game). The † Copyright©2017 for this paper by its authors. Copying permitted for private and academic purposes RecTemp Workshop, RecSys’2017, August 2017, Como, Italy V.Bogina, T.Kuflik output forms the set of the next items for each item in the input by clicks, including timestamp of the click. Clearly, dwell time for batch. Therefore, the last element in the session is not part of the all clicks, except the last one in the session (as there is no dwell input (no item follows it), as well as the first element of the session time for the last click in the data set), where dwell time is not is not part of the output (no item precedes it). We will refer to their available, can be extracted. The dwell time is calculated as a model in the paper as GRU4Rec. difference between timestamp of the current item and the next one. In our study, we propose to enrich Hidasi et al. [5] model Following Hidasi et al. research [5] , the one-click sessions from (GRU4Rec) with the representation of dwell time as additional the original training dataset were dropped due their nature – the elements by incorporating item boosting. lack of next click to predict. Yoochoose click data was split to training and test sets the same way as it was done in [5]. To be able to compare our results with the results of GRU4Rec the same metrics were used – recall@20 and MRR@20. The statistics on the dwell time (in seconds as the reader can see on axis X) is presented as a boxplot in Fig3. Items dwell time distribution is presented in Fig4. Figure 1. Session-parallel mini batch creation (following Hidasi et.al paper). In our proposed method, session i is represented by clicks, based on their dwell time. Let’s assume that the predefined threshold for the dwell time is t seconds and in session2 the dwell time of the first element is greater than 2t seconds but less than 3t, then this session parallel mini batch is different from the previous one (as described in Fig1) by inclusion of 2 instances of the first element – see Fig2. We call it items boosting. Items dwell times are different. To differentiate items accordingly, we propose to increase the Figure 3. Boxplot with statistics on the data set’s dwell time presence in the session of those, having dwell time greater than the pre-defined threshold. Indeed, the presence remains at the same The reader can see from the boxplot that the average dwell time is location in the sequence. around 148 seconds, while the median is around 60 and standard deviation is 326.05. 25th percentile is 26 seconds, 50th - 58.5 seconds and 75th percentile - 130 seconds. Figure 2. Dwell time based session-parallel mini batch creation using items boosting Recently, Hidasi and Karatzoglou [6] (will refer to it later as GRU4Rec with sampling) improved the performance of the GRU4Rec by changing a sampling strategy: where for each example in the mini batch another example is used as a negative sampling; and presenting a novel family of ranking loss functions, Figure 4. Dwell time distribution that is limited to items with based on individual pairwise losses. Their recall@20 and dwell time less than 450 seconds. MRR@20 results outperformed GRU4Rec. We will also approach this method it in the next section of evaluation. We decided to test our approach initially with the threshold that is half of the average time – 75 seconds (from practical performance 3 EVALUATION reasons), as well as between the 50th and 75th. Moreover, since we Our proposed method was evaluated on RecSys’2015 challenge are curious about other threshold values, we have conducted an data set – Yoochoose [1]. Each e-commerce session is represented experiment with 100 seconds as a threshold and got worse results 2 Addressing Temporal Aspects in User Modelling UMAP’2017, July 2017, Bratislava, Slovakia than with 75 seconds, as presented in Table 1. Looking at Table 1, learning process is faster, since more data is available (relatively to the reader can see two baseline method’s GRU4Rec[5] and Rec4GRU with sampling). As well given the fact that Hidasi et al. GRU4Rec with sampling[6] results. Under each one of them we models were optimized. present results for the same method, but with the dwell time enrichment with the threshold set to 75 seconds and 100 seconds. According to the table, incorporating dwell time into GRU4Rec 4 DISCUSSION AND FUTURE WORK with sampling, using the threshold set to 75 provides the best We proposed and evaluated a method for incorporating dwell time results. in session-based recommendations with RNN for next item prediction. By multiplying items instances we increased the Table 1. Comparison between different results original training dataset that we had. As a result, we also improved Method recall@20 MRR@20 recall@20 and MRR@20 by boosting significant to user items, GRU4Rec [5] 0.5853 0.2305 based on their dwell time. GRU4Rec with 0.7885 0.5834 We showed that the newly suggested method enhances and dwell time outperforms the method suggested by [5] in a specific case study. threshold 75 This supports our claim that the recommendations of the next item GRU4Rec with 0.7754 0.548 are dynamic and depend also on the time user spends on the specific dwell time item. Moreover, they evolve over time. threshold 100 In future work, we plan to explore the optimal threshold for the GRU4Rec with 0.7117 0.308 dwell time and check our findings on few more data sets, not only sampling [6] on RecSys’2015 challenge. GRU4Rec with 0.84 0.61 sampling and ACKNOWLEDGMENTS dwell time The work is partially supported by the Israeli Innovation Authority, threshold 100 the Ministry of Economy and Industry, MAGNET “Infomedia” GRU4Rec with 0.853 0.636 project. sampling and dwell time threshold 75 REFERENCES [1] Ben-Shimon, D., Tsikinovsky, A., Friedmann, M., Shapira, B., Rokach, L., & We will continue to experiment and test different thresholds effects Hoerle, J. (2015). Recsys challenge 2015 and the yoochoose dataset. In on the next item recommendation task. Proceedings of the 9th ACM Conference on Recommender Systems (pp. 357- The loss function values progression over epochs in the extended 358). ACM. version of the algorithm, based on Rec4GRU with sampling [6], is [2] Bogina, V., Kuflik, T., & Mokryn, O. (2016). Learning Item Temporal presented in Fig5. The goal is to minimize the loss function for the Dynamics for Predicting Buying Sessions. In Proceedings of the 21st International Conference on Intelligent User Interfaces 251-255 training data. Therefore, the smaller the value of the loss function, [3] Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the the better our predictions. Frequently used loss function is the properties of neural machine translation: Encoder-decoder approaches. arXiv cross-entropy loss. In Rec4GRU – TOP1 loss was used [5]. preprint arXiv:1409.1259. However, in Rec4GRU with sampling - Bayesian Personalized [4] Greenstein-Messica, A., Rokach, L., & Friedman, M. (2017, March). Session- Ranking (BPR)-max provided the best results. Based Recommendations Using Item Embedding. In Proceedings of the 22nd International Conference on Intelligent User Interfaces (pp. 629-633). ACM. [5] Hidasi, B., Karatzoglou, A., Baltrunas, L., & Tikk, D. (2015). Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939. [6] Hidasi, B., & Karatzoglou, A. (2017). Recurrent Neural Networks with Top-k Gains for Session-based Recommendations. arXiv preprint arXiv:1706.03847. [7] Jannach, D., Kamehkhosh, I., & Lerche, L. (2017, April). Leveraging multi- dimensional user models for personalized next-track music recommendation. In Proceedings of the Symposium on Applied Computing (pp. 1635-1642). ACM. [8] Park, S. E., Lee, S., & Lee, S. G. (2011). Session-based collaborative filtering for predicting the next song. In Computers, Networks, Systems and Industrial Engineering (CNSI), 2011 First ACIS/JNU International Conference on(pp. 353-358). IEEE. [9] Tan, Y. K., Xu, X., & Liu, Y. (2016, September). Improved recurrent neural networks for session-based recommendations. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (pp. 17-22). ACM. [10] Yi, X., Hong, L., Zhong, E., Liu, N. N., & Rajan, S. (2014). Beyond clicks: dwell time for personalization. In Proceedings of the 8th ACM Conference on Figure 5. Loss function values for methods based on Rec4GRU Recommender systems, 113-120. with sampling with and without dwell time. For each epoch in RNN the loss function value is outputted as well. On the first epoch0 the values are the worst and after that, slowly, it converges. When the threshold is set to 75 seconds, then the 3