=Paper=
{{Paper
|id=Vol-1441/recsys2015_poster8
|storemode=property
|title=How to Interpret Implicit User Feedback?
|pdfUrl=https://ceur-ws.org/Vol-1441/recsys2015_poster8.pdf
|volume=Vol-1441
|dblpUrl=https://dblp.org/rec/conf/recsys/PeskaV15
}}
==How to Interpret Implicit User Feedback?==
How to Interpret Implicit User Feedback? Ladislav Peska, Peter Vojtas Faculty of Mathematics and Physics, Charles University in Prague [peska|vojtas]@ksi.mff.cuni.cz ABSTRACT combine RBTs e.g. [7]. Our current research switched towards Our research is focused on interpreting user preference from correct interpretation of RBT values as the first step towards his/her implicit behavior. There are many types of relevant behav- learning user preference. We can track several similar approaches ior e.g. time on page, scrolling, clickstream etc. which we will in the literature e.g. [1] comparing implicit signals with explicit further denote as Relevant Behavior Types (RBT). RBT s varies user rating on an open-web user study, [8] categorizing several both in quality and incidence and thus we might need different user activities as positive or negative feedback on an online music approaches to process them. In this early work we focus on how service, RSS feed recommender analyzing implicit reading-related to derive user preference from each RBT separately. We selected user actions [3] or using normalized item level dwell time as number of common indicators, design two novel e-commerce relevance measure [9]. However to our best knowledge, there are specific RBT interpreting methods and conducted series of off- no approach in the literature focusing on interpreting RBTs in line experiments. After the off-line evaluation an A/B test on the small e-commerce and thus our set of RBTs and methods for their real-world users of a travel agency was conducted comparing best interpretation based on purchasing behavior are rather unique. off-line method with simple binary feedback. The experiments, 2. IMPLICIT PREFERENCE INDICATORS although preliminary, showed importance of considering multiple Virtually any observable user behavior can serve as implicit feed- RBTs together. back. The majority of user behavior consists of separated user actions (mouse click, typing, scrolling event etc.). Although it is Categories and Subject Descriptors possible to consider these actions as a stream, we opted for aggre- H.3.3 [Information Systems]: Information Search and Retrieval - gating the same type of behavior while user visits particular Information Filtering webpage. Thus the Relevant Behavior Types (RBTs) are integer General Terms variables containing volumes of each type of action aggregated throughout user’s visit of the webpage. So far we considered only Measurement, Human Factors, Experimentation. several basic types of behavior as shown in Table 1, however we Keywords plan to use the full scope of the RBT collecting component [5] in Implicit Feedback, Recommender Systems, User preference the future work. Note that not all RBTs are triggered for all visits. Table 1: Considered RBTs. Coverage column describes for how 1. INTRODUCTION many visits we have also information from this RBT. Recommender Systems have been widely studied in the last two RBT Triggered event Coverage decades. They successfully complement search engines or on-site catalogues on video streaming services, book databases1, e- Pageview JavaScript Load() 99% commerce2 etc. Although the recommender systems are relatively Mouse JavaScript MouseOver() 44% widespread nowadays, we focus on yet neglected domain: rec- Scroll JavaScript Scroll() 49% ommending on small e-commerce websites without dominant Time Total time spent on page 69% position on the market. Among the most sewer challenges of this Purchase Object was purchased 0.5% domain is users’ disloyalty and high ratio between number of objects and users. This disqualifies otherwise successful Collabo- rative Filtering (CF) methods as they stuck in persistent cold-start 3. PREFERENCE LEARNING METHODS The key research question of this poster is to learn dependence problem [4]. Another related challenge is the scarcity of explicit between values of each RBTs and user preference . It is possible feedback. Due to users disloyalty and absence of incentives to do to use simple binary model like “all visited objects are equally so, users generally do not provide explicit feedback in small e- preferred” or simple numeric model with linear dependence be- commerce websites. Our only option is to focus on implicit user tween the value of RBT and user preference [2]. We added two feedback. Unlike e.g. Hu et al. [2], we focus on multiple behavior collaborative approaches specific for the e-commerce domain, types relevant for user preference (e.g. time on page, scrolling, considering other users purchasing behavior. purchases etc.), which will hopefully provide better user under- standing and thus better recommendations than a single type of Binary user preference is defined as for all visited objects. feedback. In our previous works we focused on deriving negative Direct preference normalization is a user-wise linear normaliza- preference from implicit feedback [6] or various approaches to tion of each indicator into the [0,1] interval. This approach is similar to [2]. For user u and type t, the preference based on type is: 1 Librarything.com 2 Amazon.com Purchase-based approaches considers whether other users with similar values of RBTs purchased the object or not and computes Copyright is held by the author(s). RecSys 2015 Poster Proceedings, purchase rate PR. The approaches differ in definition of neighbor- September 16-20, 2015, Austria, Vienna. hood ε for RBT values: KNN: use K nearest neighbor visits to compute PR. K is The KNN has peak performance around ε=0.7 for all RBTs except defined as ε * total number of all visits scrolling. Distance: use all visits from interval [(1-ε) * val(RBT), (1+ε) * val(RBT)] 4.2 A/B testing The PR is then computed as: , After the off-line evaluation we selected 3 methods for on-line where #purch is volume of purchases from defined ε neighbor- testing: Binary user preference as baseline, average of direct hood, #all_purch is volume of all purchases in the dataset. Intui- normalization of all RBTs and average of best resulting methods tively PR for KNN represents ratio between mass of purchases in for each RBT according to Table 2. The recommendations were current interval and expected one for uniform distribution. Finally computed via VSM algorithm and we opted for the number of we use PR in sigmoid function to smoothly normalize user rating click throughs (CT) as target metric. The evaluation was carried out in June 2015 with in total over 2900 users randomly assigned into [0,1] interval: . to one of the preference learning methods. The hypothesis behind purchase-based approaches is that pur- Table 3: Results of on-line evaluation chase is the only RBT with “guaranteed” effect on user prefer- ence, so if users evaluate other objects similarly, then although Binary Direct norm. Best according to Table 2 they did not purchase them, they still probably like them. Another CT 208 232 213 reason for this approach is that although we can expect that higher Users 971 979 976 value of each RBT implicates higher preference, the exact de- The on-line experiments are not conclusive yet, but it seems that pendence is unknown. Purchasee-based approaches allow us to direct normalization outperforms other methods. We will further derive non-linear parametric dependence between the value of experiment with other settings and definitions of purchase-based RBT and expected user preference . On the other hand in this approaches in the future work. Our current working hypothesis is approach we neglect different behavior patterns for different users to use exact values instead of ε expressions. as well as various cognitive demands to evaluate different objects. We would like to perform user clustering of more loyal users with 5. CONCLUSIONS AND FUTURE WORK enough feedback in the future work. In this poster, our aim was to design novel methods to infer user preference from relevant behavior types and to determine optimal 4. EVALUATION approaches to handle different RBTs. The purchase-based meth- 4.1 Off-line Evaluation ods succeeded in off-line experiments, however further tuning and enhanced on-line evaluation is necessary. Also incorporation of In the first phase of evaluation we compared various RBT inter- various aggregation methods is in our future work. pretation methods on a travel agency dataset. As we did not con- sider any specific method for aggregating RBTs, we opted for Acknowledgements: The work on this paper was supported by pairwise comparison of purchased and non-purchased objects for the grant SVV-2015-260222, GAUK-126313 and P46. each user. For each (strictly rated) pair and each indicator prefer- ence we state that this pair is correctly ordered, if the indicator REFERENCES preference of purchased object is greater than preference of [1] Claypool, M.; Le, P.; Wased, M. & Brown, D.: Implicit non-purchased object . Incorrect and equal are defined like- interest indicators. In IUI 2001. ACM, 2001, 33-40. wise. Let Corr/Inc/Eq are sums of all correctly/incorrectly/equally [2] Hu, Y.; Koren, Y.; & Volinsky, Ch.: Collaborative Filtering ordered pairs. We can now define paired error metric as follows: for Implicit Feedback Datasets. In ICDM '08. IEEE, 263-272. [3] Lai, Y., Xu, X., Yang, Z., Liu, Z. User interest prediction Note that we consider Eq as an error too, however its significance based on behaviors analysis. Int. Jour. of Digital Content is lower than Inc. We use α=0.5 in the evaluation. Technology and its Applications, 6 (13), 2012, 192-204 The evaluation dataset contains 9 months usage data from a travel [4] Peska, L. & Vojtás, P.: Recommending for Disloyal agency. For the purpose of the experiment, the dataset was re- Customers with Low Consumption Rate. In SOFSEM 2014, stricted to only users with at least one purchase and at least two Springer, LNCS 8327, 2014, 455-465 visited objects (some outliers were also removed) leaving over [5] Peska, L.: IPIget – The Component for Collecting Implicit 8400 pairs of objects from 380 users with 450 purchases. User Preference Indicators. In ITAT 2014, Ustav informatiky Table 2: Off-line results of for various values of ε. AV CR, 2014, 22-26, http://itat.ics.upjs.sk/workshops.pdf . [6] Peska, L. & Vojtas, P.: Negative Implicit Feedback in E- RBT Direct Dist, 0.2 Dist, 0.9 KNN, 0.01 KNN, 0.7 commerce Recommender Systems. In WIMS 2013, ACM, Pageview 0.797 0.695 0.850 0.753 0.825 2013, 45:1-45:4 Mouse 0.772 0.561 0.799 0.695 0.822 [7] Peska, L. & Vojtas, P.: Evaluating Various Implicit Factors Scroll 0.569 0.555 0.578 0.582 0.573 in E-commerce. In RUE 2012, CEUR, 2012, 910, 51-55 Time 0.791 0.502 0.589 0.632 0.649 [8] Yang, B.; Lee, S.; Park, S. & Lee, S.: Exploiting Various Implicit Feedback for Collaborative Filtering. In WWW 2012, According to the off-line evaluation, each RBT needs to be treated ACM, 2012, 639-640. differently. As for the Time, the best method was direct normali- zation for Mouse it was KNN with larger ε, for Pageview was [9] Yi, X.; Hong, L.; Zhong, E.; Liu, N.; & Rajan. S.: Beyond optimal Distance with large ε and Scrolling on the other hand Clicks: Dwell Time for Personalization. In RecSys '14. requires either KNN with small ε. For Distance method, we can ACM, 2014, 113-120 clearly see grading improvements with increasing ε for all RBTs,