=Paper=
{{Paper
|id=None
|storemode=property
|title=From Online Browsing to Offline Purchases: Analyzing Contextual Information in the Retail Business
|pdfUrl=https://ceur-ws.org/Vol-889/paper7.pdf
|volume=Vol-889
}}
==From Online Browsing to Offline Purchases: Analyzing Contextual Information in the Retail Business==
From Online Browsing to Offline Purchases: Analyzing Contextual Information in the Retail Business Simon Chan, Licia Capra University College London London, United Kingdom {mhchan,l.capra}@cs.ucl.ac.uk ABSTRACT 1. INTRODUCTION Accurate recommender systems can enhance consumers’ shop- Recommender systems are increasingly important for retail ping experiences. In retail and many other business envi- businesses. Retailers commonly provide personalized prod- ronments, extra contextual factors are usually available for uct recommendations to consumers through various chan- building even more accurate recommender systems. The in- nels, for instance, web advertisement, email vouchers adver- fluence of some factors is controversial in the industry. For tisement and in-store location-based mobile advertisement. instance, consumers’ recent online exposure to products can Recommender systems try to predict the outcomes accu- decrease the chance of in-store purchase as consumers may rately so that they can recommend products according to choose to purchase products online. On the other hand, the best expected scenario. Research showed that the use online exposure can be seen as an evidence of consumers’ of proper contextual information in recommender systems preference on products, which implies a higher chance of can improve the accuracy of prediction in some situations in-store purchase. The understanding of true influence is [1]. For instance, contextual factors such as purchase in- important for product recommendation in-store in this case. tent have shown to improve prediction [14]. How could we, The question is how to evaluate the relevance and the influ- however, know the intent of purchase of a consumer who ence of potential factors for prediction. Existing literature just enters a retail store without requiring additional user focuses on applying machine learning techniques to iden- interaction? In an offline in-store environment, other than tify relevant contextual factors. While these methods are time and location context, the type of user context we can proven to be effective in some experiments, an alternative obtain without intruding consumers’ experience is limited. approach that can provide easy-to-interpret analysis on rel- To address this issue, this paper explores the possibility of evance and influence is preferred in many situations. The using consumers’ recent online behaviors on the retailer’s paper introduces a computationally inexpensive approach e-commerce website to derive contextual factors for their in- to conduct preliminary relevance and influence analysis for store purchase decisions. The type of consumer behaviors contextual information in retail business. Statistical tech- that can be collected non-intrusively online is usually richer niques from medical research field are applied to analyze re- than those can be collected in-store. lationship between consumers’ online exposure to retailer’s e-commerce website, i.e., a contextual factor, and their of- In this paper, we consider consumers’ recent online exposure fline in-store purchase decisions, i.e., the outcome to be pre- to brands at a retailer’s website as a potential, but contro- dicted, based on a retail dataset provided by a large UK versial, contextual factor. Consumers’ online exposure could retail business with both online and offline presence. Unlike imply consumers’ tendency to purchase products online, so machine learning approaches, this analysis can be done even they are less likely to purchase them in-store. Oppositely, it before a recommender system is built by using the proposed could be seen as a user context that represents consumers’ approach. This research further shows that the influence recent preference on brands, which implies a higher chance of this contextual factor depends on extraneous attributes, of in-store purchase. The understanding of the influence of such as consumers’ ages and gender. This papers serves as this contextual factor is important for a number of recom- a preliminary step to analyze relevant contextual factors for mendation scenarios. For simplicity, we focus on a specific building context-aware recommender systems. scenario: A large retailer that has both online e-commerce website and offline stores presence wants to predict which Keywords product brands the customers are going to purchase when contextual information, odds ratio, stratified analysis, retail they enter offline stores. For a brand that a targeted con- sumer has browsed online recently, a recommender system can possibly consider three cases of influence: 1) If it is strong positively, the consumer will likely to purchase prod- ucts of this brand in-store anyway, so no recommendation is needed; 2) If it is negative or if there is no influence, the chance of purchase in-store is not high; 3) If it is medium positively, the system may try to nudge the consumer to CARS-2012, September 9, 2012, Dublin, Ireland. purchase products of this brand in-store. The design of this Copyright is held by the author/owner(s). recommender system is part of our future work. From the point of view of utilizing contextual information, tem settings in a news recommender system for accurate this paper studies the relevance and influence between con- recommendations [5]. Another feature selection technique, sumers’ online exposure to brands at the retailer’s website, Las Vegas Filter algorithm, has been applied in a more re- i.e., a contextual factor, and the probabilities of their in- cent work to identify relevant factors [17]. A pre-filtering al- store purchasing decisions, i.e., the outcome to be predicted, gorithm that pre-processes and selects contextual segments for product brands, stratified by different consumer groups. offline has also been described in details [3]. In [10], users are We propose the use of statistical techniques to analyze con- clustered based on the value of some contextual factors. The textual factor. Unlike machine learning techniques imple- predictive accuracy of each cluster is then compared with the mented in the literature, this approach is independent from one of the whole dataset which is non-contextual in order to the prediction model of the system. Besides, unlike tradi- understand whether and where the performance improves. tional correlation analysis techniques, such as sign test and The advantage of this kind of algorithm is that factors are chi-squared test, our approach can estimate the influence of considered only in situations where contextual method out- the factor on the probability of in-store purchase on products performs the standard non-contextual algorithm. In current of brands. This information can possibly be used to improve literature, relevance of contextual factors is measured based the prediction model directly in our future work. A challenge on their effects in the system’s predictive accuracy. Recent of using statistical techniques, namely the issue that basic research, however, shows that recommendation accuracy of probabilistic measurement is sensitive to external noise, is context-aware recommender systems can be affected by con- presented by a numerical example in this paper. More ro- ditions other than the contextual factors themselves, such as bust techniques, such as odds ratio and stratified analysis, the task requirement and the overall number of items in the are then proposed. The dataset for experiments used in this recommended list [15]. As a result, contextual factors may research is provided by a large UK retail business. Ten prod- be omitted simply because they are not integrated into the uct brands are analyzed. It is a 1-year anonymized records system or the prediction model properly. Other literature of loyalty card holders who have browsed products of the proposes the use of statistical methods to evaluate the rele- selected brands online at the retailer’s website and of those vance of contextual factors. In contrast to a machine learn- who have purchased products of the selected brands in any ing approach, a statistical approach is fast to compute and is store of the retailer in the UK. All data is collected in a independent from the prediction model implemented by the real non-experimental setting. In our experiments, whether system. Not all type of data fulfills the assumptions of these the consumers have browsed any product page of a targeted statistical models though. For instance, Pearson Correla- brand at the retailer’s website within a month is regarded tion Coefficient, or its binary form, Phi Coefficient, expects as a binary contextual factor and the outcome to be pre- a linear relationship between the two variables. Paired t- dicted is consumers’ in-store purchase decisions. Results test discussed in previous literature [1] is not suitable for show that the influence of this contextual factor on the out- binary data with binomial distribution. They are, there- come to be predicted varies with consumers’ attributes such fore, not suitable for our scenario. Although other statisti- as age and gender. The rest of the paper is organized as cal methods, such as Sign Test and Chi-squared Test, could follows. Section 2 is a review of related literature about be suitable for our binary data, this paper presents an alter- techniques that identify relevant contextual information in native statistical methodology that, not only evaluates the recommender systems. Section 3 describes the challenges of relevance of a contextual factor, but also estimates the in- data analysis in a retail scenario. Solutions are then pro- fluence of it on the probability of expected outcomes at the posed to analyze contextual factors. Section 4 presents ex- same time. By knowing the influence in probability, it is pos- periments conducted based on real retail dataset and section sible to make use of this contextual information to improve 5 is the discussion and future work. the prediction models directly in our future work. 2. RELATED WORK 3. STATISTICAL ANALYSIS Our literature review focuses on techniques that identify and 3.1 Problem Formulation evaluate relevant contextual information in recommender In this paper, we consider a retailer that operates both an systems. The technical goal of a recommender system can online e-commerce website and physical retail stores. Con- generally be seen as the problem of predicting ratings, or sumers’ recent online exposure at the retailer’s website is the any other outcome, for the items that have not been seen contextual factor to be evaluated. In particular, we define by a user [2]. The outcome to be predicted in retail recom- whether the consumers have browsed at least a page of a mender systems, for instance, may be consumers’ purchase targeted brand at the retailer’s website within a month as decisions instead of ratings. The use of contextual informa- the contextual condition, browse = 1 if the condition ex- tion in recommender systems have proven to improve the ists, 0 otherwise. For illustration purposes, we assume that accuracy of prediction in some situations [1]. Context is a the outcome to be predicted is the purchase decision of any multifaceted concept that is defined differently in multiple product of the targeted brand at any physical store (in-store research disciplines [3]. Various kind of attributes can be purchase), which is a binary variable: purchase = 1 if pur- defined as context. For instance, there are context of users, chased, 0 otherwise. In order to evaluate the relevance and context of items and context of interactions or situations influence between this contextual factor and the outcome, [4]. Regardless of the definition, the selection of relevant we need to compare the probability of in-store purchase of contextual factors to be used in recommender systems is a consumers who have browsed and of those who have not, i.e. critical issue. To deal with this issue, some literature applies p(purchase = 1|browse = 1) and p(purchase = 1|browse = machine learning techniques to identify relevant factors au- 0). In a population of N potential consumers, we can con- tomatically. Decision trees and feature selection techniques struct a table to represent the online browsing and in-store are used to rank the relevance of user preferences and sys- purchasing situation of the dataset: purchase=1 purchase=0 evaluate the impact of possible confounding attributes. browse=1 a11 a10 a1∗ browse=0 a01 a00 a0∗ a∗1 a∗0 N 3.2 Odds Ratio Odds ratio is commonly used as a estimator of RC in medical and epidemiological research for case-control studies where where a11 ,a10 ,a01 and a00 are the number of consumers for disease cases are not easy to be obtained [6, 13, 7]. Simi- the corresponding purchasing and browsing situations. a∗1 = lar to our problem, N is also adjustable in medical studies a11 + a01 is the number of consumers who have purchased because the number of people with and without diseases in in-store, a∗0 = a10 + a00 is the number of consumers who the dataset are determined by the design of the case-control have not purchased in-store, a1∗ = a11 + a10 is the number studies artificially. In our case, OR can be calculated as: of consumers who have browsed online and a0∗ = a01 + a00 is the number of consumers who have not browsed online. a11 /a01 a11 a00 OR = = (2) A direct way to express the relationship is to compare the a10 /a00 a10 a01 two probabilities with relative correlation (RC), where Identical to RC, there is no correlation if OR = 1, the influ- p(purchase = 1|browse = 1) a11 /a1∗ ence is positive if OR < 1 and negative if OR > 1. Unlike RC = = (1) p(purchase = 1|browse = 0) a01 /a0∗ RC, OR is insensitive to the row and column scaling op- There is no correlation if RC = 1, the influence is positive erations of the data table. Using the same example above, if RC < 1 and negative if RC > 1. This approach, however, OR = 5X25000 60X500 = 4.17 when there is no sales promotion, suffers from two problems when the data is collected from a OR = 50X25000 600X500 = 4.17 as well when there is a sales promo- non-experimental retail environment. First, RC is sensitive tion. OR is a good estimator statistically if a requirement to the total number of consumers who have purchased and is fulfilled: For the two groups of consumers, i.e. those who also to the total number of consumers who have not pur- have browsed online and those who have not browsed online, chased. These two numbers, unfortunately, can be affected separately, the number of consumers who have purchased by external irrelevant factors, such as marketing campaigns in-store must be a small percentage (less than 10%) of the or product promotions, which should be isolated from this total number of consumers in the group. This requirement analysis. This problem can be illustrated with a numerical is reasonably fulfilled in most retail situations. Confidence example. Suppose the data looks like the following table interval (CI) is used to determine the reliability of the re- when there is no sales promotion: sults. The larger the range of CI, the less reliable the result is. The CI of odds ratio [12] can be approximated with: purchase=1 purchase=0 r ! browse=1 5 500 505 a11 a00 1 1 1 1 CI = exp ±z + + + (3) browse=0 60 25,000 25,060 a10 a01 a11 a10 a01 a11 65 25,500 25,565 where z is the score of the standard normal distribution associated with the confidence level. z = 1.96 for a 95% Let us assume that a sales promotion successfully attracts confidence interval. new consumers to purchase and the number of purchase in- creases 10 times as shown in the following table: 3.3 Stratified Analysis purchase=1 purchase=0 Extraneous attributes, such as consumers’ age and gender, browse=1 50 500 550 potentially affect the influence of the contextual factor on browse=0 600 25,000 25,600 the outcome. Stratified analysis is a computationally inex- 650 25,500 26,150 pensive solution to reveal their effects. This technique is commonly used in medical research when setting up con- When all things being equal, a temporary sales promotion trol group experiments is not feasible and so the existence should not affect the relationship between the contextual of extraneous factors is common [9]. It analyzes subgroups 5/505 factor and the outcome. In reality, however, RC = 60/25060 = (strata) of the study population separately according to the 50/550 attributes. For instance, two strata are created for the gen- 4.14 in the first case while RC = 600/25600 = 3.88 in the sec- der attribute: female consumers and male consumers. Odds ond one. In another words, RC is sensitive to the change rate is measured for each strata separately. Stratified analy- of number of consumers who purchase (a∗1 ). This problem sis provides an independent view for each strata, each comes presence in many real-world environments since businesses with its own odds ratio. The difference is then comparable can always attract new consumers to stores or website dy- among these strata. In addition, a common strata-adjusted namically, which affects N , and thus a∗1 and a∗1 can be odds ratio is estimated by Mantel-Haenszel (MH) method manipulated. An odds ratio technique to estimate RC that [11]. This adjusted value represents a weighted average of is insensitive to the change of N is proposed later in this the stratum-specific odds ratio which is an approximation to paper. The second problem is the existence of extraneous the maximum likelihood estimation. According to [8], the attributes, such as age and gender, that potentially affect the formula of approximation can be written as: influence of the targeted contextual factor on the outcome to Pk a11i a00i be predicted. This problem occurs when an attribute is asso- i=1 Ni ciated with the contextual factor and at the same time such ORM H = Pk a01i a10i (4) i=1 Ni attribute affects the outcome dependently or independently. This kind of extraneous attribute is called a confounder in where k is the total number of strata in an analysis and i the statistics discipline. Stratified analysis is proposed to represents one of them. For this Mantel-Haenszel method of estimation to be accurate, the overall sample size must be for those who have not browsed so. We also calculate the large. [16] provides a more robust but complicated approxi- monthly common strata-adjusted odds ratio as well as the mation method for data with small sample size. Confidence 95% confidence interval (CI). interval (CI) can again be used to indicate the reliability of the result: 4.3 Results 95% CI for ORM H = Exp[(lnORM H ± SE(lnORM H )] (5) Results of only three stratified odds ratios analysis are pre- sented in this paper due to length constraint. All figures where show that odds ratio measurements are well above 1, i.e., the Pk a10i a01i 2 influence is positive for all brands. It means that the prob- p i=1 ( Ni ) vi SE(lnORM H ) = ( Pk a10i a01i 2 ) ability to purchase at least one product of a selected brand i=1 ( Ni ) in-store by consumers who have browsed at least a webpage and of that brand online at the retailer’s website is higher than the probability for those who have not browsed so. The val- 1 1 1 1 vi = + + + ues and patterns are different for each brand though, which a11i a10i a01i a00i means that the impact of this contextual factor of online exposure varies with brands. 4. EXPERIMENT 4.1 Dataset Figure 1 represents gender-stratified analysis of brand A. Our dataset, which is provided by a large UK retail busi- The influence on female consumers is much stronger than ness, is a 1-year anonymized records of loyalty card holders the one on male consumers. An interesting discovery is that who have browsed the selected products online on the re- the odds ratio measurements for both genders follow a very tailer’s website and of those who have purchased the selected similar up and down monthly pattern. Both strata have products in any store of the retailer in the UK. It contains peaked odds ratio in February. This finding hints that time 10,217,972 unique loyalty card holders and 2,939 unique is a contextual factor that should also be considered in future products under 10 selected brands. There are 21,668,137 in- work. Figure 2 shows that the odds ratio range of different store purchase transaction records and 299,070 online brows- age groups for brand B are separated clearly. The influence ing records. We associate consumers’ online browsing and for consumers of age 18-25 is the highest while the one for in-store purchasing behaviors with unique loyalty card num- consumers of age 26-35 is the lowest. It means that the bers. All data is collected in a real non-experimental setting. probability for consumers of age 18-25 is higher than the one for consumers of age 36-45 and both of them are higher 4.2 Experimental Design than the one for consumers of age 26-35. This finding implies This experiment investigates the relevance and influence be- that, for brand B, the age attribute itself can be correlated tween consumers’ recent online browsing behaviors and the to consumers’ in-store purchase decisions. Figure 3, on the probabilities of their in-store purchase decisions for ten prod- other hand, draws a different conclusion for brand C. In uct brands carried by a large UK retail business nationally. this case, the odds ratio measurements of these age groups These ten brands are selected randomly, some of them are mixed together in a close range. There is no clear monthly luxury brands while the others are mid-range brands. We pattern either. It means that age, gender and month are not define whether a consumer has browsed at least a page of confounding factors for this brand. a targeted brand at the retailer’s website within a month as the context of the consumer, browse = 1 if the condi- 5. DISCUSSION AND FUTURE WORK tion exists, 0 otherwise. Odds ratio is used to compare the This paper derives a contextual factor from consumers’ re- influence of this contextual factor on the probabilities of con- cent online browsing behaviors on the retailers’ website for sumers’ binary purchase decision of any product of the tar- the prediction of their offline in-store purchase. A statisti- geted brand at any physical store (in-store purchase). We cal approach is presented to conduct a preliminary analysis pre-process the dataset to filter out consumers who have on the relevance and influence between this factor and the not visited any page at the retailer’s website at least once offline purchase decisions on brands using odds ratio and in the past year. This process ensures that the remaining N stratified analysis techniques. The initial uncertainty that consumers have at least successfully accessed the retailer’s consumers who browse online on retailer’s website tend to website recently. We start with a hypothesis that age and gender are two attributes of consumers that may confound the influence. We conduct monthly strata-specific measure- ment of odds ratio based on these two attributes for each Figure 1: Odds Ratio by Gender (Brand A) brand. Practically, age and gender information is missing in some records. In each analysis, therefore, we analyze a population size of Nage or Ngender which represent the total number of consumers with age information or with gender information respectively. In these experiments, we calculate the monthly crude (unadjusted) odds ratio for each strata for each brand. If the odds ratio for a strata of a brand is X, it means that, in this strata and in this particular month, the probability to purchase at least one product of this brand in-store by consumers who have browsed at least a webpage of this brand online is X times higher than the probability Figure 2: Odds Ratio by Age (Brand B) Figure 3: Odds Ratio by Age (Brand C) purchase online and therefore they have lower chance to pur- [8] W. Hauck. The large sample variance of the chase in-store has been proven untrue for the brands we have mantel-haenszel estimator of a common odds ratio. analyzed. In addition, as expected, the influence of online Biometrics, pages 817–819, 1979. exposure on offline purchases varies with brands and con- [9] D. Kleinbaum, L. Kupper, and H. Morgenstern. sumers’ ages and gender. In our future work, the analysis Epidemiologic research: principles and quantitative for non-binary contextual factor will be illustrated. Besides methods. Wiley, 1982. the factor we have evaluated in this paper, it is interesting to [10] S. Lombardi, S. Anand, and M. Gorgoglione. Context see whether other relevant contextual factors can be derived and customer behavior in recommendation. In from consumers’ recent online behaviors for their in-store RecSys09: Workshop on context-aware recommender purchase decisions. Future work is to build a context-aware systems (CARS-2009), 2009. recommender system for in-store product recommendation [11] N. Mantel and W. Haenszel. Statistical aspects of the based on these findings. We are interested in using the OR analysis of data from retrospective studies of disease. value directly to improve prediction. Also, a comparison of The Challenge of Epidemiology: Issues and Selected predictive performance of recommender systems using con- Readings, 1(1):533–553, 2004. textual factors selected by this approach and by existing [12] J. Morris and M. Gardner. Statistics in medicine: machine learning techniques is part of our future work. Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates. British medical journal (Clinical research ed.), 6. REFERENCES 296(6632):1313, 1988. [1] G. Adomavicius, R. Sankaranarayanan, S. Sen, and [13] F. Mosteller. Association and estimation in A. Tuzhilin. Incorporating contextual information in contingency tables. Journal of the American Statistical recommender systems using a multidimensional Association, 63(321):1–28, 1968. approach. ACM Trans. Inf. Syst., 23(1):103–145, Jan. [14] C. Palmisano, A. Tuzhilin, and M. Gorgoglione. Using 2005. context to improve predictive modeling of customers [2] G. Adomavicius and A. Tuzhilin. Toward the next in personalization applications. Knowledge and Data generation of recommender systems: A survey of the Engineering, IEEE Transactions on, 20(11):1535 state-of-the-art and possible extensions. IEEE –1549, nov. 2008. Transactions on Knowledge and Data Engineering, [15] U. Panniello and M. Gorgoglione. Does the 17(6):734–749, 2005. recommendation task affect a cars performance? In [3] G. Adomavicius and A. Tuzhilin. Context-aware RecSys10: Workshop on context-aware recommender recommender systems. Recommender Systems systems (CARS-2010), 2010. Handbook, pages 217–253, 2011. [16] J. Robins, N. Breslow, and S. Greenland. Estimators [4] M. Bazire and P. Brezillon. Understanding context of the mantel-haenszel variance consistent in both before using it. Modeling and using context, pages sparse data and large-strata limiting models. 113–192, 2005. Biometrics, pages 311–323, 1986. [5] A. Bellogı́n, I. Cantador, P. Castells, and A. Ortigosa. [17] B. Vargas-Govea, G. González-Serna, and Discovering relevant preferences in a personalised R. Ponce-Medellı́n. Effects of relevant contextual recommender system using machine learning features in the performance of a restaurant techniques. In Proceedings of the ECML-PKDD 2008 recommender system. In RecSys11: Workshop on Workshop on Preference Learning, 2008. context-aware recommender systems (CARS-2011), [6] J. Cornfield et al. A method of estimating 2011. comparative rates from clinical data; applications to cancer of the lung, breast, and cervix. Journal of the National Cancer Institute, 11(6):1269, 1951. [7] A. Edwards. The measure of association in a 2× 2 table. Journal of the Royal Statistical Society. Series A (General), pages 109–114, 1963.