=Paper= {{Paper |id=None |storemode=property |title=From Online Browsing to Offline Purchases: Analyzing Contextual Information in the Retail Business |pdfUrl=https://ceur-ws.org/Vol-889/paper7.pdf |volume=Vol-889 }} ==From Online Browsing to Offline Purchases: Analyzing Contextual Information in the Retail Business== https://ceur-ws.org/Vol-889/paper7.pdf
        From Online Browsing to Offline Purchases:
   Analyzing Contextual Information in the Retail Business

                                                   Simon Chan, Licia Capra
                                                     University College London
                                                     London, United Kingdom
                                                 {mhchan,l.capra}@cs.ucl.ac.uk


ABSTRACT                                                            1.   INTRODUCTION
Accurate recommender systems can enhance consumers’ shop-           Recommender systems are increasingly important for retail
ping experiences. In retail and many other business envi-           businesses. Retailers commonly provide personalized prod-
ronments, extra contextual factors are usually available for        uct recommendations to consumers through various chan-
building even more accurate recommender systems. The in-            nels, for instance, web advertisement, email vouchers adver-
fluence of some factors is controversial in the industry. For       tisement and in-store location-based mobile advertisement.
instance, consumers’ recent online exposure to products can         Recommender systems try to predict the outcomes accu-
decrease the chance of in-store purchase as consumers may           rately so that they can recommend products according to
choose to purchase products online. On the other hand,              the best expected scenario. Research showed that the use
online exposure can be seen as an evidence of consumers’            of proper contextual information in recommender systems
preference on products, which implies a higher chance of            can improve the accuracy of prediction in some situations
in-store purchase. The understanding of true influence is           [1]. For instance, contextual factors such as purchase in-
important for product recommendation in-store in this case.         tent have shown to improve prediction [14]. How could we,
The question is how to evaluate the relevance and the influ-        however, know the intent of purchase of a consumer who
ence of potential factors for prediction. Existing literature       just enters a retail store without requiring additional user
focuses on applying machine learning techniques to iden-            interaction? In an offline in-store environment, other than
tify relevant contextual factors. While these methods are           time and location context, the type of user context we can
proven to be effective in some experiments, an alternative          obtain without intruding consumers’ experience is limited.
approach that can provide easy-to-interpret analysis on rel-        To address this issue, this paper explores the possibility of
evance and influence is preferred in many situations. The           using consumers’ recent online behaviors on the retailer’s
paper introduces a computationally inexpensive approach             e-commerce website to derive contextual factors for their in-
to conduct preliminary relevance and influence analysis for         store purchase decisions. The type of consumer behaviors
contextual information in retail business. Statistical tech-        that can be collected non-intrusively online is usually richer
niques from medical research field are applied to analyze re-       than those can be collected in-store.
lationship between consumers’ online exposure to retailer’s
e-commerce website, i.e., a contextual factor, and their of-        In this paper, we consider consumers’ recent online exposure
fline in-store purchase decisions, i.e., the outcome to be pre-     to brands at a retailer’s website as a potential, but contro-
dicted, based on a retail dataset provided by a large UK            versial, contextual factor. Consumers’ online exposure could
retail business with both online and offline presence. Unlike       imply consumers’ tendency to purchase products online, so
machine learning approaches, this analysis can be done even         they are less likely to purchase them in-store. Oppositely, it
before a recommender system is built by using the proposed          could be seen as a user context that represents consumers’
approach. This research further shows that the influence            recent preference on brands, which implies a higher chance
of this contextual factor depends on extraneous attributes,         of in-store purchase. The understanding of the influence of
such as consumers’ ages and gender. This papers serves as           this contextual factor is important for a number of recom-
a preliminary step to analyze relevant contextual factors for       mendation scenarios. For simplicity, we focus on a specific
building context-aware recommender systems.                         scenario: A large retailer that has both online e-commerce
                                                                    website and offline stores presence wants to predict which
Keywords                                                            product brands the customers are going to purchase when
contextual information, odds ratio, stratified analysis, retail     they enter offline stores. For a brand that a targeted con-
                                                                    sumer has browsed online recently, a recommender system
                                                                    can possibly consider three cases of influence: 1) If it is
                                                                    strong positively, the consumer will likely to purchase prod-
                                                                    ucts of this brand in-store anyway, so no recommendation
                                                                    is needed; 2) If it is negative or if there is no influence, the
                                                                    chance of purchase in-store is not high; 3) If it is medium
                                                                    positively, the system may try to nudge the consumer to
CARS-2012, September 9, 2012, Dublin, Ireland.                      purchase products of this brand in-store. The design of this
Copyright is held by the author/owner(s).                           recommender system is part of our future work.
From the point of view of utilizing contextual information,        tem settings in a news recommender system for accurate
this paper studies the relevance and influence between con-        recommendations [5]. Another feature selection technique,
sumers’ online exposure to brands at the retailer’s website,       Las Vegas Filter algorithm, has been applied in a more re-
i.e., a contextual factor, and the probabilities of their in-      cent work to identify relevant factors [17]. A pre-filtering al-
store purchasing decisions, i.e., the outcome to be predicted,     gorithm that pre-processes and selects contextual segments
for product brands, stratified by different consumer groups.       offline has also been described in details [3]. In [10], users are
We propose the use of statistical techniques to analyze con-       clustered based on the value of some contextual factors. The
textual factor. Unlike machine learning techniques imple-          predictive accuracy of each cluster is then compared with the
mented in the literature, this approach is independent from        one of the whole dataset which is non-contextual in order to
the prediction model of the system. Besides, unlike tradi-         understand whether and where the performance improves.
tional correlation analysis techniques, such as sign test and      The advantage of this kind of algorithm is that factors are
chi-squared test, our approach can estimate the influence of       considered only in situations where contextual method out-
the factor on the probability of in-store purchase on products     performs the standard non-contextual algorithm. In current
of brands. This information can possibly be used to improve        literature, relevance of contextual factors is measured based
the prediction model directly in our future work. A challenge      on their effects in the system’s predictive accuracy. Recent
of using statistical techniques, namely the issue that basic       research, however, shows that recommendation accuracy of
probabilistic measurement is sensitive to external noise, is       context-aware recommender systems can be affected by con-
presented by a numerical example in this paper. More ro-           ditions other than the contextual factors themselves, such as
bust techniques, such as odds ratio and stratified analysis,       the task requirement and the overall number of items in the
are then proposed. The dataset for experiments used in this        recommended list [15]. As a result, contextual factors may
research is provided by a large UK retail business. Ten prod-      be omitted simply because they are not integrated into the
uct brands are analyzed. It is a 1-year anonymized records         system or the prediction model properly. Other literature
of loyalty card holders who have browsed products of the           proposes the use of statistical methods to evaluate the rele-
selected brands online at the retailer’s website and of those      vance of contextual factors. In contrast to a machine learn-
who have purchased products of the selected brands in any          ing approach, a statistical approach is fast to compute and is
store of the retailer in the UK. All data is collected in a        independent from the prediction model implemented by the
real non-experimental setting. In our experiments, whether         system. Not all type of data fulfills the assumptions of these
the consumers have browsed any product page of a targeted          statistical models though. For instance, Pearson Correla-
brand at the retailer’s website within a month is regarded         tion Coefficient, or its binary form, Phi Coefficient, expects
as a binary contextual factor and the outcome to be pre-           a linear relationship between the two variables. Paired t-
dicted is consumers’ in-store purchase decisions. Results          test discussed in previous literature [1] is not suitable for
show that the influence of this contextual factor on the out-      binary data with binomial distribution. They are, there-
come to be predicted varies with consumers’ attributes such        fore, not suitable for our scenario. Although other statisti-
as age and gender. The rest of the paper is organized as           cal methods, such as Sign Test and Chi-squared Test, could
follows. Section 2 is a review of related literature about         be suitable for our binary data, this paper presents an alter-
techniques that identify relevant contextual information in        native statistical methodology that, not only evaluates the
recommender systems. Section 3 describes the challenges of         relevance of a contextual factor, but also estimates the in-
data analysis in a retail scenario. Solutions are then pro-        fluence of it on the probability of expected outcomes at the
posed to analyze contextual factors. Section 4 presents ex-        same time. By knowing the influence in probability, it is pos-
periments conducted based on real retail dataset and section       sible to make use of this contextual information to improve
5 is the discussion and future work.                               the prediction models directly in our future work.

2.   RELATED WORK                                                  3. STATISTICAL ANALYSIS
Our literature review focuses on techniques that identify and      3.1 Problem Formulation
evaluate relevant contextual information in recommender            In this paper, we consider a retailer that operates both an
systems. The technical goal of a recommender system can            online e-commerce website and physical retail stores. Con-
generally be seen as the problem of predicting ratings, or         sumers’ recent online exposure at the retailer’s website is the
any other outcome, for the items that have not been seen           contextual factor to be evaluated. In particular, we define
by a user [2]. The outcome to be predicted in retail recom-        whether the consumers have browsed at least a page of a
mender systems, for instance, may be consumers’ purchase           targeted brand at the retailer’s website within a month as
decisions instead of ratings. The use of contextual informa-       the contextual condition, browse = 1 if the condition ex-
tion in recommender systems have proven to improve the             ists, 0 otherwise. For illustration purposes, we assume that
accuracy of prediction in some situations [1]. Context is a        the outcome to be predicted is the purchase decision of any
multifaceted concept that is defined differently in multiple       product of the targeted brand at any physical store (in-store
research disciplines [3]. Various kind of attributes can be        purchase), which is a binary variable: purchase = 1 if pur-
defined as context. For instance, there are context of users,      chased, 0 otherwise. In order to evaluate the relevance and
context of items and context of interactions or situations         influence between this contextual factor and the outcome,
[4]. Regardless of the definition, the selection of relevant       we need to compare the probability of in-store purchase of
contextual factors to be used in recommender systems is a          consumers who have browsed and of those who have not, i.e.
critical issue. To deal with this issue, some literature applies   p(purchase = 1|browse = 1) and p(purchase = 1|browse =
machine learning techniques to identify relevant factors au-       0). In a population of N potential consumers, we can con-
tomatically. Decision trees and feature selection techniques       struct a table to represent the online browsing and in-store
are used to rank the relevance of user preferences and sys-        purchasing situation of the dataset:
                purchase=1       purchase=0                       evaluate the impact of possible confounding attributes.
 browse=1           a11              a10          a1∗
 browse=0           a01              a00          a0∗
                    a∗1              a∗0          N
                                                                  3.2    Odds Ratio
                                                                  Odds ratio is commonly used as a estimator of RC in medical
                                                                  and epidemiological research for case-control studies where
where a11 ,a10 ,a01 and a00 are the number of consumers for
                                                                  disease cases are not easy to be obtained [6, 13, 7]. Simi-
the corresponding purchasing and browsing situations. a∗1 =
                                                                  lar to our problem, N is also adjustable in medical studies
a11 + a01 is the number of consumers who have purchased
                                                                  because the number of people with and without diseases in
in-store, a∗0 = a10 + a00 is the number of consumers who
                                                                  the dataset are determined by the design of the case-control
have not purchased in-store, a1∗ = a11 + a10 is the number
                                                                  studies artificially. In our case, OR can be calculated as:
of consumers who have browsed online and a0∗ = a01 + a00
is the number of consumers who have not browsed online.                                    a11 /a01   a11 a00
                                                                                    OR =            =                        (2)
A direct way to express the relationship is to compare the                                 a10 /a00   a10 a01
two probabilities with relative correlation (RC), where
                                                                  Identical to RC, there is no correlation if OR = 1, the influ-
              p(purchase = 1|browse = 1)   a11 /a1∗               ence is positive if OR < 1 and negative if OR > 1. Unlike
      RC =                               =                 (1)
              p(purchase = 1|browse = 0)   a01 /a0∗               RC, OR is insensitive to the row and column scaling op-
There is no correlation if RC = 1, the influence is positive      erations of the data table. Using the same example above,
if RC < 1 and negative if RC > 1. This approach, however,         OR = 5X25000
                                                                             60X500
                                                                                       = 4.17 when there is no sales promotion,
suffers from two problems when the data is collected from a       OR = 50X25000
                                                                            600X500
                                                                                      = 4.17 as well when there is a sales promo-
non-experimental retail environment. First, RC is sensitive       tion. OR is a good estimator statistically if a requirement
to the total number of consumers who have purchased and           is fulfilled: For the two groups of consumers, i.e. those who
also to the total number of consumers who have not pur-           have browsed online and those who have not browsed online,
chased. These two numbers, unfortunately, can be affected         separately, the number of consumers who have purchased
by external irrelevant factors, such as marketing campaigns       in-store must be a small percentage (less than 10%) of the
or product promotions, which should be isolated from this         total number of consumers in the group. This requirement
analysis. This problem can be illustrated with a numerical        is reasonably fulfilled in most retail situations. Confidence
example. Suppose the data looks like the following table          interval (CI) is used to determine the reliability of the re-
when there is no sales promotion:                                 sults. The larger the range of CI, the less reliable the result
                                                                  is. The CI of odds ratio [12] can be approximated with:
                purchase=1       purchase=0                                                    r                          !
 browse=1            5               500          505                          a11 a00             1     1     1       1
                                                                      CI =             exp ±z         +     +      +          (3)
 browse=0           60              25,000        25,060                       a10 a01           a11    a10   a01     a11
                    65              25,500        25,565
                                                                  where z is the score of the standard normal distribution
                                                                  associated with the confidence level. z = 1.96 for a 95%
Let us assume that a sales promotion successfully attracts
                                                                  confidence interval.
new consumers to purchase and the number of purchase in-
creases 10 times as shown in the following table:
                                                                  3.3    Stratified Analysis
                purchase=1       purchase=0                       Extraneous attributes, such as consumers’ age and gender,
 browse=1            50              500          550             potentially affect the influence of the contextual factor on
 browse=0           600             25,000        25,600          the outcome. Stratified analysis is a computationally inex-
                    650             25,500        26,150          pensive solution to reveal their effects. This technique is
                                                                  commonly used in medical research when setting up con-
When all things being equal, a temporary sales promotion          trol group experiments is not feasible and so the existence
should not affect the relationship between the contextual         of extraneous factors is common [9]. It analyzes subgroups
                                                   5/505
factor and the outcome. In reality, however, RC = 60/25060 =      (strata) of the study population separately according to the
                                     50/550                       attributes. For instance, two strata are created for the gen-
4.14 in the first case while RC = 600/25600 = 3.88 in the sec-    der attribute: female consumers and male consumers. Odds
ond one. In another words, RC is sensitive to the change          rate is measured for each strata separately. Stratified analy-
of number of consumers who purchase (a∗1 ). This problem          sis provides an independent view for each strata, each comes
presence in many real-world environments since businesses         with its own odds ratio. The difference is then comparable
can always attract new consumers to stores or website dy-         among these strata. In addition, a common strata-adjusted
namically, which affects N , and thus a∗1 and a∗1 can be          odds ratio is estimated by Mantel-Haenszel (MH) method
manipulated. An odds ratio technique to estimate RC that          [11]. This adjusted value represents a weighted average of
is insensitive to the change of N is proposed later in this       the stratum-specific odds ratio which is an approximation to
paper. The second problem is the existence of extraneous          the maximum likelihood estimation. According to [8], the
attributes, such as age and gender, that potentially affect the   formula of approximation can be written as:
influence of the targeted contextual factor on the outcome to                                  Pk a11i a00i
be predicted. This problem occurs when an attribute is asso-                                      i=1      Ni
ciated with the contextual factor and at the same time such                         ORM H = Pk          a01i a10i
                                                                                                                             (4)
                                                                                                  i=1      Ni
attribute affects the outcome dependently or independently.
This kind of extraneous attribute is called a confounder in       where k is the total number of strata in an analysis and i
the statistics discipline. Stratified analysis is proposed to     represents one of them. For this Mantel-Haenszel method of
estimation to be accurate, the overall sample size must be         for those who have not browsed so. We also calculate the
large. [16] provides a more robust but complicated approxi-        monthly common strata-adjusted odds ratio as well as the
mation method for data with small sample size. Confidence          95% confidence interval (CI).
interval (CI) can again be used to indicate the reliability of
the result:                                                        4.3    Results
 95% CI for ORM H = Exp[(lnORM H ± SE(lnORM H )] (5)               Results of only three stratified odds ratios analysis are pre-
                                                                   sented in this paper due to length constraint. All figures
where                                                              show that odds ratio measurements are well above 1, i.e., the
                                 Pk     a10i a01i 2                influence is positive for all brands. It means that the prob-
                             p     i=1 (   Ni
                                                  ) vi
          SE(lnORM H ) =         ( Pk     a10i a01i 2
                                                       )           ability to purchase at least one product of a selected brand
                                    i=1 (    Ni
                                                   )               in-store by consumers who have browsed at least a webpage
and                                                                of that brand online at the retailer’s website is higher than
                                                                   the probability for those who have not browsed so. The val-
                       1      1      1      1
               vi =        +      +      +                         ues and patterns are different for each brand though, which
                      a11i   a10i   a01i   a00i                    means that the impact of this contextual factor of online
                                                                   exposure varies with brands.
4. EXPERIMENT
4.1 Dataset                                                        Figure 1 represents gender-stratified analysis of brand A.
Our dataset, which is provided by a large UK retail busi-          The influence on female consumers is much stronger than
ness, is a 1-year anonymized records of loyalty card holders       the one on male consumers. An interesting discovery is that
who have browsed the selected products online on the re-           the odds ratio measurements for both genders follow a very
tailer’s website and of those who have purchased the selected      similar up and down monthly pattern. Both strata have
products in any store of the retailer in the UK. It contains       peaked odds ratio in February. This finding hints that time
10,217,972 unique loyalty card holders and 2,939 unique            is a contextual factor that should also be considered in future
products under 10 selected brands. There are 21,668,137 in-        work. Figure 2 shows that the odds ratio range of different
store purchase transaction records and 299,070 online brows-       age groups for brand B are separated clearly. The influence
ing records. We associate consumers’ online browsing and           for consumers of age 18-25 is the highest while the one for
in-store purchasing behaviors with unique loyalty card num-        consumers of age 26-35 is the lowest. It means that the
bers. All data is collected in a real non-experimental setting.    probability for consumers of age 18-25 is higher than the
                                                                   one for consumers of age 36-45 and both of them are higher
4.2     Experimental Design                                        than the one for consumers of age 26-35. This finding implies
This experiment investigates the relevance and influence be-       that, for brand B, the age attribute itself can be correlated
tween consumers’ recent online browsing behaviors and the          to consumers’ in-store purchase decisions. Figure 3, on the
probabilities of their in-store purchase decisions for ten prod-   other hand, draws a different conclusion for brand C. In
uct brands carried by a large UK retail business nationally.       this case, the odds ratio measurements of these age groups
These ten brands are selected randomly, some of them are           mixed together in a close range. There is no clear monthly
luxury brands while the others are mid-range brands. We            pattern either. It means that age, gender and month are not
define whether a consumer has browsed at least a page of           confounding factors for this brand.
a targeted brand at the retailer’s website within a month
as the context of the consumer, browse = 1 if the condi-           5.    DISCUSSION AND FUTURE WORK
tion exists, 0 otherwise. Odds ratio is used to compare the        This paper derives a contextual factor from consumers’ re-
influence of this contextual factor on the probabilities of con-   cent online browsing behaviors on the retailers’ website for
sumers’ binary purchase decision of any product of the tar-        the prediction of their offline in-store purchase. A statisti-
geted brand at any physical store (in-store purchase). We          cal approach is presented to conduct a preliminary analysis
pre-process the dataset to filter out consumers who have           on the relevance and influence between this factor and the
not visited any page at the retailer’s website at least once       offline purchase decisions on brands using odds ratio and
in the past year. This process ensures that the remaining N        stratified analysis techniques. The initial uncertainty that
consumers have at least successfully accessed the retailer’s       consumers who browse online on retailer’s website tend to
website recently. We start with a hypothesis that age and
gender are two attributes of consumers that may confound
the influence. We conduct monthly strata-specific measure-
ment of odds ratio based on these two attributes for each                Figure 1: Odds Ratio by Gender (Brand A)
brand. Practically, age and gender information is missing
in some records. In each analysis, therefore, we analyze a
population size of Nage or Ngender which represent the total
number of consumers with age information or with gender
information respectively. In these experiments, we calculate
the monthly crude (unadjusted) odds ratio for each strata
for each brand. If the odds ratio for a strata of a brand is X,
it means that, in this strata and in this particular month, the
probability to purchase at least one product of this brand
in-store by consumers who have browsed at least a webpage
of this brand online is X times higher than the probability
      Figure 2: Odds Ratio by Age (Brand B)                            Figure 3: Odds Ratio by Age (Brand C)




purchase online and therefore they have lower chance to pur-        [8] W. Hauck. The large sample variance of the
chase in-store has been proven untrue for the brands we have            mantel-haenszel estimator of a common odds ratio.
analyzed. In addition, as expected, the influence of online             Biometrics, pages 817–819, 1979.
exposure on offline purchases varies with brands and con-           [9] D. Kleinbaum, L. Kupper, and H. Morgenstern.
sumers’ ages and gender. In our future work, the analysis               Epidemiologic research: principles and quantitative
for non-binary contextual factor will be illustrated. Besides           methods. Wiley, 1982.
the factor we have evaluated in this paper, it is interesting to   [10] S. Lombardi, S. Anand, and M. Gorgoglione. Context
see whether other relevant contextual factors can be derived            and customer behavior in recommendation. In
from consumers’ recent online behaviors for their in-store              RecSys09: Workshop on context-aware recommender
purchase decisions. Future work is to build a context-aware             systems (CARS-2009), 2009.
recommender system for in-store product recommendation             [11] N. Mantel and W. Haenszel. Statistical aspects of the
based on these findings. We are interested in using the OR              analysis of data from retrospective studies of disease.
value directly to improve prediction. Also, a comparison of             The Challenge of Epidemiology: Issues and Selected
predictive performance of recommender systems using con-                Readings, 1(1):533–553, 2004.
textual factors selected by this approach and by existing          [12] J. Morris and M. Gardner. Statistics in medicine:
machine learning techniques is part of our future work.                 Calculating confidence intervals for relative risks (odds
                                                                        ratios) and standardised ratios and rates. British
                                                                        medical journal (Clinical research ed.),
6.   REFERENCES                                                         296(6632):1313, 1988.
 [1] G. Adomavicius, R. Sankaranarayanan, S. Sen, and              [13] F. Mosteller. Association and estimation in
     A. Tuzhilin. Incorporating contextual information in               contingency tables. Journal of the American Statistical
     recommender systems using a multidimensional                       Association, 63(321):1–28, 1968.
     approach. ACM Trans. Inf. Syst., 23(1):103–145, Jan.          [14] C. Palmisano, A. Tuzhilin, and M. Gorgoglione. Using
     2005.                                                              context to improve predictive modeling of customers
 [2] G. Adomavicius and A. Tuzhilin. Toward the next                    in personalization applications. Knowledge and Data
     generation of recommender systems: A survey of the                 Engineering, IEEE Transactions on, 20(11):1535
     state-of-the-art and possible extensions. IEEE                     –1549, nov. 2008.
     Transactions on Knowledge and Data Engineering,               [15] U. Panniello and M. Gorgoglione. Does the
     17(6):734–749, 2005.                                               recommendation task affect a cars performance? In
 [3] G. Adomavicius and A. Tuzhilin. Context-aware                      RecSys10: Workshop on context-aware recommender
     recommender systems. Recommender Systems                           systems (CARS-2010), 2010.
     Handbook, pages 217–253, 2011.                                [16] J. Robins, N. Breslow, and S. Greenland. Estimators
 [4] M. Bazire and P. Brezillon. Understanding context                  of the mantel-haenszel variance consistent in both
     before using it. Modeling and using context, pages                 sparse data and large-strata limiting models.
     113–192, 2005.                                                     Biometrics, pages 311–323, 1986.
 [5] A. Bellogı́n, I. Cantador, P. Castells, and A. Ortigosa.      [17] B. Vargas-Govea, G. González-Serna, and
     Discovering relevant preferences in a personalised                 R. Ponce-Medellı́n. Effects of relevant contextual
     recommender system using machine learning                          features in the performance of a restaurant
     techniques. In Proceedings of the ECML-PKDD 2008                   recommender system. In RecSys11: Workshop on
     Workshop on Preference Learning, 2008.                             context-aware recommender systems (CARS-2011),
 [6] J. Cornfield et al. A method of estimating                         2011.
     comparative rates from clinical data; applications to
     cancer of the lung, breast, and cervix. Journal of the
     National Cancer Institute, 11(6):1269, 1951.
 [7] A. Edwards. The measure of association in a 2× 2
     table. Journal of the Royal Statistical Society. Series
     A (General), pages 109–114, 1963.