=Paper=
{{Paper
|id=None
|storemode=property
|title=Evaluating the Importance of Various Implicit Factors in E-commerce
|pdfUrl=https://ceur-ws.org/Vol-910/paper11.pdf
|volume=Vol-910
|dblpUrl=https://dblp.org/rec/conf/recsys/PeskaV12
}}
==Evaluating the Importance of Various Implicit Factors in E-commerce==
          Evaluating Various Implicit Factors in E-commerce
                      Ladislav Peska                                                                  Peter Vojtas
         Department of software engineering                                              Department of software engineering
            Charles University in Prague                                                    Charles University in Prague
  Malostranske namesti 25, Prague, Czech Republic                                 Malostranske namesti 25, Prague, Czech Republic
                 peska@ksi.mff.cuni.cz                                                        vojtas@ksi.mff.cuni.cz
ABSTRACT                                                                    systems, algorithms or methods have been presented so far. We
In this paper, we focus on the situation of a typical e-commerce            can mention Amazon.com recommender [12] as one of the most
portal employing personalized recommendation. Such website                  popular commercial examples. Recommender systems varies in
could, in addition to the explicit feedback, monitor many different         both type (Collaborative, Content-based, Context, hybrid, etc.),
patterns of implicit user behavior – implicit factors. The problem          input (user feedback types, object attributes, etc.) or output. We
arises while trying to infer connections between observed implicit          suggest [17] for detailed recommender systems taxonomy.
behavior and user preferences - while some connections are                  The explicit feedback (given by the user consciously e.g. rating
obvious, others may not.                                                    objects with stars) is often used in research and also in some
We have selected several often used implicit factors and                    commercial applications. Although it is quite easy to understand
conducted online experiment on travel agency web site to find out           and refers very well to the user’s preference, it also has
which implicit factors could replace explicit ratings and (if there         drawbacks. The biggest ones are its scarcity and unwillingness of
are more of them) how to combine their values. As utility                   some users to provide any explicit feedback [7]. Contrary to the
functions determining recommending efficiency was selected click            explicit feedback, the implicit feedback (events triggered by a user
through rate and conversions rate.                                          unconsciously) can provide abundant amount of data, but it is
                                                                            much more difficult to understand the true meaning of such
Our experiments corroborate importance of considering more                  feedback.
implicit factors and their different weights. The best individual
                                                                            The rest of the paper is organized as follows: review of some
results were achieved by means of the scrolling factor, the best
                                                                            related work is in section 2. In section 3 we describe our model
combination was Prior_to method (lexicographical ordering
                                                                            of user preferences and in section 4 method how to learn it.
based on factor values).
                                                                            Section 5 contains results of our online experiment on a travel
                                                                            agency website. Finally section 6 concludes our paper and points
Categories and Subject Descriptors                                          to our future work.
H.3.3 [Information Systems]: Information Search and Retrieval -
Information Filtering                                                       1.1       Motivation
                                                                            In this paper we focus on an e-commerce website employing
                                                                            personalized object recommendation – e.g. travel agency. On such
General Terms                                                               site we can record several types of user implicit feedback such as
Measurement, Human Factors.                                                 page-view, actions or time spent on page, purchasing related
                                                                            actions, click through or click stream, etc. Each of these factors is
Keywords                                                                    believed to be related to the user’s preference on an object.
Recommender systems, implicit           factors,   user   feedback,         However this relation can be non-trivial, dependant on other
e-commerce success metrics                                                  factors, etc. In this work, we focus on if and how such relations
                                                                            could be compared against each another. Our second aim is how
                                                                            to use or combine them in order to improve recommendations.
1.    INTRODUCTION
Recommending on the web is both an important commercial                     1.2       Contribution
application and popular research topic. The amount of data on the           The main contributions of this paper are:
web grows continuously and it is nearly impossible to process it
directly by a human. The keyword search engines were adopted to                       Evaluation of recommendation based on various
cope with information overload but despite their undoubted                             implicit factors using typical e-commerce success
successes, they have certain limitations. Recommender systems                          metrics.
can complement onsite search engines especially when the user
does not know exactly what he/she wants. Many recommender                             A generic model that combines various types of user
                                                                                       feedback.
                                                                                      Experiments with several combining methods (average,
Copyright is held by the author/owner(s). Workshop on Recommendation                   weighted aggregation and prioritization).
Utility Evaluation: Beyond RMSE (RUE 2012), held in conjunction with
ACM RecSys 2012. September 9, 2012, Dublin, Ireland.                                  Gathered data for possible future off-line experiments.
                                                                       51
2.    RELATED WORK                                                            from [0, 1] interval. Then they defined three types of operators
The area of recommender systems has been extensively studied                  combining soft conditions together:
recently. Much effort has been made for creating different                    - Preferring Operator: preferring one (or more) condition against
recommendation algorithms e.g. [3], [4], [5] and designing whole              others.
recommender systems e.g. [6], [15] and [16]. Our work is                      - Ranking Operator to combine conditions by a ranking function.
prependable to some of those systems as we can supply them with               At this time we use weighted average as a ranking.
a single-value object rating based on more implicit factors instead           - Pareto Operator for combining equally important conditions, or
of using explicit user’s object rating or only single implicit factor.        conditions where their relation is unknown. We plan to use this
                                                                              operator in our future work.
A lot of recommendation algorithms aims to do decompose the
                                                                              In our research, we have replaced the soft conditions by the
user’s preference on the object into the preference of the object’s
                                                                              implicit factors forming the Preference algebra model. Each
attributes [3], [4], [5] and [15], which can be a future extension
                                                                              implicit factor value has assigned preference value from [0, 1]
to our work.
                                                                              interval – currently we simply linearly normalize the space
Some authors employ context information while deciding about                  between highest and lowest factor values. Those preference values
true meaning of the user feedback e.g. Eckhardt et al. [2] proposes           can be then freely combined with the operators e.g.:
that good rating of an object is more relevant when the object
appears among other good objects. Joachims et al. [8] proposes                Scrolling PRIOR TO Avg(Time, MouseClicks)
“Search Engine Trust Bias” while observing that the first result of
a search engine search has higher click through rate than the                 We will demonstrate behavior of our model on a small two-
second one, even if the results were swapped – so the less relevant           dimensional example: Table 1 contains four sample objects and
result was shown at the first place.                                          their scrolling and time on page feedback for fixed user (data
                                                                              already normalized into [0, 1]). They are visualized on Figure 1:
Important for our research is the work of Kiessling et al. on the
                                                                              as it can be seen, we will receive different top-k for their various
Preference SQL system e.g. [10]. The Preference SQL is an
                                                                              combinations.
extension of SQL language allowing user to specify directly
preferences (or so called “soft constraints”) and to combine them
                                                                                Table 1: example objects and their scrolling and time on page
in order to receive best objects. We use three described
                                                                                                  implicit factor values.
combination operators: Prior to (hierarchical), Ranking and
Pareto in our model of user preference.                                        Object       Amount of scrolling           Time on page
Several authors studied various aspects of implicit feedback: quite           Object1        1.0 (e.g 10 times)          0.4 (e.g. 200sec)
common are studies about comparing implicit and explicit                      Object2          0.7 (e.g 7 times)         1.0 (e.g. 500sec)
feedback e.g. Claypool et al. [1] using adapted web browser or
Jawaheer et al. [7] on an online music server. Using only an                  Object3          0.8 (e.g 8 times)         0.6 (e.g. 300sec)
implicit feedback based utility function is a common approach                 Object4          0.4 (e.g 4 times)         0.3 (e.g. 150sec)
when it is impossible to get explicit feedback [6], [14]. Lee and
Brusilovsky proposed job recommender directly employing
negative implicit feedback [11]. In our case we have focused on e-
commerce recommenders, so we have used two typical e-
commerce utility functions – Click Through Rate and user
Conversion Rate. In contrast to several studies e.g. [1] who
studied behavior of closed, small group of users (who installed
special browser) on the open web, we have focused on the single
website and all its users which in result let us to gather more
feedback data and introduce more various feedback factors.
For our experiments, we use the UPComp [13] recommender
deployable into the running e-commerce applications. Compared
to our previous work [14], we have conducted larger on-line
experiment, revised utility functions in our learning method and                   Figure 1: Combining single implicit factor values into the
introduced new model of user preference.                                                     preference for objects from Table 1.
3.    MODELS OF USER PREFERENCE                                               4.    LEARNING PREFERENCE MODEL
We assume that any feedback is in the form Feedback(user,                     The idea behind our learning model is following: If we use a fixed
object, feedback type, value). At this stage of our research, we do           recommendation methods supplied with various implicit factor
not employ preference relations or feedback related to the object             data and then compare the effectivity of the recommendations, we
groups (e.g. categories) and object attributes.                               can estimate how successful each implicit factor is.
We based our models on work of Kiessling et al. and their model               For the purpose of our experiment, we have divided our learning
of user preferences in Preference SQL [10]. The authors defined               model into two phases: in the first phase, we have learned
several patterns on how to express preferences (soft conditions)              successfulness of the considered implicit factors (see Table 2 for
on a single attribute e.g. “prize around 2000” or “Highest                    their list and description). In the second phase we have
distance”, etc. Each soft condition assigns to each object value              implemented several methods combining various implicit factors
                                                                              together based on the Preference algebra model.
                                                                         52
      Table 2: Description of the considered implicit factors for
                   arbitrary fixed user and object
   Factor                            Description
PageView            Count( OnLoad() event on object detail page)
                    Count( OnMouseOver() events on object detail
MouseActions
                                    page)
Scroll             Count( OnScroll() events on object detail page)
TimeOnPage              Sum( time spent on object detail page)
Purchase                      Count(Object was purchased)
                   Count( Object detail page accessed via link from
Open                                                                          Figure 2: The simplified state diagram of an e-commerce site:
                                recommending area)
                                                                              User enters the site in STATE I. or II. He/she can either navigate
Shown                Count( Object shown in recommending area)                through category or search result pages – updating query Q,
In both phases we have measured success of the recommendations                receiving new recommended objects OQ and OR (STATE I.) or
according to the two widely used e-commerce success metrics:                  proceeds to the detail of an object (STATE II.). The object can be
                                                                              eventually purchased (STATE III.).
           Conversion rate - #buyers / #users
                                                                              The Figure 3 depicts the schema of our experiment.
           Click through rate (CTR) - #click through / #shown
            objects by the recommending method
As we stand on the side of the e-shop owner, we determine that
the main task for the recommender system is to increase the shop
owner’s profit. It is possible to measure the profit directly as an
utility function, however we did reject this method for now and
use only conversion rate measuring overall goal (purchase)
achievements. In this stage of our work we mainly focus on
convincing user to buy any product rather then convince him/her
to buy product B instead of A (see table 3 – the overall conversion
rates are rather low and need to be improved prior to the other
goals).
As the conversion rate should evaluate the overall success of the
whole system, the CTR refers directly to the success of the
recommendation itself.
                                                                              Figure 3: General schema of our experiment. When user visits the
                                                                              website for the first time, he receives userID, whenever he access
5.      EXPERIMENT                                                            page with recommendations, the component selects the
We have conducted an online experiment on the SLAN tour travel                recommending method according to the userID. The experiment
agency website1 to confirm our ideas. We have exchanged the                   results for each method are computed from user feedback (Click
previous random recommendations on the category pages for our                 throughs, purchases).
methods. The experiment lasted for 2 months in February and
March 2012. We have collected data from in total 15610 unique                 5.1       UPComp recommender
users (over 200 000 feedback events). We first describe in Figure             The UPComp (user preference component) is an independent e-
2 the simplified diagram of the travel agency e-shop. We                      commerce recommender. It consists of a database layer storing
recognize four important states of user interaction with the e-shop:          user feedback, server-side computing user preference and
                                                                              recommendations and client-side which captures the user events
      User is creating conjunctive query Q (either implicitly e.g. by
                                                                              and shows recommended objects. Among UPComp main
       viewing category pages or explicitly via search interface).
                                                                              advantages belong:
      The (possibly very large) set of objects OQ is response to Q.
                                                                                        Easy deployable to a various e-commerce systems
       The objects are recommended at this state. We recommend
                                                                                         regardless to the domain of objects.
       some objects from OQ to the user (membership in OQ set is
       necessary condition, each recommended object from OR has                         Large (extendible) set of recorded user behavior.
       to fulfill).                                                                     Several recommending       methods     which   can   be
      User is viewing detail of the selected object o. We believe                       combined together.
       that most of the interesting user feedback should be recorded          In the current experimental setting, we have used only a small
       in this phase.                                                         portion of UPComp capabilities (ObjectRating and Collaborative
      User purchased the object o, which is the criterion of success         methods, recommending objects for known category). For more
       for us.                                                                complex description see [13].
1
     http://www.slantour.cz
                                                                         53
5.2     Single implicit factors                                             5.3     Combining implicit factors
For the first learning phase we have created a total of seven               Following to the first phase, we have defined our three main tasks
variants of ObjectRating recommending method, each based on                 and perform experiments to receive at least initial answers/results
one implicit factor (PageView(), MouseActions(), Scrolling(),               for them:
TimeOnPage(),        Purchases(),       ClickThrough()      and
                                                                            T1. Measure whether combined methods produce                    better
ClickThrough()/Shown() rate). Each variant of ObjectRating
                                                                                recommendations than the single-factor ones.
method used the same recommendation algorithm, but based on
only one feedback type data. We have also added Random()                    T2. Measure whether various combination functions affect
method recommending random objects from the current category                    recommendation effectivity.
as a baseline. Each unique user received recommendations based
only on one of these methods all the time he visited the website.           T3. How to use our results in more complex recommending
The method is determined as userID mod K, where K is number of                  methods.
possible methods.                                                            Table 4: Results of combined methods: AVG stands for average,
The ObjectRating method calculates for each object (o) the object              in Weighted_AVG we use the factor’s placement in the CTR
rating as the sum of feedback values of given type (f) from all              results as weight, similarly Prior_to prioritize first factor against
users U. The score is then normalized into [0, 1] (see pseudo SQL                 second, etc. * significant improvement over Random()
code below).                                                                    (TukeyHSD). ** significant impr. over AVG(best 3 factors)
                                                                                (TukeyHSD). *** significant impr. over Scrolling() (t-test).
SELECT (SUM(value) / MAX(SUM(value)) as ObjectRating
          FROM Feedback                                                                                         Conversion
                                                                                         Method                                 CTR
                                                                                                                   rate
          WHERE Object = o and FeedbackType = f                              Random() (baseline1)                 0.97%        3.19%
We have selected this simple method, because we wanted to avoid              Scrolling() (baseline2)              1.07%       4.36% *
the problems suffered by more complex methods (e.g. Cold Start               AVG(all factors)                     1.41%       4.54% *
Problem). On the other hand, this decision decreases variability of          AVG(best 3 factors)                  1.35%        3.95%
recommendations, so we want to use also other methods in our                 Weighted_AVG(best 3 factors)         1.49%      4.95% *,**
future work.                                                                 Prior_to(best 3 factors)             1.05%    5.12% *,**,***
Table 3. shows results of the first phase of our experiment. Anova           Collaborative+ Weighted_AVG
                                                                                                                  0.95%         4.64% *
test proves statistically significant differences in Click through           (all factors)
rate (p-value < 0.001), but the differences in the Conversion rate          Again Conversion rate unfortunately did not provide us with any
were not statistically significant (probably due to relatively small        significant results, so we have focused on the CTR. The combined
number of purchases – 106 buyers in total).                                 methods overall achieved better results than the Scrolling(), but
                                                                            only the Prior_to() was significantly better. Almost every method
Rather surprising is the supreme position of the Scrolling()
                                                                            outperforms Random() recommendation.
method comparing to the e.g. Claypool et al. [1]. However in
contrast to the Claypool et al. the most of our object detail pages         For the Task 2, we have compared Weighted average, Priorization
overflows typical browser visible area. However important                   and Average methods on the best three implicit factors, where
controls like purchase button are visible in top of the page,               both Weighted average and Priorization methods receives
scrolling is necessary to see some additional information like              significantly better results than Average in Click through rate.
accommodation details, all hotel pictures, trip program, etc. On            Both Prior_to and Weighted_AVG significantly outperformed
sites with bookmark-style design with no or a little scrolling              AVG method, from which can be concluded that there are
needs, opening an in-page bookmark should be considered as a                important differences in various single implicit factors
similar action to our scrolling event. Also time spent on page              performance and that combination function should weight
seems to improve recommendations (despite the results of e.g.               somehow the single factors performance. However even though
Kelly and Belkin [9]).                                                      the Prior_to CTR results were better than Weighted_AVG, the
                                                                            difference was not significant enough, so we can not yet make a
  Table 3. Results of the experiment’s first phase. * significant
                                                                            conclusion about which combination method is the best.
  improvement over Random() (TukeyHSD, 95% confidence).
                                                   Click through            For the third task, we have slightly changed our experiment
          Method               Conversion rate                              schema (see Figure 2), where we have exchanged the
                                                    rate (CTR)
Random() (baseline)                 0.97%              3.02%                ObjectRating() method for UserObjectRating(User, Object,
                                                                            Feedback type) calculating object rating separately for each
PageView()                          1.34%             4.11%*
                                                                            relevant user (see pseudo SQL code below).
MouseActions()                      0.96%             4.15%*
TimeOnPage()                        1.71%             4.50%*                SELECT (SUM(value) / MAX(SUM(value)) as ObjectRating
Scrolling()                         1.98%             4.94%*                  FROM Feedback
Purchases()                         1.39%              4.06%                  WHERE User = u and Object = o and FeedbackType = f
ClickThrough()                      0.84%             4.32%*
ClickThrough/Shown()                1.70%             4.38%*                UPComp then calculated standard user-to-user collaborative
                                                                            filtering. The method results (see Table 4, Collaborative+
                                                                            Weighted_AVG) were though rather moderate. The method
                                                                            outperforms AVG, Scrolling and Random in CTR, however the
                                                                            difference was not significant enough and other simple methods
                                                                       54
(e.g. Prior_to) achieved better results. One of the possible                [5] Jill Freyne, Shlomo Berkovsky, and Gregory Smith. 2011.
problems was the higher computational complexity of this method                 Recipe recommendation: accuracy and reasoning. In
resulting in higher response time which could reduce the user's                 Proceedings of the 19th international conference on User
interest in the objects presented in recommending area. This                    modeling, adaption, and personalization (UMAP'11).
method can be in future compared / replaced with e.g. object-to-                Springer-Verlag, Berlin, Heidelberg, 99-110.
object collaborative filtering with precomputed similarity as               [6] Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008.
described in [12].                                                              Collaborative Filtering for Implicit Feedback Datasets. In
                                                                                Proc. of ICDM '08. IEEE Computer Society, Washington,
6.    CONCLUSIONS AND FUTURE WORK                                               DC, USA, 263-272.
In this paper, we have discussed the problem of using more
various implicit factors and how to formulate user’s preference             [7] Gawesh Jawaheer, Martin Szomszor, and Patty Kostkova.
from them. We have adapted the Preference algebra model to this                 2010. Comparison of implicit and explicit feedback from an
task, selected several possibly good implicit factors and organized             online music recommendation service. In Proc. of
a small online experiment to verify our ideas. The experiment                   HetRec'10. ACM, New York, NY, USA, 47-51.
results showed that the most of our proposed factors outperforms            [8] Thorsten Joachims, Laura Granka, Bing Pan, Helene
baseline recommendation and that it is important to use more                    Hembrooke, Filip Radlinski, and Geri Gay. 2007. Evaluating
various implicit factors combined accordingly to their                          the accuracy of implicit feedback from clicks and query
performance.                                                                    reformulations in Web search. ACM Trans. Inf. Syst. 25, 2,
The usage of e-commerce success metrics (especially CTR) to                     Article 7 (April 2007). DOI=10.1145/1229179.1229181
determine success of recommendations provided us with                       [9] Kelly, D. & Belkin, N. J. Display time as implicit feedback:
interesting results, so we plan to continue using Click through rate            understanding task effects Proceedings of the 27th ACM
as a success metrics (conversions due to the relatively small                   SIGIR conference on Research and development in
number of purchases only in large scale experiments).                           information retrieval, ACM, 2004, 377-384
 Our research on this field is in its early stage, so there is both         [10] Kießling, W.; Endres, M. & Wenzel, F. The Preference SQL
space for more experiments (e.g. with negative implicit feedback,                System - An Overview. IEEE Data Eng. Bull., 2011, 34, 11-
dependencies between various factors, temporal aspect of user’s                  18
preference and behavior, etc.) and for possible improvements in             [11] Lee, D. H. & Brusilovsky, P. Reinforcing Recommendation
our experimental settings (e.g. replacing recommending methods,                  Using Implicit Negative Feedback In Proc. of UMAP 2009,
extend the implicit factors set, etc.).                                          Springer, LNCS, 2009, 422-427
However our main task should be to move from such experiments               [12] Linden, G.; Smith, B. & York, J. Amazon.com
into a working recommender system based on implicit preferences                  recommendations: item-to-item collaborative filtering
with various (dynamic) importances.                                              Internet Computing, IEEE, 2003, 7, 76 - 80
7.    ACKNOWLEDGMENTS                                                       [13] Ladislav Peska, Alan Eckhardt, and Peter Vojtas. 2011.
                                                                                 UPComp - A PHP Component for Recommendation Based
The work on this paper was supported by Czech projects SVV-
                                                                                 on User Behaviour. In Proceedings of WI-IAT '11, IEEE
2012-265312, MSM 0021620838 and GACR 202-10-0761.
                                                                                 Computer Society, Washington, DC, USA, 306-309.
REFERENCES                                                                  [14] Ladislav Peska and Peter Vojtas. 2012. Estimating
[1] Mark Claypool, Phong Le, Makoto Wased, and David
                                                                                 Importance of Implicit Factors in E-commerce. To appear on
    Brown. 2001. Implicit interest indicators. In Proceedings of
                                                                                 WIMS 2012, http://ksi.mff.cuni.cz/~peska/wims12.pdf
    the 6th international conference on Intelligent user
    interfaces (IUI '01). ACM, New York, NY, USA, 33-40.                    [15] Pizzato, L.; Rej, T.; Chung, T.; Koprinska, I. & Kay, J.
[2] Eckhardt A., Horváth T., Vojtáš P.: PHASES: A User Profile                   RECON: a reciprocal recommender for online dating
    Learning Approach for Web Search. In Proc. of WI 2007,                       Proc. of RecSys'10, ACM, 2010, 207-214
    Silicon Valley, CA, IEEE Computer Society, pp. 780-783                  [16] Symeonidis, P.; Tiakas, E. & Manolopoulos, Y. Product
[3] Alan Eckhardt, Peter Vojtáš: Combining Various Methods of                    recommendation and rating prediction based on multi-modal
    Automated User Decision and Preferences Modelling. MDAI                      social network. Proc. of RecSys'11, ACM, 2011, 61-68
    '09 172-181. Springer-Verlag Berlin, Heidelberg, 2009.
                                                                            [17] Bo Xiao and Izak Benbasat. 2007. E-commerce product
[4] Alan Eckhardt, Peter Vojtáš. 2009. How to learn fuzzy user                   recommendation agents: use, characteristics, and impact. MIS
    preferences with variable objectives. In proc. of                            Q. 31, 1 (March 2007), 137-209.
    IFSA/EUSFLAT Conf. 2009: 938-943.
                                                                       55