=Paper= {{Paper |id=Vol-512/paper-1 |storemode=property |title=Demonstration of Improved Search Result Relevancy Using Real-Time Implicit Relevance Feedback |pdfUrl=https://ceur-ws.org/Vol-512/paper01.pdf |volume=Vol-512 |dblpUrl=https://dblp.org/rec/conf/sigir/CramerWH09 }} ==Demonstration of Improved Search Result Relevancy Using Real-Time Implicit Relevance Feedback== https://ceur-ws.org/Vol-512/paper01.pdf
       Demonstration of Improved Search Result Relevancy
         Using Real-Time Implicit Relevance Feedback

                   David Hardtke                        Mike Wertheim                       Mark Cramer
                    Surf Canyon                          Surf Canyon                        Surf Canyon
                   Incorporated                          Incorporated                       Incorporated
                     274 14th St.                         274 14th St.                       274 14th St.
                  Oakland, CA 94612                    Oakland, CA 94612                  Oakland, CA 94612
          hardtke@surfcanyon.com                 mikew@surfcanyon.com               mcramer@surfcanyon.com

ABSTRACT                                                           to infer the relevance or non-relevance of documents. Many
Surf Canyon has developed real-time implicit personaliza-          different user behavior signals can contribute to a proba-
tion technology for web search and implemented the tech-           bilistic evaluation of document relevance. Explicit docu-
nology in a browser extension that can dynamically mod-            ment relevance determinations are more accurate, but im-
ify search engine results pages (Google, Yahoo!, and Live          plicit relevance determinations are more easily obtained as
Search). A combination of explicit (queries, reformulations)       they require no additional user effort.
and implicit (clickthroughs, skips, page reads, etc.) user
signals are used to construct a model of instantaneous user        2.   IMPLICIT SIGNALS AND USER INFOR-
intent. This user intent model is combined with the ini-                MATION NEED
tial search result rankings in order to present recommended
                                                                      With the large, open nature of the World Wide Web it is
search results to the user as well as to reorder subsequent
                                                                   very difficult to evaluate the quality of search engine algo-
search engine results pages after the initial page. This pa-
                                                                   rithms using explicit human evaluators. Hence, there have
per will use data from the first three months of Surf Canyon
                                                                   been numerous investigations into using implicit user sig-
usage to show that a user intent model built from implicit
                                                                   nals for evaluation and optimization of search engine quality.
user signals can dramatically improve the relevancy of search
                                                                   Several studies have investigated the extent to which a click-
results.
                                                                   through on a specific search engine result can be interpreted
                                                                   as a user indication of document relevancy (for a review see
Keywords                                                           [3]). The primary issue involving clickthrough data is that
Implicit Relevance Feedback, Personalization, Adaptive Search      users are most likely to click on higher ranked documents
System                                                             because they tend to read the SERP (search engine results
                                                                   page) from top to bottom. Additionally, users trust that
                                                                   a search engine places the most relevant documents at the
1. INTRODUCTION                                                    highest positions on the SERP.
   It has long since been demonstrated that explicit relevance        Joachims et al used eye tracking studies combined with
feedback can improve both precision and recall in informa-         manual relevance judgements to investigate the accuracy of
tion retrieval[1]. An initial query is used to retrieve a set of   clickthrough data for implicit relevance feedback [4]. They
documents. The user is then asked to manually rate a sub-          conclude that clickthrough data can be used to accurately
set of the documents as relevant or not relevant. The terms        determine relative document relevancies. If, for instance,
appearing in the relevant document are then added to the           a user clicks on a search result after skipping other search
initial query to produce a new query. Additionally, non-           results, subsequent evaluation by human judges show that
relevant documents can be used to remove or de-emphasize           in ∼80% of cases the clicked document is more relevant to
terms for the reformulated query. This process can be re-          the query than the documents that were skipped.
peated iteratively, but it was found that after a few iterations      In addition to clickthroughs, other user behaviors can be
very few new relevant documents are found [2].                     related to document relevancy. Fox et al. used a browser
   Explicit relevance feedback as described above requires ac-     add-in to track user behavior for a volunteer sample of of-
tive user participation. An alternative method that does not       fice workers[5]. In addition to tracking their search and web
require specific user participation is pseudo relevance feed-      usage, the browser add-in would prompt the user for spe-
back. In this scheme, the top N documents from the initial         cific relevance evaluations for pages they had visited. Using
query are assumed to be relevant. The important terms in           the observed user behavior and subsequent relevance evalu-
these documents are then used to expand the original query.        ations, they were able to correlate implicit user signals with
   Implicit Relevance Feedback aims to improve the precision       explicit user evaluations and determine what user signals
and recall of information retrieval by utilizing user actions      are most likely to indicate document relevance. For pages
                                                                   clicked by the user, the user indicated that they were either
                                                                   satisfied or partially satisfied with the document nearly 70%
                                                                   of the time. In the study, two other variables were found
SIGIR ’09, July 19-23, 2009, Boston, USA.                          to be most important for predicting user satisfaction with
Copyright is held by the author/owner(s).                          a result page visit. The first was the duration of time that
the user spent away from the SERP before returning – if           4.     TECHNOLOGICAL DETAILS
the user was away from the SERP for a short period of time           Surf Canyon’s technology can be used as both a tradi-
they tended to be dissatisfied with the document. The other       tional web search engine and as a browser extension that dy-
important variable for predicting user satisfaction was the       namically modifies the search results page from commercial
“Exit type” – users that closed the browser on a result page      search engines (currently Google, Yahoo!, and Live Search).
tended to be satisfied with that result page. The impor-          The underlying algorithms in the two cases are mostly iden-
tant outcome of this and other studies is that implicit user      tical. As the data presented was gathered using the browser
behavior can be used instead of explicit user feedback to         extension, we will describe that here.
determine the user’s information need.                               Surf Canyon’s browser extension was publicly launched
                                                                  on February 19, 2008. From that point forward visitors to
                                                                  the Surf Canyon website2 were invited to download a small
3. IMPLICIT REAL-TIME PERSONALIZA-                                piece of free software that is installed in their browser. The
   TION                                                           software works with both Internet Explorer and Firefox. Al-
   As discussed in the previous section, it has been shown        though the implementation differs for the two browsers, the
that implicit user behavior can often infer satisfaction with     functionality is identical.
visited results pages. The goal of the Surf Canyon technol-          Internet Explorer leads in all current studies of web browser
ogy is to use implicit user behavior to predict which unseen      market share with March 2008 market share estimated be-
documents in a collection are most relevant to the user and       tween 60% and 90%. Among users of the Surf Canyon
to recommend these documents to the user.                         browser extension, however, about 75% use Firefox. Among
   Shen, Tan, and Zhai1 have investigated context-sensitive       users who merely visit the extension download page, the
adaptive information retrieval systems [6]. They use both         breakdown by browser type is nearly 50/50. Part of the
clickthrough information and query history information to         skew towards Firefox in both website visitors and users of the
update the retrieval and ranking algorithm. A TREC collec-        product can be attributed to the fact that marketing of the
tion was used since manual relevancy judgements are avail-        product has been mainly via technology blogs. Readers of
able. They built an adaptive search interface to this collec-     technology blogs are more likely to use operating systems for
tion, and had 3 volunteers conduct searches on 30 relatively      which Internet Explorer is not available (e.g. Mac, Linux).
difficult TREC topics. The users could query, re-query, ex-       Additionally, we speculate that Firefox may be more preva-
amine document summaries, and examine documents. To               lent among readers of technology blogs. The difference be-
quantify the retrieval algorithms, they used Mean Average         tween the fraction of visitors to the site using Firefox (∼50%)
Precision (MAP) or Precision at 20 documents. As these            and the fraction of people who install and use the product
were difficult TREC topics, users submitted multiple queries      using Firefox (∼75%) is likely due to the more widespread
for each topic. They found that including query history           acceptance towards browser extensions in the Firefox com-
produced a marginal improvement in MAP, while use of              munity. The Firefox browser was specifically designed to
clickthrough information produced dramatic increases (up          have minimal core functionality augmented by browser add-
to nearly 100%) in MAP.                                           ons submitted by the developer community. The technolo-
   Shen et al. also built an experimental adaptive search in-     gies used to implement Internet Explorer browser extensions
terface called UCAIR (User-Centered Adaptive Information          are also often used to distribute malware so there may be a
Retrieval) [7]. Their client-side search agent has the capabil-   higher level of distrust among IE users.
ity of automatic query reformulation and active reranking of         Once the browser extension is installed, the user never
unseen search results based on a context driven user model.       needs to visit the company web site again to use the prod-
They evaluated their system by asking 6 graduate students         uct. The user enters a Google, Yahoo!, or Live Search web
to work on TREC topic distillation tasks. At the end of           search query just as they would for any search (using either
each topic, the volunteers were asked to manually evaluate        the search bar built into the browser or by navigating to
the relevance of 30 top ranked search results displayed by the    the URL of the search engine). After the initial query, the
system. The top results shown are mixed between Google            search engine results page is returned exactly as it would be
rankings and UCAIR rankings (some results overlap), and           were Surf Canyon not installed (for most users who have not
the evaluators could not distinguish the two. UCAIR rank-         specified otherwise, the default number of search results is
ings show a 20% increase in precision for the top 20 results.     10). Two minor modifications are made to the SERP. Small
   The Surf Canyon browser extension represents the first         bull’s eyes are placed next to the title hyperlink for each
attempt to integrate implicit relevance feedback directly into    search result (see Figure 1). Also, the numbered links to
the major commercial search engines. Hence, we are able to        subsequent search engine results pages at the bottom of the
evaluate this technology outside of controlled studies. From      SERP are replaced by a single “More Results” link.
a research perspective, this is the first study to investigate       The client side browser extension is used to communicate
this technology in the context of normal searches by normal       with the central Surf Canyon servers and to dynamically
users. The drawback is that we have no chance to collect          update the search engine results page. The personalization
a posteori relevancy judgements from the searchers or to          algorithms currently reside on the Surf Canyon servers. This
conduct surveys to evaluate the user experience. We can,          client-server architecture is used primarily to facilitate op-
however, quickly collect large amounts of user data in order      timization of the algorithm and to support active research
to evaluate the technology.                                       studies. Since web search patterns vary widely by user, the
                                                                  best way to evaluate personalized search algorithms is to
1                                                                 vary the algorithms on the same set of users while main-
 Shen, Tan, and Zhai are co-authors on one Surf Canyon
patent application but were not actively involved in the work
                                                                  2
presented here                                                        http://www.surfcanyon.com
  implicit relevance feedback - Google Search                               http://www.google.com/search?q=implicit+relevance+feedback&ie=ut...


           Web Images Maps News Shopping Gmail more ▼                                                                          Sign in


           Google                                                                                                Advanced Search
                                        implicit relevance feedback                                     Search   Preferences
                                                                                                                 Reset recommendations

            Web                                 Results 1 - 10 of about 1,180,000 for implicit relevance feedback. (0.04 seconds)


           Relevance feedback - Wikipedia, the free encyclopedia
           The idea behind relevance feedback is to take the results that are initially ... Implicit
           feedback is inferred from user behavior, such as noting which ...
           en.wikipedia.org/wiki/Relevance_feedback - 19k - Cached - Similar pages

           Implicit Relevance Feedback from Eye Movements (ResearchIndex)
           We explore the use of eye movements as a source of implicit relevance feedback
           information. We construct a controlled information retrieval experiment where ...
           citeseer.ist.psu.edu/730378.html - 20k - Cached - Similar pages

           Click data as implicit relevance feedback in web search
           In this article, we address three issues related to using click data as implicit relevance
           feedback: (1) How click data beyond the search results page might ...
           portal.acm.org/citation.cfm?id=1224561.1224720 - Similar pages

                       Surf Canyon recommends 3 search results:

                   Using Implicit Relevance Feedback in a Web (ResearchIndex)
                   The explosive growth of information on the World Wide Web demands effective intelligent
                   search and filtering methods. Consequently, techniques have been ...
                   citeseer.ist.psu.edu/572595.html - 20k - Cached - Similar pages
                   More results from citeseer.ist.psu.edu »


                   Implicit relevance feedback in interactive music (from page 2)
                   This paper presents methods for correlating a human performer and a synthetic
                   accompaniment based on Implicit Relevance Feedback (IRF) using Graugaard's ...
                   portal.acm.org/citation.cfm?id=1164845 - Similar pages
                   More results from portal.acm.org »


                   Scalable Relevance Feedback Using Click-Through Data for Web Image ... (from page 2)
                   File Format: PDF/Adobe Acrobat - View as HTML
                   In this paper, we have presented a scalable relevance feedback. mechanism for web
                   image retrieval. Click-through data is used as. implicit relevance ...
                   research.microsoft.com/users/leizhang/Paper/ACMMM06-Cheng.pdf - Similar pages
                   More results from research.microsoft.com »



           [PPT] LBSC 796/INFM 718R: Week 8 Relevance Feedback




  1 of 3                                                                                                                   03/20/2008 10:14 AM

Figure 1: A screenshot of the Google search result page with Surf Canyon installed. The third link was
selected by the user, leading to three recommended search results.
taining an identical user interface. With the client-server         generated immediately below this search result.
architecture, the implicit relevance feedback algorithms can           At the bottom of the 10 organic search results, there is a
be modified without alerting the user to any changes. Noth-         link to get “More Results”. If the user requests the next page
ing fundamental prevents the technology from becoming ex-           of results, all results shown on the second and subsequent
clusively client side.                                              pages are determined using Surf Canyon’s instantaneous rel-
   In addition to the ten results displayed by the search en-       evancy algorithm. Unlike the default search engine behavior,
gine to the user, a larger set of results (typically 200) for       subsequent pages of results are added to the existing page.
the same query is gathered by the server. With few excep-           After selecting “More Results” links 1-20 are displayed in the
tions, the top 10 links in the larger result set are identical      browser, with link 11 focused at the top of the window (the
to the results displayed by the search engine. While the            user needs to scroll up to see links 1-10).
user reads the search result page, the back-end servers parse
the larger result set and prepare to respond to user actions.       5.   ANALYSIS OF USER BEHAVIOR
Each user action on the search result page is sent to the
                                                                       Most previous studies of Interactive Information Retrieval
back-end server (note that we are only using the user’s ac-
                                                                    systems have used post-search user surveys to evaluate the
tions on the SERP for personalization and do not follow the
                                                                    efficacy of the systems. These studies also tended to re-
user after they leave the SERP). For certain actions (select
                                                                    cruit test subjects and use closed collections and/or spe-
a link, select a Surf Canyon bull’s eye, ask for more results)
                                                                    cific research topics. The data presented here was collected
the back end server sends recommended search results to
                                                                    from an anonymous (but not necessarily representative) set
the browser. The Surf Canyon real-time implicit personal-
                                                                    of web surfers during the course of their interactions with
ization algorithm incorporates both the initial rank of the
                                                                    the three leading search engines (Google, Yahoo, and Live
result and personalized instantaneous relevancies. The im-
                                                                    Search). The majority of searches were conducted using
plicit feedback signals used to calculate the real-time search
                                                                    Google. Where possible, we have analyzed the user data
result ranks are cumulative across all recent related queries
                                                                    independently for each of the search engines and have not
by that user. The algorithm does not, however, utilize any
                                                                    found any cases where the conclusions drawn from this study
long-term user profiling or collaborative filtering. The pre-
                                                                    would differ depending on the user’s choice of search en-
cise details of the Surf Canyon algorithm are proprietary
                                                                    gine. The total number of unique search queries analyzed
and are not important for the evaluation of the technology
                                                                    was ∼700,000.
presented below. If an undisplayed result from the larger set
                                                                       Since the users in this study were acquired primarily from
of results is deemed by Surf Canyon’s algorithm to be more
                                                                    technology web blogs, their search behavior can be expected
relevant than other results displayed below the last selected
                                                                    to be significantly different than the average web surfer.
link, it is shown as an indented recommendation below the
                                                                    Thus, we cannot evaluate the real-time personalization tech-
last selected link.
                                                                    nology by comparing to previous studies of web user be-
   The resulting page is shown in Figure 1. Here, the user
                                                                    havior. Also, since we have changed the appearance of the
entered a query for “implicit relevance feedback” on Google3 .
                                                                    SERP and also dynamically modify the SERP, any metrics
Google returned 10 organic search results (only three of
                                                                    calculated from our data cannot be directly compared to
which are displayed in Figure 1) of the 1,180,000 documents
                                                                    historical data due to the different user interface.
in their web index that satisfy the query. The user then
                                                                       Surf Canyon only shows recommendations after a bull’s
selected the third organic search result, a paper from an
                                                                    eye or search result is selected. It is therefore interesting
ACM conference entitled “Click data as implicit relevance
                                                                    to investigate how many actions a user makes for a given
feedback in web search”. Based on the implicit user signals
                                                                    query as this tells us how frequently implicit personalization
(which include interactions with this SERP, recent similar
                                                                    within the same query can be of benefit. Jansen and Spink
queries, and interactions with those results pages) the Surf
                                                                    [8] found from a meta-analysis of search engine log studies
Canyon algorithm recommends three search results. These
                                                                    that user interaction with the search engine results pages is
links were initially given a higher initial rank (> 10) by
                                                                    decreasing. In 1997, 71% of searchers viewed beyond the first
the Google algorithm in response to the query “implicit rel-
                                                                    page of search results. In 2002 only 27% of searchers looked
evance feedback”. The real-time personalization algorithm
                                                                    past the first page of search results. There is a paucity of
has determined, however, that the three recommended links
                                                                    data on the number of web pages visited per search. Jansen
are more pertinent to this user’s information need at this
                                                                    and Spink [9] reported the mean number of web pages vis-
particular time than the results displayed by Google with
                                                                    ited per query to be 2.5 for AllTheWeb searches in 2001,
initial ranks 4-10.
                                                                    but they exclude queries where no pages were visited in this
   Recommendations are also generated when a user clicks
                                                                    estimate. Analysis of the AOL query logs from 2006 [10]
on the small bull’s eyes next to the link title. We assume
                                                                    gives a mean number of web pages viewed per unique query
that a selection of a bull’s eye indicates that the linked doc-
                                                                    of 0.97. For the current data sample, the mean number of
ument is similar to but not precisely what the user is looking
                                                                    search results visited is 0.56. The comparatively low num-
for. For the analysis below, up to three recommendations
                                                                    ber of search results that were selected in the current study
are generated for each link selection or bull’s eye selection.
                                                                    has multiple partial explanations. The search results page
Unless the user specifically removes recommended search re-
                                                                    now contains multiple additional links (news, videos) that
sults by clicking on the bull’s eye or by clicking the close box,
                                                                    are not counted in this study. Additionally, the information
they remain displayed on the page. Recommendations can
                                                                    that the user is looking for is often on the SERP (e.g. a
nest up to three levels deep – if the user clicks on the first
                                                                    search for a restaurant often produces the map, phone num-
recommended result then up to three recommendations are
                                                                    ber, and address). Search engines have replaced bookmarks
                                                                    and direct URL typing for re-visiting web sites. For such
3
    http://www.google.com                                           navigational searches the user will have either one or zero
Fraction of Queries (%)                                              the fact that users do not often click on more than one search
                                                                     result as discussed above. The important point, however, is
                          60
                                                                     that the Surf Canyon implicit relevance feedback technol-
                          50                                         ogy increases the click frequency by ∼80% compared to the
                                                                     links presented without any real-time user-intent modelling.
                          40
                                                                     The relative increase in clickthrough rate is constant (within
                          30                                         statistical errors) for all display positions even though the
                                                                     absolute clickthrough rates rapidly drop as funciton of dis-
                          20
                                                                     play position.
                          10
                           0 NONE      1    2    3     4    5+




                                                                     Click Probability (%)
                                                                                              3                     w/ Implicit Feedback
                               Number Of Search Results Selected                             2.5                    w/o Implicit Feedback

                                                                                              2
Figure 2: Distribution of total number of selections
per query.                                                                                   1.5

                                                                                               1
                                                                                             0.5
clicks depending on whether the specific web page is listed
on the SERP. Additionally, it may be that the current sam-                                    0    1            2               3
ple of users is biased towards searchers who are less likely to
click on links.                                                                                        Recommended Link Position
   Figure 2 shows the distribution of the total number of
selections per query. 62% of all queries lead to the selection       Figure 3: Probability (%) that a recommended
of zero search results. Since Surf Canyon does nothing until         search result will be clicked as a function of display
after the first selection, this number is intrinsic to the current   position relative to the last selected search result.
users interacting with these particular search engines. A            The red circles are for recommendations selected
recent study by Downey, Dumais and Horvitz also showed               using Surf Canyon’s instantaneous relevancy algo-
that after a query the user’s next action is to re-query or end      rithm, while the black triangles are for the random
the search session about half the time [11]. In our study, only      control sample that does not incorporate relevance
12% of queries lead to more than one user selection. A goal          feedback.
of implicit real-time personalization would be to decrease
direct query reformulation and to increase the number of                Figure 4 shows the per query distribution of initial search
informational queries that lead to multiple selections. The          result ranks for all selected search links in the current data
current data sample is insufficient to study whether this goal       sample. The top 10 links are selected most frequently. Search
has been achieved.                                                   results beyond 10 are all displayed using Surf Canyon’s al-
   In order to evaluate the implicit personalization technol-        gorithm (either through a bull’s eye selection, a link selec-
ogy developed by Surf Canyon we chose to compare the ac-             tion, or when the user selects more results). For the re-
tions of the same set of users with and without the implicit         sults displayed by Surf Canyon (initial ranks > 10), the
personalization technology enabled. Our baseline control             selection frequency follows a power-law distribution with
sample was created by randomly replacing recommended                 P (IR) = 38% ∗ IR−1.8 , where IR is the initial rank.
search results with random search results selected from among           As Surf Canyon’s algorithm favors links with higher initial
the results with initial ranks 11-200. These “Random Rec-            rank, the click frequency distribution does not fully reflect
ommendations” were only shown for 5% of the cases where              the relevancy of the links as a function of initial rank. Fig-
recommendations were generated. The position (1, 2, or 3)            ure 5 shows the probability that a shown recommendation
in the recommendation list was also random. These ran-               is clicked as a function of the initial rank. This is only
dom recommendations were not necessarily poor, as they do            for recommendations shown in the first position below the
come from the list of results generated by the search engine         last selected link. After using Surf Canyon’s instantaneous
in response to the query.                                            relevancy algorithm, this probability shows at most a weak
   Figure 3 shows the click frequency for Surf Canyon rec-           dependence on the initial rank of the search result. The dot-
ommendations as a function of the position of the recom-             ted link shows the result of a linear regression to the data,
mendation relative to the last selected search result. Posi-         P (IR) = 3.2 − (0.0025 ± 0.00101) ∗ IR. When sufficient data
tion 1 is immediately below the last selected search result.         is available we will repeat the same analysis for “Random
Also shown are the click frequencies for “Random Recom-              Recommendations” as that will give us a user-interface in-
mendations” placed at the same positions. In both cases,             dependent estimate of the relative relevance for deep links
the frequency is relative to the total number of recommen-           in the search result set before the application of the implicit
dations shown at that position. The increase in click rate           feedback algorithms.
(∼60%) is constant within statistical uncertainties for all             For the second and subsequent results pages, the browser
recommended link positions. Note that the recommenda-                extension has complete control over all displayed search re-
tions are generated each time a user selects a link and are          sults. For a short period of time we produced search re-
considered to be shown even if the user does not return to the       sults pages that mixed Surf Canyon’s top ranked results
SERP. The low absolute click rates (3% or less) are due to           with results having the top initial ranks from the search
                                                                                                  engine. This procedure was proposed by Joachims as a way
                                                                                                  to use clickthrough data to determine relative user prefer-
                                                                                                  ence between two search engine retrieval algorithms [12].
                                                                                                  Each time a user requests “More Results”, two lists are gen-
                                                                                                  erated. The first list (SC) contains the remaining search
                                                                                                  results as ranked by the Surf Canyon’s instantaneous rele-
                                                                                                  vancy algorithm. The second list (IR) contains the same set
                                                                                                  of results ranked by their initial display rank from the search
 Click Frequency (%)




                             10                                                                   engine. The list of results shown to the user is such that the
                                                                 Google w/ Surf Canyon            top kSC and kIR results are displayed from each list, with
                              1                                                                   |kSC − kIR | < 1. Whenever kSC = kIR the next search re-
                                                                                                  sult is taken from one of the lists chosen at random. Thus,
                         10
                              -1                                                                  the topmost search result on the second page will reflect
                                                                                                  Surf Canyon’s ranking half the time and the initial search
                       10
                              -2                                                                  result order half the time. By mixing the search results
                                                                                                  this way, the user will see, on average, an equal number of
                              -3                                                                  search results from each ranking algorithm in each position
                       10
                                   0             50         100               150           200   on the page. The users have no way of determining which
                                                                        Initial SERP Rank         algorithm produced each search result. If the users select
                                                                                                  more search results from one ranking algorithm compared
                                                                                                  to the other ranking algorithm it demonstrates an absolute
Figure 4: Frequency per non-repeated search query                                                 user preference for the retrieval function that led to more
for link selection as a function of initial search result                                         selections.
rank.                                                                                                Figure 6 shows the ratio of link clicks for the two retrieval
                                                                                                  functions. IR is the retrieval function based on the result
                                                                                                  rank returned from the search engine. SC is the retrieval
                                                                                                  function incorporating Surf Canyon’s implicit relevance feed-
                                                                                                  back technology. The ratio is plotted as a function of the
                                                                                                  number of links selected previously for that query. Previ-
                                                                                                  ously selected links are generally considered to be positive
                                                                                                  content feedback. If, on the other had, no links were selected
                                                                                                  then the algorithm bases its decision exclusively on negative
                                                                                                  feedback indications (skipped links) and on the user intent
                                                                                                  model that may have been developed for similar recent re-
                                                                                                  lated queries.
                                                                                                   Link Selection Ratio [SC/IR]
 Rec. Link Click Prob. (%)




                             3.5                                                                                                  1.4
                              3                                                                                                   1.3
                             2.5                                                                                                  1.2
                              2                                                                                                   1.1
                             1.5                                                                                                    1
                               1                                                                                                  0.9
                             0.5
                                                                                                                                  0.8   NONE       1          2-4         5+
                              0        20   40   60   80   100    120   140     160   180   200                                                # Previous Search Results Selected
                                                                        Initial SERP Rank

                                                                                                  Figure 6: Ratio of click frequency for second and
Figure 5: Probability that a displayed recommended                                                subsequent search results page links ordered by
link is selected as a function of the initial search re-                                          Surf Canyon’s Implicit Relevance Feedback algo-
sult rank. This data only include links from the first                                            rithm (SC) compared to links ordered by the initial
position immediately below the last selected search                                               search engine result rank (IR).
result.
                                                                                                     We observe that, independent of the number of previous
                                                                                                  user link selections in the same query, the number of clicks on
                                                                                                  links from the relevance feedback algorithm is higher than
                                                                                                  links displayed because of their higher initial rank. This
                                                                                                  demonstrates an absolute user preference for the ranking al-
                                                                                                  gorithm that utilizes implicit relevance feedback. Remark-
ably, the significant user preference for search results re-         feedback. In SIGIR ’05, 2005.
trieved using the implicit feedback algorithm is also appar-     [7] Xuehua Shen, Bin Tan, and ChengXiang Zhai.
ent when the user had zero positive clickthrough actions on          Implicit user modelling for personalized search. In
the first 10 results. After skipping the first 10 results and        CIKM ’05, 2005.
asking for a subsequent set of search links, the users are       [8] B. Jansen and A. Spink. How are we searching the
∼35% more likely to click on the top ranked Surf Canyon              world wide web?: a comparison of nine search engine
result compared to result # 11 from Google. Clearly, the             transaction logs. Information Processing and
searcher is not so interested in search results produced by          Management, 42(1):248–263, 2006.
the identical algorithm that produced the 10 skipped links       [9] B. Jansen and A. Spink. An analysis of web documents
and an update of the user intent model for this query is             retrieved and viewed. In The 4th International
appropriate.                                                         Conference on Internet Computing, pages 65–69, 2003.
                                                                [10] G. Pass, A. Chowdhury, and C. Torgeson. A picture of
6. CONCLUSIONS AND FUTURE DIREC-                                     search. In The First International Conference on
   TIONS                                                             Scalable Information Systems, 2006.
   Surf Canyon is an interactive information retrieval system   [11] D. Downey, S. Dumais, and E. Horvitz. Studies of web
that dynamically modifies the SERP from major search en-             search with common and rare queries. In SIGIR ’07,
gines based on implicit relevance feedback. This was built           2007.
with the goal of relieving the growing user frustration with    [12] T. Joachims. Unbiased evaluation of retrieval quality
the search experience and to help searchers “find what they          using clickthrough data. In SIGIR Workshop on
need right now”. The system presents recommended search              Mathematical/Formal Methods in Information
results based on an instantaneous user-intent model. By              Retrieval, 2002.
comparing clickthrough rates, it was shown that real-time
implicit personalization can dramatically increase the rele-
vancy of presented search results.
   Users of web search engines learn to think like the search
engines they are using. As an example, searchers tend to
select words with high IDF (inverse document frequency)
when formulating queries – they naturally select the rarest
terms that they can think of that would be in all documents
they desire. Excellent searchers can often formulate suffi-
ciently specific queries after multiple iterations such that
they eventually find what they need. Properly implemented
implicit relevance feedback would reduce the need for query
reformulations, but it should be noted that in the current
study most users had not yet adjusted their browsing habits
to the modified behavior of the search engine. By tracking
the current users in the future we hope to see changes in
user behavior that can further improve the utility of this
technology. As the user-intent model is cumulative, more
interaction will produce better recommendations once the
users learn to trust the system.

7. REFERENCES
 [1] J.J. Rocchio. The Smart Retrieval System
     Experiments in Automatic Document Processing.
     Prentice Hall, 1971.
 [2] D. Harman. Relevance feedback revisited. In
     Proceedings of the Fifteenth International ACM SIGIR
     Conference, pages 1–10, 1992.
 [3] D. Kelly and J. Teevan. Implicit feedback for inferring
     user preference: A bibliography. In SIGIR Forum
     37(2), pages 18–28, 2003.
 [4] Thorsten Joachims, Laura Granka, Bing Pan, Helene
     Hembrooke, and Geri Gay. Accurately interpreting
     clickthrough data as implicit feedback. In SIGIR ’05,
     2005.
 [5] Steve Fox, Kuldeep Karnawat, Mark Mydland, Susan
     Dumais, and Thomas White. Evaluating implicit
     measures to improve web search. ACM Transactions
     on Information Systems, 23(2):147–168, April 2005.
 [6] Xuehua Shen, Bin Tan, and ChengXiang Zhai.
     Context-sensitive information retrieval using implicit