<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Dynamics of Search Engine Rankings - A Case Study</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Judit Bar-Ilan</string-name>
          <email>judit@cc.huji.ac.il</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Mark Levene and Mazlita Mat-Hassan School of Computer Science and Information Systems Birkbeck , University of London</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>The Hebrew University of Jerusalem and Bar-Ilan University Israel</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The objective of this study was to characterize the changes in the rankings of the top-n results of major search engines over time and to compare the rankings between these engines. We considered only the top-ten results, since users usually inspect only the first page returned by the search engine, which normally contains ten results. In particular, we compare rankings of the top ten results of the search engines Google and AlltheWeb on identical queries over a period of three weeks. The experiment was repeated twice, in October 2003 and in January 2004 in order to assess changes to the top ten results of some of the queries during a three months period. Results show that the rankings of AlltheWeb were highly stable over each period, while the rankings of Google underwent constant yet minor changes, with occasional major ones. Changes over time can be explained by the dynamic nature of the Web or by fluctuations in the search engines' indexes (especially when frequent switches in the rankings are observed). The top ten results of the two search engines have surprisingly low overlap. With such small overlap (occasionally only a single URL) the task of comparing the rankings of the two engines becomes extremely challenging, and additional measures are needed to assess rankings in such situations. The Web is growing continuously; new pages are published on the Web every day. However it is not enough to publish a Web page - this page must also be locatable. Currently the primary tools for locating information on the Web are the search engines, and by far the most popular search engine is Google (Nielsen/NetRatings, 2003; Sullivan &amp; Sherman, 2004). Google reportedly covers over 4.2 billion pages as of mid-February 2004 (Google, 2004; Price, 2004), a considerable jump from over 3.3 billion as reported from August 2003 and until mid-February 2004. Some of the pages indexed by Google are not from the traditional “publicly indexable Web” (Lawrence &amp; Giles, 1999), for example records from OCLC's WorldCat (Quint, 2003). Currently the second largest search engine in terms of the reported number of indexed pages is AlltheWeb with over 3.1 billion pages (AlltheWeb, 2004). At the time of our data collection, the two search engines were of similar size. There are no recent studies on the coverage of Web search engines, but the 1999 study of Lawrence and Giles found that the, then largest search engine (NorthernLight), covered only about 16% of the Web. Today, authors of Web pages can influence the inclusion of their pages through the paid-inclusion services. AlltheWeb has a paid-inclusion service, and even though Google doesn't, one's chances of being crawled are increased if the pages appear in major directories (which do have paid-inclusion services) (Sullivan, 2003a). However, it is not enough to be included in the index of a search engine, placement is also crucial, since most Web users do not browse beyond the first ten or twenty results (Silverstein et al., 1999; Spink et al., 2002). Paid inclusion is not supposed to influence the placement of the page. The SEOs (Search Engine Optimizers) offer their services to increase the ranking of</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        your pages on certain queries (see for example Search Engine Optimization, Inc,
http://www.seoinc.com/) – Google
        <xref ref-type="bibr" rid="ref19 ref4 ref5">(Google, 2003a)</xref>
        warns against careless use of such
services. Thus it is clear to all that the top ten results retrieved on a given query have the best
chance of being visited by Web users. This was the main motivation for the research we
present herein, in addition to examining the changes over time in the top ten results for a set
of queries of the currently two largest search engines, Google and AlltheWeb. In parallel to
this line of enquiry, we also studied the similarity (or rather non-similarity) between the top
ten results of these two tools.
      </p>
      <p>
        For this study, we could not analyze the ranking algorithms of the search engines, since these
are kept secret, both because of the competition between the different tools and in order to
avoid misuse of the knowledge of these algorithms by users who want to be placed high on
specific queries. For example, Google is willing to disclose only that its ranking algorithm
involves more than 100 factors, but “due to the nature of our business and our interest in
protecting the integrity of our search results, this is the only information we make available to
the public about our ranking system”
        <xref ref-type="bibr" rid="ref12 ref19 ref4 ref5">(Google, 2003b)</xref>
        . Thus we had to use empirical methods
to study the differences in the ranking algorithms and the influence of time on the rankings of
search engines.
      </p>
      <p>
        The usual method of evaluating rankings is through human judgment. In an early study by
        <xref ref-type="bibr" rid="ref17">Su
et al. (1998)</xref>
        , users were asked to choose and rank the five most relevant items from the first
twenty results retrieved for their queries. In their study, Lycos performed better on this
criteria than the other three examined search engines.
        <xref ref-type="bibr" rid="ref8">Hawking et al. (1999)</xref>
        compared
precision at 20 of five commercial search engines with precision at 20 of six TREC systems.
The results for the commercial engines were retrieved from their own databases, while the
TREC engines’ results came from an 18.5 million pages test collection of Web pages.
Findings showed that the TREC systems outperformed the Web search engines, and the
authors concluded that “the standard of document rankings produced by public Web search
engines is by no means state-of-the-art.” On the other hand,
        <xref ref-type="bibr" rid="ref14">Singhal and Kaszkiel (2001)</xref>
        compared a well-performing TREC system with four Web search engines and found that “for
finding the web page/site of an entity, commercial web search engines are notably better than
a state-of-the-art TREC algorithm.” They were looking for home pages of the entity and
evaluated the search tool by the rank of the URL in the search results that pointed to the
desired site. In Fall 1999,
        <xref ref-type="bibr" rid="ref7">Hawking et al. (2001)</xref>
        evaluated the effectiveness of twenty public
Web search engines on 54 queries. One of the measures used was the reciprocal rank of the
first relevant document – a measure closely related to ranking. The results showed significant
differences between the search engines and high intercorrelation between the measures.
        <xref ref-type="bibr" rid="ref2">Chowdhury and Soboroff (2002)</xref>
        also evaluated search effectiveness based on the reciprocal
rank – this time of the URL of a known item.
      </p>
      <p>
        Evaluations based on human judgments are unavoidably subjective.
        <xref ref-type="bibr" rid="ref22">Voorhees (2000)</xref>
        examined this issue, and found very high correlations among the rankings of the systems
produced by different relevance judgment sets. The paper considers rankings of the different
systems and not rankings within the search results, and despite the fact that the agreement on
the ranking performance of the search tools was high, the mean overlap between the relevance
judgments on individual documents of two judges was below 50% (binary relevance
judgments were made).
        <xref ref-type="bibr" rid="ref16">Soboroff et al. (2001)</xref>
        based on the finding that differences in human
judgments of relevance do not affect the relative evaluated performance of the different
systems, proposed a ranking system based on randomly selecting “pseudo-relevant”
documents. In a recent study, Vaughan (to appear) compared human rankings of 24
participants with those of three large commercial search engines, Google, AltaVista and
Teoma on four search topics. The highest average correlation between the human-based
rankings and the rankings of the search engines was for Google, where the average correlation
was 0.72. The average correlation for AltaVista was 0.49.
        <xref ref-type="bibr" rid="ref3">Fagin et al. (2003)</xref>
        proposed a method for comparing the top-k results retrieved by different
search engines. One of the applications of the metrics proposed by them was comparing the
rankings of the top 50 results of seven public search tools (AltaVista, Lycos, AlltheWeb,
HotBot, NorthernLight, AOLSearch and MSNSearch - some of them received their results
from the same source, e.g., Lycos and AlltheWeb) on 750 queries. The basic idea of their
method was to assign some reasonable, virtual placement to documents that appear in one of
the lists but not in the other. The resulting measures were proven to be metrics, which is a
major point they stress in their paper.
      </p>
      <p>The studies we have mentioned concentrate on comparing the search results of several
engines at one point in time. In contrast, this study examines the temporal changes in search
results over a period of time within a single engine and between different engines. In
particular, we concentrate on the results of two of the largest search engines, Google and
AlltheWeb using three different measures described below.</p>
    </sec>
    <sec id="sec-2">
      <title>Methodology</title>
      <sec id="sec-2-1">
        <title>Data Collection</title>
        <p>The data for this study was collected during two, approximately three weeks long time
periods, the first during October 2003 and the second during January 2004. The data
collection for the first period was a course assignment at Birbeck, University of London. Each
student was required to choose a query from a list of ten queries and also to choose an
additional query of his/her own liking. These two queries were to be submitted to Google
(google.com) and AlltheWeb (alltheweb.com) twice a day (morning and evening) during a
period of three weeks. The students were to record the ranked list of the top ten retrieved
URLs for each search point. Overall, 34 different queries were tracked by twenty-seven
students (some of the queries were tracked by more than one student). The set of all queries
that were processed with the numbering assigned to them appear in Table 1. For the first
period queries q01-q05 were analyzed.</p>
        <p>
          The process was repeated at the beginning January 2004. We picked 10 queries from the list
of 34 queries. This time we queried Google.com, Google.co.uk, Google.co.il and Alltheweb
in order to assess the differences between the different Google sites as well. In this
experiment, at each data collection point all the searches were carried out within a 20-minute
timeframe. The reason for rerunning the searches was to study the effect of time on the top
ten results. Between the two parts of the experiment, Google most likely introduced a major
change into its ranking algorithm (called the “Florida Google Dance” -
          <xref ref-type="bibr" rid="ref12 ref18 ref19 ref20">(Sullivan, 2003b)</xref>
          ),
and we were interested to study the effects of this change. For the second period queries
q01q10 were analyzed. The search terms were not submitted as phrases at either stage.
        </p>
        <p>Query ID
q01
q02
q03
q04
q05
q06
q07
q08
q09
q10</p>
        <p>Query
Modern architecture
Web data mining
world rugby
Web personalization
Human Cloning
Internet security
Organic food
Snowboarding
dna evidence
internet advertising techniques
We used three measures in order to assess the changes over time in the rankings of the search
engines and to compare the results of Google and AlltheWeb. The first and simplest measure
is simply the size of the overlap between two top ten lists.</p>
        <p>The second measure was Spearman’s rho. Spearman’s rho is applied to two rankings of the
same set, thus if the size of the set is N, all the rankings must be between 1 and N (ties are
allowed). Since the top ten results retrieved by two search engines on a given query, or
retrieved by the same engine on two consecutive days are not necessarily identical, the two
lists must be transformed before Spearman’s rho can be computed. First the non-overlapping
URLs were eliminated from both lists, and then the remaining lists were reranked, each URL
was given its relative rank in the set of remaining URLs in each list. After these
transformations Spearman’s rho could be computed:
r = 1 −
6∑ di2
(n 2 − 1)n
where di is the difference between the ranking of URLi in the two lists. The value of r is
between -1 and 1, where -1 indicates that the two lists have opposite rankings, and 1 indicates
perfect correlation. Note that Sperman’s rho is based on the reranked lists, and thus for
example if the original ranks of the URLs that appear in both lists (the overlapping pairs) are
(1,8), (2,9) and (3,10), the reranked pairs will be (1,1), (2,2) and (3,3) and the value of
Spearman’s rho will be 1 (perfect correlation).</p>
        <p>
          The third measure utilized by us was one of the metrics introduced by
          <xref ref-type="bibr" rid="ref3">Fagin et al. (2003)</xref>
          . It is
relatively easy to compare two rankings of the same list of items – for this well-known
statistical measures such as Kendall’s tau or Spearman’s rho can be easily utilized. The
problem arises when the two search engines that are being compared rank non-identical sets
of documents. To cover this case (which is the usual case when comparing top-k lists created
by different search engines),
          <xref ref-type="bibr" rid="ref3">Fagin et al. (2003)</xref>
          extended the previously mentioned metrics.
Here we discuss only the extension of Spearman’s footrule (a variant of Spearman’s rho,
which is unlike Spearman’s rho is a metric), but the extensions of Kendall’s tau are shown in
the paper to be equivalent to the extension of Spearman’s footrule. A major point in their
method was to develop measures that are either metrics or “near” metrics. Spearman’s
footrule, is the L1 distance between two permutations (where the rankings on identical sets
can be viewed as permutations): F(σ 1,σ 2 ) = ∑|σ 1(i) −σ 2 (i) | . This metric is extended for the
case where the two lists are not identical, to documents appearing in one of the lists but not in
the other an arbitrary placement (which is larger than the length of the list) is assigned in the
second list – when comparing lists of length k this placement can be k+1 for all the
documents not appearing in the list. The rationale for this extension is that the ranking of
those documents must be k+1 or higher – Fagin et al. do not take into account the possibility
that those documents are not indexed at all by the other search engine. The extended metric
becomes:
        </p>
        <p>F (k+1) (τ 1,τ 2 ) = 2(k − z)(k + 1) + ∑i∈Z |τ 1 (i) −τ 2 (i) | − ∑i∈Sτ 1 (i) − ∑i∈Tτ 2 (i)
where Z is the set of overlapping documents, and z is the size of Z, S is the set of documents
that are only in the first list and T is the set of documents that appear in the second list only. A
problem with the measures proposed by Fagin et al. is that when the two lists have little in
common, the non-common documents have a major effect on the measure. Our experiments
show that usually the overlap between the top ten results of two search engines for an
identical query is very small, and the non-overlapping elements have a major effect.
F(k+1) was normalized by Fagin et al. so that the values lie between 0 and 1. For k=10 the
normalization factor is 110. Since F(k+1) is a distance measure, the smaller the value the more
similar are the two lists, however for Spearman’s rho the more similar the two lists are, the
value of the measure is nearer to 1. In order to be able to have some comparison between the
two measures, we computed</p>
        <p>G (k+1) = 1 −</p>
        <p>F (k+1)
max F (k+1)
which we refer to as the G metric.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Data analysis</title>
        <p>For a given search engine and a given query we computed these measures on the results for
consecutive data collection points. When comparing two search engines we computed the
measures on the top ten results retrieved by both engines on the given data collection point.
The two periods were compared on five queries - here we calculated the overlap between the
two periods and assessed the changes in the rankings of the overlapping elements based on
the average rankings.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results and Discussion</title>
      <sec id="sec-3-1">
        <title>A Single Engine over Time</title>
        <p>AlltheWeb was very stable during both phases on all queries, as can be seen in Table 2. There
were almost no changes either in the set of URLs retrieved or in the relative placement of
these URLs in the top ten results. Some of the queries were monitored by several students,
thus the number of data comparisons (comparing the results of consecutive data collection
points) was high, For each query we present the total number of URLs identified during the
period, the average and minimum number of URLs that were retrieved at both of the two
consecutive data collection points (overlap). The maximum overlap was 10 for each of the
queries, an overlap of 10 was rather frequent, thus we computed the percentage of the
comparisons where the set of URLs was not identical in both of the points that were
compared (% of points with overlap less than 10). In addition, Table 1 displays the percentage
of comparisons where the relative ranking of the overlapping URLs changed and the minimal
values of Spearman’s rho and of G (the maximal values where 1 in all cases). Finally, in order
to assess the changes in the top-ten URLs over a longer period of time, we also present the
number of URLs that were retrieved in both the first and the last data collection points.
When considering the data for Google we see somewhat larger variability, but still the
changes between two consecutive data points are rather small. Note that for the query number
3 (world rugby), there were frequent changes in the placement of the top ten URLs.
10
9
9
9
10
% of
points
overlap
less than
10
0%
9%
14%
20%
0%
0%
0%
2%
0%
0%
1
1
1
1
0.9
min
G
1
1
1
0.8
0.95 0.891
0.983 0.933
0.548 0.8</p>
        <p>1 0.691
0.891 0.927
1 3 5 7 9 1 3 5 7 9 1 3 5 7 9 1 3 5 7 9 1 3 5 7 9 1 3 5 7 9 1 3 5 7 9 1 3 5 7 9 1 3 5 7 9 1 3 5 7 9
1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9</p>
        <p>Data capture points</p>
        <sec id="sec-3-1-1">
          <title>Google- Web Personalization - First Period</title>
        </sec>
        <sec id="sec-3-1-2">
          <title>Data capture points</title>
          <p>Similar analysis was carried out for the queries during the second period. The results appear
in Tables 4 and 5. Also during the second period the results and the rankings of AlltheWeb
were highly stable. Google.com exhibited considerable variability, even though the average
overlap was above 9 for all ten queries. Unlike AlltheWeb, quite often the relative placements
of the URLs changed.</p>
          <p>Perhaps the most interesting case for Google.com was query 10 (internet advertising
techniques), where all except two of the previous hits were replaced by completely new ones
(and the relative rankings of the two remaining URLs were swapped, and from this point on
the search engine presented this new set of results. This was not accidental, the same behavior
was observed on Google.co.uk and Google.co.il as well. We do not display the results for
Google.co.uk and Google.co.il here, since the descriptive statistics are very similar, even
though there are slight differences between the result sets. We shall discuss this point more
extensively when we compare the results of the different engines.
query</p>
          <p># days # # URLs
monitored comparisons identified
during
period
Comparing Two Engines
At the time of the data collection the two search engines reportedly indexed approximately
the same number of documents (approximately 3 billion documents). In spite of this the
results show that the overlap between the top ten results is extremely small (see Tables 6 and
7). The small positive and the negative values of Spearman’s rho indicate that the relative
rankings on the overlapping elements are considerably different – thus even for those URLs
that are considered highly relevant for the given topic by both search engines; the agreement
on the relative importance of these documents is rather low.
query</p>
          <p># days # average min max average min max
monitored comparisons overlap overlap overlap Spearman Spearman Spearman
average min G max G</p>
          <p>
            G
# days # average min max average min max
monitored comparisons overlap overlap overlap Spearman Spearman Spearman
There are two possible reasons why a given URL does not appear in the top ten results of a
search engine: either it is not indexed by the search engine or the engine ranks it after the first
ten results. We checked whether the URLs identified by the two search engines during the
second period are indexed by the search engine
            <xref ref-type="bibr" rid="ref1 ref6">(we ran this check in February 2004)</xref>
            . We
defined three cases: the URL was in the top ten list of the engine some time during the period
(called “top-ten”), it was not in the top ten, but is indexed by the search engine (“indexed”)
and is not indexed at all (“not indexed”). The results for queries 1-5 appear in Table 8. The
results for these five queries show that both engines index most of the URLs located (between
67.6% and 96.6% of the URLs – top-ten and indexed combined), thus it seems that the
ranking algorithms of the two search engines are highly dissimilar.
          </p>
          <p>-1
0.266
-0.8
n/a
0.5</p>
          <p>-1
0.311
0.527
n/a
0.5
# days # average min max average min max
monitored comparisons overlap overlap overlap Spearman Spearman Spearman</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Comparing Two Periods</title>
        <p>The second period of data collection took place about three months after the first one. We
tried to assess the changes in the top ten lists of the two search engines. The findings are
summarized in Table 11. Here we see again that AlltheWeb is less dynamic than Google,
except for query 4 (web personalization), where considerable changes were recorded for
AlltheWeb as well.</p>
        <p>AlltheWeb
URLs overlap URLs min change max change URLs
(two missing from average average (both
periods) second set ranking ranking period)</p>
        <p>Google
URLs missing min change max change
from second average average
set ranking ranking
11
11
22
19
10
10
10
8
7
10
1
0
4
7
0
0
0
0
0
0</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion and Conclusions</title>
      <p>In this paper, we computed a number of measures in order to assess the changes that occur
over time to the rankings of the top ten results on a number of queries for two search engines.
We computed a number of measures, since none of them were satisfactory as a standalone
measure for such assessment. Overlap does not assess rankings at all, while Spearman’s rho
ignores the non-overlapping elements and takes into account relative placement only.
Moreover, Fagin’s measure gives too much weight to the non-overlapping elements. The
three measures together provide a better picture than any of these measures alone. Since none
of these measures are completely satisfactory, we recommend experimenting with additional
measures in the future.</p>
      <p>The results indicate that the top ten results usually change gradually. Abrupt changes were
observed only very occasionally. Overall, AlltheWeb seems to be much less dynamic than
Google. The ranking algorithms of the two search engines seem to be highly dissimilar: even
though both engines index most of the URLs that appeared in the top ten lists; the differences
in the top ten lists are large (the overlap is small and the correlations between the rankings of
the overlapping elements are usually small, sometimes even negative). One reason for Google
being more dynamic may be due to its search indexes being unsynchronised while they are
being updated, and the non-deterministic nature of query processing due to its distributed
nature.</p>
      <p>An additional area for further research, along the lines of the research carried out by Vaughan
(to appear), is comparing the rankings provided by the search engines with human judgments
placed on the value of the retrieved documents.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>AlltheWeb</surname>
          </string-name>
          (
          <year>2004</year>
          ).
          <source>Retrieved February 18</source>
          ,
          <year>2004</year>
          from http://www.alltheweb.com
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Chowdhury</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Soboroff</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          (
          <year>2002</year>
          ).
          <article-title>Automatic evaluation of World Wide Web Search Services</article-title>
          .
          <source>In Proceedings of the 25th Annual International ACM SIGIR Conference</source>
          ,
          <volume>421</volume>
          -
          <fpage>422</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Fagin</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Sivakumar</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>Comparing top k lists</article-title>
          .
          <source>SIAM Journal on Discrete Mathematics</source>
          ,
          <volume>17</volume>
          (
          <issue>1</issue>
          ),
          <fpage>134</fpage>
          -
          <lpage>160</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Google.</surname>
          </string-name>
          (
          <year>2003a</year>
          ).
          <article-title>Google information for Webmasters</article-title>
          .
          <source>Retrieved February 18</source>
          ,
          <year>2004</year>
          , from http://www.google.com/webmasters/seo.html
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Google.</surname>
          </string-name>
          (
          <year>2003b</year>
          ).
          <article-title>Google information for Webmasters</article-title>
          .
          <source>Retrieved February 18</source>
          ,
          <year>2004</year>
          , from http://www.google.com/webmasters/4.html
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Google.</surname>
          </string-name>
          (
          <year>2004</year>
          ) Retrieved February 18,
          <year>2004</year>
          from http://www.google.com
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Hawking</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Craswell</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bailey</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Griffiths</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          (
          <year>2001</year>
          ).
          <article-title>Measuring search engine quality</article-title>
          .
          <source>Information Retrieval</source>
          ,
          <volume>4</volume>
          ,
          <fpage>33</fpage>
          -
          <lpage>59</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Hawking</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Craswell</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thistlewaite</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Harman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>1999</year>
          ).
          <article-title>Results and challenges in Web search evaluation</article-title>
          .
          <source>In Proceedings of the 8th International World Wide Web Conference</source>
          , May
          <year>1999</year>
          , Computer Networks,
          <volume>31</volume>
          (
          <fpage>11</fpage>
          -
          <lpage>16</lpage>
          ),
          <fpage>1321</fpage>
          -
          <lpage>1330</lpage>
          , Retrieved February 18,
          <year>2004</year>
          , from http://www8.org/w8-papers
          <string-name>
            <surname>/</surname>
          </string-name>
          2c-search-discover/results/results.html
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Lawrence</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Giles</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (
          <volume>199</volume>
          ).
          <source>Accessibility of information on the Web. Nature</source>
          ,
          <volume>400</volume>
          ,
          <fpage>107</fpage>
          -
          <lpage>109</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <article-title>Nielsen/NetRatings (</article-title>
          <year>2003</year>
          ).
          <article-title>NetView usage metrics</article-title>
          .
          <source>Retrieved February 18</source>
          ,
          <year>2004</year>
          , from http://www.netratings.com/news.jsp?section=dat_to
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Price</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2004</year>
          ).
          <article-title>Google ups total page count</article-title>
          .
          <source>In Resourceshelf. Retrieved February 18</source>
          ,
          <year>2004</year>
          , from http://www.resourceshelf.com/archives/2004_02_01_resourceshelf_archive.
          <source>html#107702 946623981034</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Quint</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>OCLC Project Opens WorldCat Records to Google</article-title>
          . In Information Today.
          <source>Retrieved February 18</source>
          ,
          <year>2004</year>
          , from http://www.infotoday.com/newsbreaks/nb031027-
          <fpage>2</fpage>
          .shtml
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Silverstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henzinger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marais</surname>
            , H and Moricz,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>1999</year>
          ).
          <article-title>Analysis of a very large Web search engine query log</article-title>
          .
          <source>ACM SIGIR Forum</source>
          ,
          <volume>33</volume>
          (
          <issue>1</issue>
          ).
          <source>Retrieved February 18</source>
          ,
          <year>2004</year>
          from http://www.acm.org/sigir/forum/F99/Silverstein.pdf
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Singhal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Kaszkiel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2001</year>
          ).
          <article-title>A case study in Web search using TREC algorithms</article-title>
          .
          <source>In Proceedings of the 10th International World Wide Web Conference</source>
          , May
          <year>2001</year>
          ,
          <fpage>708</fpage>
          -
          <lpage>716</lpage>
          . Retrieved February 18,
          <year>2004</year>
          from http://www10.org/cdrom/papers/pdf/p317.pdf
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>Spink</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ozmutlu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ozmutlu</surname>
            , H.,
            <given-names>C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Jansen</surname>
            ,
            <given-names>B. J.</given-names>
          </string-name>
          (
          <year>2002</year>
          ).
          <article-title>U.S. versus European Web searching trends</article-title>
          .
          <source>SIGIR Forum</source>
          ,
          <year>Fall 2002</year>
          .
          <source>Retrieved February 18</source>
          ,
          <year>2004</year>
          from http://www.acm.org/sigir/forum/F2002/spink.pdf
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Soboroff</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nicholas</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Cahan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2001</year>
          ).
          <article-title>Ranking retrieval systems without relevance judgments</article-title>
          .
          <source>In Proceedings of the 24th annual international ACM SIGIR conference</source>
          ,
          <volume>66</volume>
          -
          <fpage>72</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Su</surname>
            ,
            <given-names>L. T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>H.L.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>X. Y.</given-names>
          </string-name>
          (
          <year>1998</year>
          ).
          <article-title>Evaluation of Web-based search engines from the end-user's perspective: A pilot study</article-title>
          .
          <source>In Proceedings of the ASIS Annual Meeting</source>
          ,
          <volume>35</volume>
          ,
          <fpage>348</fpage>
          -
          <lpage>361</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>Sullivan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2003a</year>
          ).
          <article-title>Buying your way in: Search engine advertising chart</article-title>
          .
          <source>Retrieved February 18</source>
          ,
          <year>2004</year>
          , from http://www.searchenginewatch.com/webmasters/article.php/2167941
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Sullivan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2003b</year>
          ).
          <article-title>Florida Google dance resources</article-title>
          .
          <source>Retrieved February 18</source>
          ,
          <year>2004</year>
          from http://www.searchenginewatch.com/searchday/article.php/3285661
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Sullivan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Sherman</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2004</year>
          ).
          <source>4th Annual Search Engine Watch 2003 Awards. Retrieved February 18</source>
          ,
          <year>2004</year>
          , from http://www.searchenginewatch.com/awards/article.php/3309841
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Vaughan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          (to appear).
          <article-title>New measurements for search engine evaluation proposed and tested</article-title>
          . To appear
          <source>in Information Processing &amp; Management. doi:10</source>
          .1016/S0306-
          <volume>4573</volume>
          (
          <issue>03</issue>
          )
          <fpage>00043</fpage>
          -
          <lpage>8</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Voorhees</surname>
            ,
            <given-names>E. M.</given-names>
          </string-name>
          (
          <year>2000</year>
          ).
          <article-title>Variations in relevance judgments and the measurement of retrieval effectiveness</article-title>
          .
          <source>Information Processing and Management</source>
          ,
          <volume>36</volume>
          ,
          <fpage>697</fpage>
          -
          <lpage>716</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>