<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Run-time Management Policies for Data Intensive Web sites</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Christos Bouras</string-name>
          <email>bouras@cti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Agisilaos Konidaris</string-name>
          <email>konidari@cti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Engineering and Informatics Department, University of Patras</institution>
          ,
          <addr-line>26500 Rion, Patras, Greece e-mails :</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Computer Technology Institute-CTI</institution>
          ,
          <addr-line>Kolokotroni 3, 26221 Patras</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Web developers have been concerned with the issues of Web latency and Web data consistency for many years. These issues have become more important in our days since the accurate and imminent dissemination of information is vital to businesses and individuals that rely on the Web. In this paper, we evaluate different run-time management policies against real Web site data. We first define the meaning of data intensive Web sites and categorize them according to their hit patterns. Our research relies on real world Web data collected from various popular Web sites and proxy log files. We propose a Web site run-time management policy that may apply to various real Web site hit patterns and Web data update frequencies.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In our days the WWW is the most popular application on the Internet because it is easy to use
and has the ability to keep all network functions that are executed throughout a browsing session,
transparent to the user. The user has the notion that he is requesting information and this
information is somehow brought to his/her computer. He/she does not have to know how this
happens. Even though most users don't know how the WWW works, almost all of them
experience a phenomenon that is formally known as Web Latency [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Web Latency is nothing
more than the delay between the time that a user requests (by clicking or typing a URL) a Web
page and the time that the Web page actually reaches his computer. This intuitive and informal
definition of Web latency urged Web site developers and Web equipment developers to try and
reduce the waiting time of users in order to make their browsing sessions as productive as
possible.
      </p>
      <p>
        A solution that has been proposed for reducing web latency, and at the same time keeping web
data consistent with database changes, is run-time management policies [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ]. The issue of
implementing run-time management policies for data intensive web sites is as old as data
intensive web sites themselves. In order to keep up with user demand and show efficient response
times, web sites have resorted to run-time management policies. A run-time management policy
may be viewed as a web server component that is executed in parallel with the web server's
functions and has total (or partial in some cases) control over those functions. A run-time
management policy receives inputs such as user request rates, database update frequencies and
page popularity values. The outputs produced may be the pages that should be pre-computed, the
pages that should be kept in the server's cache and the pages that should be invalidated from
cache.
      </p>
      <sec id="sec-1-1">
        <title>A run-time management policy for web servers is a dynamic element that can be handled in different ways. Not all policies are suitable for all web sites. A lot of work has been carried out in the fields of specifying, executing and optimizing run-time management policies for data intensive web sites [11, 14].</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. A definition of Data intensive Web sites</title>
      <p>In order to propose run-time management policies for data intensive Web sites we must first
determine the meaning of Data-intensive Web sites. The term is used for quite a lot of different
"types" of web sites that share a common characteristic. The common characteristic is that a lot
of data is demanded of these web sites at a certain point in time. The term "Data intensive web
site" differs from site to site. We argue that Data intensive Web sites must be further analyzed in
order to determine suitable run-time management policies for each one.</p>
      <p>
        In order to break down Data intensive web sites into categories, we consider two criteria. The
first criterion is user demand (meaning number of requests) in time, and the second is the number
and change rate of the dynamic elements of a Web page in these web sites[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. At this point we
must clarify that, for simplicity reasons, we will refer to the default pages1 of sites from now on
in our analysis. Our research is focused on the default page because it is the page that most of the
sites' requests are directed to (according to the Zipfian distribution of requests). Having this fact
in mind we analyze data intensive web site default pages according to the following criteria:
      </p>
      <sec id="sec-2-1">
        <title>1. The number of requests in time. This criterion is closely related to the geographical</title>
        <p>topology of every web site and its target group. An obvious example is that of a Greek portal.
Many Greek portals can be considered Data intensive web sites but only for a limited (and in
general well defined) time slot, every day. The peak web access time in Greece is considered
to be 09:00 to 14:00 Central European Time (CET). The Greek portals' peak request times
are the same. The requests for pages decline after 14:00 CET and come to a minimum
overnight. As opposed to Greek portals, that show a peak request time of about 5 hours, and
then a substantial reduction in requests, major US portals such as cnn.com show significant
demand all through the day. The request threshold of major US portals is much higher than
that of Greek portals. Even though they experience similar request curves, as those shown in
Figure 1, due to the fact that US users outnumber users from all other countries, their curves
do not decline as much as those of Figure 1. In simple terms this means that US portals
experience more intense prime-time periods and show much higher request thresholds while
not at prime time. In coming years, with the expected expansion of the Internet in other
countries (in Europe, Asia and Africa), the curves of Figure 1 (especially for portals of
Global interest), will tend to become straight lines. The deference between Greek and US
portals has to do with language (Greek versus English) and information importance. It is
obvious at this point that a run-time management policy for a Greek portal should be very
different from a policy for a major US portal. The problem becomes even more complex
when we consider web sites such as the site of the Olympics or web sites set-up for election
days, that show peak request demand, all through the day, but only for a limited number of
days (or hours). These sites should be treated as an entirely different category in relevance to
their run-time management policies.
2. The number and rate of change of dynamic elements in the default page. This criterion is
directly related to the database involvement in the construction of the Web page. More
dynamic web page elements (also referred to as dynamic components in this paper), demand
more interaction between the web page and the database. The rate of change of dynamic
elements in a default web page, is mainly related to the nature of the information that they
contain. For example a dynamic element that contains stock market information must be
updated frequently. The in.gr portal has a maximum of 4 dynamic elements in its default
page, of which only 2 are updated frequently (in October 2000). The cnn.com site contains 6
dynamic elements in its default page of which 3 are updated frequently (in October 2000).</p>
      </sec>
      <sec id="sec-2-2">
        <title>Considering the first criteria that we have mentioned above, we may break down data intensive</title>
        <p>web sites to the following general categories:
• Web sites that show limited peak demand (e.g. Greek portals)
• Web sites that experience peak and heavily tailed demand (e.g. US portals)
• Web sites that experience continuos peak demand for a limited number of days ( e.g.</p>
        <p>election sites and sites of sporting events)
All three categories of web sites may contain various numbers of dynamic elements in their
default pages, and these dynamic elements may be updated in various time intervals. By
combining the two criteria we end-up with virtually unlimited categories of web sites. All of
1 The default page in this study is the page that is transferred to the user when he/she requests a base URL
such as http://www.cnn.com. It is also referred to as the initial web site page.
these web site categories must be efficiently serviced by run-time management policies. It is
obvious that each web site needs a self-tailored run-time management policy. In Figure 1 we have
included four different request distributions in time. These distributions come from diverse Web
sites. It is obvious that the peak request period is much more intense and demanding in the case
of the Greek portal in.gr. The curves in Figure 1 reveal that almost all Web servers show peak
response times (limited or extended less or more demanding).</p>
      </sec>
      <sec id="sec-2-3">
        <title>In this paper we propose a general cost estimation for a hybrid run-time management policy that</title>
        <p>can be tailored to different web site needs and we then present its performance evaluation.</p>
      </sec>
      <sec id="sec-2-4">
        <title>According to the categorization made in this paragraph, we determine that the main aim of a runtime management policy is to efficiently handle the different request demand of web sites, by servicing different numbers of dynamic elements that have different change rates.</title>
        <p>Averafgoe(rhNtthtpuem:/U/bwnewirvweor.fsuRiptyeaqtoruafesPs.agttrPr/a)esr Hour Average Nfourmthb(eehrtGtoprf:e/R/ewekwqSuwce.hssotcohPl.egNrre/H)twouorrkof the Day</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Run time management policies</title>
      <p>There are two extreme approaches for handling dynamic data on the web. The first is the on the
fly creation of dynamic web pages on every user request and the second is the materialization and
caching of all dynamic web pages before any request (or in frequent time intervals). The first
approach has the obvious drawback of large response times and the second approach has the
drawback of possibly serving "stale" data to users. In this paper we will not refer to these
approaches since, in our opinion, they do not qualify as run-time management policies. We will
only refer to run-time management policies that follow a hybrid approach. These policies
implement two basic functions at run-time:
• Adapt to differentiating request demand. This means that they can adapt to match the
changes of user request in time.
• Satisfy differentiating numbers of default web page dynamic elements with
differentiating update times. This means that they can perform well, when handling pages
that contain different numbers of dynamic elements, which must be updated in different time
intervals.</p>
      <p>
        The issue of materialization and caching of dynamic web pages, is one that has been thoroughly
discussed in the bibliography [
        <xref ref-type="bibr" rid="ref14 ref4">4, 14</xref>
        ]. The basic idea behind materializing and caching dynamic
web pages, is that a web server response is much quicker when serving static, rather than
dynamic pages. This is easily understood, since in the case of static pages, the web server does
not have to query the database for data contained in the web pages. A run-time management
policy is basically about dynamic web page materialization and caching. There are several issues
related to these functions of the policies. These are:
      </p>
      <sec id="sec-3-1">
        <title>Cache consistency and invalidation policy. The basic problem behind caching dynamic</title>
        <p>
          data is the consistency between the data that are stored in the database and the data that are
contained in the cached web pages. The problem of keeping caches consistent is a very
"popular" problem that has been addressed not only in the context of Web server caches, but
mostly in the context of proxy servers. Some of these policies can be applied to the Web
server caches [
          <xref ref-type="bibr" rid="ref10 ref8">10, 8</xref>
          ].
        </p>
        <sec id="sec-3-1-1">
          <title>Object caching. Another interesting issue (directly related to the previous) in Web server</title>
          <p>
            caches, is the selection of objects that must be cached. Throughout this paper we have
considered that the objects that can be kept in a Web server's cache are only "whole" Web
pages. This is not true. The issue of caching fragments of Web pages has also been presented
in the bibliography [
            <xref ref-type="bibr" rid="ref1 ref2 ref3 ref7 ref9">1, 2, 3, 7, 9</xref>
            ] and is very interesting, in relevance to our approach that is
presented in the following section.
          </p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Cache size and type. The size of the cache is one of the most important parameters in</title>
          <p>caching of dynamic web pages on server. In the extreme case that an infinite (or a very large)
cache was available, many of the problems related to caching would be eliminated. Since, in
our days, large disks are available and are fairly inexpensive, the issue of cache size should
not be considered very important when referring to disk caches. In the case of memory
caches the issue of capacity is still considered very important.</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>Caches and web server dependency. A caching scheme for dynamic web pages should be applicable to many different web server implementations. To make this possible, caching schemes should be implemented as a standalone application that should be able to exchange information with various web servers.</title>
          <p>Methods of Materialization triggering. A materialization policy may be implemented with
the use of various techniques. Many frequently updated portals update their data through web
forms. A news site for example must rely on Web forms, because it relies on journalists to
update the site and thus requires a user friendly update environment. The use of web forms
causes the triggering of the materialization policy and the consequent update of Web site
pages.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.1. Data intensive Web site study results</title>
        <p>In order to record the "behavior" of web sites, in relevance to their default web page update
policies, we performed a study that included two well known Greek portals (www.in.gr and
www.naftemporiki.gr) and one "global" portal (europe.cnn.com). The first Greek portal is mainly
a news and search engine portal and the second is a stock market web site that includes a stock
market quote on its default page. In order to determine the default web page update policy in
relevance to their request demand we performed an analysis by issuing default web page requests
every 3 minutes. From these requests we were able to measure their response times and whether
their default pages had changed from our previous request.</p>
        <sec id="sec-3-2-1">
          <title>The results show a high rate of change in all three sites. The www.in.gr site shows a slight</title>
          <p>decline in update frequency during early morning hours, but no substantial changes occur during
peak times. The www.naftemporiki.gr site shows a decline in update frequency during peak times
but on average the update frequency remains the same. The europe.cnn.com site shows a standard
update frequency at all times.</p>
          <p>The update frequency results may be interpreted as run-time management policies for the default
web pages of the sites. The advertisement policies of the sites may have played a substantial role
in these results. A more frequent request policy on our part may have shown different results
(since the update frequency of a site may be less than 1 change per three minutes) but we did not
want to load sites with our requests. The first column of Figure 2 shows the response time of the
three sites after issuing requests to the sites every 3 minutes through a web client that we
implemented in Java. The second column contains the corresponding number of changes that
occurred to the default page, every ten requests to a web site. The change of the web page was
computed through its checksum.</p>
          <p>Whole day Response times for www.in.gr
120000
100000
itesenop 648000000000000
m
e 20000
s
R 0
-200000:00 4:48 9:36</p>
          <p>14:24 19:12 0:00
Time</p>
          <p>4:48
Peak response times for www.naftemporiki.gr
9:36</p>
          <p>Time
4:48
14:24</p>
          <p>19:12
100000
es 80000
item 60000
s
on 40000
p
es 20000
R 0</p>
          <p>0:00</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. The proposed Run-time management policy</title>
      <sec id="sec-4-1">
        <title>In this section we will propose a run-time management policy based on characteristics that we</title>
        <p>have observed in real world data intensive web sites. We will propose a general model for
constructing and validating a run-time management policy. Our run-time management policy is
based on the assumption that a policy must be based on the current web server conditions and the
current web site request demand.</p>
        <sec id="sec-4-1-1">
          <title>4.1. Proposed policies and cost estimation</title>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>The cost estimation of a run-time management policy is the basis of its performance evaluation.</title>
        <p>In order to have a theoretical view of the efficiency of a policy, one must perform a cost
estimation corresponding to the policy. In this paragraph we perform a cost estimation for a
default web page of a data intensive web site. First we will refer to a policy, that we will call
ondata-change-materialize policy, that materializes every dynamic component of a web page,
every time it changes. This policy has the advantage that web pages are always consistent with
database changes but imposes a substantial load to servers in the case of frequently changing web
pages.</p>
        <p>In data intensive web sites, and especially during the period of intensity the policy described
above might not be a very good proposal. Consider an example of a web page that contains three
dynamic components. The data contained in the database, and concerns the first component,
changes very frequently (e.g. as in the case of stock quotes). The data of the other two dynamic
components change periodically but with a much smaller frequency. According to this cost
estimation, the default page should be re-computed and materialized very frequently because of
the frequent changes in data of the first component, even though the other two components did
not need refreshing. The cost of materialization in this case, equals to the cost of materializing the
web page every time a dynamic component that is included needs refreshing. As one can
conclude, this is not a very good proposal especially at peak request times.</p>
        <p>The other proposal consists of a hybrid model. Our basic concern is peak request demand
periods. Those are the periods when a web site should perform best. Our hybrid model combines
two run-time management policies. The on-data-change-materialize policy, that has been
already presented, and an on-data-change-batch-compromise policy that we will explain in the
following paragraphs. The basic idea is that the on-data-change-materialize policy should be
executed until the Web server load reaches a certain threshold after which the
on-data-changebatch-compromise policy is executed.</p>
        <p>The on-data-change-batch-compromise policy attempts to estimate the optimal wait time
between default web page materializations. We will attempt to clarify this by presenting the three
web page dynamic elements example, the first of which is a stock quote. The stock quote element
has a change frequency f1=6 changes per minute. The second element has a change frequency
f2=2 changes per minute and the third has a change frequency f3=0.2 changes per minute. The
ondata-change-materialize policy considers these three frequencies independent variables. In
simple terms this means that the default page would be materialized 8.2 times per minute, on
average. This is very costly and can not be considered a good practice. In our
on-data-changebatch-compromise policy, we relate these three change frequencies by attempting to implement a
batch materialization model that relies on accepted compromises in data "freshness". In our
example the stock quote dynamic element should be refreshed every 10 seconds, the second
element every 30 seconds and the third every 5 minutes (300 seconds). The problem here, is to
combine the web page materializations that are caused by one dynamic element change frequency
with those caused by another. The obvious solution is to divide all dynamic element change
periods with the smallest period. In our case, since the smallest change period is 10 seconds we
would divide 10 seconds/10 seconds = 1, 30seconds/10seconds=3 and 300 seconds/10
seconds=30. We name the smallest value (e.g. 1) the Minimum Compromise Factor-CFMIN, the
second result (e.g. 3) the First Compromise Factor-FCF and the third (e.g. 30) as the Maximum</p>
      </sec>
      <sec id="sec-4-3">
        <title>Compromise Factor-CFMAX. The on-data-change-batch-compromise policy may be</title>
        <p>implemented by choosing any of the compromise factors that were derived. The data on the web
page are "more" consistent to database changes, as the compromise factor becomes smaller. The
compromise factor is a meter of the web page developer's or administrator's willingness to
compromise in data "freshness", in order to achieve better Web server performance. After a
compromise factor has been chosen, the materialization of the web page is executed in batches.</p>
      </sec>
      <sec id="sec-4-4">
        <title>Let’s consider the case that the FCF has been chosen. This means that a web page materialization</title>
        <p>takes place every 3 changes of the stock quote dynamic element value in the database, which will
include a change in the second element value and so on.</p>
        <p>The concept of the CF (Compromise Factor) can also be very beneficial to a run-time
management policy. Its value does not have to be the result of the division that we mentioned
above. The value of the CF may significantly differ from this result. This may happen in cases
where the administrator believes that the result of the division is unacceptable because the
services demanded by users (web site Quality of Service) require a different approach. Our
proposal aims at relating the value of the CF with the Maximum Request Handling value (MRH)
on the server, which is the maximum requests that the web server can fulfill at any point in time,
and the Current Request Rate (CRR) which is the current amount of requests to the web server. It
is obvious that the value of the CF should be greater on a server with a MRH of 100000 that has a</p>
      </sec>
      <sec id="sec-4-5">
        <title>CRR of 90000 than the value of the CF on a server with an MRH of 100000 and a CRR 20000. In our proposal the CF value is computed at run-time in frequent time intervals.</title>
        <p>Client1
Client2
.
.
.</p>
        <p>Client20
SubNet 1
SubNet 2
Switch
Server</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Performance Evaluation</title>
      <p>5.1.</p>
      <sec id="sec-5-1">
        <title>Experimental set-up and methodology</title>
        <sec id="sec-5-1-1">
          <title>The basic requirement that we wanted to ensure in the topology that we used to evaluate the hybrid run-time management policy, was the non-intervention of conditions not related with the policies. In order to ensure this as much as possible we used the topology shown in Figure 3.</title>
        </sec>
        <sec id="sec-5-1-2">
          <title>We tested our policies by implementing the client-server model on campus at the University of</title>
        </sec>
        <sec id="sec-5-1-3">
          <title>Patras. We used a Pentium III Computer with 256MB of RAM and 30GB Hard disk running</title>
        </sec>
        <sec id="sec-5-1-4">
          <title>Windows NT as a server and an Ultra Sparc 30 with 128MB of RAM and 16GB Hard Disk running Solaris 2.6 to simulate our clients. Our client program was executed on the Ultra Sparc and was able to simulate simultaneous client requests to the server running as independent UNIX processes.</title>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Evaluation of different types of dynamic web components</title>
        <sec id="sec-5-2-1">
          <title>In our performance evaluation we tested six different categories of dynamic web components.</title>
        </sec>
        <sec id="sec-5-2-2">
          <title>The categories are shown in Table 1. Only three of the categories are dynamic components. The other three are the corresponding static components. We included them in the evaluation in order to show the big difference between static and dynamic components of web pages.</title>
        </sec>
      </sec>
      <sec id="sec-5-3">
        <title>Category ID</title>
      </sec>
      <sec id="sec-5-4">
        <title>Category Name Description</title>
        <p>1
2
3
4
5
6</p>
        <sec id="sec-5-4-1">
          <title>Simple</title>
          <p>(SQS)</p>
        </sec>
        <sec id="sec-5-4-2">
          <title>Query Static component A static component that is a result of a simple query</title>
        </sec>
        <sec id="sec-5-4-3">
          <title>Simple Query Dynamic component A dynamic component that includes</title>
          <p>(SQD) the same simple query of 1</p>
        </sec>
        <sec id="sec-5-4-4">
          <title>Multiple Queries Static component A static component that is the result</title>
          <p>(MQS) of multiple queries to the database</p>
        </sec>
        <sec id="sec-5-4-5">
          <title>Multiple Queries component (MQD)</title>
        </sec>
        <sec id="sec-5-4-6">
          <title>Simple Query Static component with large result size (LDS)</title>
        </sec>
        <sec id="sec-5-4-7">
          <title>Dynamic</title>
        </sec>
        <sec id="sec-5-4-8">
          <title>Simple Query with large result size</title>
        </sec>
        <sec id="sec-5-4-9">
          <title>Dynamic component (LDD)</title>
        </sec>
        <sec id="sec-5-4-10">
          <title>A dynamic component that contains the same multiple queries as 3</title>
        </sec>
        <sec id="sec-5-4-11">
          <title>A static component containing the data of a simple query that returned a substantial amount of data</title>
        </sec>
        <sec id="sec-5-4-12">
          <title>A dynamic component that contains the same simple query as 5 that returns a substantial amount of data</title>
          <p>In is obvious from Figure 4 that the dynamic component that contains a lot of data has the largest
response times. The second worse response times are shown for the dynamic component that
contains multiple queries and the third worse from the dynamic component that contains a simple
query. The static components follow with smaller response times.</p>
          <p>25000
item20000
e
sn15000
o
p
es10000
R
n
ea 5000
M
0
0</p>
          <p>SQD
SQS
140000
e120000
m
i
t 100000
e
sn 80000
o
p
es 60000
R
n 40000
a
eM 20000</p>
          <p>0
10</p>
          <p>20
Number of Users
30
40
0
10
30</p>
          <p>40
20</p>
          <p>Number of Users
60000
e50000
m
i
te40000
s
n
po30000
s
eR20000
n
a
eM10000
0</p>
          <p>MQD
MQS</p>
          <p>LDD
LDS
0
10
30</p>
          <p>40
20</p>
          <p>Number of Users</p>
        </sec>
        <sec id="sec-5-4-13">
          <title>The basic conclusion that may be extracted from this first step of evaluation is that web developers should implement dynamic web components that do not contain a lot of data and are not the result of complex database queries. Thus the use of views and indexes in web database design can be very useful in relevance to dynamic web page elements.</title>
        </sec>
      </sec>
      <sec id="sec-5-5">
        <title>5.3. Evaluation of the hybrid run-time management policy</title>
        <p>In this section we evaluate our hybrid run-time management policy. We constructed a default
web page that consisted of 3 dynamic elements that needed refreshing every 5, 10 and 20
seconds. We issued a number of requests equal to 11 requests per second for that web page,
through client processes. Twenty clients were monitored and their responses were recorded. We
also issued a number of requests to the web server through other client processes in order to
achieve a background load on the web server. The evaluation Figures contain equalities such as
CF=10sec which must be interpreted as CF corresponding to 10 second default web page update
frequencies.</p>
        <sec id="sec-5-5-1">
          <title>The rationale behind the experiments was to show that different CF values may improve performance under specific server loads.</title>
          <p>Mean response times for 11 requests per second and</p>
          <p>CF=10sec</p>
          <p>Mean response times for 11 requests per second</p>
          <p>and CF=20sec</p>
          <p>The situation becomes more complex when CF=5 since the response times are not clearly better
for a lower server load. This is logical because the rate of default page materialization itself,
imposes a load to the server. The overall conclusion of Figure 6 is that the CF is directly related
to the response times of the server, under specific server loads. Thus, the CF should be calculated
in a way that it would result in better server response times. It is obvious that the value of CF
should increase as the load of the Web server (which is the result of greater request demand)
rises.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Future Work and conclusions</title>
      <p>
        Our future work will aim at improving the hybrid run-time management policy described in this
paper. We believe that the policy that is presented in this paper can be improved much further. It
can be improved by using what we call Web page component caching. We would like to evaluate
our policy together with schemes presented in [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref7 ref9">1, 2, 3, 7, 9</xref>
        ]. These schemes present the idea of
fragment caching instead of whole web page caching. This way we would like to evaluate
schemes that store fragments or materialized web page elements in cache and integrate them on
user request.
      </p>
      <sec id="sec-6-1">
        <title>We would also like to further enhance the computation of CF with other parameters. In this paper we have only related CF to the Current Request Rate, the Maximum Request Handling value and the update frequencies of components. We would like to be able to relate it with other parameters such as general network state and web page popularity in the future.</title>
      </sec>
      <sec id="sec-6-2">
        <title>The results are interesting, since it was shown that by adopting a compromise policy, the performance of the web server may improve even in peak conditions. The basic problem is to define the compromise factor that we named CF. We believe that our hybrid policy may be enhanced much further in order to reduce web latency that is related to web servers.</title>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Jim</given-names>
            <surname>Challenger</surname>
          </string-name>
          , Paul Dantzig, Daniel Dias, and
          <string-name>
            <given-names>Nathaniel</given-names>
            <surname>Mills</surname>
          </string-name>
          ,
          <article-title>"Engineering Highly Accessed Web Sites for Performance "</article-title>
          , Web Engineering,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Deshpande</surname>
          </string-name>
          and S. Murugesan editors, Springer-Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Challenger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Iyengar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Witting.</surname>
          </string-name>
          ,
          <article-title>"A Publishing System for Efficiently Creating Dynamic Web Content"</article-title>
          ,
          <source>In INFOCOM 2000, March 26-30</source>
          , 2000 Tel Aviv, Israel
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Jim</given-names>
            <surname>Challenger</surname>
          </string-name>
          , Arun Iyengar and
          <string-name>
            <given-names>Paul</given-names>
            <surname>Dantzig</surname>
          </string-name>
          ,
          <article-title>"A Scalable System for Consistently Caching Dynamic Web Data "</article-title>
          ,
          <source>In Proceedings of IEEE INFOCOM'99</source>
          , New York, New York,
          <year>March 1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Mecca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Atzeni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Masci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Merialdo</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Sindoni "The Araneus Web-Based, Management System"</article-title>
          - In
          <source>Exhibits Program of ACM SIGMOD '98</source>
          ,
          <year>1998</year>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Md</given-names>
            <surname>Ahsan</surname>
          </string-name>
          <string-name>
            <surname>Habib</surname>
          </string-name>
          ,
          <article-title>Marc Abrams "Analysis of Sources of Latency in Downloading Web Pages"</article-title>
          ,
          <source>WebNet 2000, October 30 - November 4</source>
          , 2000 San Antonio, Texas, USA
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Brian</surname>
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Brewington</surname>
            ,
            <given-names>George</given-names>
          </string-name>
          <string-name>
            <surname>Cybenko</surname>
          </string-name>
          .
          <article-title>"Keeping Up with the Changing Web"</article-title>
          . IEEE Computer Magazine May 2000
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Craig</surname>
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Wills</surname>
          </string-name>
          , Mikhail Mikhailov,
          <article-title>"Studying the impact of more complete server information on web caching"</article-title>
          ,
          <source>5th International Web caching and Content delivery Workshop</source>
          , Lisbon, Portugal,
          <fpage>22</fpage>
          -
          <lpage>24</lpage>
          May 2000
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Khaled</given-names>
            <surname>Yagoub</surname>
          </string-name>
          , Daniela Florescu, Valérie Issarny, Patrick Valduriez,
          <article-title>"Caching Strategies for Data-Intensive Web Sites"</article-title>
          ,
          <source>Proc. of the Int. Conf. on Very Large Data Bases (VLDB) Cairo</source>
          , Egypt ,
          <fpage>10</fpage>
          -
          <lpage>14</lpage>
          september,
          <year>2000</year>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Iyengar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Challenger</surname>
          </string-name>
          .
          <article-title>"Improving Web Server Performance by Caching Dynamic Data"</article-title>
          ,
          <source>In Proceedings of the USENIX Symposium on Internet Technologies and Systems</source>
          , Monterey, California, December 1997
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Arlitt</surname>
            ,
            <given-names>Martin F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Friedrich</surname>
          </string-name>
          , Richard J.,
          <string-name>
            <surname>Jin</surname>
          </string-name>
          , Tai Y.,
          <article-title>"Performance Evaluation of Web Proxy Cache Replacement Policies"</article-title>
          ,
          <source>HP Technical Report HPL-98-97</source>
          , 980601 Available at http://www.hpl.hp.com/techreports/98/HPL-98-97.html
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Alexandros</given-names>
            <surname>Lambrinidis</surname>
          </string-name>
          and
          <article-title>Nick Roussopoulos "On the Materialization of WebViews"</article-title>
          ,
          <source>CSHCN Technical Report 99-14 (ISR T.R. 99-26) In proceedings of the ACM SIGMOD Workshop on the Web and Databases (WebDB '99)</source>
          , Philadelphia, Pensylvania, USA, June 1999
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Yi</given-names>
            <surname>Li</surname>
          </string-name>
          and
          <article-title>Kevin Lu, "Performance Issues of a Web Database"</article-title>
          <source>In Proc. of Eleventh International Workshop on Database and Expert Systems Applications 4-8 September</source>
          ,
          <year>2000</year>
          , Greenwich, London, UK
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Daniela</surname>
            <given-names>Florescu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alon Levy</surname>
          </string-name>
          , Alberto Mendelzon,
          <article-title>"Database Techniques for the WorldWide Web" A Survey</article-title>
          .
          <source>SIGMOD Record</source>
          ,
          <volume>27</volume>
          (
          <issue>3</issue>
          ):
          <fpage>59</fpage>
          -
          <lpage>74</lpage>
          ,
          <year>1998</year>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Birgit</surname>
            <given-names>Proll</given-names>
          </string-name>
          , Heinrich Starck, Werner Retschitzegger, Harald Sighart,
          <article-title>"Ready for Prime Time Pre-Generation of Web Pages"</article-title>
          in TIScover, WebDB '
          <fpage>99</fpage>
          , Philadelphia, Pennsylvania,
          <fpage>3</fpage>
          -4 June,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>