=Paper=
{{Paper
|id=None
|storemode=property
|title=Run-time Management Policies for Data Intensive Web Sites
|pdfUrl=https://ceur-ws.org/Vol-701/paper7.pdf
|volume=Vol-701
|dblpUrl=https://dblp.org/rec/conf/icdt/BourasK01
}}
==Run-time Management Policies for Data Intensive Web Sites==
Run-time Management Policies for Data Intensive Web
sites
Christos Bouras1,2 , Agisilaos Konidaris1,2
1 Computer Technology Institute-CTI, Kolokotroni 3, 26221 Patras, Greece
2 Computer Engineering and Informatics Department, University of Patras, 26500 Rion, Patras,
Greece
e-mails : {bouras, konidari}@cti.gr
Abstract
Web developers have been concerned with the issues of Web latency and Web data consistency
for many years. These issues have become more important in our days since the accurate and
imminent dissemination of information is vital to businesses and individuals that rely on the
Web. In this paper, we evaluate different run-time management policies against real Web site
data. We first define the meaning of data intensive Web sites and categorize them according to
their hit patterns. Our research relies on real world Web data collected from various popular Web
sites and proxy log files. We propose a Web site run-time management policy that may apply to
various real Web site hit patterns and Web data update frequencies.
1. Introduction
In our days the WWW is the most popular application on the Internet because it is easy to use
and has the ability to keep all network functions that are executed throughout a browsing session,
transparent to the user. The user has the notion that he is requesting information and this
information is somehow brought to his/her computer. He/she does not have to know how this
happens. Even though most users don't know how the WWW works, almost all of them
experience a phenomenon that is formally known as Web Latency [5]. Web Latency is nothing
more than the delay between the time that a user requests (by clicking or typing a URL) a Web
page and the time that the Web page actually reaches his computer. This intuitive and informal
definition of Web latency urged Web site developers and Web equipment developers to try and
reduce the waiting time of users in order to make their browsing sessions as productive as
possible.
A solution that has been proposed for reducing web latency, and at the same time keeping web
data consistent with database changes, is run-time management policies [12, 13]. The issue of
implementing run-time management policies for data intensive web sites is as old as data
intensive web sites themselves. In order to keep up with user demand and show efficient response
times, web sites have resorted to run-time management policies. A run-time management policy
may be viewed as a web server component that is executed in parallel with the web server's
functions and has total (or partial in some cases) control over those functions. A run-time
management policy receives inputs such as user request rates, database update frequencies and
page popularity values. The outputs produced may be the pages that should be pre-computed, the
pages that should be kept in the server's cache and the pages that should be invalidated from
cache.
A run-time management policy for web servers is a dynamic element that can be handled in
different ways. Not all policies are suitable for all web sites. A lot of work has been carried out in
the fields of specifying, executing and optimizing run-time management policies for data
intensive web sites [11, 14].
2. A definition of Data intensive Web sites
In order to propose run-time management policies for data intensive Web sites we must first
determine the meaning of Data-intensive Web sites. The term is used for quite a lot of different
61
1
"types" of web sites that share a common characteristic. The common characteristic is that a lot
of data is demanded of these web sites at a certain point in time. The term "Data intensive web
site" differs from site to site. We argue that Data intensive Web sites must be further analyzed in
order to determine suitable run-time management policies for each one.
In order to break down Data intensive web sites into categories, we consider two criteria. The
first criterion is user demand (meaning number of requests) in time, and the second is the number
and change rate of the dynamic elements of a Web page in these web sites[6]. At this point we
must clarify that, for simplicity reasons, we will refer to the default pages1 of sites from now on
in our analysis. Our research is focused on the default page because it is the page that most of the
sites' requests are directed to (according to the Zipfian distribution of requests). Having this fact
in mind we analyze data intensive web site default pages according to the following criteria:
1. The number of requests in time. This criterion is closely related to the geographical
topology of every web site and its target group. An obvious example is that of a Greek portal.
Many Greek portals can be considered Data intensive web sites but only for a limited (and in
general well defined) time slot, every day. The peak web access time in Greece is considered
to be 09:00 to 14:00 Central European Time (CET). The Greek portals' peak request times
are the same. The requests for pages decline after 14:00 CET and come to a minimum
overnight. As opposed to Greek portals, that show a peak request time of about 5 hours, and
then a substantial reduction in requests, major US portals such as cnn.com show significant
demand all through the day. The request threshold of major US portals is much higher than
that of Greek portals. Even though they experience similar request curves, as those shown in
Figure 1, due to the fact that US users outnumber users from all other countries, their curves
do not decline as much as those of Figure 1. In simple terms this means that US portals
experience more intense prime-time periods and show much higher request thresholds while
not at prime time. In coming years, with the expected expansion of the Internet in other
countries (in Europe, Asia and Africa), the curves of Figure 1 (especially for portals of
Global interest), will tend to become straight lines. The deference between Greek and US
portals has to do with language (Greek versus English) and information importance. It is
obvious at this point that a run-time management policy for a Greek portal should be very
different from a policy for a major US portal. The problem becomes even more complex
when we consider web sites such as the site of the Olympics or web sites set-up for election
days, that show peak request demand, all through the day, but only for a limited number of
days (or hours). These sites should be treated as an entirely different category in relevance to
their run-time management policies.
2. The number and rate of change of dynamic elements in the default page. This criterion is
directly related to the database involvement in the construction of the Web page. More
dynamic web page elements (also referred to as dynamic components in this paper), demand
more interaction between the web page and the database. The rate of change of dynamic
elements in a default web page, is mainly related to the nature of the information that they
contain. For example a dynamic element that contains stock market information must be
updated frequently. The in.gr portal has a maximum of 4 dynamic elements in its default
page, of which only 2 are updated frequently (in October 2000). The cnn.com site contains 6
dynamic elements in its default page of which 3 are updated frequently (in October 2000).
Considering the first criteria that we have mentioned above, we may break down data intensive
web sites to the following general categories:
• Web sites that show limited peak demand (e.g. Greek portals)
• Web sites that experience peak and heavily tailed demand (e.g. US portals)
• Web sites that experience continuos peak demand for a limited number of days ( e.g.
election sites and sites of sporting events)
All three categories of web sites may contain various numbers of dynamic elements in their
default pages, and these dynamic elements may be updated in various time intervals. By
combining the two criteria we end-up with virtually unlimited categories of web sites. All of
1
The default page in this study is the page that is transferred to the user when he/she requests a base URL
such as http://www.cnn.com. It is also referred to as the initial web site page.
62
2
these web site categories must be efficiently serviced by run-time management policies. It is
obvious that each web site needs a self-tailored run-time management policy. In Figure 1 we have
included four different request distributions in time. These distributions come from diverse Web
sites. It is obvious that the peak request period is much more intense and demanding in the case
of the Greek portal in.gr. The curves in Figure 1 reveal that almost all Web servers show peak
response times (limited or extended less or more demanding).
In this paper we propose a general cost estimation for a hybrid run-time management policy that
can be tailored to different web site needs and we then present its performance evaluation.
According to the categorization made in this paragraph, we determine that the main aim of a run-
time management policy is to efficiently handle the different request demand of web sites, by
servicing different numbers of dynamic elements that have different change rates.
Average Number of Request Per Hour Average Number of Request Per Hour of the Day
for the University of Patras for the Greek School Network
(http://www.upatras.gr/) (http://www.sch.gr/)
Average Number of
Average Number of
500000 80000
400000 60000
Requests
Requests
Average Number of
300000 Average Number of
40000 Request Per Hour of the
200000 Request Per Hour
Day
100000 20000
0 0
0 5 10 15 20 25 0 5 10 15 20 25
Time Time
Average Number of Accesses Per Hour of the Day Average Requests per Hour in May 2000
for the Proxy server of the University of Patras for the largest Greek portal in.gr (http://www.in.gr/)
(proxy.upatras.gr)
10000000
Requests (Users)
Average Number
12000
Average Unique
8000000
of Requests
10000
8000 Average Number of 6000000 Average Requests per Hour
6000 Accesses Per Hour of 4000000 in May 2000
4000 the Day
2000000
2000
0 0
0 5 10 15 20 25 0 5 10 15 20 25
Time Time
Figure 1 Request distribution for 2 popular Greek sites, 1 University proxy server and the
most popular Greek portal
3. Run time management policies
There are two extreme approaches for handling dynamic data on the web. The first is the on the
fly creation of dynamic web pages on every user request and the second is the materialization and
caching of all dynamic web pages before any request (or in frequent time intervals). The first
approach has the obvious drawback of large response times and the second approach has the
drawback of possibly serving "stale" data to users. In this paper we will not refer to these
approaches since, in our opinion, they do not qualify as run-time management policies. We will
only refer to run-time management policies that follow a hybrid approach. These policies
implement two basic functions at run-time:
• Adapt to differentiating request demand. This means that they can adapt to match the
changes of user request in time.
• Satisfy differentiating numbers of default web page dynamic elements with
differentiating update times. This means that they can perform well, when handling pages
that contain different numbers of dynamic elements, which must be updated in different time
intervals.
The issue of materialization and caching of dynamic web pages, is one that has been thoroughly
discussed in the bibliography [4, 14]. The basic idea behind materializing and caching dynamic
web pages, is that a web server response is much quicker when serving static, rather than
dynamic pages. This is easily understood, since in the case of static pages, the web server does
not have to query the database for data contained in the web pages. A run-time management
policy is basically about dynamic web page materialization and caching. There are several issues
related to these functions of the policies. These are:
63
3
• Cache consistency and invalidation policy. The basic problem behind caching dynamic
data is the consistency between the data that are stored in the database and the data that are
contained in the cached web pages. The problem of keeping caches consistent is a very
"popular" problem that has been addressed not only in the context of Web server caches, but
mostly in the context of proxy servers. Some of these policies can be applied to the Web
server caches [10, 8].
• Object caching. Another interesting issue (directly related to the previous) in Web server
caches, is the selection of objects that must be cached. Throughout this paper we have
considered that the objects that can be kept in a Web server's cache are only "whole" Web
pages. This is not true. The issue of caching fragments of Web pages has also been presented
in the bibliography [1, 2, 3, 7, 9] and is very interesting, in relevance to our approach that is
presented in the following section.
• Cache size and type. The size of the cache is one of the most important parameters in
caching of dynamic web pages on server. In the extreme case that an infinite (or a very large)
cache was available, many of the problems related to caching would be eliminated. Since, in
our days, large disks are available and are fairly inexpensive, the issue of cache size should
not be considered very important when referring to disk caches. In the case of memory
caches the issue of capacity is still considered very important.
• Caches and web server dependency. A caching scheme for dynamic web pages should be
applicable to many different web server implementations. To make this possible, caching
schemes should be implemented as a standalone application that should be able to exchange
information with various web servers.
• Methods of Materialization triggering. A materialization policy may be implemented with
the use of various techniques. Many frequently updated portals update their data through web
forms. A news site for example must rely on Web forms, because it relies on journalists to
update the site and thus requires a user friendly update environment. The use of web forms
causes the triggering of the materialization policy and the consequent update of Web site
pages.
3.1. Data intensive Web site study results
In order to record the "behavior" of web sites, in relevance to their default web page update
policies, we performed a study that included two well known Greek portals (www.in.gr and
www.naftemporiki.gr) and one "global" portal (europe.cnn.com). The first Greek portal is mainly
a news and search engine portal and the second is a stock market web site that includes a stock
market quote on its default page. In order to determine the default web page update policy in
relevance to their request demand we performed an analysis by issuing default web page requests
every 3 minutes. From these requests we were able to measure their response times and whether
their default pages had changed from our previous request.
The results show a high rate of change in all three sites. The www.in.gr site shows a slight
decline in update frequency during early morning hours, but no substantial changes occur during
peak times. The www.naftemporiki.gr site shows a decline in update frequency during peak times
but on average the update frequency remains the same. The europe.cnn.com site shows a standard
update frequency at all times.
The update frequency results may be interpreted as run-time management policies for the default
web pages of the sites. The advertisement policies of the sites may have played a substantial role
in these results. A more frequent request policy on our part may have shown different results
(since the update frequency of a site may be less than 1 change per three minutes) but we did not
want to load sites with our requests. The first column of Figure 2 shows the response time of the
three sites after issuing requests to the sites every 3 minutes through a web client that we
implemented in Java. The second column contains the corresponding number of changes that
occurred to the default page, every ten requests to a web site. The change of the web page was
computed through its checksum.
64
4
Page changes from previous count
Whole day Response times for www.in.gr
120000 12
Default page updates
100000 10
Response time
80000 8
60000 6
4
40000
2
20000 0
15:33
16:54
18:14
19:34
20:54
22:14
23:36
10:54
12:19
13:43
1:14
2:51
4:11
5:32
6:55
8:15
9:34
0
-200000:00 4:48 9:36 14:24 19:12 0:00 4:48
Time
Time
Peak response times for www.naftemporiki.gr Page changes from previous count
7
Default page updates
100000
6
Response times
80000 5
60000 4
3
40000 2
1
20000
0
0
11:47
12:07
12:27
12:46
13:06
13:26
13:48
14:08
14:29
14:47
15:05
15:24
15:43
16:01
16:20
16:39
0:00 4:48 9:36 14:24 19:12
Time Time
Greek Peak response time for europe.cnn.com Page changes from previous count
12
Default page Response
Default page updates
80000
70000 10
60000 8
50000 6
time
40000
4
30000
20000 2
10000 0
0
10:13
10:38
11:03
11:29
11:53
12:19
12:44
13:10
13:35
14:00
14:25
14:50
15:15
15:40
16:05
16:30
16:55
0:00 4:48 9:36 14:24 19:12
Time Time
Figure 2 Request times of three popular sites and the correspondent default web page
update frequencies
The main result of the statistics in Figure 2 is that web sites follow update policies that rely on
standard time intervals. A web site is updated every n minutes at all times (peak or not). This
gave us another motive to construct an efficient run-time management policy.
4. The proposed Run-time management policy
In this section we will propose a run-time management policy based on characteristics that we
have observed in real world data intensive web sites. We will propose a general model for
constructing and validating a run-time management policy. Our run-time management policy is
based on the assumption that a policy must be based on the current web server conditions and the
current web site request demand.
4.1. Proposed policies and cost estimation
The cost estimation of a run-time management policy is the basis of its performance evaluation.
In order to have a theoretical view of the efficiency of a policy, one must perform a cost
estimation corresponding to the policy. In this paragraph we perform a cost estimation for a
default web page of a data intensive web site. First we will refer to a policy, that we will call on-
data-change-materialize policy, that materializes every dynamic component of a web page,
every time it changes. This policy has the advantage that web pages are always consistent with
database changes but imposes a substantial load to servers in the case of frequently changing web
pages.
In data intensive web sites, and especially during the period of intensity the policy described
above might not be a very good proposal. Consider an example of a web page that contains three
65
5
dynamic components. The data contained in the database, and concerns the first component,
changes very frequently (e.g. as in the case of stock quotes). The data of the other two dynamic
components change periodically but with a much smaller frequency. According to this cost
estimation, the default page should be re-computed and materialized very frequently because of
the frequent changes in data of the first component, even though the other two components did
not need refreshing. The cost of materialization in this case, equals to the cost of materializing the
web page every time a dynamic component that is included needs refreshing. As one can
conclude, this is not a very good proposal especially at peak request times.
The other proposal consists of a hybrid model. Our basic concern is peak request demand
periods. Those are the periods when a web site should perform best. Our hybrid model combines
two run-time management policies. The on-data-change-materialize policy, that has been
already presented, and an on-data-change-batch-compromise policy that we will explain in the
following paragraphs. The basic idea is that the on-data-change-materialize policy should be
executed until the Web server load reaches a certain threshold after which the on-data-change-
batch-compromise policy is executed.
The on-data-change-batch-compromise policy attempts to estimate the optimal wait time
between default web page materializations. We will attempt to clarify this by presenting the three
web page dynamic elements example, the first of which is a stock quote. The stock quote element
has a change frequency f1=6 changes per minute. The second element has a change frequency
f2=2 changes per minute and the third has a change frequency f3=0.2 changes per minute. The on-
data-change-materialize policy considers these three frequencies independent variables. In
simple terms this means that the default page would be materialized 8.2 times per minute, on
average. This is very costly and can not be considered a good practice. In our on-data-change-
batch-compromise policy, we relate these three change frequencies by attempting to implement a
batch materialization model that relies on accepted compromises in data "freshness". In our
example the stock quote dynamic element should be refreshed every 10 seconds, the second
element every 30 seconds and the third every 5 minutes (300 seconds). The problem here, is to
combine the web page materializations that are caused by one dynamic element change frequency
with those caused by another. The obvious solution is to divide all dynamic element change
periods with the smallest period. In our case, since the smallest change period is 10 seconds we
would divide 10 seconds/10 seconds = 1, 30seconds/10seconds=3 and 300 seconds/10
seconds=30. We name the smallest value (e.g. 1) the Minimum Compromise Factor-CFMIN, the
second result (e.g. 3) the First Compromise Factor-FCF and the third (e.g. 30) as the Maximum
Compromise Factor-CFMAX. The on-data-change-batch-compromise policy may be
implemented by choosing any of the compromise factors that were derived. The data on the web
page are "more" consistent to database changes, as the compromise factor becomes smaller. The
compromise factor is a meter of the web page developer's or administrator's willingness to
compromise in data "freshness", in order to achieve better Web server performance. After a
compromise factor has been chosen, the materialization of the web page is executed in batches.
Let’s consider the case that the FCF has been chosen. This means that a web page materialization
takes place every 3 changes of the stock quote dynamic element value in the database, which will
include a change in the second element value and so on.
The concept of the CF (Compromise Factor) can also be very beneficial to a run-time
management policy. Its value does not have to be the result of the division that we mentioned
above. The value of the CF may significantly differ from this result. This may happen in cases
where the administrator believes that the result of the division is unacceptable because the
services demanded by users (web site Quality of Service) require a different approach. Our
proposal aims at relating the value of the CF with the Maximum Request Handling value (MRH)
on the server, which is the maximum requests that the web server can fulfill at any point in time,
and the Current Request Rate (CRR) which is the current amount of requests to the web server. It
is obvious that the value of the CF should be greater on a server with a MRH of 100000 that has a
CRR of 90000 than the value of the CF on a server with an MRH of 100000 and a CRR 20000. In
our proposal the CF value is computed at run-time in frequent time intervals.
66
6
5. Performance Evaluation
5.1. Experimental set-up and methodology
The basic requirement that we wanted to ensure in the topology that we used to evaluate the
hybrid run-time management policy, was the non-intervention of conditions not related with the
policies. In order to ensure this as much as possible we used the topology shown in Figure 3.
Client 1
SubNet 1 SubNet 2
Switch
Client 2 Server
...
Client 20
Figure 3 The experimental set-up
We tested our policies by implementing the client-server model on campus at the University of
Patras. We used a Pentium III Computer with 256MB of RAM and 30GB Hard disk running
Windows NT as a server and an Ultra Sparc 30 with 128MB of RAM and 16GB Hard Disk
running Solaris 2.6 to simulate our clients. Our client program was executed on the Ultra Sparc
and was able to simulate simultaneous client requests to the server running as independent UNIX
processes.
5.2. Evaluation of different types of dynamic web components
In our performance evaluation we tested six different categories of dynamic web components.
The categories are shown in Table 1. Only three of the categories are dynamic components. The
other three are the corresponding static components. We included them in the evaluation in order
to show the big difference between static and dynamic components of web pages.
Category ID Category Name Description
1 Simple Query Static component A static component that is a result of
(SQS) a simple query
2 Simple Query Dynamic component A dynamic component that includes
(SQD) the same simple query of 1
3 Multiple Queries Static component A static component that is the result
(MQS) of multiple queries to the database
4 Multiple Queries Dynamic A dynamic component that contains
component (MQD) the same multiple queries as 3
5 Simple Query Static component with A static component containing the
large result size (LDS) data of a simple query that returned a
substantial amount of data
6 Simple Query with large result size A dynamic component that contains
Dynamic component (LDD) the same simple query as 5 that
returns a substantial amount of data
Table 1 The categories of Web components used in our performance evaluation
67
7
140000
Mean Response times
120000 LDD
100000 LDS
80000 MQD
60000 MQS
40000 SQD
20000 SQS
0
0 10 20 30 40
Number of users
Figure 4 The Mean Response times of all categories of components (MQS and LDS static
components are not shown clearly because they show response times very similar to SQS)
In is obvious from Figure 4 that the dynamic component that contains a lot of data has the largest
response times. The second worse response times are shown for the dynamic component that
contains multiple queries and the third worse from the dynamic component that contains a simple
query. The static components follow with smaller response times.
25000 140000
120000
Mean Response time
Mean Response time
20000
100000
15000 80000 LDD
SQD
SQS 60000 LDS
10000
40000
5000
20000
0 0
0 10 20 30 40 0 10 20 30 40
Number of Users Number of Users
60000
50000
Mean Response time
40000
MQD
30000
MQS
20000
10000
0
0 10 20 30 40
Number of Users
Figure 5 Mean response times for corresponding static and Dynamic pages
Figure 5 illustrates the large differences between dynamic web page components and static web
page components. The difference becomes more obvious as the clients become more, and
consequently response time rises.
The basic conclusion that may be extracted from this first step of evaluation is that web
developers should implement dynamic web components that do not contain a lot of data and are
not the result of complex database queries. Thus the use of views and indexes in web database
design can be very useful in relevance to dynamic web page elements.
5.3. Evaluation of the hybrid run-time management policy
In this section we evaluate our hybrid run-time management policy. We constructed a default
web page that consisted of 3 dynamic elements that needed refreshing every 5, 10 and 20
seconds. We issued a number of requests equal to 11 requests per second for that web page,
through client processes. Twenty clients were monitored and their responses were recorded. We
also issued a number of requests to the web server through other client processes in order to
achieve a background load on the web server. The evaluation Figures contain equalities such as
68
8
CF=10sec which must be interpreted as CF corresponding to 10 second default web page update
frequencies.
The rationale behind the experiments was to show that different CF values may improve
performance under specific server loads.
Mean response times for 11 requests per second and Mean response times for 11 requests per second
CF=10sec and CF=20sec
Mean Response time
Mean response times
3300 2950
3200
3100 Mean response times 2900 Mean Response time
(ms)
for ~30% server load
(ms)
3000 for ~60% server load
2850
2900 Mean Response times
2800 2800 for ~60% server load
Mean Response times
2700
for ~30% server load
0 5 10 15 20 25 2750
0 5 10 15 20 25
Client
Client
Mean response times for 11 requests per second
and CF=5 sec
Mean Response time
3100
3050
3000 Mean Response times
for ~30% server load
(ms)
2950
2900 Mean Response times
2850 for ~60% server load
2800
2750
0 5 10 15 20 25
Client
Figure 6 Mean response times of the default page with different values of CF and different
server loads
Figure 6 shows the mean response times of the default page in 20 clients, calculated for different
values of CF and different server loads. In the case of CF=10, meaning that the default web page
was materialized every 10 seconds, we see a clear improvement in the default web page response
times when the server load is close to 30% of the maximum. This improvement is even clearer in
the case that CF=20. We would expect such a result since what it means is that a web page is
transferred faster to a client when it is materialized every 20 seconds than when it is materialized
every 10 seconds, under specific server loads.
The situation becomes more complex when CF=5 since the response times are not clearly better
for a lower server load. This is logical because the rate of default page materialization itself,
imposes a load to the server. The overall conclusion of Figure 6 is that the CF is directly related
to the response times of the server, under specific server loads. Thus, the CF should be calculated
in a way that it would result in better server response times. It is obvious that the value of CF
should increase as the load of the Web server (which is the result of greater request demand)
rises.
6. Future Work and conclusions
Our future work will aim at improving the hybrid run-time management policy described in this
paper. We believe that the policy that is presented in this paper can be improved much further. It
can be improved by using what we call Web page component caching. We would like to evaluate
our policy together with schemes presented in [1, 2, 3, 7, 9]. These schemes present the idea of
fragment caching instead of whole web page caching. This way we would like to evaluate
schemes that store fragments or materialized web page elements in cache and integrate them on
user request.
We would also like to further enhance the computation of CF with other parameters. In this paper
we have only related CF to the Current Request Rate, the Maximum Request Handling value and
the update frequencies of components. We would like to be able to relate it with other parameters
such as general network state and web page popularity in the future.
69
9
The results are interesting, since it was shown that by adopting a compromise policy, the
performance of the web server may improve even in peak conditions. The basic problem is to
define the compromise factor that we named CF. We believe that our hybrid policy may be
enhanced much further in order to reduce web latency that is related to web servers.
7. References
[1] Jim Challenger, Paul Dantzig, Daniel Dias, and Nathaniel Mills, "Engineering Highly
Accessed Web Sites for Performance ", Web Engineering, Y. Deshpande and S. Murugesan
editors, Springer-Verlag.
[2] J. Challenger, A. Iyengar, K. Witting., "A Publishing System for Efficiently Creating
Dynamic Web Content", In INFOCOM 2000, March 26-30, 2000 Tel Aviv, Israel
[3] Jim Challenger, Arun Iyengar and Paul Dantzig, "A Scalable System for Consistently
Caching Dynamic Web Data ", In Proceedings of IEEE INFOCOM'99, New York, New
York, March 1999.
[4] G. Mecca, P. Atzeni, A. Masci, P. Merialdo, G. Sindoni "The Araneus Web-Based,
Management System" - In Exhibits Program of ACM SIGMOD '98, 1998
[5] Md Ahsan Habib, Marc Abrams "Analysis of Sources of Latency in Downloading Web
Pages", WebNet 2000, October 30 - November 4, 2000 San Antonio, Texas, USA
[6] Brian E. Brewington, George Cybenko. "Keeping Up with the Changing Web". IEEE
Computer Magazine May 2000
[7] Craig E. Wills, Mikhail Mikhailov, "Studying the impact of more complete server
information on web caching", 5th International Web caching and Content delivery
Workshop, Lisbon, Portugal, 22-24 May 2000
[8] Khaled Yagoub, Daniela Florescu, Valérie Issarny, Patrick Valduriez, "Caching Strategies
for Data-Intensive Web Sites", Proc. of the Int. Conf. on Very Large Data Bases (VLDB)
Cairo, Egypt , 10-14 september, 2000
[9] A. Iyengar, J. Challenger. "Improving Web Server Performance by Caching Dynamic Data",
In Proceedings of the USENIX Symposium on Internet Technologies and Systems,
Monterey, California, December 1997
[10] Arlitt, Martin F., Friedrich, Richard J., Jin, Tai Y., "Performance Evaluation of Web Proxy
Cache Replacement Policies", HP Technical Report HPL-98-97, 980601 Available at
http://www.hpl.hp.com/techreports/98/HPL-98-97.html
[11] Alexandros Lambrinidis and Nick Roussopoulos "On the Materialization of WebViews",
CSHCN Technical Report 99-14 (ISR T.R. 99-26) In proceedings of the ACM SIGMOD
Workshop on the Web and Databases (WebDB '99), Philadelphia, Pensylvania, USA, June
1999
[12] Yi Li and Kevin Lu, "Performance Issues of a Web Database" In Proc. of Eleventh
International Workshop on Database and Expert Systems Applications 4-8 September,
2000, Greenwich, London, UK
[13] Daniela Florescu, Alon Levy, Alberto Mendelzon, "Database Techniques for the World-
Wide Web" A Survey. SIGMOD Record, 27(3):59-74, 1998
[14] Birgit Proll, Heinrich Starck, Werner Retschitzegger, Harald Sighart, "Ready for Prime
Time Pre-Generation of Web Pages" in TIScover, WebDB '99, Philadelphia, Pennsylvania,
3-4 June, 1999.
70
10