=Paper=
{{Paper
|id=Vol-239/paper-5
|storemode=property
|title=An Efficient Implementation of a Rule-based Adaptive Web Information System
|pdfUrl=https://ceur-ws.org/Vol-239/paper5.pdf
|volume=Vol-239
|dblpUrl=https://dblp.org/rec/conf/caise/ValerianoVTF06
}}
==An Efficient Implementation of a Rule-based Adaptive Web Information System==
WISM'06 1077
An Efficient Implementation of a Rule-based
Adaptive Web Information System
Davide Valeriano, Roberto De Virgilio,
Riccardo Torlone, and Daniele Di Federico
Dipartimento di Informatica e Automazione
Università degli studi Roma Tre
{valeriano,devirgilio,torlone,difederico}@dia.uniroma3.it
Abstract. Mobile devices provide a variety of ways to access informa-
tion resources available on the Web and a high level of adaptability to
different aspects (e.g., device capabilities, network QoS, user preferences,
location, an so on) is strongly required in this scenario. In this paper we
propose a technique for the efficient adaptation of Web Information Sys-
tems. The approach relies on special rules that allows us to specify, in
a declarative way, how to generate, in an automatic way, the adaptation
specifications that satisfy the requirements of a given context. Rules are
classified in a set of clusters and a matching relation between clusters
and context requirements is defined. The technique of rule evaluation and
clustering guarantee that different contexts and orthogonal requirements
of adaptation, possibly not fixed in advance, can be efficiently organized
and taken into account in the adaptation process.
1 Introduction
Modern Web Information Systems (WIS) are challenged to generate (automat-
ically) a hypermedia presentation built from information that is gathered from
different, possibly heterogeneous, sources that are distributed over the Web.
Nowadays, given the spread of mobile devices, a novel and fundamental require-
ment of these systems is the ability to adapt and personalize content delivery
according to the context of the client. An important point that needs to be taken
into account is that this scenario is very dynamic: some aspects of the context
can dynamically change. For instance new users with new preferences can be
enabled to access the information system, possibly unpredicted types of access
devices can be used, alternative network connections can be made available, fur-
ther aspects of the context (like the location or others environmental factors)
need to be incorporated. Also, the technology used to implement the adaptation
can be changed and this should have a limited impact on the overall application.
It follows that the adaptation process should guarantee a high level of flexibility
in terms of responsiveness to rapidly changing requirements of adaptation and
suitable actions to be undertaken to meet these requirements.
In the literature, some adaptation approaches rely on the definition of rules
to generate suitable adaptation specifications [2, 3, 10]. In [5] we have presented
1078 Web Information Systems Modeling
a general methodology for context adaptation, based on special rules that model
the adaptation at an abstract level, inspired by the approaches cited above. The
rules specify, in a declarative way, how to generate, in an automatic way, the
adaptation specifications that satisfy the requirements of a given context. A
relevant problem is to rapidly select and activate the rules that best fit the re-
quirements of adaptation of the context (in particular when the context changes
frequently). In this paper we present an efficient implementation to optimize the
activation process of adaptation rules. To this aim, we introduce a clustering
technique to classify the adaptation specifications.
The rest of the paper is organized as follows. Section 2 introduces the rule-
based approach. In Section 3, we illustrate how to define the set of clusters
and the process to efficiently activate a rule, using the clusters. In Section 4
we present a practical implementation and experimental results and finally, in
Section 5 we draw some conclusions and sketch future works.
2 A Rule-based approach to model adaptation
requirements
In this section we present special rules to specify, in a declarative way, how to
generate, in an automatic way, adaptation specifications. The rules are based on
two basic notions: the generic profile and the abstract configuration.
2.1 Generic profiles
A (generic) profile is a description of an autonomous aspect of the context in
which the Web site is accessed (and that should influence the presentation of
its contents). Examples of profiles are the user, the device, the location, and so
on. A dimension is property that characterizes a profile. Each dimension can
be conveniently described by means of a set of attributes. Attributes can be
simple or composite. A simple attribute has a value associated with it, whereas a
complex attribute has a set of (simple or complex) attributes associated with it.
In this model, a context is a collection of profiles. Figure 1 reports a graphical
example of profiles. In our model, different profiles over the same dimensions can
P1 P2
P3
Hardware Software Category Hardware Software Category Category
cpu display memory os version type family model display type type
OS
width height width height
profile dimension complex attribute simple attribute
Fig. 1. An example of profiles
WISM'06 1079
be compared making use of a subsumption relationship C. Intuitively, given two
profiles P1 and P2 , if P1 C P2 then P2 is more detailed than P1 since it includes
the attributes of P2 at the same or at coarser level of detail. More precisely, we
first say that an attribute A covers another attribute B if either they are simple
and A = B or they are composite and for each sub-attribute Ai of A there is a
sub-attribute Bj of B such that Ai covers Bj . The subsumption relationship is
then defined as follows.
Definition 1 (Subsumption) Given two profiles P1 and P2 , we say that P1 is
subsumed by P2 , P1 C P2 , if for each dimension d of P1 there is a dimension d’
of P2 such that for each attribute A of d there is an attribute A’ of d’ covered by
A.
As an example, given the profiles reported in Figure 1 we have that P2 C P1
and P3 C P2 .
2.2 Abstract configurations
We introduce the notion of (abstract) configuration as a triple C = (q, h, s)
where:
– q is a query over the underlying database expressed in relational calculus, a
declarative and abstract language for relational databases ([1]);
– h is an abstract hypertext definition expressed in WebML ([4]), a conceptual
model for Web application which allows us to describe the organization of
Web pages in a tight and concise way, by abstracting their logical features;
– s is presentation specification expressed in terms of an original notion of
logical style sheet, which is a set of tuples over a predefined collection of
Web Object Types (WOTs) like text, image, video, form and so on; each
WOT τ is associated with a set of presentation attributes: they identify
possible styles (e.g. font, color, spacing, position) that can be specified for τ .
An example of configuration is reported below.
q = {T : x1 , S : x2 , D : x3 , C : x4 , N : x5 |
NewsItems(N : x0 , T : x1 , S : x2 , D : x3 , G : x6 , J : x7 ),
Details(NK : x0 , P : x8 , C : x4 ), Newspapers(J : x7 , N : x5 , C : x9 ), x6 = Sport}
h= IndexUnit NewsIndex
( source News Items; attributes Title, Date; orderby Date )
DataUnit ContentData
( selector Item = CurrentItem; attributes Title, Date, Content)
link NewsToDetails
( from NewsIndex to ContentData
parameters CurrentItem = NK)
link ContentToRestOfContent
( from ContentData to ContentData
parameters CurrentItem = NK)
s= Text( Font: Arial, ..., Border: 0pt)
Link( Note: False, ..., Color: Blue)
Image( Format: jpeg, ..., Alignment: left)
1080 Web Information Systems Modeling
It is important to note that a configuration is indeed a logical notion that
can be represented and implemented in several ways and with different syntaxes.
This property guarantees the generality of the approach with respect to actual
languages and tools used to implement the adaptive application. For instance,
we can implement a configuration using SQL at the content level, XHTML at
the navigation level and a set of CSS files at the presentation level.
2.3 Adaptation Rule
The matching relationship between configurations and profiles is represented by
means of a novel notion of adaptation rule. An adaptation rule has the form
Pr : Cd ⇒ Cf where:
• Pr is a parametric profile, that is, a profile in which parameters can appear
in place of values,
• Cd is a condition, made out of a conjunction of atoms of the form AθB or
Aθc, where A and B are parameters occurring in Pr , c is a constant value,
and θ is a comparison operator,
• Cf is a parametric configuration in which parameters occurring in Pr can
appear in place of values.
The intuitive semantics of an adaptation rule is the following: if the client has a
profile Pr and the condition Cd is verified then generate the configuration Cf .
An example of adaptation rule for a device profile with the hardware dimen-
sion H is the following:
H (ImgCapable : X , ScreenSize : Y , ColorCapable : Z ) :
X = false, Y < 1500 ⇒ {q, h, s}
where X, Y , and Z are the parameters of H and {q, h, s} is the following para-
metric configuration:
q = { T : x1 , S : x2 , D : x3 , C : x4 |
NewsItems(N : x0 , T : x1 , S : x2 , D : x3 , G : x5 ,
J : x6 ), Details(NK : x0 , P : x7 , C : x4 )}
h = Page NewsPage ( units NewsIndex )
Page ContentPage ( units ContentData )
IndexUnit NewsIndex
( source News Items; attributes Title, Date;
orderby Date )
DataUnit ContentData
( selector Item = CurrentItem;
attributes Title, Date, Content)
link NewsToDetails
( from NewsIndex to ContentData
parameters CurrentItem = NK)
s = Text( Color: Black, Size: 8pt)
Link( Style: Underline, Color: Black)
Image( Size: Y × 0, 5, Color: Z)
WISM'06 1081
Note the use of the parameters Y and Z of H to resize the images and to
eliminate/maintain colors.
A profile P activates a rule Pr : Cd → Cf if Pr C P . Hence, a profile P can
activate a rule for a profile that is more general than P . Now, let R be a rule
Pr : Cd → Cf activated by a profile P and let C be the configuration obtained
from Cf by substituting the parameters of Pr with the actual values occurring
in P : we say that C is generated by P using R.
3 An optimization strategy
We assume to have at disposal an initial set of rules SR . In an adaptability
scenario, context changes generate events that produce one or more profiles
(instances). Each profile (instance) can activate a rule of SR . We should optimize
the research to select and activate efficiently the most appropriate rules and
generate the most suitable configuration. So we organize SR classifying the rules
in a set of clusters. Each cluster represents a class of adaptation that matches
the requirements of a class of context.
3.1 A distance between profiles
We define a distance that allows to compare profiles. We take advantage of the
tree structure of the generic profile. We get inspiration from the Tree Edit Dis-
tance [9], where the distance between two trees T1 and T2 , that we indicate as
dted (T1 , T2 ), depend on the sequence of operations (chosen in a limited prede-
fined set) that transforms T1 into T2 . We assign a fixed cost to each operation:
the distance between the considered trees is the sum of essential costs for the
transformation. For our case, let’s consider the set of operations edit, remove
and add, respectively to modify the content of a node, to delete a node and to
create a new node. In our tree structures, we distinguish two principal types of
node: dimension and attribute. We have operations to modify, remove or create
dimensions and attributes and we indicate the cost of an operation with γ.
Definition 2 (S-derivation) Given two trees T1 and T2 , S = s1 , ... , sk the
sequence of operations to transform T1 in T2 , an S-derivation between T1 and
T2 is a sequence U0 , ... , Uk of trees resulting by applying the single
Pi=k operation,
where U0 = T1 and Uk = T2 , and the cost γ(S − derivation) is i=0 γ(si ).
So we can intuitively define the distance between two profiles P1 and P2 , the
minimum cost for a S-derivation between them.
Definition 3 (Distance between profiles) Given two profiles P1 and P2 , we
define the distance between them as dted (P1 , P2 ) = min{γ(S − derivation) | S-
derivation from P1 to P2 }
1082 Web Information Systems Modeling
3.2 Definition of Clusters
Let’s represent a cluster as a tree where the nodes are profiles. The root repre-
sents the minimum set of information that nodes of the tree share. Basically a
client is identified by the profile sent to the system. Let’s remember that a profile
is principally described by dimensions, that characterize the main information,
articulated in attributes. So, the root of a cluster is a profile characterized only
by dimensions. It will contain the minimum information that several profiles has
to include to be in the same cluster. Given a profile P, pruned by all attributes,
we can build several beginner clusters, rooted with profiles generated from P,
on the number and type of dimensions. However the designer can decide the
detail level for the roots. For instance a designer can decide to articulate some
dimensions of a root with attributes, to extend the level of detail of the minimum
information to share.
We define a cluster as follows
Definition 4 (Cluster Tree) A cluster CL is a couple {NCL , ECL } where NCL
is the set of nodes (profiles) and ECL is the set of edges (ni , nj ) such that ni / nj
We can relate a node Pr of a cluster with the rule Pr : Cd → Cf in an Hash
table using Pr as access key.
Given the set of roots Rt = {Rt1 , ... ,Rtn } and the set of rules SR , we initialize
a set of clusters CL = {CL1 , ... ,CLn } as follows:
• we set CLi = {{Rti },∅},
• ∀ (Pr : Cd → Cf ) ∈ SR , (i) we search a CLi such that Rti C Pr , and
we navigate the cluster in depth until level k where doesn’t exist any node
subsumed by Pr , (ii) we search at level k −1 the node N d such that N d C Pr
and if there are more nodes N d such that N d C Pr we choose the node with
lowest distance, (iii) we add Pr to NCLi and (N d, Pr ) to ECLi , (iv) if there
are one or more nodes N di such that Pr C N di then we delete any edge of
type (N dj ,N di ) in ECLi and add to it the edge (Pr , N di ),
• for each node of a cluster we add the relative rule to the Hash table.
3.3 The activation process
Built the set of clusters, now we can perform an optimization strategy to activate
a rule in an efficient way. Given the set of rules SR , the set of clusters CL =
{CL1 , CL2 , ..., CLn }, the Hash table Ht and a profile instance PI of a profile
schema PS , we produce a set of configurations as follows:
1. we initialize P 0 with PS ;
2. we search the cluster CLi in CL, with a root Rti such that Rti C P 0 and
that has the lowest distance from it;
3. (i) we navigate in depth the cluster until level k where any node, that is
subsumed by P 0 , doesn’t exist; (ii) we search at level k − 1 the node N d such
that N d C P 0 and if more nodes are subsumed by P 0 we choose the one with
WISM'06 1083
lowest distance; (iii) we extract the rule Rl , related in the Ht to N d, and if
the condition is satisfied, we activate Rl and instance the configuration with
the value of PI ; (iv) we update P 0 as P 0 - N d;
4. we iterate from step (2.) until P 0 is not empty and a rule that can be activated
by P 0 exists;
5. if no rule can be activated in this way, the system will return a default
configuration.
3.4 An example of interaction between profiles and clusters
Let’s assume a client sending a profile PI (see Figure. 2(a)), instance of a profile
schema P to the system. Assume also that the system has already initialized its
own set of clusters, composed by clusters C1 and C2 (see Figure 3). We easily see
that the root of the cluster C2 is the one which subsumes P and has the lowest
distance from it. So we navigate that cluster and note that the only child N of
the root it is equal to P ; then we extract the rule Rl related in the Ht to N and
if the condition is satisfied, we activate it and instance the configuration with
the values of PI . Subsequently, assume that a new profile PI0 (see Figure 2(b)),
instance of a profile P 0 , is sent by another client to the system. At the same way
explained before, we see that the root of the cluster that subsumes P 0 and has
the lowest distance from it, is cluster C1 root. Navigating that cluster, we need
to confront P 0 with root’s two children. We easily see that no one of them is
equal to or has a subsumption relation with P 0 : being their parent a root, is not
possible to activate its rule; so, as we explained before, the system will return a
default configuration.
Device Device
Hardware Hardware Software
display display OS
width height width height Win
800 640 800 640
(a) Profile PI (b) Profile PI0
Fig. 2. Profiles sent to the system.
1084 Web Information Systems Modeling
Device Device
Hardware Software
Hardware
Device Device
Device
-
Hardware Software Hardware Software Hardware -
Pr1:Cd1 -> Cf1
-
display OS Version CPU OS Version display -
-
width height
width height
Device Device
Hardware Software
Hardware
display CPU OS
display
Version
CPU
width height
width height
(a) Cluster C1 (b) Cluster C2
Fig. 3. Cluster Set
4 A practical implementation
4.1 Benefits of Clustering
We can easily see that using this clustering algorithm we can reduce the number
of accesses to SR . In our system, we have seen that a profile given by a client can
activate a rule in the default set. So when a client sends a profile to the system it
has to search for the most suitable rule in SR . We can assume that the worst case
is when the system has the greatest possible number of default rules, that is when
the system has in SR one rule for each possible profile that a client can send.
Without using clusters we should execute a number of ordered accesses equal to
the total number of possible profiles. It is easy to see that the number of possible
profiles is the number of profiles obtained with every possible different combina-
tion of dimensions and attributes. Assuming to have a complete schema with n
dimensions and m attributes for each dimension,
Pn ¡ ¢ we can see that the number of
different combinations of dimensions is i=1 ni while the number of different
Pm ¡ ¢
combinations of attributes for each single dimension is j=1 m j . Combining
these two results and looking carefully at our structure, we see that the number
Pn hPm ¡m¢i(i−1 n
)
of different profiles is equal to i=1 j=1 j (n − i + 1). If we use the
clustering algorithm, in this case we will obtain a number of clusters equal to the
number of different combinations of dimensions, seen above. To access the right
clusters, the system has to access to all of their roots and then find
Pthe right way
n £¡ ¢¤
on the selected tree. So it can ignore a number of trees equal to i=1 ni + 1.
We can see that in this situation the worst case is when the root of the selected
WISM'06 1085
cluster has the maximum number n of dimensions, because it would be the tree
with the maximum number of nodes. The remaining clusters will contain a num-
Pn−1 hPm ¡m¢i(i−1 n
)
ber of profiles equal to i=1 j=1 j (n − i + 1). These profiles will be
ignored by the system, which has only accessed the roots of their clusters, that
are not profiles. So, counting also these accesses, we find that at least we obtain to
Pn−1 hPm ¡m¢i(i−1 n
) Pn ¡ ¢
save a number of accesses equal to i=1 j=1 j (n − i + 1)− i=1 ni .
This number doesn’t take count of the fact that the system won’t access all the
nodes of the selected cluster, but only some of them, depending on its structure.
Anyway it gives evidence that, using this clustering algorithm, it is possible to
obtain a certain benefit in terms of number of accesses.
4.2 Experimental results
We have designed and developed a practical implementation to test the our ap-
proach. The system relies on an RDF database of contents and the selection of
data is done by performing RDF queries expressed in SPARQL [8]. We use an
RDF(S) representation for schema level and RDF for instance level. The im-
plementation makes use of Jena [6], a Java framework for building RDF based
applications. Each rule is implemented in terms of Inference Rule of Jena [7].
Using the Inference Engine of Jena, the system generates a configuration that
infers the adaptation on RDF(S) documents. Finally the system returns several
response as output, using XSLT transformations on RDF instances. We have
tested our system to experiment the effectiveness and the efficiency of the ap-
proach illustrated in this paper in several practical cases. On the server side, we
used an IBM computer xSeries 225, equipped with a Xeon2 2.8Ghz processor,
a 4 GB RAM, and a 120 GB HDD SCSI. The Web site have been accessed by
three different types of devices: a mid-range desktop, several PDAs with differ-
ent capabilities, and some cellular phones. Figure 4 reports the average response
time (vertical axis) for each device type (horizontal axis). The Figure shows
promising times to produce the adapted response. The Desktop client presents
an average time of 72 ms, the PDA client 97 ms and the Wap client 121 ms.
The diagram compares the medium response time in absence and in presence
of clusters. In particular the system, using clusters, capitalizes the approach for
mobile devices, almost halving the response time. For instance the Wap client
benefits from 121 ms to 59 ms.
5 Conclusions and future works
In this paper we have presented a novel rule-based approach supporting the
automatic adaptation of content delivery in Web Information Systems. This
problem arises in common scenarios where the information system is accessed,
in different contexts, by a variety of mobile devices. The adaptation is achieved
automatically by means of special rules that specify, in a declarative way, how
a configuration can be generated to meet the requirements of adaptation of a
1086 Web Information Systems Modeling
List of Rules Clusters of Rules
140
medium response time (ms)
120
100
80
60
40
20
0
desktop pda wap
devices
Fig. 4. Results of experiments
given profile. Finally the organization of rules in a set of clusters guarantees the
responsiveness of the system in a dynamic scenario. Different contexts and or-
thogonal requirements of adaptation, possibly not fixed in advance, can be taken
into account in this process. The results presented in this paper are subject of
further conceptual and practical investigation. From a conceptual point of view,
we are currently investigating the expressive power of rules and the generality
of clusters. In particular we are improving the distance between profiles, using
a comparison between common pathes. From a practical point of view we are
improving the efficiency of the system and we are extending its functionality.
References
1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley,
1995.
2. C. Bettini, D. Maggiorini, and D. Riboni. Distributed context monitoring for con-
tinuous mobile services. In IFIP TC8 Working Conference on Mobile Information
Systems (MOBIS), 2005.
3. S. Ceri, F. Daniel, V. Demaldé, and F. M. Facca. An approach to user-behavior-
aware web applications. In 5th International Conference on Web Engineering
(ICWE), pages 417–428, 2005.
4. S. Ceri, P. Fraternali, A. Bongio, M. Brambilla, S. Comai, and M. Matera. Design-
ing Data-Intensive Web Applications. Morgan Kaufmann Publishers Inc., 2002.
5. R. DeVirgilio, R. Torlone, and G.-J. Houben. A rule-based approach to content
delivery adaptation in web information systems. In 7th International Conference
on Mobile Data Management (MDM’06) - to appear -, 2006.
6. HP-Labs. Jena. http://jena.sourceforge.net/, 2004.
7. HP-Labs. Jena 2 inference support. http://jena.sourceforge.net/inference/, 2005.
8. Hp-Labs. Sparql, query language for rdf. www.w3.org/TR/rdf-sparql-query/, 2005.
9. C. Isert. The editing distance between trees. Technical report, Ferienakademie, for
course 2: Bume: Algorithmik Und Kombinatorik, 1999.
10. H. Kiyomitsu, A. Takeuchi, and K. Tanaka. Activeweb: Xml-based active rules for
web view derivations and access control. In International workshop on Information
technology for virtual enterprises (ITVE01), pages 31–39, 2001.