WISM'06                                                                                   1077


          An Efficient Implementation of a Rule-based
              Adaptive Web Information System

                         Davide Valeriano, Roberto De Virgilio,
                        Riccardo Torlone, and Daniele Di Federico

                        Dipartimento di Informatica e Automazione
                             Università degli studi Roma Tre
              {valeriano,devirgilio,torlone,difederico}@dia.uniroma3.it


           Abstract. Mobile devices provide a variety of ways to access informa-
           tion resources available on the Web and a high level of adaptability to
           different aspects (e.g., device capabilities, network QoS, user preferences,
           location, an so on) is strongly required in this scenario. In this paper we
           propose a technique for the efficient adaptation of Web Information Sys-
           tems. The approach relies on special rules that allows us to specify, in
           a declarative way, how to generate, in an automatic way, the adaptation
           specifications that satisfy the requirements of a given context. Rules are
           classified in a set of clusters and a matching relation between clusters
           and context requirements is defined. The technique of rule evaluation and
           clustering guarantee that different contexts and orthogonal requirements
           of adaptation, possibly not fixed in advance, can be efficiently organized
           and taken into account in the adaptation process.


     1    Introduction
     Modern Web Information Systems (WIS) are challenged to generate (automat-
     ically) a hypermedia presentation built from information that is gathered from
     different, possibly heterogeneous, sources that are distributed over the Web.
     Nowadays, given the spread of mobile devices, a novel and fundamental require-
     ment of these systems is the ability to adapt and personalize content delivery
     according to the context of the client. An important point that needs to be taken
     into account is that this scenario is very dynamic: some aspects of the context
     can dynamically change. For instance new users with new preferences can be
     enabled to access the information system, possibly unpredicted types of access
     devices can be used, alternative network connections can be made available, fur-
     ther aspects of the context (like the location or others environmental factors)
     need to be incorporated. Also, the technology used to implement the adaptation
     can be changed and this should have a limited impact on the overall application.
     It follows that the adaptation process should guarantee a high level of flexibility
     in terms of responsiveness to rapidly changing requirements of adaptation and
     suitable actions to be undertaken to meet these requirements.
         In the literature, some adaptation approaches rely on the definition of rules
     to generate suitable adaptation specifications [2, 3, 10]. In [5] we have presented
1078                                                                                                                 Web Information Systems Modeling


       a general methodology for context adaptation, based on special rules that model
       the adaptation at an abstract level, inspired by the approaches cited above. The
       rules specify, in a declarative way, how to generate, in an automatic way, the
       adaptation specifications that satisfy the requirements of a given context. A
       relevant problem is to rapidly select and activate the rules that best fit the re-
       quirements of adaptation of the context (in particular when the context changes
       frequently). In this paper we present an efficient implementation to optimize the
       activation process of adaptation rules. To this aim, we introduce a clustering
       technique to classify the adaptation specifications.
           The rest of the paper is organized as follows. Section 2 introduces the rule-
       based approach. In Section 3, we illustrate how to define the set of clusters
       and the process to efficiently activate a rule, using the clusters. In Section 4
       we present a practical implementation and experimental results and finally, in
       Section 5 we draw some conclusions and sketch future works.


       2         A Rule-based approach to model adaptation
                 requirements

       In this section we present special rules to specify, in a declarative way, how to
       generate, in an automatic way, adaptation specifications. The rules are based on
       two basic notions: the generic profile and the abstract configuration.


       2.1          Generic profiles

       A (generic) profile is a description of an autonomous aspect of the context in
       which the Web site is accessed (and that should influence the presentation of
       its contents). Examples of profiles are the user, the device, the location, and so
       on. A dimension is property that characterizes a profile. Each dimension can
       be conveniently described by means of a set of attributes. Attributes can be
       simple or composite. A simple attribute has a value associated with it, whereas a
       complex attribute has a set of (simple or complex) attributes associated with it.
       In this model, a context is a collection of profiles. Figure 1 reports a graphical
       example of profiles. In our model, different profiles over the same dimensions can


                                                           P1                                                                            P2
                                                                                                                                                                  P3


                   Hardware                          Software                     Category                     Hardware               Software      Category   Category


           cpu           display        memory     os           version    type       family   model               display                            type      type
                                                                                                                                        OS


                 width             height                                                                  width             height


                                                 profile                  dimension                complex attribute                   simple attribute


                                                                     Fig. 1. An example of profiles
WISM'06                                                                                         1079


     be compared making use of a subsumption relationship C. Intuitively, given two
     profiles P1 and P2 , if P1 C P2 then P2 is more detailed than P1 since it includes
     the attributes of P2 at the same or at coarser level of detail. More precisely, we
     first say that an attribute A covers another attribute B if either they are simple
     and A = B or they are composite and for each sub-attribute Ai of A there is a
     sub-attribute Bj of B such that Ai covers Bj . The subsumption relationship is
     then defined as follows.
     Definition 1 (Subsumption) Given two profiles P1 and P2 , we say that P1 is
     subsumed by P2 , P1 C P2 , if for each dimension d of P1 there is a dimension d’
     of P2 such that for each attribute A of d there is an attribute A’ of d’ covered by
     A.
     As an example, given the profiles reported in Figure 1 we have that P2 C P1
     and P3 C P2 .

     2.2     Abstract configurations
     We introduce the notion of (abstract) configuration as a triple C = (q, h, s)
     where:
      – q is a query over the underlying database expressed in relational calculus, a
        declarative and abstract language for relational databases ([1]);
      – h is an abstract hypertext definition expressed in WebML ([4]), a conceptual
        model for Web application which allows us to describe the organization of
        Web pages in a tight and concise way, by abstracting their logical features;
      – s is presentation specification expressed in terms of an original notion of
        logical style sheet, which is a set of tuples over a predefined collection of
        Web Object Types (WOTs) like text, image, video, form and so on; each
        WOT τ is associated with a set of presentation attributes: they identify
        possible styles (e.g. font, color, spacing, position) that can be specified for τ .
           An example of configuration is reported below.
      q = {T : x1 , S : x2 , D : x3 , C : x4 , N : x5 |
            NewsItems(N : x0 , T : x1 , S : x2 , D : x3 , G : x6 , J : x7 ),
            Details(NK : x0 , P : x8 , C : x4 ), Newspapers(J : x7 , N : x5 , C : x9 ), x6 = Sport}
     h=      IndexUnit NewsIndex
               ( source News Items; attributes Title, Date; orderby Date )
             DataUnit ContentData
               ( selector Item = CurrentItem; attributes Title, Date, Content)
             link NewsToDetails
               ( from NewsIndex to ContentData
                 parameters CurrentItem = NK)
             link ContentToRestOfContent
               ( from ContentData to ContentData
                 parameters CurrentItem = NK)
     s=      Text( Font: Arial, ..., Border: 0pt)
             Link( Note: False, ..., Color: Blue)
             Image( Format: jpeg, ..., Alignment: left)
1080                                                      Web Information Systems Modeling


          It is important to note that a configuration is indeed a logical notion that
       can be represented and implemented in several ways and with different syntaxes.
       This property guarantees the generality of the approach with respect to actual
       languages and tools used to implement the adaptive application. For instance,
       we can implement a configuration using SQL at the content level, XHTML at
       the navigation level and a set of CSS files at the presentation level.

       2.3   Adaptation Rule
       The matching relationship between configurations and profiles is represented by
       means of a novel notion of adaptation rule. An adaptation rule has the form
       Pr : Cd ⇒ Cf where:
        • Pr is a parametric profile, that is, a profile in which parameters can appear
          in place of values,
        • Cd is a condition, made out of a conjunction of atoms of the form AθB or
          Aθc, where A and B are parameters occurring in Pr , c is a constant value,
          and θ is a comparison operator,
        • Cf is a parametric configuration in which parameters occurring in Pr can
          appear in place of values.
       The intuitive semantics of an adaptation rule is the following: if the client has a
       profile Pr and the condition Cd is verified then generate the configuration Cf .
           An example of adaptation rule for a device profile with the hardware dimen-
       sion H is the following:
                   H (ImgCapable : X , ScreenSize : Y , ColorCapable : Z ) :
                   X = false, Y < 1500 ⇒ {q, h, s}
       where X, Y , and Z are the parameters of H and {q, h, s} is the following para-
       metric configuration:
                   q = { T : x1 , S : x2 , D : x3 , C : x4 |
                       NewsItems(N : x0 , T : x1 , S : x2 , D : x3 , G : x5 ,
                       J : x6 ), Details(NK : x0 , P : x7 , C : x4 )}
                  h = Page NewsPage ( units NewsIndex )
                        Page ContentPage ( units ContentData )
                        IndexUnit NewsIndex
                          ( source News Items; attributes Title, Date;
                            orderby Date )
                        DataUnit ContentData
                          ( selector Item = CurrentItem;
                            attributes Title, Date, Content)
                        link NewsToDetails
                          ( from NewsIndex to ContentData
                            parameters CurrentItem = NK)
                  s = Text( Color: Black, Size: 8pt)
                        Link( Style: Underline, Color: Black)
                        Image( Size: Y × 0, 5, Color: Z)
WISM'06                                                                            1081


     Note the use of the parameters Y and Z of H to resize the images and to
     eliminate/maintain colors.
         A profile P activates a rule Pr : Cd → Cf if Pr C P . Hence, a profile P can
     activate a rule for a profile that is more general than P . Now, let R be a rule
     Pr : Cd → Cf activated by a profile P and let C be the configuration obtained
     from Cf by substituting the parameters of Pr with the actual values occurring
     in P : we say that C is generated by P using R.


     3     An optimization strategy

     We assume to have at disposal an initial set of rules SR . In an adaptability
     scenario, context changes generate events that produce one or more profiles
     (instances). Each profile (instance) can activate a rule of SR . We should optimize
     the research to select and activate efficiently the most appropriate rules and
     generate the most suitable configuration. So we organize SR classifying the rules
     in a set of clusters. Each cluster represents a class of adaptation that matches
     the requirements of a class of context.


     3.1   A distance between profiles

     We define a distance that allows to compare profiles. We take advantage of the
     tree structure of the generic profile. We get inspiration from the Tree Edit Dis-
     tance [9], where the distance between two trees T1 and T2 , that we indicate as
     dted (T1 , T2 ), depend on the sequence of operations (chosen in a limited prede-
     fined set) that transforms T1 into T2 . We assign a fixed cost to each operation:
     the distance between the considered trees is the sum of essential costs for the
     transformation. For our case, let’s consider the set of operations edit, remove
     and add, respectively to modify the content of a node, to delete a node and to
     create a new node. In our tree structures, we distinguish two principal types of
     node: dimension and attribute. We have operations to modify, remove or create
     dimensions and attributes and we indicate the cost of an operation with γ.

     Definition 2 (S-derivation) Given two trees T1 and T2 , S = s1 , ... , sk the
     sequence of operations to transform T1 in T2 , an S-derivation between T1 and
     T2 is a sequence U0 , ... , Uk of trees resulting by applying the single
                                                                        Pi=k operation,
     where U0 = T1 and Uk = T2 , and the cost γ(S − derivation) is i=0 γ(si ).

        So we can intuitively define the distance between two profiles P1 and P2 , the
     minimum cost for a S-derivation between them.

     Definition 3 (Distance between profiles) Given two profiles P1 and P2 , we
     define the distance between them as dted (P1 , P2 ) = min{γ(S − derivation) | S-
     derivation from P1 to P2 }
1082                                                          Web Information Systems Modeling


       3.2   Definition of Clusters
       Let’s represent a cluster as a tree where the nodes are profiles. The root repre-
       sents the minimum set of information that nodes of the tree share. Basically a
       client is identified by the profile sent to the system. Let’s remember that a profile
       is principally described by dimensions, that characterize the main information,
       articulated in attributes. So, the root of a cluster is a profile characterized only
       by dimensions. It will contain the minimum information that several profiles has
       to include to be in the same cluster. Given a profile P, pruned by all attributes,
       we can build several beginner clusters, rooted with profiles generated from P,
       on the number and type of dimensions. However the designer can decide the
       detail level for the roots. For instance a designer can decide to articulate some
       dimensions of a root with attributes, to extend the level of detail of the minimum
       information to share.
           We define a cluster as follows

       Definition 4 (Cluster Tree) A cluster CL is a couple {NCL , ECL } where NCL
       is the set of nodes (profiles) and ECL is the set of edges (ni , nj ) such that ni / nj

       We can relate a node Pr of a cluster with the rule Pr : Cd → Cf in an Hash
       table using Pr as access key.
           Given the set of roots Rt = {Rt1 , ... ,Rtn } and the set of rules SR , we initialize
       a set of clusters CL = {CL1 , ... ,CLn } as follows:
        • we set CLi = {{Rti },∅},
        • ∀ (Pr : Cd → Cf ) ∈ SR , (i) we search a CLi such that Rti C Pr , and
          we navigate the cluster in depth until level k where doesn’t exist any node
          subsumed by Pr , (ii) we search at level k −1 the node N d such that N d C Pr
          and if there are more nodes N d such that N d C Pr we choose the node with
          lowest distance, (iii) we add Pr to NCLi and (N d, Pr ) to ECLi , (iv) if there
          are one or more nodes N di such that Pr C N di then we delete any edge of
          type (N dj ,N di ) in ECLi and add to it the edge (Pr , N di ),
        • for each node of a cluster we add the relative rule to the Hash table.

       3.3   The activation process
       Built the set of clusters, now we can perform an optimization strategy to activate
       a rule in an efficient way. Given the set of rules SR , the set of clusters CL =
       {CL1 , CL2 , ..., CLn }, the Hash table Ht and a profile instance PI of a profile
       schema PS , we produce a set of configurations as follows:

        1. we initialize P 0 with PS ;
        2. we search the cluster CLi in CL, with a root Rti such that Rti C P 0 and
           that has the lowest distance from it;
        3. (i) we navigate in depth the cluster until level k where any node, that is
           subsumed by P 0 , doesn’t exist; (ii) we search at level k − 1 the node N d such
           that N d C P 0 and if more nodes are subsumed by P 0 we choose the one with
WISM'06                                                                               1083


         lowest distance; (iii) we extract the rule Rl , related in the Ht to N d, and if
         the condition is satisfied, we activate Rl and instance the configuration with
         the value of PI ; (iv) we update P 0 as P 0 - N d;
      4. we iterate from step (2.) until P 0 is not empty and a rule that can be activated
         by P 0 exists;
      5. if no rule can be activated in this way, the system will return a default
         configuration.


     3.4   An example of interaction between profiles and clusters


     Let’s assume a client sending a profile PI (see Figure. 2(a)), instance of a profile
     schema P to the system. Assume also that the system has already initialized its
     own set of clusters, composed by clusters C1 and C2 (see Figure 3). We easily see
     that the root of the cluster C2 is the one which subsumes P and has the lowest
     distance from it. So we navigate that cluster and note that the only child N of
     the root it is equal to P ; then we extract the rule Rl related in the Ht to N and
     if the condition is satisfied, we activate it and instance the configuration with
     the values of PI . Subsequently, assume that a new profile PI0 (see Figure 2(b)),
     instance of a profile P 0 , is sent by another client to the system. At the same way
     explained before, we see that the root of the cluster that subsumes P 0 and has
     the lowest distance from it, is cluster C1 root. Navigating that cluster, we need
     to confront P 0 with root’s two children. We easily see that no one of them is
     equal to or has a subsumption relation with P 0 : being their parent a root, is not
     possible to activate its rule; so, as we explained before, the system will return a
     default configuration.


                          Device                                  Device


                          Hardware                  Hardware               Software


                              display                   display              OS


                       width       height         width      height          Win


                        800         640           800         640

                       (a) Profile PI                     (b) Profile PI0

                                 Fig. 2. Profiles sent to the system.
1084                                                                                               Web Information Systems Modeling


                                             Device                                                      Device

                                  Hardware             Software
                                                                                                         Hardware


                         Device                                                                                     Device
                                                               Device
                                                                                                                                         -
              Hardware            Software         Hardware                 Software                                Hardware             -
                                                                                                                                   Pr1:Cd1 -> Cf1
                                                                                                                                         -
               display        OS       Version         CPU             OS        Version                             display             -
                                                                                                                                         -
             width   height
                                                                                                              width       height


                                                 Device                                             Device

                                       Hardware            Software
                                                                                                   Hardware

                                   display       CPU      OS
                                                                                               display
                                                                  Version
                                                                                                             CPU
                              width     height
                                                                                           width    height


                              (a) Cluster C1                                                                 (b) Cluster C2

                                                                            Fig. 3. Cluster Set


       4      A practical implementation
       4.1      Benefits of Clustering
       We can easily see that using this clustering algorithm we can reduce the number
       of accesses to SR . In our system, we have seen that a profile given by a client can
       activate a rule in the default set. So when a client sends a profile to the system it
       has to search for the most suitable rule in SR . We can assume that the worst case
       is when the system has the greatest possible number of default rules, that is when
       the system has in SR one rule for each possible profile that a client can send.
       Without using clusters we should execute a number of ordered accesses equal to
       the total number of possible profiles. It is easy to see that the number of possible
       profiles is the number of profiles obtained with every possible different combina-
       tion of dimensions and attributes. Assuming to have a complete schema with n
       dimensions and m attributes for each dimension,
                                                  Pn ¡ ¢ we can see that the number of
       different combinations of dimensions is i=1 ni while the number            of different
                                                                     Pm ¡ ¢
       combinations of attributes for each single dimension is j=1 m         j  . Combining
       these two results and looking carefully at our structure, we see that the number
                                         Pn hPm ¡m¢i(i−1      n
                                                                )
       of different profiles is equal to i=1       j=1 j          (n − i + 1). If we use the
       clustering algorithm, in this case we will obtain a number of clusters equal to the
       number of different combinations of dimensions, seen above. To access the right
       clusters, the system has to access to all of their roots and then find
                                                                            Pthe    right way
                                                                               n £¡ ¢¤
       on the selected tree. So it can ignore a number of trees equal to i=1 ni + 1.
       We can see that in this situation the worst case is when the root of the selected
WISM'06                                                                               1085


     cluster has the maximum number n of dimensions, because it would be the tree
     with the maximum number of nodes. The remaining clusters will contain a num-
                              Pn−1 hPm ¡m¢i(i−1     n
                                                      )
     ber of profiles equal to i=1        j=1 j          (n − i + 1). These profiles will be
     ignored by the system, which has only accessed the roots of their clusters, that
     are not profiles. So, counting also these accesses, we find that at least we obtain to
                                          Pn−1 hPm ¡m¢i(i−1     n
                                                                  )              Pn ¡ ¢
     save a number of accesses equal to i=1          j=1 j          (n − i + 1)− i=1 ni .
     This number doesn’t take count of the fact that the system won’t access all the
     nodes of the selected cluster, but only some of them, depending on its structure.
     Anyway it gives evidence that, using this clustering algorithm, it is possible to
     obtain a certain benefit in terms of number of accesses.

     4.2   Experimental results
     We have designed and developed a practical implementation to test the our ap-
     proach. The system relies on an RDF database of contents and the selection of
     data is done by performing RDF queries expressed in SPARQL [8]. We use an
     RDF(S) representation for schema level and RDF for instance level. The im-
     plementation makes use of Jena [6], a Java framework for building RDF based
     applications. Each rule is implemented in terms of Inference Rule of Jena [7].
     Using the Inference Engine of Jena, the system generates a configuration that
     infers the adaptation on RDF(S) documents. Finally the system returns several
     response as output, using XSLT transformations on RDF instances. We have
     tested our system to experiment the effectiveness and the efficiency of the ap-
     proach illustrated in this paper in several practical cases. On the server side, we
     used an IBM computer xSeries 225, equipped with a Xeon2 2.8Ghz processor,
     a 4 GB RAM, and a 120 GB HDD SCSI. The Web site have been accessed by
     three different types of devices: a mid-range desktop, several PDAs with differ-
     ent capabilities, and some cellular phones. Figure 4 reports the average response
     time (vertical axis) for each device type (horizontal axis). The Figure shows
     promising times to produce the adapted response. The Desktop client presents
     an average time of 72 ms, the PDA client 97 ms and the Wap client 121 ms.
     The diagram compares the medium response time in absence and in presence
     of clusters. In particular the system, using clusters, capitalizes the approach for
     mobile devices, almost halving the response time. For instance the Wap client
     benefits from 121 ms to 59 ms.


     5     Conclusions and future works
     In this paper we have presented a novel rule-based approach supporting the
     automatic adaptation of content delivery in Web Information Systems. This
     problem arises in common scenarios where the information system is accessed,
     in different contexts, by a variety of mobile devices. The adaptation is achieved
     automatically by means of special rules that specify, in a declarative way, how
     a configuration can be generated to meet the requirements of adaptation of a
1086                                                                                             Web Information Systems Modeling


                                                                     List of Rules    Clusters of Rules

                                                     140


                         medium response time (ms)
                                                     120
                                                     100
                                                     80
                                                     60
                                                     40
                                                     20
                                                      0
                                                           desktop                   pda                  wap
                                                                                devices


                                                           Fig. 4. Results of experiments


       given profile. Finally the organization of rules in a set of clusters guarantees the
       responsiveness of the system in a dynamic scenario. Different contexts and or-
       thogonal requirements of adaptation, possibly not fixed in advance, can be taken
       into account in this process. The results presented in this paper are subject of
       further conceptual and practical investigation. From a conceptual point of view,
       we are currently investigating the expressive power of rules and the generality
       of clusters. In particular we are improving the distance between profiles, using
       a comparison between common pathes. From a practical point of view we are
       improving the efficiency of the system and we are extending its functionality.

       References
        1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley,
           1995.
        2. C. Bettini, D. Maggiorini, and D. Riboni. Distributed context monitoring for con-
           tinuous mobile services. In IFIP TC8 Working Conference on Mobile Information
           Systems (MOBIS), 2005.
        3. S. Ceri, F. Daniel, V. Demaldé, and F. M. Facca. An approach to user-behavior-
           aware web applications. In 5th International Conference on Web Engineering
           (ICWE), pages 417–428, 2005.
        4. S. Ceri, P. Fraternali, A. Bongio, M. Brambilla, S. Comai, and M. Matera. Design-
           ing Data-Intensive Web Applications. Morgan Kaufmann Publishers Inc., 2002.
        5. R. DeVirgilio, R. Torlone, and G.-J. Houben. A rule-based approach to content
           delivery adaptation in web information systems. In 7th International Conference
           on Mobile Data Management (MDM’06) - to appear -, 2006.
        6. HP-Labs. Jena. http://jena.sourceforge.net/, 2004.
        7. HP-Labs. Jena 2 inference support. http://jena.sourceforge.net/inference/, 2005.
        8. Hp-Labs. Sparql, query language for rdf. www.w3.org/TR/rdf-sparql-query/, 2005.
        9. C. Isert. The editing distance between trees. Technical report, Ferienakademie, for
           course 2: Bume: Algorithmik Und Kombinatorik, 1999.
       10. H. Kiyomitsu, A. Takeuchi, and K. Tanaka. Activeweb: Xml-based active rules for
           web view derivations and access control. In International workshop on Information
           technology for virtual enterprises (ITVE01), pages 31–39, 2001.