A Heterogeneous Multi-Agent System for Adaptive Web Applications Andrea Bonomi, Giuseppe Vizzari Marcello Sarini Department of Informatics, Systems and Communication Department of Psychology University of Milan–Bicocca University of Milan–Bicocca Via Bicocca degli Arcimboldi 8, 20126 Milano, Italy Piazza dell’Ateneo Nuovo 1, 20126 Milan - Italy {andrea.bonomi, vizzari}@disco.unimib.it sarini@disco.unimib.it Abstract— A web site presents an intrinsic graph–like spatial or even an optimization of the site for all users. There are structure composed of pages connected by hyperlinks. This various approaches supporting these adaptation activities, but structure may represent an environment in which agents related they are generally based on the analysis of log files which to visitors of the web site are positioned and moved in order to track their navigation. To consider this structure and to store low–level requests to the web server: this kind of file keep track of these movements allows the monitoring of the site is generally made up of entries including the address of the and of its visitors, in order to support the enhancement of the machine that originated the request, the indication of the site itself through forms of adaptivity, carried out by specific time and the resource associated to the request. In order to interface agents. This paper presents a heterogeneous multi- obtain meaningful information on users’ activities these raw agent system supporting the collection of information related to user’s behaviour in a web site by specific situated reactive agents. data must be processed (see, e.g., [3]), for instance in order The acquired information is then exploited by an application to collapse requests related to various elements of a single supporting the proposal of hyperlinks based on the history of web page (e.g. composing frames and images) into a single user’s movement in the web site environment. entry. Moreover, this kind of information must be further processed to detect groups of requests that indicate the path I. I NTRODUCTION (web pages connected by hyperlinks) that a user followed A web site presents an intrinsic graph–like spatial structure in the navigation. Recent results [4] show that this kind of composed of pages connected by hyperlinks. However, this analysis, also referred to as web usage mining, could benefit structure is generally not considered by web servers, which from the consideration of site contents and structure. essentially act as a sort of extended and specific File Transfer This paper proposes to exploit the graph-like structure of a Protocol servers [1], receiving requests for specific contents web site as a Multi–Agent System (MAS) environment [5] on and supplying the related data. Several web–based applications which agents representing visitors of the web site (hereafter instead exploit the structure of the sites itself to support users user agents) are positioned and moved according to their in their navigation, generating awareness of their position. For navigation. In particular, in this case, the environment is a instance, many e–commerce sites emphasize the hierarchical virtual structure which allows the gathering of information structure linking pages related to categories (and possibly on user’s activities in a more structured way, simplifying subcategories), included products and their specific views, and subsequent phases of analysis and adaptation of site contents. remind users’ relative position (i.e. links to higher level nodes Furthermore, part of the adaptivity could be carried out in the tree structure). Some specific web–based applications, without the need of an off-line analysis, but could be the mainly bulletin boards and forums (see, e.g., phpBB1 ), are result of a more dynamic monitoring of users’ activities. In also able to inform users about the presence of other visitors particular, the paths that are followed by users are often related of the web site or even, more precisely, of the specific area of to recurrent patterns of navigation which may indicate that the site that they are currently viewing. Web site structure and the user could benefit from the proposal of additional links users’ context represent thus pieces of information that can be providing shortcuts to the terminal web pages, as a sort of exploited to supply visitors a more effective presentation of suggestion to the web site visitor. Index pages may thus be site contents. enhanced by the inclusion of links representing shortcuts to Different visitors, however, may have very different goals the typical destinations of the user in the navigation of the and needs, especially with reference to large web sites made web site. Moreover, links between terminal content pages that up of several categories and subcategories. This consideration are not provided by the static structure of the site can also be is the main motivation for the research in the area of adaptive identified and exploited. Users without a relevant history (and web sites [2]. The various forms of adaptation may provide also anonymous or unrecognized ones) may instead exploit a customization of site’s presentation for an individual user the paths that are most commonly followed by site visitors. Moreover such an information could also be communicated 1 http://www.phpbb.com/ to the webmaster suggesting possible modifications to the 66 this approach, the mapping between the web site structure and agents’ environment, while Section III introduces the gath- Web page ered information on agents’ movement in their environment. Section IV describes an application providing the exploitation of this information for the adaptation of web pages, both for customization and optimization. The adopted technology supporting the design and development of the related interface MMASS agent is introduced, and discussed with reference to existing node alternatives. A brief comparison of this approach and related work can be found in Section V, and finally concluding remarks and future developments will end the paper. User II. S ITE S TRUCTURE AND R EACTIVE U SER AGENTS agent A web site is made up of a set of HTML pages (generally Fig. 1. The diagram shows a mapping between a web site structure and an including multimedia contents) connected by means of hyper- agent environment. links. It is possible to obtain a graph-like structure mapping pages to nodes and hyperlinks to edges interconnecting these nodes. This kind of spatial structure could be exploited as an static predefined structure of the site. This approach provides environment on which user agents related to site visitors are thus both a support for site optimization, but also for the placed and move according to the related users’ activities. A customization to specific visitor’s needs and preferences. diagram showing a sample mapping among a web site and this The metaphor of a web site as an environment on which kind of structure is shown in Figure 1. users move in search for information is not new (see, e.g., [6] This structure can be either static or dynamic: for instance it but also more recent approaches such as [7]), and its applica- could vary according to specific rules and information stored tion to web site adaptation resembles the emergent, collective in a database (i.e. database driven web sites). However, this phenomenon of trail formation [8] which can be identified kind of structure (both for static and dynamic web sites) in several biological systems. However, this proposal provides can generally be obtained by means of a crawler (see, e.g., more than just gather information on users’ behaviours for sake Sphinx [9] and the related WebSphinx project2 ); then it could of web pages adaptation or navigation support, but exploits be maintained by having periodic updates. the MAS environment to provide users a means for mutual Given this spatial structure, a multi-agent model allowing perception and interaction. In fact information related to users’ an explicit representation of this aspect of agents’ environment positions on the environment representing the web site can is needed to represent and exploit this kind of information. also be used to supply them awareness information on other Environments for Multi Agent Systems [10] and situated visitors which are currently browsing the same page or area agents represent promising topics in the context of MAS of the site. Moreover, to keep track of this information allows research, aimed at providing first class abstractions for agents the conception of a form of interaction among users that is environment (which can be more than just a message transport based on their positions on the site. Essentially, more than just system), towards a clearer and more concrete definition of showing a user the other registered visitors that are “nearby” concepts such as locality and perception. There are not many (i.e. viewing the same page or adjacent ones), the system models for situated agents, which provide an explicit repre- could also allow to communicate with them. This form of sentation of agent’s environment. Some of them are mainly interaction, in addition to the web page adaptation function, focused on providing mechanisms for coordinating situated requires the adoption of a supporting technology that goes agent’s actions [11], other provide the interaction among beyond the request/response model. agents through a modification of the shared environment (see, The overall system architecture requires thus proper in- e.g., [12], [13]). An interesting approach that we adopted for terface agents, able to interact with user agents situated in this work is represented by the Multilayered Multi Agent the previously introduced environment in order to exploit the Situated Systems (MMASS) [14] model. MMASS allows the acquired information on users’ behaviours. This second type explicit representation of agents’ environment through a set of of agent is totally different from user agents, both from a interconnected layers whose structure is an undirected graph modelling point of view and with reference to the supporting of nodes (also referred to as sites in the model terminology; technology. In fact the web interface agent must be active as from now on we will use the term node to avoid confusion long as the related web page is being viewed by a visitor with web sites). The model was adopted given the similarity and it must be able, in collaboration with the rest of the among the defined spatial structure of the environment and the system, to proactively modify the page to improve the user’s structure underlying a web site. Moreover, the model defines browsing experience. The overall system architecture includes a set of allowed actions for agents’ behavioural specification thus heterogeneous agents collaborating to achieve this goal. The following section describes the general framework of 2 http://www-2.cs.cmu.edu/ rcm/websphinx/ 67 !"#$%&'()"& !"#$*"&+"& 892$"+",-)./-01 "+",-)./-01 EC)"&$=>?F 6:",-$*"&+"& =>? ;3-3 "+",-)./-01 A,-"&B34"$ 6:",- 2&345"- 3:",-<$=>? 0'+" -&345"&$7,+'43-7', $*"&+1"- @3:"<$ "+",-)./-01 C)"&$=>?D "+",-)./-01 ',$@3:"$"+",-)./-01$ 64-7+"$)"))7',)$17)- Fig. 2. A diagram showing how user actions influence the related reactive user agent through the capture of requests by the Tracker module. (including a primitive for agents’ movement); for this specific the relevant part of their state in a persistent way, until the application, however, the constraint which limits the number related user requires again a page of the site. In particular, of agents positioned in a node was relaxed. In fact there is no remote users’ requests may be divided into two main classes, limit to the number of users that are viewing the same web according to their effects on the Tracker and Agent server: page. • creating a new agent: whenever a new user requires a web Moreover a platform for the specification and execution of page, the Tracker will invoke the Agent Server requiring simulations based on the MMASS model [15] was exploited to the creation of an agent whose starting position is the implement the part of the system devoted to the management node related to the required page; the same effect is of agents in their environments. The definition of spatial generated by a request coming from an already registered structure of the environment was supplied by the previously user which was not present in the system, but in this case introduced crawler, while agents’ movement is guided by information related to previous user agents must be re- external inputs generated by the requests issued by the related trieved in order to determine the new agent’s state; finally, web site visitor. The general architecture of the system is when an already registered and active user requires a page shown in Figure 2: the Agent server module is implemented that is not adjacent to its current one, a new agent related through the MMASS platform, while the Web server is a to the new browsing activity must also be created; Tomcat servlet container hosting SnipSnap3 , a Java-based • generating the movement of an agent: when the viewer of weblog and wiki software. The highlighted Tracker module a page follows one of the provided links, the related web is a implemented through a Java Servlet, which is invoked by browser will generate a request for a page that is adjacent every page of the site but does not produce a visible effect on to one of the related agents which must be moved to the the related web page. The Tracker is responsible for triggering node related to the required page; whenever there are the creation and the movement of agents related to visitors in two or more agents in positions that are adjacent to the the environment related to the web site structure. In particular, required page, in order to solve the ambiguity and choose when a user makes his/her first page request the Tracker is the agent to be moved, the Tracker will invoke the Session invoked by the interface agent associated to the page. Then object in which it stores the current URL related to the the Tracker tries to set a cookie on the client including the viewed page. session information. If the cookie is accepted, it is possible to The following section will describe how the raw information use the session information to identify the user; on the other that can be gathered thanks to the above described framework hand, requests from clients not accepting cookies will not be can be processed in order to obtain higher level indications monitored. on users’ behaviours. Since the interface agent collaborates to The management of agents creation and movement is not as the user monitoring process, more details on this topic will simple as its intuitive description might indicate. In fact, the instead be given in Section IV-B. same user could be using different browser pages or tabs to simultaneously view distinct pages of the site. In other words, III. G ATHERED I NFORMATION : B ROWSING T RACES a user might be simultaneously following different trajectories in his/her web site navigation. In order to manage these This system allows to gather and exploit two kinds of situations, a user can be related to different agents, and his/her information: first of all situated agents related to web site requests must be associated to the correct agent (possibly a visitors have a perception of their local context, both in terms new one). Finally, agents related to finished (or interrupted) of relative position, adjacent nodes and presence of other user navigation should be eliminated by the system, storing visitors; second, agents may gather information related to the paths defined by the browsing activities or the related user in 3 http://snipsnap.org the site itself. 68 (a) (b) (request 2). According to the previously described Tracker Trace 1 Trace 1 behaviour, two agents are now associated to the user, and they 1 are associated to different traces sharing the Start field. 1 2 A1 A1 In (b), instead, the user has followed links 1 and 2 from the A starting page, then he/she made a step back (request 3) and A2 3 eventually moved to the last known position (request 4). The 2 Trace 2 4 Trace 2 step back causes the closure of the temporary trace associated to the agent (Trace 1 in the Figure), and the creation of a Fig. 3. A diagram describing two traces that are derived by a sequence of new temporary one with the same Start field (Trace 2). In user requests. this case the step back may have different interpretations: it could refer to a negative evaluation of the page contents but it could also indicate the fact that the user has found what he/she There are inherent issues in determining in a precise way the was searching for. An information that could be exploited to actual users’ activities on the web site, due to the underlying determine if the Dest field of the trace was interesting for the request/response model: the only available indications on user is the time interval between request 2 and 3: for instance, these activities can be obtained by requests captured by the given ∆td a threshold indicating the minimum time required Tracker. In particular, we have an indication of the page that to reasonably inspect the content of a specific web page, if was required by a user and the time-stamp of the request. timestamp(3) − timestamp(2) < ∆td then Trace 1 could be Starting from this raw information the system can try to detect ignored. However, the mere interval between the two requests emerging links, which are hyperlinks that are not provided by is not a safe indicator of the fact that the page was actually the structure of the site but can be derived by the behaviour viewed and considered interesting. of specific visitors. To this purpose, the concept of trace In fact, the time spent on a web page is also important in was introduced as a higher level information describing the order to determine when a temporary trace must be closed. In behaviour of a user. A trace synthesizes a path followed by fact, whenever a user does not issue requests for a certain time a user, from the web page representing his/her entry point, to we could consider that his/her browsing activity has stopped, a different point of the environment (i.e. another web page) possibly because he/she is reading the page related to the which may represent an interesting destination. Every agent Dest field of the trace associated to the related agent. In other related to a visiting user is associated to a temporary trace, words, every agent has a timer, set to the previously introduced and it may generate several actual traces (also called closed threshold ∆td , which is set when the agent is created and it traces) in the course of its movement in the environment. is reset whenever it moves. The action associated to this timer Formally a trace is a three-tuple hAId , Start, Desti, where specifies that its temporary trace becomes closed, and a new AId represents the identifier of the agent to which the trace is timer is set: the action associated to this second timer caused related, while Start and Dest indicate the starting and desti- the disappearance of the agent from the system, and the storage nation node related to the browsing sequence which generated of the related state. the trace. A new trace is generated when a user enters the It is important to note that even anonymous visitors (i.e. non site, triggering the creation of a related agent. The starting authenticated ones) whose clients are accepting cookies, can trace has a null value for the destination node. Subsequent be tracked and can thus generate traces, although anonymous requests by the user generated following hyperlinks will bring ones. The latter can be exploited for sake of web optimization the related agent to an adjacent node, and the the Dest field but are not relevant for sake of user specific site customization. of the corresponding trace will be modified in order to reflect User agents provide thus a support to interface agents user’s current position. Non trivial traces provide Start and by monitoring users’ behaviours and, in this specific case, Dest nodes that are not directly connected by means of a selecting relevant traces. Figure 4 shows how the user agents hyperlink. interact with the interface agents to provide them with relevant There are two relevant exceptions to the basic rule for trace information for page adaptation, but more details on this topic update, that are related respectively to the duplication of a will be provided by the following section. trace and to its closing. According to the previously introduced informal definition, a trace should be coherent in time and IV. T HE W EB I NTERFACE AGENT space. In fact, whenever the same user requires simultaneously two or more different pages he/she is probably following The aim of the Interface Agent is to improve the browsing distinct search trajectories, possibly even related to different experience of a user by adapting the page he/she is currently goals. In this case, as previously introduced, the Tracker will viewing to his/her preferences, needs or habits. To do so, it detect this situation and create additional agents that refer to must be active during the time–span in which the page is the same user. Figure 3 shows two sample situations providing visualized by the browser, and it must be able to dynamically respectively trace duplication and closing: in (a) the user has alter its appearance. To do so, it must also be able to interact chosen to open a hyperlink in a new browser page (request 1) with the previously introduced system to be informed about and then has followed another link in the first browser page past user’s behaviour. In other words the interface agent is 69 .#/&'#()#( !"#$%&'#()#( .#/&0(123#( user page. Another disadvantage of Java Applet is represented *$%#(+,-#&!"#$% '4""#3%51$ by the requirements of the Java Runtime Environment: first of &'#()6#% all it is not available by default on all web browsers, moreover it has a large memory occupation (around 20 Mb) and applets 73#(38 cannot start until the Java Virtual Machine is running. /#9,)514(3 73#(&!"#$%3 Flash5 is a multimedia technology commonly used to create animations, to build interactive web pages and to develop client-side web applications. The flash files (called Flash Fig. 4. A diagram showing the interaction among an interface agent, the Movies) run in a virtual machine called Flash Player, that user agent (in the MMASS environment) and the users’ behaviors database. is available for a wide variety of different browsers, platforms and devices. The Flash Player is smaller than Java runtime a client–side component, “living” in the web browser and (less than 1 MB) and it is installed on over 500 million devices interacting with it in a proactive way, as shown in Figure 4. and more than 97% of Internet-enabled desktops6 . Moreover, In the following sections, we describe the technology a Flash Player is embedded in many consumer electronics adopted to implement the interface agent, comparing it with devices, like Kodak EasyShare-One digital camera: the user other currently available technologies that could have been interface, built using Flash, enables simple navigation during selected to develop this kind of client-side web application. picture taking and sharing, and includes rich graphical scene Then the behaviour of the interface agent is briefly introduced, modes. Flash Movies can be programmed with a scripting focusing on its setting in the overall architecture and on the language called ActionScript, that is an ECMAScript7 –based adopted strategy for page adaptation. programming language, object oriented, loosely–typed and has a syntax quite similar to C. In contrast with JavaScript (which A. Technologies for Web Interface Agents: Java Applet, Flash is also ECMAScript compliant), ActionScript is compiled and AJAX into bytecode which is interpreted by a virtual machine. Today there are several technologies suitable to develop ActionScript has a rich API supporting the elaboration of rich client–side web applications, and in particular interface numbers, strings, XML and graphical element (vectorial and agents able to “live” in a common web browser. The most raster); it allows to play sounds and movies and to interact common are Java Applet, Macromedia (now Adobe) Flash and with server side application with a fast proprietary protocol AJAX. We intentionally chose not to consider recent browser (Flash Remoting8 ) or the slower SOAP (Simple Object Access extensions and plug-ins for the visualization of 3D virtual Protocol). environments, and to focus on more traditional forms of web AJAX (shorthand for Asynchronous JavaScript and XML) browser interfaces. is not a technology in itself, but a term that refers to the use Java Applet4 is the oldest technology used to provide of a group of technologies together [16]. In fact, AJAX is interactive features to web applications. An applet is a Java a combination of JavaScript, DHTML (Dynamic HTML)9 , software component that runs in a Web browser using the XML and the Remote Scripting (also described in [16]). Java Virtual Machine. Applets can be included in HTML (or Remote Scripting is used to deliver content dynamically with- XHTML) pages in the same way as an image or another out the need to refresh the page and DHTML is a method multimedia content, and they are executed in a sandbox, an for creating interactive web pages by using a combination infrastructure preventing them from accessing client’s local of a markup language (HTML) and a client–side scripting data (though there may be exceptions to this principle, and language (JavaScript): one major use of JavaScript is to write in particular trusted applets). This kind of approach is very functions that are embedded in or included from HTML powerful because applets can exploit all the Java API: they pages and interact with the Document Object Model (DOM). can, for example, generate complex user interfaces, with a rich Other typical examples of JavaScript usage are: validating web multimedia support (e.g. 3D graphics, sound, movies), or they form input, opening popup window, playing sounds, changing can interact with server–side application via Web Services, images size and performing text conversion operation. The Java RMI (Remote Method Invocation) or CORBA. It is scripts can be embedded in HTML pages or contained in possible to develop very complex applications using common .js files linked to the web pages. The overall AJAX web Open Source Java IDEs (like Eclipse or Netbeans) and run application model, compared to traditional web applications, is them in web browser as applets. Though Java Applet can be shown in Figure 5. Since JavaScript is an interpreted language, a suitable technology for many complex web application, it is errors are not detected until the faulty program line is executed. difficult to implement an interface agent with an applet because Another problem of AJAX (and JavaScript in general) are the of its lack of integration with the web browser. An applet is 5 http://www.adobe.com/products/flash/ in fact confined in a sandbox and cannot manipulate the data 6 NPD Online survey, conducted in April 2006 of the page in which it is being executed. For example, an 7 http://en.wikipedia.org/wiki/ECMAScript applet cannot be used to extract all the links of the current 8 http://www.adobe.com/products/flashremoting/ 9 http://www.w3.org/DOM/faq.html#DHTML–DOM, 4 http://www.sun.com/applets/ http://www.w3schools.com/dhtml/ 70 browser that remains alive and active during the visualization of a web page. So it is possible to go beyond the classic web request/response model and develop proactive interface agents. We chose AJAX instead of Flash because it is possible to develop AJAX applications with Open Source tools (in fact, only a common text editor is needed). Today, a commercial IDE is required to build Flash web applications; although there is an Open Source ActionScript compiler14 , the lack of a proper full–featured Open Source IDE and mature tools for user interface drawing is a major drawback. Compared to Java Applet, instead, AJAX is lightweight and better integrated in the browsing environment: JavaScript functions have a com- plete control on the page content while applets are confined in a sandbox. This is a very important feature because the aim of an interface agent is to interact with the user, so an agent with more freedom of action over the interface can perform its task more effectively. B. The Interface Agent in the Overall Architecture Fig. 5. The traditional web applications compared to the AJAX model. Figure The interface agent starts its activity when a web page by J. J. Garrett taken from [16]. of the site is loaded into client Web Browser. The first action performed by the agent is adding to every link of the page a parameter (called linkfrom) with the URL of the differences between different JavaScript engine implementa- current page as value. This action permits to identify the tions, so applications must be tested systematically on the source page of every subsequent request. For example, assume different target browsers and platforms. Nonetheless, AJAX is that current page address is http://host/index.html, the not only a scripting language that supports a rapid prototyping link Events included in the of web applications but it is also suitable for industry-strength page will be rewritten as systems (from WebGIS applications like Google Maps10 , to Events complex enterprise messaging and collaboration systems like Zimbra11 ). Similarly, Events will be rewritten as To compare the different technologies, several sample ap- plications that are available and freely accessible online can be Events evaluated. In particular several instant messengers have been implemented adopting Java Applet, Flash and AJAX technolo- The content of the page is dynamically changed at client- gies: for instance ICQ2Go!12 is is available both as a Java side by JavaScript DOM (Document Object Model), so the Applet and as a Flash application and Meebo13 is developed original page on the server remains intact. DOM will allow with AJAX. Despite all are instant messenger applications, the scripts to dynamically access and update the content, structure user experience is very different: the Java version as ICQ2Go! and style of current page. The document can be further pro- has a very long startup time and it requires a huge amount cessed and the results of that processing can be incorporated of memory but it has most functions of the stand–alone ICQ back into the presented page. The agent doesn’t update every client application and it is able to communicate with the server link of the page, but only the HTTP links to the current site. adopting the common ICQ protocol. The new Flash version So links to other sites, or links to a FTP repository or mail of ICQ2Go! and Meebo are comparable in terms of user address remain unchanged. experience: both of them start much faster than the ICQ2Go! The next action performed by the interface agent is applet, but they still have a very good look and feel and to call the tracker. If the current page is called with an extensive set of functionalities. However, both the Flash the linkfrom parameter, this parameter is passed to the and the AJAX version required a special server–side wrapper tracker. The tracker uses this parameter to build the because they can communicate only with a XML protocol. traces. For example, if the URL of the current page is After the analysis of the various technologies, we have events.html?linkfrom=index.html the user’s last page chosen to adopt AJAX in order to develop the Interface Agent. was index.html. The tracker can add a trace for the current With AJAX, it is possible to create an agent hosted in the web user from index.html to events.html (or update an existing one). The tracker doesn’t perform this operation itself, instead 10 http://maps.google.com/ it informs the user agent on the MMASS environment, which 11 http://www.zimbra.com/ is responsible for adding the trace. Then the interface agent can 12 http://go.icq.com/ 13 http://www.meebo.com/ 14 http://www.mtasc.org/ 71 ,$-).%/01$% *+$"#);$%@$% This method of the interface agent parses the RSS document ,$-);$%@$% !"#$%&'($)*+$"# A%'(B$%)3"@/('#3/" and displays the suggestions in a box in the web page. This A%'(B$# ;=++$1#3/"1 );$%@8$# operation is done by using DHTML: the agent searches for %$>=$1#? :$17/"1$)'1 ;=++$1#3/" );$%@8$# the suggestion box (sBox) in the DOM of the page (which :;; is a tree representation of the page HTML source) and than ;=++$1#3/"1 it replaces the content of the suggestion box with the freshly 2/%$3+")*+$"# %$>=$1#? :$17/"1$)'1 generated one. The latter is based on RSS suggestions: for 4&/%)$5'678$9 '):;;)*++%$+'#/%< :;; each suggested page (represented as an item in the RSS) the Interface Agent adds a link to the page and uses the title of the Fig. 6. Interface Agent and Foreign Agent interaction with the MMASS user page as label for the link. The following RSS is a suggestion agent are performed through the Suggestion Servlet. example: the user.Suggestions are in fact generated on the server–side Suggested contents for index.html and are published as an RSS15 (Really Simple Syndication) http://example.com/index.html feed. The agent suggestion request is managed by the user en Wed, 28 Jun 2006 02:28:19 +0200 agent (analogously as for traces). We choose RSS instead of a 1 proprietary format because this allows foreign interface agents (other then our interface agent) to interact with the system. Events The interface agents loads the RSS by using the http://example.com/events.html XMLHttpRequest16 class, which allows to perform an asyn- http://example.com/events.html 75 chronous request to the web server hosting the current web 3 page and to store the response in a local variable. The response could be a XML document or plain text. In the [ ... more items ... ] first case, XMLHttpRequest stores the retrieved data in a DOM-structured object, which can be navigated using the standard JavaScript DOM access methods and properties, such as getElementsByTagName() and childNodes[]. The fol- In this example, the first suggested element is the Events lowing code is an example of using XMLHttpRequest to asyn- page, whose URL is http://example.com/events.html. chronously request the server side page suggestions.jsp: The tags in the lintar namespace are our extension to req = new XMLHttpRequest(); the basic RSS: the tag identifies req.onreadystatechange = processReqChange; the number of users currently viewing the page and the req.open("GET", "suggestions.jsp", true); req.send(null); tag represent the intensity of foot- prints on the page, in the spirit of [6]. Footprints are signs In order to find out when the method has finished retrieving that one or more users have recently viewed the page. This data, a specific event listener must be defined: in this case information is also displayed by the interface agent on the the method is processReqChange, reported in the following suggestion box: the number of online users is displayed as code snippet: a picture of little red man and the presence of users traces is function processReqChange() { represented by corresponding icon. The number of online users if ((req.readyState == 4) && (req.status == 200)) { and the intensity of footprints are displayed in a tip box that // Gets the items from the XML document var xml = req.responseXML; it is shown when the mouse arrow is over the picture. It must var items = xml.getElementsByTagName("item"); be noted that the interface agent does not just provide a “one // Builds new suggestions shot” behaviour. In fact, when initialized, it sets a timeout for var html = ""; for (item in items) { a cyclical invocation of its main execution cycle by the web var title = getValue(item, "title"); browser. In this specific application, in particular, it is this able var link = getValue(item, "link"); to update and refresh the indication on the presence of other // Adds a link and a carriage return html += "" + title + ""; visitors and footprints on suggested pages. The overall cycle html += "
"; of interaction between the interface agent and the back end of } the system is illustrated in Figure 6 and a screenshot of the // Replaces the content of the suggestions box document.getElementById("sBox").innerHTML = html; web page enriched by the interface agent is shown in Figure 7. } C. The Adaptation Strategy } Every MAS agent of the implemented system provide 15 http://www.rssboard.org/rss-specification personalized suggestions about items that user will find in- 16 http://www.w3.org/TR/XMLHttpRequest/ teresting, according to the history of the user and to the other 72 !"#$%&#'$(&$'#)*+,(-#.$/,.$"(&$/)0(*,& /'#$)/10%'#.$23$0"#$0'/)4#'$5"()"$(,$0%', 5(66$7*8#$0"#$'#6/0#.$/+#,0$/))*'.(,+639 !"(&$/'#/$(&$/./10#.$/))*'.(,+$0*$0"#$0'/)#&$0"/0 5#'#$1'#8(*%&63$+#,#'/0#.$23$/+#,0&$'#6/0#.$0*$0"(& %&#'&$/,.$*0"#'$8(&(0*'&9 !"#$:,0#';/)#$<+#,0$%1./0#&$0"(&$/'#/$#8#'3$;#5$&#)*,.&9 !"#$=;**01'(,0&=$7#/,&$&*7#*,#$'#)#,063$8(&(0$0"#$1/+#> 0"#$=6(006#$7/,=$7#/,&$/$%&#'$(&$8(&(0(,+$0"#$1/+#9 Fig. 7. A screenshot of a web page adapted according to gathered traces. users path. These suggesting links have relationship with the An example of page adaptation refers to the adoption of a previously introduced traces, which represent behaviors and recurrent trace leading from the index of the web site to a con- movements of a user in a web site: the strategy which is tent page, that is not directly connected to the index but that is adopted to select the most relevant traces to be presented to visited very frequently. This kind of “vertical”17 emerging link a given user considers the occurrence of trace generation and is frequently observed in the prototypal implemetation of the the success rate of the traces that were proposed. system, which is installed in a web site presenting information about a research laboratory as well as information on courses A first element of this strategy is adopted when new users held by members of the group18 . Since the number of students (or non authenticated ones) enter the site. In this case the user of some of these courses is very high, they frequently generate has no previous history (or it is not possible to correlate the traces connecting the index to the page related to those courses. user with his/her history), and the adopted strategy considers These traces represent effective shortcuts allowing to bypass all stored traces, not considering the user which generated intermediate index pages related to education activities and them. An additional information that is stored with traces university courses. However, emerging links can also connect is the number of times that the related trace was effectively pages deep in the site structure. For example, a page related selected and shown to a user and the number of times that to a project might not be explicitly connected to another page the related link was effectively exploited by a user. This kind describing a particular modeling approach adopted in that of information allows to obtain an indication of the success project, but a user might browse the web site and effectively rate of the suggestions that were chosen by the agent, and can discover that page, causing the generation by the system of a be exploited to select the traces to be shown in the adaptive correspondant trace connecting the project and the modeling block. When the agent has an indication of the user which approach. This trace might not be extremely relevant to all issued the request, it may focus the selection activity to those visitors of the web site, due to the fact that this navigation traces that compose the history of user’s activities in the path will probably be not very frequent, but if the visitor is a web site, in a web customization framework. In fact traces registered user the trace could be stored and suggested anyway, include an indication of the agent which generated them, and since a number of slots in the adaptive area of the page is in turn agents are related to registered users. Moreover, in reserved to user–generated emerging links. order to focus on a specific user’s history but do not waste This strategy for the exploitation of the gathered and stored the chance to exploit other users’ experiences, just two of the traces, based on users’ behaviours and movement in the web three available slots for emergent links are devoted to traces site environment, represents a very simple way of exploiting that were generated by that user and one is selected according this kind of information without requiring an off-line analysis to the strategy adopted for anonymous or new users. Because the time spent on a page had a strong correlation with explicit 17 Here vertical is intended as describing the typical navigation path starting interest [17], the adopted strategy uses this information to from an index page and going deeper into the web site. refine the proposed suggestions. 18 http://www.lintar.disco.unimib.it 73 of the logs generated by the web server. The design, imple- A different approach to web site adaptation provides the mentation and test of more complex strategies, for instance adoption of a learning network to model the evolution of based on details of the outcomes of emerging link proposals a distributed hypertext network, such as a web site [22]. (e.g. which user effectively followed the suggested adaptive Also in this case the adaptation provides a modification in hyperlink) are object of future works. the structure of a web site, and the concept of emergent link and the underlying mechanisms present a similarity with V. R ELATED W ORK the learning rules adopted for that kind of learning network. However that approach also provides a deep modification There are several different approaches and relevant ex- in the architecture of the site and modifications in the web periences in the area of web site adaptation, and some of protocols, while this work aims at providing a solution that them are also related to agent technologies. In particular, can be easily integrated with a traditional web architecture. a relevant approach provides the adoption of information Moreover, recent developments of that line of research were agents supporting users in their navigation [18]. These agents aimed at identifying analogies and relations among words by generally consider both the specific behaviour of the user and means of web mining [23], rather than realizing adaptive web the actions of other visitors, and adopt multiple strategies for systems. making recommendations (e.g. similarity, proximity, access The introduced system supporting web site adaptation seems frequency to specific documents). more similar to a recommendation system. A relevant type The Footprints system [6] instead provides a site optimiza- of recommender exploiting users’ behaviours to decide which tion through the metaphor of site visitors leaving traces in contents could be interesting for a certain visitor is represented their navigation. These signals accumulate in the environ- by the collaborative filter approach [24]. The latter has been ment, generating awareness information on the most frequently adopted in different recommendation systems, filtering mail visited areas of the web site. No user profile is needed, messages, newsgroup articles and web contents in general, as visitors are essentially provided this information which but typically requires users to rate these items. Moreover, could represent an indicator of the most interesting pages it generally provides a concept of explicit users descriptions to visit. The metaphor of the structure of the web site as through profiles which can be compared to determine similar- an environment on which visitors move in their search for ity among them. The idea is that contents that received a high information is very similar to the one on which the proposed rating by a certain user could be considered interesting by a framework is based, but we also propose the exploitation of similar user. The introduced system instead does not require an the gathered information on users’ paths for user specific explicit rating of contents, but it rather observes the frequency customization. Another interesting recent work [19] represents of specific navigation paths, and exploits emergent links for an attempt to integrate interaction mechanisms similar to the customization or optimization of site structure. However, the one adopted by Footprints, often referred to as stigmergic adaptive block of the page can include emerging links that are interaction mechanisms [20], and cognitive agents. This line not related to the specific visitor who is currently browsing of research could represent an interesting way to integrate the that page, but were generated by other users which frequently proposed approach, which is able to generate and manage followed paths that the current one still did not follow. awareness contextual information, with higher level mecha- From this point of view, the system provides a very basic nisms and strategies of adaptation. collaborative browsing scheme, but a more through analysis of Other approaches provide instead the generation of index a possible integration with this approach is object of current pages [3], that are pages containing links to other pages and future works. covering a specific topic. These pages, resulting from an analysis of access logs aimed at finding clusters grouping VI. C ONCLUSIONS AND F UTURE D EVELOPMENTS together pages related to a topic, are proposed to web masters This paper introduced a general framework providing the in a computer-assisted site optimization scheme. A differ- adoption of a web site as an environment on which agents ent approach provides the real-time generation of shortcut related to visitors move and possibly interact. This approach links [21], through a predictive model of web usage based allows the gathering of a structured form of information on on statistical techniques and the concept of expected saving users’ behaviours and activities in the web site. The concept of a shortcut, which considers both the probability that the of emerging links and traces have been introduced in order generated link will be effectively used and the amount of to support an application exploiting information on users’ effort saved (i.e. intermediate links to follow). In particular, browsing history for sake of web pages adaptation. The intro- this framework is very similar to the one proposed here with duced framework and the application to web site adaptation reference to the aims of the overall system, but it incorporates have been designed and implemented, exploiting a platform a complex algorithm for off-line analysis of logs, while the supporting systems based on the MMASS model. proposed approach provides a light and dynamic generation of A campaign of tests aimed at evaluating the effectiveness most probable useful links and the storage of these proposals of the adaptation approach, and also for sake of tuning and high level information on site usage for a possible further the involved parameters (e.g. timings, number of presented off-line analysis. possible emerging links) is under way. This evaluation will 74 be based on user interviews and also on the exploitation [9] R. C. Miller and K. Bharat, “Sphinx: a Framework for Creating Personal, of the gathered information of the success rate of proposed Site-specific Web Crawlers,” Computer Networks and ISDN Systems, vol. 30, no. 1–7, pp. 119–130, 1998. adaptive hyperlinks. Such an indicator might be obtained as a [10] D. Weyns, F. Michel, and H. V. D. Parunak, Eds., Environments for ratio between the number of times an emerging link has been Multi-Agent Systems, First International Workshop (E4MAS 2004), ser. actually selected by a user and the total number of times its has Lecture Notes in Artificial Intelligence, vol. 3374. Springer–Verlag, 2005. been shown. However, it must be noted that we currently do [11] D. Weyns and T. Holvoet, “Model for Simultaneous Actions in Situated not have an indication of threshold to discriminate successful Multi-Agent Systems,” in First International German Conference on suggestions from unsatisfactory ones; a further analysis of Multi-Agent System Technologies, MATES, ser. Lecture Notes in Com- puter Science, vol. 2831. Springer–Verlag, 2003, pp. 105–119. methods adopted to evaluate related approaches is currently [12] M. Mamei, F. Zambonelli, and L. Leonardi, “Co-fields: Towards a being carried out. The results of this evaluation might also lead Unifying Approach to the Engineering of Swarm Intelligent Systems,” to consider the modelling, design and implementation of more in Engineering Societies in the Agents World III: Third International Workshop (ESAW2002), ser. Lecture Notes in Artificial Intelligence, vol. complex trace selection strategies, and thus a more complex 2577. Springer–Verlag, 2002, pp. 68–81. behaviour for the interface agent. [13] K. Hadeli, P. Valckenaers, C. Zamfirescu, H. V. Brussel, B. S. Germain, Future works will be focused on the introduction and T. Hoelvoet, and E. Steegmans, “Self-organising in Multi-Agent Coor- dination and Control Using Stigmergy,” in Engineering Self-Organising exploitation of higher level semantic information related to Systems: Nature-Inspired Approaches to Software Engineering, ser. the site structure and contents, and thus agents’ environment, Lecture Notes in Computer Science, vol. 2977. Springer–Verlag, 2004, aimed at providing additional forms of adaptation, including pp. 105–123. [14] S. Bandini, S. Manzoni, and C. Simone, “Dealing with Space in Multi– images and multimedia contents. While in [25] an analysis Agent Systems: a Model for Situated MAS,” in Proceedings of the first on how a conceptual view on the topics may be used as international joint conference on Autonomous agents and multiagent an additional level of description of the environment, another systems. ACM Press, 2002, pp. 1183–1190. [15] S. Bandini, S. Manzoni, and G. Vizzari, “Towards a Platform for aspect that will be considered is the possibility to improve the Multilayered Multi Agent Situated System Based Simulations: Focusing effectiveness of web–based applications supporting processes on Field Diffusion,” Applied Artificial Intelligence, vol. 20, no. 4–5, pp. with adaptive functionalities. Finally, a further development 327–351, 2006.. [16] J. J. Garrett, “AJAX: a New Approach to Web Applications,” provides also the design and implementation of a prototype Adaptive Path Essay, Tech. Rep., 2005. [Online]. Available: supporting the context-aware interaction among web site vis- http://www.adaptivepath.com/publications/essays/archives/000385.php itors. In this framework, the environment related to the web [17] M. Claypool, P. Le, M. Waseda, and D. Brown, “Implicit Interest Indicators.” in Intelligent User Interfaces, 2001, pp. 33–40. site also supports the mutual perception of the agents situated [18] M. J. Pazzani and D. Billsus, “Adaptive Web Site Agents,” Autonomous in it and it also supports a form of interaction among them Agents and Multi-Agent Systems, vol. 5, no. 2, pp. 205–218, 2002. depending on their relative positions. The latter can be thus [19] A. Ricci, Omicini, M. Viroli, L. Gardelli, and E. Oliva, “Cognitive Stigmergy: a Framework Based on Agents and Artifacts,” in 3rd Inter- considered as a form of context–dependant interaction. A more national Workshop “Environments for Multi-Agent Systems” (E4MAS thorough analysis of the possible applications of this approach 2006), D. Weyns, H. V. D. Parunak, and F. Michel, Eds., 2006, pp. can be found in [25], and a prototypal implemetation of these 44–60. [20] G. Theraulaz and E. Bonabeau, “A Brief History of Stimergy,” Artificial interaction mechanisms is currently under way. Life, vol. 5, no. 2, pp. 97–116, 1999. [21] C. R. Anderson, P. Domingos, and D. S. Weld, “Adaptive Web R EFERENCES Navigation for Wireless Devices,” in Proceedings of the Seventeenth [1] A. S. Tanenbaum, Computer Networks - third edition. Prentice Hall, International Joint Conference on Artificial Intelligence (IJCAI 2001), 1996. 2001, pp. 879–884. [2] M. Perkowitz and O. Etzioni, “Adaptive Web Sites: an AI Challenge.” in [22] J. Bollen and F. Heylighen, “Algorithms for the Self-Organisation of Proceedings of the Fifteenth International Joint Conference on Artificial Distributed, Multi-User Networks. Possible Application to the Future Intelligence (IJCAI 1997), 1997, pp. 16–23. World Wide Web,” in Proceedings of the 13th European Meeting on [3] ——, “Adaptive Web Sites,” Communications of the ACM, vol. 43, no. 8, Cybernetics and Systems Research, R. Trappl, Ed. Austrian Society pp. 152–158, 2000. for Cybernetic Studies, 1996, pp. 911–916. [4] R. Cooley, “The Use of Web Structure and Content to Identify Subjec- [23] F. Heylighen, “Mining Associative Meanings from the Web: from tively Interesting Web Usage Patterns,” ACM Transactions on Internet Word Disambiguation to the Global Brain,” in Proceedings of the Technology, vol. 3, no. 2, pp. 93–116, 2003. International Colloquium: Trends in Special Language & Language [5] D. Weyns, H. V. D. Parunak, F. Michel, T. Holvoet, and J. Ferber, Technology, R. Temmerman and M. Lutjeharms, Eds. Standaard “Environments for Multiagent Systems State-of-the-art and Research Editions, Antwerpen, 2001, pp. 15–44. Challenges.” in Environments for Multi-Agent Systems, First Inter- [24] P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl, “Grou- national Workshop (E4MAS 2004), ser. Lecture Notes in Computer plens: an Open Architecture for Collaborative Filtering of Netnews,” Science, vol. 3374. Springer–Verlag, 2005, pp. 1–47. in CSCW ’94: Proceedings of the 1994 ACM conference on Computer [6] A. Wexelblat and P. Maes, “Footprints: History-Rich Tools for Infor- supported cooperative work. ACM Press, 1994, pp. 175–186. mation Foraging,” in Proceedings of the SIGCHI conference on Human [25] S. Bandini, M. Sarini, C. Simone, and G. Vizzari, “WWW in the Small: factors in computing systems. ACM Press, 1999, pp. 270–277. Towards Sustainable Adaptivity,” World Wide Web Journal, 2006 (to [7] J. Liu, S. Zhang, and J. Yang, “Characterizing Web Usage Regularities appear). with Information Foraging Agents,” IEEE Transactions Knowledge and Data Engineering, vol. 16, no. 5, pp. 566–584, 2004. [8] D. Helbing, F. Schweitzer, J. Keltsch, and P. Molnár, “Active Walker Model for the Formation of Human and Animal Trail Systems,” Physical Review E, vol. 56, no. 3, pp. 2527–2539, January 1997. 75