An Active Web-based Distributed Database System for E-Commerce Hiroshi Ishikawa Manabu Ohta Tokyo Metropolitan University, Dept. of Electronics & Information Eng. Abstract ECbusiness models like e-brokers on the Web use WWW-based distributed XML databases. To flexibly model such applications, we need a modeling language for EC businesses, specifically, its dynamic aspects or business processes. To this end, we have adopted a query language approach to modeling, extended by integrating active database functionality with it, and have designed an active query language for WWW-based XML databases, called XBML. In this paper, we explain and validate the functionality of XBML by specifying e-broker and auction business models and describe the implementation of the XBML server, focusing on the distributed query processing in the WWW context. 1 Introduction SQL to XML is inadequate because RDB and XML XML data are widely used in Web information data have different data models. So we take a systems and EC applications. In particular, e-broker nonprocedural query language approach to modeling [12] business models on the Internet like XML-based businesses. Further, we extend the query Amazon.com, use a large number of XML data such language approach by integrating ECA rules [8] with as product, customer, and order data. In order both it for modeling control flow of the business processes. to flexibly model and agilely realize such applications, Thus, we call XBML an active query language we need a modeling language for EC businesses, in approach to modeling EC businesses. At the same particular, business processes. To this end, we will time, we make XBML efficiently executable on the adopt a query language approach to modeling EC server-side to agilely implement the business models. businesses, extended by integrating active database This paper will not propose a new query language [8] functionality with it, and will provide an active for XML submitted to W3C, although XBML query language for XML data centric in business contains the query functionality as a basic part for models, tentatively called XBML (Xml-based specifying business processes. XBML integrates the Business Modeling Language) by extending the query facility with ECA rules for controlling business earlier version [10]. As an active query language processes. We will describe the functionality of approach, we need to consider its continuity with XBML and validate the usability of XBML by nonprocedural database standards such as SQL. specifying the example business model with XBML We rationalize the necessity of a modeling in Section 2. We will describe the current language for EC businesses. First, the modeling implementation of an XBML server, focusing on the language must be able to integrate the components by distributed query processing in Section 3. reducing their complexity and to make the integrated system understandable. Second, the modeling 2 Approach language must be able to do more than model EC 2.1 Database Schemas and Business Model businesses. Indeed, we can use XML as interfaces of We use the following database schemas or DTD each component to make integration easy. However, fragments for illustrating the functionality of XBML: this just models only the static aspects of the components. We must be able to model the dynamic aspects of business models, that is, business processes. For example, the author [4] discusses the necessity of modeling Web-based applications although he takes an HTML/JavaScript approach in the context of extending UML. But this approach would increase the complexity of modeling the business logic and the overhead of the client-server interaction on the contrary. Instead, we need a nonprocedural language approach to modeling the bushiness processes such as SQL as an analogical solution although just applying 27 The following is a part of XML data with conformity discussed in the previous section. to the above DTD: (1) Searching Products XBML provides the following functions for Hiroshi describing product search processes: Ishikawa - XBML allows product search by selecting products based on their attributes, such as titles L2 and authors, and constructing search results based S210 on them. - XBML allows ambiguous search by allowing Object-Oriented Database System partially-specified strings and path expressions. - XBML supports data join used in related search Springer Verlag to promote cross-sell and up-sell. 69.00 - XBML configures search results by sorting and grouping them based on product attributes. - XBML supports “comparison model” of similar We take an ordered directed graph as a logical products by allowing search multiply bound model for an active query language XBML as a across shopping sites. modeling language of EC businesses. That is, the data - XBML provides localized views (e.g., prices) of model of the XBML can be represented as data global products by introducing namespaces (i.e., structures consisting of nodes (i.e., elements) and contexts). directed edges (i.e., contain, or parent-child Note that we cannot describe sorting, grouping, and relationships), which are ordered. namespaces due to the limit of space. We also use e-broker business models based on XML data for describing the XBML functionality. Data selection and construction Here we will provide the working definition to EC The basic function of XBML is to select arbitrary business models in general. The EC business models elements from XML data by specifying search consist of business processes and revenue sources conditions for accommodating flexible product based on IT such as Web and XML. We assume that searches in e-broker business models. XBML allows e-broker business models on behalf of customers any combination of retrieved elements to produce consist of at least the following business processes: new element constructs for further services. The (1) The customer searches products by issuing following query produces new elements consisting of either precisely- or titles and authors of books published by Prentice-Hall approximately-conditioned queries against and firstly authored by Ullman: one or more suppliers and /or navigating (Query1) select result {$book.title, $book.author } through the related links. from dlib URI “www.a.b.c/dlib.xml”, book $dlib.book (2) The customer takes recommendations from where $book.publisher.name = “Prentice-Hall” and suppliers into account if any. $book.author[0].lastname =“Ullman” (3) The customer compares and selects products and $book.@year gt “1995” and puts them into the shopping cart. (4) The customer checks out by placing a The basic unit of XBML is a path expression, that purchase order with registration. is, an element variable followed by a series of tag (5) The customer tracks the order to check the names such as “$dlib.book”. The user must declare at status for shipping. least one element variable in a from-clause. In particular, the user can bind XML data as input The revenue source in e-broker models is sales. specified by URI to element variables such as dlib. Note that URI in our context must have the form 2.2 Business Model Specification “www.x.y.z/d.xml” but not “www.x.y.z”. This We have adapted the design of XBML to the declares a context where an XBML query is requirements for supporting EC business models evaluated. References of element variables are done 28 by prefixing “$” to them. The user checks a condition retrieving products represented as slightly for selection in a where-clause. Two values of heterogeneous elements (i.e., semi-structured XML elements are compared in an alphabetical order. data), which depend on data and suppliers in e-broker Compare operators include “=”, “!=”, “lt” for “<”, business models. Here we define semi-structured “le” for “<=”, “gt” for “>”, and “ge” for “>=”. XBML XML data as follows: allows indexed access to ordered elements by (1) Elements with the same tag are repeated at more specifying an index [i]. Attributes are referenced by than or equal to zero times, depending on parent prefixing “@” to them. elements, such as authors of books. “{}” in a select-clause enclosing elements (2) Elements with the same tag have variant delimited by “,” creates new XML elements of a sub-structures, depending on parent elements, specified construct such as author and title tags. The such as offices of authors. result of an XBML query is XML data, which can be retrieved as well as existing data. In our current As these characteristics cannot be determined in design, the resultant XML data have no DTD, that is, advance, we allow partially-specified path they are well-formed XML data. For example, the expressions.The following query retrieves authors of result of the above query has the following structure, any material such as book and article named automatically wrapped by a tag “XBML:result”: Ishikawa. (Query2) select result {$anyauthor} from dlib URI “www.a.b.c/dlib.xml”, A First Course in Database Systems anyauthor $dlib.%.author where $anyauthor.lastname =“Ishikawa” Jeff Here “%” denotes “wild card” in path expressions, Ullman which also allows approximate searches in e-broker Gates Building business models. “$dlib.%.author” matches both of “book.author” and “article.author”. Data join … XBML joins different elements by comparing their values in a where-clause. The following query joins books and articles by authors as a join key within the Here, we define the basic syntax of XBML as same XML data: follows: (Query3) select result {$article, $book} query = select target from context-list [where-clause] from dlib URI “www.a.b.c/dlib.xml”, article $dlib.article, [orderby-clause] [groupby-clause] book $dlib.book target=expression | tag ‘{’expression-list ‘}’ where $book.author.firstname = $article.author.firstname expression-list = expression ‘,’expression-list | expression and $book.author.lastname = $article.author.lastname expression = [tag] ‘$’ variable | [tag] ‘$’ variable ‘.’ path and $book.title = “%Electronic Commerce%” path = ‘%’ | tag | ‘@’attribute| path ‘.’ path | ‘(’ path ‘|’ path ‘)’ | text context-list = context ‘,’ context-list | context In e-broker business models, this helps increase context = variable URI uri-list | variable expression cross-sell and up-sell. Here the customers can do uri-list = uri uri-list | uri approximate searches over XML data by using wild where-clause = where condition condition = term | condition or term card “%” in strings, that is, partially-specified strings, term = factor | term and factor as is often the case with search in e-broker business factor = predicate | not predicate models. The query result has the following predicate = expression compare expression structure: orderby-clause = orderby expression-list groupby-clause = groupby expression-list
Partially-specified path expression XBML allows regular path expressions for flexibly 29 those customers who purchased the products selected by the customer.
The facility for function definition and the query transformation technique have an important role in recommendation as follows. Function definition Functions correspond to “parameterized views”. Functions modularize recurring queries in EC
business models to increase their reuse. The user … defines a function by specifying an XBML query in
its body. The syntax has the following form: function-definition = function name ‘(’ parameter-list ‘)’ as ‘(’ query ‘)’ Multiple binding parameter-list = parameter ‘,’ parameter-list | parameter The user can have universal access to multiple data sources by binding a single element variable to As personalized recommendation, the following multiple URIs (i.e., URI list) in a where-clause. The function recommends products based on the following example retrieves books authored by the keywords which the customer (specified by its same author from two online bookstores (bound to identifier, customerid) have registered in advance as dlib) by only a single query at the same time: his psycho-graphic data: (Query4) function personalized-Recommendation (customerid) as select result {$book.title, $book.author} (select result {$book.title, $book.price} from dlib URI “www.a.b.c/dlib.xml” “www.x.y.z/dlib.xml”, from dlib URI “www.a.b.c/dlib.xml” , book $dlib.book, r URI book $dlib.book “www.a.b.c/registration.xml”, customer $r.register.customer where $book.author.lastname =“Ishikawa” where $book.keyword = $customer.keyword and $customer.id = customerid) The users need to declare the partially-specified path expression to accommodate the heterogeneity of The next example in the collaboratively-filtered datasources. This function is necessary for,comparing recommendation category recommends products similar products or searching the lowest price in based on similarity that there are other customers who multiple stores. purchased the product selected by the customer (i.e., indicated by selected). (2) Recommendation function collaboratively-filtered-Recommendation (selected) as (select result {$book.title, $book.price} Related search as a recommendation process is from dlib URI “www.a.b.c/dlib.xml” , book $dlib.book, r URI crucial in promoting cross-sell and up-sell, indeed. It “www.a.b.c/registration.xml”, customer $r.register.customer is classified into three categories to the extent to where $book = $customer.purchased and which the customer in session is involved. $customer.purchased = selected) (1) Non-personalized recommendation The customer is not involved. The e-broker Query transformation recommends some products as general trends, Until now, we have treated recommendation and independently of the customer. Or, the e-broker search as separate processes. However, when the shows the customer products highly rated by the customer specifies search keywords, the search result other customers. can be expanded to include recommended products (2) Personalized recommendation by transforming the original search query. Query The customer only is involved. The e-broker transformation is classified into two rules as follows: recommends some products based on the (1) Keyword addition rule customer’s psycho-graphic data, such as interests, This rule has the general form: keyword1 ==> keyword1 | keyword2 or historical data, such as purchase records. (3) Collaboratively filtered recommendation [17] Both the customer and the others are involved. For example, the originally specified keyword The e-broker recommends products purchased by “Electronic Commerce” adds a new keyword 30 “Internet Business” and the disjunctive condition is result $XBML:result.result added to the end of the query as follows: where $result.checked =“yes” (Query5) select result {$book} (4) Placing Orders from dlib URI “www.a.b.c/dlib.xml, book $dlib.book Selected items in the shopping cart remain to be where $book.keyword = “Electronic Commerce” or added to ordering databases. Thus, addition of new $book.keyword = “Internet Business” elements is a mandatory function for constructing practical e-broker models. Addition of new elements This technique is similar to query expansion [3] often needs making them unique by invoking a used in information retrieval. Note that this type of dedicated function, defined in programming transformation keeps data sources unchanged. languages such as Java. To this end, XBML also allows function invocation in a query. (2) Data source addition rule This rule uses set operations on queries to modify Insertion and function invocation the original one. The rule has the following general We provide the syntax for insertion by using an form: XBML query as follows: query1 ==> query1 set-operator query2 insertion = insert into target query Here set-operator includes union, intersection, and The following query places a purchase order in difference. For example, when the customer e-broker business models by consulting the current searches books on EC, he will search articles on shopping cart and customer data and invoking a EC at the same time by modifying the original function: query with a disjunctive query as follows: (Query8) (Query6) insert into $order select result {$book} select order {@id = OrderID($customer.id, date()), from dlib URI “www.a.b.c/dlib.xml, book $dlib.book item $cart.item} where $book.keyword = “Electronic Commerce” from r URI “www.a.b.c/registration.xml”, union customer $r.register.customer, select result {$article} XBML:result URI “www.a.b.c/XBML:result.xml” , from dlib URI “www.a.b.c/dlib.xml, article $dlib.article cart $XBML:result.cart, o URI “www.a.b.c/ordering.xml”, where $article.keyword = “Electronic Commerce” order $o.order where $customer.lastname =“Kanemasa” We analyze the application-based Web access patterns [5] to create the transformation rules, not Here, in a select-clause, function calls discussed here due to space limitation. “OrderID($customer.id, date())” generate unique order numbers. Ordering initiates internal processes, (3) Moving to Carts such as payment and shipment, hidden from the In general, EC business models involve temporary customers. Please note that “$order” in the data, such as search results and shopping carts, valid into-clause is permanent in only within sessions as well as permanent data such “www.a.b.c/ordering.xml” while “order” in the as books and customers. XBML handles such select-clause is temporarily constructed in this query. temporary data as first-class citizens. (5)Tracking Orders Use of query results Ordering and shipping constitute a supply chain in the XBML allows a query against the intermediate query EC business models. Further, shipping is often results as well. The customer checks the result of outsourced. Thus, the involved data are managed at searching products or recommendation to place an separate sites whether on intranet or on the extranet. order. The following XBML query moves only the To this end, XBML allows data join across different customer-checked items in the search result to the sites in addition to that within one site. shopping cart: (Query7) Join of data from multiple data sources select cart {item $result.book} The user can join heterogeneous XML data from from XBML:result URI “www.a.b.c/XBML:result.xml” , 31 different data sources indicated by different URIs. In by sellers corresponds to registry of products by e-broker business models, the following query suppliers, just implicit in the e-broker model. produces a set of ordered items and shipping status by Searching and recommendation of auction items are joining order identifiers of order entry data and order very close to those of products in the e-broker model. shipping data at different sites indicated by separate Indeed, bidding is a new process, but it can be viewed variables bound to multiple URIs, such as o and s: as a series of tentative ordering until the buying (Query9) customer wins the auction. In other words, the event select result {$order.item, $ship.status} that the customer wins the auction moves auction from o URI “www.a.b.c/ordering.xml”, s URI “www.d.e.f/shipping.xml”, order $o.order, ship $s.ship items to the shopping cart. The winner’s placing a where $order.id=$ship.id and $order.id=“cidymd” purchase order is very close to that in the e-broker model. Order tracking in the auction model is In general, there are two approaches to resolving analogous to that in the e-broker model although it heterogeneity in schemas of different databases: may require a new business model, such as e-escrow, schema translation based on ontologies and schema to guarantee the bargain contract. The revenue source relaxation based on query facilities. XBML takes the is a part of the contract price as fees in the auction latter approach, that is, XBML uses regular path model. Thus, we would say that our XBML can apply expressions and element variables to enable the user to the auction model as well. to retrieve multiple databases with heterogeneous However, it is also true that controlling business schemas by a single query at one time because the processes, or modeling events by some ways is regular path expressions can match with more than necessary. Thus, the auction model requires one path and the element variables can be bound to triggering business processes at a specified time or on more than one path. Further, we allow well-formed some database events such as insert. Active databases XML data containing a set of heterogeneous element or ECA (Event-Condition-Action) rules [8] will be as a query result. Of course, we admit that a simple able to specify such business processes on events solution to schema translation between heterogeneous more elegantly than procedural programming DTD is based on XSL (i.e., XSL Transformations). languages plus the current version of XBML. Therefore, we extend current XBML by introducing 2.3 Applicability to Other Models and Extension the following construct for ECA rules: In the previous subsection, we have discussed the on event if condition then action applicability of XBML to the e-broker models. Now we ascertain its applicability to business models other Events include operations of XBML (e.g., select than the e-broker model. Indeed, there are rather and insert) and a specified time. Conditions are novel EC business models, such as the reverse specified as conditions of XBML. Actions are also auction model. However, new business models are specified by XBML. often created by mutation of business processes of For example, we think of the situation that when existing models. We take the auction model [12] as the highest bidding price of the auction specified by an example. The auction model consists of the id1 is updated, if the current time is before the closing following processes: time of the auction, then the auctioneer specified by The selling customer registers auction items. id2 increases his bidding by a specified value value3. (1) The buying customer searches auction items. The corresponding ECA rules can be specified as (2) The buying customer takes recommendations follows: on insert into $auction.price into account if any. if now() lt $auction.closing-time (3) The buying customer bids. then insert into $auction.auctioneer.price (4) The winner customer checks out by placing a select increase ($autioneer.price, “value3”) purchase order with registration. from actn uri “www.a.b.c/actn”, (5) The winner customer tracks the order to auction $actn.auction where $auction.id = “id1” check the status for shipping. and $auction.auctioneer.id = “id2” We can observe similarity between the auction Here, now() returns the current time and increase(var, model and e-broker model. Registry of auction items val) increments the variable var by a value val. The 32 ECA rules are defined in advance and invoked on searching node identifiers in Attribute_Node. events. The ECA rules can elegantly implement the We cluster data in node and edge tables on a recommendation (e.g., Query5): breadth-first tree search basis. We have found this on select result {$book} way of clustering contributing very much to reducing if $book.keyword = “Electronic Commerce” I/O cost. Further, we have known from our then select result {$book} from dlib URI “www.a.b.c/dlib.xml, preliminary experiments that the DTD-dependent book $dlib.book mapping approach is mostly two times more efficient where $book.keyword = than the universal one. However, we have focused on “ElectronicCommerce” or more of our implementation efforts on the universal $book.keyword = “Internet Business” mapping approach for the following reasons: (1) The approach can free the burden of defining Note that the result of the “event query” (i.e., first idiosyncratic mappings from the users. “select”) is replaced by that of the “action query” (i.e., (2) The approach can store XML data whose DTD second “select”) in this case. are unknown in advance. (3) The approach can store heterogeneous XML 3 Implementation data, in particular, semi-structured XML data XBML is intended for use in not only modeling EC in the same database. business models, but also realizing them agilely. XBML must be efficiently implemented, too. XBML Next, we describe the system architecture for a containing URIs intrinsically requires distributed local XBML server or an XBML processing system. query processing. So we construct the XBML server We make appropriate indices on tag values, as follows: element-subelement relationships, and tag paths in (1) We construct local XBML servers as a basis. advance. (2) We construct global XBML servers by We describe how the XBML processing system extending the local servers with server-side works. The XBML language processor parses an scripting techniques. XBML query and the XBML query processor generates and optimizes a sequence of access 3.1 Local Server methods for efficient execution. The primitive access We describe the basic architecture and methods are basic operations on node sets, implementation of a local XBML server. First, we implemented by using RDBMS or ODBMS. They describe storage schema for XML data. We have include get_NodeId_by_Path&Val, explored approaches to mapping DTD to databases get_ParentId_by_Child, get_ChildId_by_Parent, (RDBMS, i.e., Oracle and ODBMS, i.e., Jasmine [9]) get_Value_by_Id, get_NodeId_by_Path, and and to implement an XBML processing system [11]. get_LabelId_by_LabelText in addition to node set If any DTD or schema information is available, we operators, such as union, intersection, and difference. basically map elements to tables and tags to fields, We illustrate the translation by using the query: respectively. We call this approach DTD-dependent select $book.title mapping, where the user must specify mapping rules from dlib URI “www.a.b.c/dlib.xml”, book $dlib.book individually. Otherwise, we take a DTD-independent where $book.publisher.name = “Prentice-Hall” mapping or universal mapping approach, which divides XML data into nodes and edges of an ordered This is parsed into an internal form, which denotes directed graph and stores them into separate tables for a logical query plan represented as an ordered-graph: nodes and edges with neighboring data physically (Proj (Sel $book (Op_EQ $book.publisher.name “Prentice-Hall”)) $book.title) clustered. We provide separate tables for nonleaf and leaf nodes. The order fields of Leaf_Node and Edge Here, Sel, Proj, and Join (not in the above tables are necessary for providing access to ordered example) denote selection, projection, and join of elements by index numbers. Identifiers, such as ID XML data, respectively. Op_EQ denotes “=”. This and IDREF, realizing internal links between elements internal form is reorganized in a pattern-directed are declared as attributes and are stored as Value of manner, such as placing Sel before Join, and is the separate Attribute_Node table. So references transformed into the following primitive operations: through identifiers are efficiently resolved by 33 (1) get_NodeId_by_Path&Val (Op_EQ the event and action queries as a single transaction. “$book.publisher.name” “Prentice-Hall” ) However, the generated queries tend to be long, in returns a node set set1 (i.e., $book.publisher). particular, for cascading events. (2) get_ParentId_by_Child (set1 “$book”) returns For the moment, we adopt the query rewriting a node set set2. approach in favor of the ease of the implementation. (3) get_ChildId_by_Parent (set2 “$book.title”) returns a node set set3. 3.2 Global Server (4) get_Value_by_Id (set3) returns a value set as a Now we construct the global XBML server by result. extending the above local XBML servers with server-side scripting techniques. We provide Both RDBMS and ODBMS can be used as the preliminary definitions to queries. First, we database system of the XBML processing system categorize queries as follows: with the upper layers unchanged by virtue of the (A) Single-URI query above primitive operators. This type of query contains only one XML data We describe the implementation of ECA rules. source specified by a single URL in the query, First, we define an event query by using the event and such as Query1 (selection) and Query3 (join). the condition in the rules and define an action query (B) Multiple-URI query by using the action in the rules. Further, we store a This type of query contains multiple XML data dedicated ECA rule database whose entry consists of sources specified by multiple URIs in the query. a pair of such an event and an action query. Now we This type is further categorized into two as consider the following approaches to ECA rule: follows: (1) Monitor-based approach (B1) Decomposable query The monitor checks each usual query against the This type of query can be decomposed into event query patterns in the ECA rule database and a combination of single-URI queries with issues the corresponding action query of the set operators, such as Query4 (multiple matched event query if the condition is satisfied. binding) and Query6 (set operators). The monitor usually keeps a queue of events (B2) Non-decomposable query generated by the matched event query and invokes This type of query cannot be decomposed the action query by looking up in the queue. into a combination of single queries alone. (2) Query rewriting approach This type of query contains join queries We modify the query processing. The parser over multiple URIs, such as Query9 (join of checks each query against the event query patterns multiple data sources). in the ECA rule database and recursively processes the corresponding action query of the matched Second, we categorize queries in another way: event query by adding a check on the condition. (a) Local query That is, the “event query” and “action query” in the XML data sources specified by URI are inside rules are translated into a sequence of queries (i.e., the relevant XBML server. primitive access methods) with a condition check. (b)Global query XML data sources specified by URI are outside Next we consider the merits and demerits of the the relevant XBML server. above approaches as follows: Now we show that non-decomposable (i.e., (1) Monitor-based approach intrinsically global) query can be transformed into The monitor can control the whole processes in a a series of single URL local or global queries and centralized manner. However, we need a monitor local queries (join). We assume that the original itself as an extra mechanism. It is not trivial to query contains n URIs. We translate a provide the facility for executing the event and non-decomposable query by two steps: action queries as a single transaction. (1) create a single-URI (local or global) query (2) Query rewriting approach for each of n URIs with the insertion of the We need no extra mechanism for controlling query result into the local server. processes.. It is rather straightforward to execute (2) create single-URI queries performing join of 34 the results stored in the local server, which } are local queries, by reducing all URIs to a single-URI. The above query processing has some room for improvement in performance. Thus, if the Queries generated by the step (1) localize non-decomposable query has no selection conditions, single-URI global queries. Of course, single-URI the whole remote data sources specified by the local queries remain local. We call them localized generated single-URI queries must be copied to the single-URI queries. After that, queries generated by local server. For example, consider Query9 when the the step (2) simulate join of multiple data sources by global server is resident at “ordering site” or at a third join of local data sources. We call them localized join site. Of course, if there is any selection condition on queries. For example, when we assume that the global the join key, the condition is propagated to all the server is resident at the “shipping site”, consider single-URI queries. We call this technique simple again the following query (Query9): selection condition propagation. It is a kind of static select result {$order.item, $ship.status} query rewriting. However, we want more from o URI “www.a.b.c/ordering.xml”, s URI improvement. So we refine the process-or-dispatch “www.d.e.f/shipping.xml”, order $o.order, ship $s.ship scheme to sort the result of the query and return the where $order.id=$ship.id and $order.id=“cidymd” value range with respect to the join key (i.e., MIN and MAX values) by adding “order-by” to the query. The query is translated into the following Then, the conditions “join-key ge min-value and localized single-URI query, whose result is fetched join-key le max-value” are dynamically added to the into the global server: select $order subsequent generated single-URI query. In turn, the from o URI “www.a.b.c/ordering.xml”, order $o.order query is evaluated to produce a new value range of where $order.id=“cidymd” the join-key (i.e., min-value’ and max-value’). The following characteristic holds: min-value’ >= and into the following localized join query, which min-value & max-value’ <= max-value. From this, produces a result of the original query: we can conclude that the expected selectivity is better select result {$order.item, $ship.status} than that of the original algorithm. If a single-URI from XBML:result URI “www.d.e.f/XBML:result.xml” , order $XBML:result.order, s URI “www.d.e.f/shipping.xml”, query Q has any selection condition “keyQ ge ship $s.ship min-valueQ” and “keyQ le max-valueQ”, then we take where $order.id=$ship.id MAXQ(min-valueQ ) and MINQ(max-valueQ ) as an initial min-value and max-value, respectively. A Now we describe the global query processing, single-URL query being firstly processed is chosen assuming that a query Q with a uri URI is specified as from ones with any selection condition on the the input: non-key because now all the single-URL queries if Q is a single-URI query then virtually have the same condition on the key (i.e., the process-or-dispatch (Q); initial value range). If there is no selection condition, else {/*i.e., Q is a multiple-URI query; */ if Q is a decomposable query then any local query being firstly processed will produce { for each sub-query Qsub in Q the initial value range. It has the merits: It can avoid process-or-dispatch (Qsub); extra data transfer by just issuing modified queries merge the result by the local server;} and avoid extra protocol by just accommodating the else {/*i.e., Q is not a decomposable query;*/ decompose Q into localized single-URI queries min/max values in results. Qloc-s and localized join queries Qloc-j; XBML works as server-side scripting with for each sub-query Qsub in Qloc-s database access such as CFML[2], and ASP [15] and process-or-dispatch (Qsub); provides universal access to distributed XML data. If process Qloc-j by the local server;} XBML queries are embedded in XML-based scripts, } process-or-dispatch (Q) /* for single-URI query*/ the global XBML server can provide more direct and { if URI is local to the server then universal interfaces to representing and accessing process Q by the local server; distributed XML data than the other approaches. That else {/*i.e., Q is not local to the server; */ is, XML pages containing the element dispatch Q to the relevant remote server; store the result into the local server;} XBML-query are interpreted as scripts. 35 4 Conclusion businesses. We extend the query optimization in We have proposed and validated XBML as an XML relational databases[14] to the distributed context. active query language approach to specifying EC business models. We compare our work with related References work. There are no high-level language approaches to 1 Abiteboul, S. et al.: Active Views for Electronic modeling EC business processes, in particular, no Commerce, Proc. Intl. Conf. VLDB 1999 other work on validating the modeling language by 2 Allaire Corporation: CFML, applying it to EC business models. XBML can http://www.allaire.com/documents/cf4/CFML_Langua ge_Reference/contents.htm, 2000 provide a more direct and universal tool for modeling 3 Chang, C.H., et al.: Enabling Concept-Based distributed XML data applications than server-side Relevance Feedback for Information Retrieval on the scripting tools such as CFML [2], ASP [15]. WWW, IEEE Trans. Knowledge and Data Eng., Now we will compare our XBML with other query vol.11, no. 4, pp.595-609, 1999 language proposals from the viewpoint of process 4 Conallen, J.: Modeling Web Application Architectures specification since XBML contains the query with UML, Comm. ACM, vol.42, no.10, pp.63-70, language functionality as a basic part. XML-QL [6] 1999 has comprehensive functionality and has much in 5 Cooley, R., et al.: Web Mining: Information and common with our XBML. However, condition Pattern Discovery on the World Wide Web, Proc. the specification in XML-QL is rather verbose. If applied 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'97), 1997. to business modeling, XML-QL would make query 6 Deutsch, A., et al.: XML-QL: A Query Language for formation rather complex. XML, XQL [16] has compactly-specified functionality http://www.w3.org/TR/1998/NOTE-xml-ql-19980819, and has common functionality with our XBML. XQL 1998 focuses more on filtering a single XML document by 7 Goldman, R., McHugh, J., and Widom, J.: From flexible pattern match conditions similar to XSL. If Semistructured Data to XML: Migrating the Lore Data applied to specifying EC business models involving Model and Query Language, Proc. the 2nd Intl. multiple sites, XQL would require the user to write Workshop on the Web and Databases (WebDB '99), extra application logic in addition to query formation. 1999. Lore [7] provides a powerful query language for 8 Ishikawa, H., et al.: An Active Object-Oriented Database: A Multi-Paradigm Approach to Constraint retrieving and updating semi-structured data based on Management, Proc. Intl. Conf. VLDB, 1993 its specific data model OEM, but it lacks some 9 Ishikawa, H., et al.: An Object-Oriented Database functionality such as multiple binding. System Jasmine: Implementation, Application, and So far we have compared XBML with the other Extension, IEEE Trans. Knowledge and Data works only from the viewpoint of query languages. Engineering, vol. 8, no. 2, pp.285-304,1996 However, the above languages are largely different 10 Ishikawa, H., et al.: from XBML for the following reasons. First, we http://www.w3.org/TandS/QL/QL98/pp/flab.doc, 1998 focus our efforts on the distributed query processing 11 Ishikawa, H., et al.: Document Warehousing Based on in the Web context. However, the above works don’t a Multimedia Database System, Proc. IEEE 15th Intl. cover such a topic. Second, we think that the Conference on Data Engineering, pp.168-173, 1999 12 Jutla, D., et al.: Making Business Sense of Electronic functionality of ECA rules is mandatory in order to Commerce, IEEE Computer, pp.67-75, Mar. 1999 model control flow of E-businesses. However, all of 13 Konopnicki, D. et al.: W3QS: A Query System for the the above query languages lack ECA rules. World-Wide Web. Proc. Intl. Conf. VLDB, 1995 Web query languages, such as W3QL [13], view 14 Makinouchi, A. et al.: The Optimization Strategy for the Web as a single huge database and enable to Query Evaluation in RDB/V1, Proc. Intl. Conf. VLDB, address the structures and contents. XBML views a 1981 single Web source as a database and allows queries 15 Microsoft: ASP, http://www.activeserverpages.com, over Web-based distributed databases. 2000 The active views [1] focuse on the comprehensive 16 Robie, J., et al.: XML Query Language (XQL), functionality of ECA rules. On the other hand, we http://www.w3.org/TandS/QL/QL98/pp/xql.html, 1998 have concluded the necessity of ECA rules from the 17 Special Section: Recommender Systems, Comm. experiences of applying XBML to concrete ACM, vol.40, no.3, pp.56-89, 1997 36