An Active Web-based Distributed Database System for E-Commerce
Hiroshi Ishikawa Manabu Ohta
Tokyo Metropolitan University, Dept. of Electronics & Information Eng.
Abstract ECbusiness models like e-brokers on the Web use WWW-based distributed XML databases. To flexibly model
such applications, we need a modeling language for EC businesses, specifically, its dynamic aspects or business processes.
To this end, we have adopted a query language approach to modeling, extended by integrating active database functionality
with it, and have designed an active query language for WWW-based XML databases, called XBML. In this paper, we
explain and validate the functionality of XBML by specifying e-broker and auction business models and describe the
implementation of the XBML server, focusing on the distributed query processing in the WWW context.
1 Introduction SQL to XML is inadequate because RDB and XML
XML data are widely used in Web information data have different data models. So we take a
systems and EC applications. In particular, e-broker nonprocedural query language approach to modeling
[12] business models on the Internet like XML-based businesses. Further, we extend the query
Amazon.com, use a large number of XML data such language approach by integrating ECA rules [8] with
as product, customer, and order data. In order both it for modeling control flow of the business processes.
to flexibly model and agilely realize such applications, Thus, we call XBML an active query language
we need a modeling language for EC businesses, in approach to modeling EC businesses. At the same
particular, business processes. To this end, we will time, we make XBML efficiently executable on the
adopt a query language approach to modeling EC server-side to agilely implement the business models.
businesses, extended by integrating active database This paper will not propose a new query language
[8] functionality with it, and will provide an active for XML submitted to W3C, although XBML
query language for XML data centric in business contains the query functionality as a basic part for
models, tentatively called XBML (Xml-based specifying business processes. XBML integrates the
Business Modeling Language) by extending the query facility with ECA rules for controlling business
earlier version [10]. As an active query language processes. We will describe the functionality of
approach, we need to consider its continuity with XBML and validate the usability of XBML by
nonprocedural database standards such as SQL. specifying the example business model with XBML
We rationalize the necessity of a modeling in Section 2. We will describe the current
language for EC businesses. First, the modeling implementation of an XBML server, focusing on the
language must be able to integrate the components by distributed query processing in Section 3.
reducing their complexity and to make the integrated
system understandable. Second, the modeling 2 Approach
language must be able to do more than model EC 2.1 Database Schemas and Business Model
businesses. Indeed, we can use XML as interfaces of We use the following database schemas or DTD
each component to make integration easy. However, fragments for illustrating the functionality of XBML:
this just models only the static aspects of the
components. We must be able to model the dynamic
aspects of business models, that is, business processes.
For example, the author [4] discusses the necessity of
modeling Web-based applications although he takes
an HTML/JavaScript approach in the context of
extending UML. But this approach would increase
the complexity of modeling the business logic and the
overhead of the client-server interaction on the
contrary. Instead, we need a nonprocedural language
approach to modeling the bushiness processes such as
SQL as an analogical solution although just applying
27
The following is a part of XML data with conformity discussed in the previous section.
to the above DTD:
(1) Searching Products
XBML provides the following functions for
Hiroshi describing product search processes:
Ishikawa - XBML allows product search by selecting
products based on their attributes, such as titles
L2 and authors, and constructing search results based
S210
on them.
- XBML allows ambiguous search by allowing
Object-Oriented Database System partially-specified strings and path expressions.
- XBML supports data join used in related search
Springer Verlag to promote cross-sell and up-sell.
69.00
- XBML configures search results by sorting and
grouping them based on product attributes.
- XBML supports “comparison model” of similar
We take an ordered directed graph as a logical products by allowing search multiply bound
model for an active query language XBML as a across shopping sites.
modeling language of EC businesses. That is, the data - XBML provides localized views (e.g., prices) of
model of the XBML can be represented as data global products by introducing namespaces (i.e.,
structures consisting of nodes (i.e., elements) and contexts).
directed edges (i.e., contain, or parent-child Note that we cannot describe sorting, grouping, and
relationships), which are ordered. namespaces due to the limit of space.
We also use e-broker business models based on
XML data for describing the XBML functionality. Data selection and construction
Here we will provide the working definition to EC The basic function of XBML is to select arbitrary
business models in general. The EC business models elements from XML data by specifying search
consist of business processes and revenue sources conditions for accommodating flexible product
based on IT such as Web and XML. We assume that searches in e-broker business models. XBML allows
e-broker business models on behalf of customers any combination of retrieved elements to produce
consist of at least the following business processes: new element constructs for further services. The
(1) The customer searches products by issuing following query produces new elements consisting of
either precisely- or titles and authors of books published by Prentice-Hall
approximately-conditioned queries against and firstly authored by Ullman:
one or more suppliers and /or navigating (Query1)
select result {$book.title, $book.author }
through the related links. from dlib URI “www.a.b.c/dlib.xml”, book $dlib.book
(2) The customer takes recommendations from where $book.publisher.name = “Prentice-Hall” and
suppliers into account if any. $book.author[0].lastname =“Ullman”
(3) The customer compares and selects products and $book.@year gt “1995”
and puts them into the shopping cart.
(4) The customer checks out by placing a The basic unit of XBML is a path expression, that
purchase order with registration. is, an element variable followed by a series of tag
(5) The customer tracks the order to check the names such as “$dlib.book”. The user must declare at
status for shipping. least one element variable in a from-clause. In
particular, the user can bind XML data as input
The revenue source in e-broker models is sales. specified by URI to element variables such as dlib.
Note that URI in our context must have the form
2.2 Business Model Specification “www.x.y.z/d.xml” but not “www.x.y.z”. This
We have adapted the design of XBML to the declares a context where an XBML query is
requirements for supporting EC business models evaluated. References of element variables are done
28
by prefixing “$” to them. The user checks a condition retrieving products represented as slightly
for selection in a where-clause. Two values of heterogeneous elements (i.e., semi-structured XML
elements are compared in an alphabetical order. data), which depend on data and suppliers in e-broker
Compare operators include “=”, “!=”, “lt” for “<”, business models. Here we define semi-structured
“le” for “<=”, “gt” for “>”, and “ge” for “>=”. XBML XML data as follows:
allows indexed access to ordered elements by (1) Elements with the same tag are repeated at more
specifying an index [i]. Attributes are referenced by than or equal to zero times, depending on parent
prefixing “@” to them. elements, such as authors of books.
“{}” in a select-clause enclosing elements (2) Elements with the same tag have variant
delimited by “,” creates new XML elements of a sub-structures, depending on parent elements,
specified construct such as author and title tags. The such as offices of authors.
result of an XBML query is XML data, which can be
retrieved as well as existing data. In our current As these characteristics cannot be determined in
design, the resultant XML data have no DTD, that is, advance, we allow partially-specified path
they are well-formed XML data. For example, the expressions.The following query retrieves authors of
result of the above query has the following structure, any material such as book and article named
automatically wrapped by a tag “XBML:result”: Ishikawa.
(Query2)
select result {$anyauthor}
from dlib URI “www.a.b.c/dlib.xml”,
A First Course in Database Systems
anyauthor $dlib.%.author
where $anyauthor.lastname =“Ishikawa”
Jeff
Here “%” denotes “wild card” in path expressions,
Ullman
which also allows approximate searches in e-broker
Gates Building
business models. “$dlib.%.author” matches both of
…
“book.author” and “article.author”.
… Data join
… XBML joins different elements by comparing their
values in a where-clause. The following query joins
books and articles by authors as a join key within the
Here, we define the basic syntax of XBML as same XML data:
follows: (Query3)
select result {$article, $book}
query = select target from context-list [where-clause]
from dlib URI “www.a.b.c/dlib.xml”, article $dlib.article,
[orderby-clause] [groupby-clause]
book $dlib.book
target=expression | tag ‘{’expression-list ‘}’
where $book.author.firstname = $article.author.firstname
expression-list = expression ‘,’expression-list | expression
and $book.author.lastname = $article.author.lastname
expression = [tag] ‘$’ variable | [tag] ‘$’ variable ‘.’ path
and $book.title = “%Electronic Commerce%”
path = ‘%’ | tag | ‘@’attribute| path ‘.’ path | ‘(’ path ‘|’
path ‘)’ | text
context-list = context ‘,’ context-list | context In e-broker business models, this helps increase
context = variable URI uri-list | variable expression cross-sell and up-sell. Here the customers can do
uri-list = uri uri-list | uri approximate searches over XML data by using wild
where-clause = where condition
condition = term | condition or term card “%” in strings, that is, partially-specified strings,
term = factor | term and factor as is often the case with search in e-broker business
factor = predicate | not predicate models. The query result has the following
predicate = expression compare expression structure:
orderby-clause = orderby expression-list
groupby-clause = groupby expression-list
Partially-specified path expression
…
XBML allows regular path expressions for flexibly
29
… those customers who purchased the products
… selected by the customer.
The facility for function definition and the query
transformation technique have an important role in
… recommendation as follows.
…
… Function definition
…
Functions correspond to “parameterized views”.
Functions modularize recurring queries in EC
business models to increase their reuse. The user
…
…
defines a function by specifying an XBML query in
its body. The syntax has the following form:
function-definition = function name ‘(’ parameter-list ‘)’ as
‘(’ query ‘)’
Multiple binding parameter-list = parameter ‘,’ parameter-list | parameter
The user can have universal access to multiple data
sources by binding a single element variable to As personalized recommendation, the following
multiple URIs (i.e., URI list) in a where-clause. The function recommends products based on the
following example retrieves books authored by the keywords which the customer (specified by its
same author from two online bookstores (bound to identifier, customerid) have registered in advance as
dlib) by only a single query at the same time: his psycho-graphic data:
(Query4) function personalized-Recommendation (customerid) as
select result {$book.title, $book.author} (select result {$book.title, $book.price}
from dlib URI “www.a.b.c/dlib.xml” “www.x.y.z/dlib.xml”, from dlib URI “www.a.b.c/dlib.xml” , book $dlib.book, r URI
book $dlib.book “www.a.b.c/registration.xml”, customer $r.register.customer
where $book.author.lastname =“Ishikawa” where $book.keyword = $customer.keyword and $customer.id
= customerid)
The users need to declare the partially-specified
path expression to accommodate the heterogeneity of The next example in the collaboratively-filtered
datasources. This function is necessary for,comparing recommendation category recommends products
similar products or searching the lowest price in based on similarity that there are other customers who
multiple stores. purchased the product selected by the customer (i.e.,
indicated by selected).
(2) Recommendation function collaboratively-filtered-Recommendation (selected) as
(select result {$book.title, $book.price}
Related search as a recommendation process is from dlib URI “www.a.b.c/dlib.xml” , book $dlib.book, r URI
crucial in promoting cross-sell and up-sell, indeed. It “www.a.b.c/registration.xml”, customer $r.register.customer
is classified into three categories to the extent to where $book = $customer.purchased and
which the customer in session is involved. $customer.purchased = selected)
(1) Non-personalized recommendation
The customer is not involved. The e-broker Query transformation
recommends some products as general trends, Until now, we have treated recommendation and
independently of the customer. Or, the e-broker search as separate processes. However, when the
shows the customer products highly rated by the customer specifies search keywords, the search result
other customers. can be expanded to include recommended products
(2) Personalized recommendation by transforming the original search query. Query
The customer only is involved. The e-broker transformation is classified into two rules as follows:
recommends some products based on the (1) Keyword addition rule
customer’s psycho-graphic data, such as interests, This rule has the general form:
keyword1 ==> keyword1 | keyword2
or historical data, such as purchase records.
(3) Collaboratively filtered recommendation [17]
Both the customer and the others are involved. For example, the originally specified keyword
The e-broker recommends products purchased by “Electronic Commerce” adds a new keyword
30
“Internet Business” and the disjunctive condition is result $XBML:result.result
added to the end of the query as follows: where $result.checked =“yes”
(Query5)
select result {$book} (4) Placing Orders
from dlib URI “www.a.b.c/dlib.xml, book $dlib.book Selected items in the shopping cart remain to be
where $book.keyword = “Electronic Commerce” or added to ordering databases. Thus, addition of new
$book.keyword = “Internet Business”
elements is a mandatory function for constructing
practical e-broker models. Addition of new elements
This technique is similar to query expansion [3] often needs making them unique by invoking a
used in information retrieval. Note that this type of dedicated function, defined in programming
transformation keeps data sources unchanged. languages such as Java. To this end, XBML also
allows function invocation in a query.
(2) Data source addition rule
This rule uses set operations on queries to modify Insertion and function invocation
the original one. The rule has the following general We provide the syntax for insertion by using an
form: XBML query as follows:
query1 ==> query1 set-operator query2
insertion = insert into target query
Here set-operator includes union, intersection, and The following query places a purchase order in
difference. For example, when the customer e-broker business models by consulting the current
searches books on EC, he will search articles on shopping cart and customer data and invoking a
EC at the same time by modifying the original function:
query with a disjunctive query as follows: (Query8)
(Query6) insert into $order
select result {$book} select order {@id = OrderID($customer.id, date()),
from dlib URI “www.a.b.c/dlib.xml, book $dlib.book item $cart.item}
where $book.keyword = “Electronic Commerce” from r URI “www.a.b.c/registration.xml”,
union customer $r.register.customer,
select result {$article} XBML:result URI “www.a.b.c/XBML:result.xml” ,
from dlib URI “www.a.b.c/dlib.xml, article $dlib.article cart $XBML:result.cart, o URI “www.a.b.c/ordering.xml”,
where $article.keyword = “Electronic Commerce” order $o.order
where $customer.lastname =“Kanemasa”
We analyze the application-based Web access
patterns [5] to create the transformation rules, not Here, in a select-clause, function calls
discussed here due to space limitation. “OrderID($customer.id, date())” generate unique
order numbers. Ordering initiates internal processes,
(3) Moving to Carts such as payment and shipment, hidden from the
In general, EC business models involve temporary customers. Please note that “$order” in the
data, such as search results and shopping carts, valid into-clause is permanent in
only within sessions as well as permanent data such “www.a.b.c/ordering.xml” while “order” in the
as books and customers. XBML handles such select-clause is temporarily constructed in this query.
temporary data as first-class citizens.
(5)Tracking Orders
Use of query results Ordering and shipping constitute a supply chain in the
XBML allows a query against the intermediate query EC business models. Further, shipping is often
results as well. The customer checks the result of outsourced. Thus, the involved data are managed at
searching products or recommendation to place an separate sites whether on intranet or on the extranet.
order. The following XBML query moves only the To this end, XBML allows data join across different
customer-checked items in the search result to the sites in addition to that within one site.
shopping cart:
(Query7) Join of data from multiple data sources
select cart {item $result.book} The user can join heterogeneous XML data from
from XBML:result URI “www.a.b.c/XBML:result.xml” ,
31
different data sources indicated by different URIs. In by sellers corresponds to registry of products by
e-broker business models, the following query suppliers, just implicit in the e-broker model.
produces a set of ordered items and shipping status by Searching and recommendation of auction items are
joining order identifiers of order entry data and order very close to those of products in the e-broker model.
shipping data at different sites indicated by separate Indeed, bidding is a new process, but it can be viewed
variables bound to multiple URIs, such as o and s: as a series of tentative ordering until the buying
(Query9) customer wins the auction. In other words, the event
select result {$order.item, $ship.status} that the customer wins the auction moves auction
from o URI “www.a.b.c/ordering.xml”, s URI
“www.d.e.f/shipping.xml”, order $o.order, ship $s.ship items to the shopping cart. The winner’s placing a
where $order.id=$ship.id and $order.id=“cidymd” purchase order is very close to that in the e-broker
model. Order tracking in the auction model is
In general, there are two approaches to resolving analogous to that in the e-broker model although it
heterogeneity in schemas of different databases: may require a new business model, such as e-escrow,
schema translation based on ontologies and schema to guarantee the bargain contract. The revenue source
relaxation based on query facilities. XBML takes the is a part of the contract price as fees in the auction
latter approach, that is, XBML uses regular path model. Thus, we would say that our XBML can apply
expressions and element variables to enable the user to the auction model as well.
to retrieve multiple databases with heterogeneous However, it is also true that controlling business
schemas by a single query at one time because the processes, or modeling events by some ways is
regular path expressions can match with more than necessary. Thus, the auction model requires
one path and the element variables can be bound to triggering business processes at a specified time or on
more than one path. Further, we allow well-formed some database events such as insert. Active databases
XML data containing a set of heterogeneous element or ECA (Event-Condition-Action) rules [8] will be
as a query result. Of course, we admit that a simple able to specify such business processes on events
solution to schema translation between heterogeneous more elegantly than procedural programming
DTD is based on XSL (i.e., XSL Transformations). languages plus the current version of XBML.
Therefore, we extend current XBML by introducing
2.3 Applicability to Other Models and Extension the following construct for ECA rules:
In the previous subsection, we have discussed the on event if condition then action
applicability of XBML to the e-broker models. Now
we ascertain its applicability to business models other Events include operations of XBML (e.g., select
than the e-broker model. Indeed, there are rather and insert) and a specified time. Conditions are
novel EC business models, such as the reverse specified as conditions of XBML. Actions are also
auction model. However, new business models are specified by XBML.
often created by mutation of business processes of For example, we think of the situation that when
existing models. We take the auction model [12] as the highest bidding price of the auction specified by
an example. The auction model consists of the id1 is updated, if the current time is before the closing
following processes: time of the auction, then the auctioneer specified by
The selling customer registers auction items. id2 increases his bidding by a specified value value3.
(1) The buying customer searches auction items. The corresponding ECA rules can be specified as
(2) The buying customer takes recommendations follows:
on insert into $auction.price
into account if any. if now() lt $auction.closing-time
(3) The buying customer bids. then insert into $auction.auctioneer.price
(4) The winner customer checks out by placing a select increase ($autioneer.price, “value3”)
purchase order with registration. from actn uri “www.a.b.c/actn”,
(5) The winner customer tracks the order to auction $actn.auction
where $auction.id = “id1”
check the status for shipping. and $auction.auctioneer.id = “id2”
We can observe similarity between the auction Here, now() returns the current time and increase(var,
model and e-broker model. Registry of auction items val) increments the variable var by a value val. The
32
ECA rules are defined in advance and invoked on searching node identifiers in Attribute_Node.
events. The ECA rules can elegantly implement the We cluster data in node and edge tables on a
recommendation (e.g., Query5): breadth-first tree search basis. We have found this
on select result {$book} way of clustering contributing very much to reducing
if $book.keyword = “Electronic Commerce” I/O cost. Further, we have known from our
then select result {$book}
from dlib URI “www.a.b.c/dlib.xml, preliminary experiments that the DTD-dependent
book $dlib.book mapping approach is mostly two times more efficient
where $book.keyword = than the universal one. However, we have focused on
“ElectronicCommerce” or more of our implementation efforts on the universal
$book.keyword = “Internet Business”
mapping approach for the following reasons:
(1) The approach can free the burden of defining
Note that the result of the “event query” (i.e., first
idiosyncratic mappings from the users.
“select”) is replaced by that of the “action query” (i.e.,
(2) The approach can store XML data whose DTD
second “select”) in this case.
are unknown in advance.
(3) The approach can store heterogeneous XML
3 Implementation
data, in particular, semi-structured XML data
XBML is intended for use in not only modeling EC
in the same database.
business models, but also realizing them agilely.
XBML must be efficiently implemented, too. XBML
Next, we describe the system architecture for a
containing URIs intrinsically requires distributed
local XBML server or an XBML processing system.
query processing. So we construct the XBML server
We make appropriate indices on tag values,
as follows:
element-subelement relationships, and tag paths in
(1) We construct local XBML servers as a basis.
advance.
(2) We construct global XBML servers by
We describe how the XBML processing system
extending the local servers with server-side
works. The XBML language processor parses an
scripting techniques.
XBML query and the XBML query processor
generates and optimizes a sequence of access
3.1 Local Server
methods for efficient execution. The primitive access
We describe the basic architecture and
methods are basic operations on node sets,
implementation of a local XBML server. First, we
implemented by using RDBMS or ODBMS. They
describe storage schema for XML data. We have
include get_NodeId_by_Path&Val,
explored approaches to mapping DTD to databases
get_ParentId_by_Child, get_ChildId_by_Parent,
(RDBMS, i.e., Oracle and ODBMS, i.e., Jasmine [9])
get_Value_by_Id, get_NodeId_by_Path, and
and to implement an XBML processing system [11].
get_LabelId_by_LabelText in addition to node set
If any DTD or schema information is available, we
operators, such as union, intersection, and difference.
basically map elements to tables and tags to fields,
We illustrate the translation by using the query:
respectively. We call this approach DTD-dependent select $book.title
mapping, where the user must specify mapping rules from dlib URI “www.a.b.c/dlib.xml”, book $dlib.book
individually. Otherwise, we take a DTD-independent where $book.publisher.name = “Prentice-Hall”
mapping or universal mapping approach, which
divides XML data into nodes and edges of an ordered This is parsed into an internal form, which denotes
directed graph and stores them into separate tables for a logical query plan represented as an ordered-graph:
nodes and edges with neighboring data physically (Proj (Sel $book (Op_EQ $book.publisher.name
“Prentice-Hall”)) $book.title)
clustered. We provide separate tables for nonleaf and
leaf nodes. The order fields of Leaf_Node and Edge
Here, Sel, Proj, and Join (not in the above
tables are necessary for providing access to ordered
example) denote selection, projection, and join of
elements by index numbers. Identifiers, such as ID
XML data, respectively. Op_EQ denotes “=”. This
and IDREF, realizing internal links between elements
internal form is reorganized in a pattern-directed
are declared as attributes and are stored as Value of
manner, such as placing Sel before Join, and is
the separate Attribute_Node table. So references
transformed into the following primitive operations:
through identifiers are efficiently resolved by
33
(1) get_NodeId_by_Path&Val (Op_EQ the event and action queries as a single transaction.
“$book.publisher.name” “Prentice-Hall” ) However, the generated queries tend to be long, in
returns a node set set1 (i.e., $book.publisher). particular, for cascading events.
(2) get_ParentId_by_Child (set1 “$book”) returns For the moment, we adopt the query rewriting
a node set set2. approach in favor of the ease of the implementation.
(3) get_ChildId_by_Parent (set2 “$book.title”)
returns a node set set3. 3.2 Global Server
(4) get_Value_by_Id (set3) returns a value set as a Now we construct the global XBML server by
result. extending the above local XBML servers with
server-side scripting techniques. We provide
Both RDBMS and ODBMS can be used as the preliminary definitions to queries. First, we
database system of the XBML processing system categorize queries as follows:
with the upper layers unchanged by virtue of the (A) Single-URI query
above primitive operators. This type of query contains only one XML data
We describe the implementation of ECA rules. source specified by a single URL in the query,
First, we define an event query by using the event and such as Query1 (selection) and Query3 (join).
the condition in the rules and define an action query (B) Multiple-URI query
by using the action in the rules. Further, we store a This type of query contains multiple XML data
dedicated ECA rule database whose entry consists of sources specified by multiple URIs in the query.
a pair of such an event and an action query. Now we This type is further categorized into two as
consider the following approaches to ECA rule: follows:
(1) Monitor-based approach (B1) Decomposable query
The monitor checks each usual query against the This type of query can be decomposed into
event query patterns in the ECA rule database and a combination of single-URI queries with
issues the corresponding action query of the set operators, such as Query4 (multiple
matched event query if the condition is satisfied. binding) and Query6 (set operators).
The monitor usually keeps a queue of events (B2) Non-decomposable query
generated by the matched event query and invokes This type of query cannot be decomposed
the action query by looking up in the queue. into a combination of single queries alone.
(2) Query rewriting approach This type of query contains join queries
We modify the query processing. The parser over multiple URIs, such as Query9 (join of
checks each query against the event query patterns multiple data sources).
in the ECA rule database and recursively processes
the corresponding action query of the matched Second, we categorize queries in another way:
event query by adding a check on the condition. (a) Local query
That is, the “event query” and “action query” in the XML data sources specified by URI are inside
rules are translated into a sequence of queries (i.e., the relevant XBML server.
primitive access methods) with a condition check. (b)Global query
XML data sources specified by URI are outside
Next we consider the merits and demerits of the the relevant XBML server.
above approaches as follows: Now we show that non-decomposable (i.e.,
(1) Monitor-based approach intrinsically global) query can be transformed into
The monitor can control the whole processes in a a series of single URL local or global queries and
centralized manner. However, we need a monitor local queries (join). We assume that the original
itself as an extra mechanism. It is not trivial to query contains n URIs. We translate a
provide the facility for executing the event and non-decomposable query by two steps:
action queries as a single transaction. (1) create a single-URI (local or global) query
(2) Query rewriting approach for each of n URIs with the insertion of the
We need no extra mechanism for controlling query result into the local server.
processes.. It is rather straightforward to execute (2) create single-URI queries performing join of
34
the results stored in the local server, which }
are local queries, by reducing all URIs to a
single-URI. The above query processing has some room for
improvement in performance. Thus, if the
Queries generated by the step (1) localize non-decomposable query has no selection conditions,
single-URI global queries. Of course, single-URI the whole remote data sources specified by the
local queries remain local. We call them localized generated single-URI queries must be copied to the
single-URI queries. After that, queries generated by local server. For example, consider Query9 when the
the step (2) simulate join of multiple data sources by global server is resident at “ordering site” or at a third
join of local data sources. We call them localized join site. Of course, if there is any selection condition on
queries. For example, when we assume that the global the join key, the condition is propagated to all the
server is resident at the “shipping site”, consider single-URI queries. We call this technique simple
again the following query (Query9): selection condition propagation. It is a kind of static
select result {$order.item, $ship.status} query rewriting. However, we want more
from o URI “www.a.b.c/ordering.xml”, s URI improvement. So we refine the process-or-dispatch
“www.d.e.f/shipping.xml”, order $o.order, ship $s.ship scheme to sort the result of the query and return the
where $order.id=$ship.id and $order.id=“cidymd”
value range with respect to the join key (i.e., MIN and
MAX values) by adding “order-by” to the query.
The query is translated into the following
Then, the conditions “join-key ge min-value and
localized single-URI query, whose result is fetched
join-key le max-value” are dynamically added to the
into the global server:
select $order subsequent generated single-URI query. In turn, the
from o URI “www.a.b.c/ordering.xml”, order $o.order query is evaluated to produce a new value range of
where $order.id=“cidymd” the join-key (i.e., min-value’ and max-value’). The
following characteristic holds: min-value’ >=
and into the following localized join query, which min-value & max-value’ <= max-value. From this,
produces a result of the original query: we can conclude that the expected selectivity is better
select result {$order.item, $ship.status} than that of the original algorithm. If a single-URI
from XBML:result URI “www.d.e.f/XBML:result.xml” ,
order $XBML:result.order, s URI “www.d.e.f/shipping.xml”,
query Q has any selection condition “keyQ ge
ship $s.ship min-valueQ” and “keyQ le max-valueQ”, then we take
where $order.id=$ship.id MAXQ(min-valueQ ) and MINQ(max-valueQ ) as an
initial min-value and max-value, respectively. A
Now we describe the global query processing, single-URL query being firstly processed is chosen
assuming that a query Q with a uri URI is specified as from ones with any selection condition on the
the input: non-key because now all the single-URL queries
if Q is a single-URI query then virtually have the same condition on the key (i.e., the
process-or-dispatch (Q); initial value range). If there is no selection condition,
else {/*i.e., Q is a multiple-URI query; */
if Q is a decomposable query then any local query being firstly processed will produce
{ for each sub-query Qsub in Q the initial value range. It has the merits: It can avoid
process-or-dispatch (Qsub); extra data transfer by just issuing modified queries
merge the result by the local server;} and avoid extra protocol by just accommodating the
else {/*i.e., Q is not a decomposable query;*/
decompose Q into localized single-URI queries
min/max values in results.
Qloc-s and localized join queries Qloc-j; XBML works as server-side scripting with
for each sub-query Qsub in Qloc-s database access such as CFML[2], and ASP [15] and
process-or-dispatch (Qsub); provides universal access to distributed XML data. If
process Qloc-j by the local server;} XBML queries are embedded in XML-based scripts,
}
process-or-dispatch (Q) /* for single-URI query*/ the global XBML server can provide more direct and
{ if URI is local to the server then universal interfaces to representing and accessing
process Q by the local server; distributed XML data than the other approaches. That
else {/*i.e., Q is not local to the server; */ is, XML pages containing the element
dispatch Q to the relevant remote server;
store the result into the local server;}
XBML-query are interpreted as scripts.
35
4 Conclusion businesses. We extend the query optimization in
We have proposed and validated XBML as an XML relational databases[14] to the distributed context.
active query language approach to specifying EC
business models. We compare our work with related References
work. There are no high-level language approaches to 1 Abiteboul, S. et al.: Active Views for Electronic
modeling EC business processes, in particular, no Commerce, Proc. Intl. Conf. VLDB 1999
other work on validating the modeling language by 2 Allaire Corporation: CFML,
applying it to EC business models. XBML can http://www.allaire.com/documents/cf4/CFML_Langua
ge_Reference/contents.htm, 2000
provide a more direct and universal tool for modeling
3 Chang, C.H., et al.: Enabling Concept-Based
distributed XML data applications than server-side Relevance Feedback for Information Retrieval on the
scripting tools such as CFML [2], ASP [15]. WWW, IEEE Trans. Knowledge and Data Eng.,
Now we will compare our XBML with other query vol.11, no. 4, pp.595-609, 1999
language proposals from the viewpoint of process 4 Conallen, J.: Modeling Web Application Architectures
specification since XBML contains the query with UML, Comm. ACM, vol.42, no.10, pp.63-70,
language functionality as a basic part. XML-QL [6] 1999
has comprehensive functionality and has much in 5 Cooley, R., et al.: Web Mining: Information and
common with our XBML. However, condition Pattern Discovery on the World Wide Web, Proc. the
specification in XML-QL is rather verbose. If applied 9th IEEE International Conference on Tools with
Artificial Intelligence (ICTAI'97), 1997.
to business modeling, XML-QL would make query
6 Deutsch, A., et al.: XML-QL: A Query Language for
formation rather complex. XML,
XQL [16] has compactly-specified functionality http://www.w3.org/TR/1998/NOTE-xml-ql-19980819,
and has common functionality with our XBML. XQL 1998
focuses more on filtering a single XML document by 7 Goldman, R., McHugh, J., and Widom, J.: From
flexible pattern match conditions similar to XSL. If Semistructured Data to XML: Migrating the Lore Data
applied to specifying EC business models involving Model and Query Language, Proc. the 2nd Intl.
multiple sites, XQL would require the user to write Workshop on the Web and Databases (WebDB '99),
extra application logic in addition to query formation. 1999.
Lore [7] provides a powerful query language for 8 Ishikawa, H., et al.: An Active Object-Oriented
Database: A Multi-Paradigm Approach to Constraint
retrieving and updating semi-structured data based on
Management, Proc. Intl. Conf. VLDB, 1993
its specific data model OEM, but it lacks some 9 Ishikawa, H., et al.: An Object-Oriented Database
functionality such as multiple binding. System Jasmine: Implementation, Application, and
So far we have compared XBML with the other Extension, IEEE Trans. Knowledge and Data
works only from the viewpoint of query languages. Engineering, vol. 8, no. 2, pp.285-304,1996
However, the above languages are largely different 10 Ishikawa, H., et al.:
from XBML for the following reasons. First, we http://www.w3.org/TandS/QL/QL98/pp/flab.doc, 1998
focus our efforts on the distributed query processing 11 Ishikawa, H., et al.: Document Warehousing Based on
in the Web context. However, the above works don’t a Multimedia Database System, Proc. IEEE 15th Intl.
cover such a topic. Second, we think that the Conference on Data Engineering, pp.168-173, 1999
12 Jutla, D., et al.: Making Business Sense of Electronic
functionality of ECA rules is mandatory in order to
Commerce, IEEE Computer, pp.67-75, Mar. 1999
model control flow of E-businesses. However, all of 13 Konopnicki, D. et al.: W3QS: A Query System for the
the above query languages lack ECA rules. World-Wide Web. Proc. Intl. Conf. VLDB, 1995
Web query languages, such as W3QL [13], view 14 Makinouchi, A. et al.: The Optimization Strategy for
the Web as a single huge database and enable to Query Evaluation in RDB/V1, Proc. Intl. Conf. VLDB,
address the structures and contents. XBML views a 1981
single Web source as a database and allows queries 15 Microsoft: ASP, http://www.activeserverpages.com,
over Web-based distributed databases. 2000
The active views [1] focuse on the comprehensive 16 Robie, J., et al.: XML Query Language (XQL),
functionality of ECA rules. On the other hand, we http://www.w3.org/TandS/QL/QL98/pp/xql.html,
1998
have concluded the necessity of ECA rules from the
17 Special Section: Recommender Systems, Comm.
experiences of applying XBML to concrete ACM, vol.40, no.3, pp.56-89, 1997
36