-

E±cient Processing Regular Queries In Shared-Nothing Parallel Database Systems Using Tree- And Structural Indexes ?

Vu Le Anh

leanhvu@inf.elte.hu 0

Attila Kiss

kiss@ullman.inf.elte.hu 0 0 Department of Information Systems, ELTE University , Hungary

2007

70 85

In this paper, we introduce and study an e±cient regular queries processing algorithm on a very large XML data set which is fragmented and stored on di®erent machines. The machines are connected by the high speed interconnection. In this system the e±ciency of a query processing algorithm depends on two main factors: the waiting time for the answer and the total query processing and communication cost over all machines of the system. In the partial processing approach, the query is sent to and partially evaluated at each server in parallel. The parallelism reduces the waiting time, but there are several redundant operations as it has to compute all possible cases for each fragment. In the stream processing approach, the query processing cost is minimized by parsing the data graph with the query. A fragment is visited if it is necessary, but there is no parallelism and the communication cost is high. To take the advantages of the shared-nothing parallel system, our algorithm is based on the partial evaluation. We describe two types of redundant operations. They are rejected by pre-computing the query on our treeand structural indexes. The sizes of the indexes and the processing costs over them are considered as constants. Our algorithm overcomes two above algorithms according both the waiting time and the total query processing and communication cost criteria both in theory and in experiment.

There are two motivational factors in the causation of the environment described in this paper: Shared-nothing Parallel Database System [ 1 ] and XML [ 13 ]. Our concept of shared-nothing parallel database system is the system of machines connected by high speed interconnection. The parallel database systems are being used in a wide variety of systems as they are extensible and are powered by the growth of RAM volumes and the speed of local network nowadays. On the other hand, XML has become the dominant standard for exchanging and querying documents over the Internet. XML o®ers its users many advantages, * q0 (a) Regular query Q * q2 //a/b//a a q1 b a q3

2 b 5 a F1 6 c

8 b especially in self-describing, extendibility and inter-operability. XML is a good choice for representing various types, dynamic schema or open data sets. In this paper, we work with the shared-nothing parallel database system which is considered as a very large XML data set stored and fragmented over machines.

To retrieve XML and semi-structured data, several query languages have been proposed (XML-QL [ 10 ], UnSQL [ 9 ], XPath [ 11 ], and XQuery [ 12 ]). The common feature of these languages are the use of regular path expressions. A data node is an element of the answer of a regular query, if the path from the root to this node matching the corresponding regular path expression. This paper introduces and analyzes e±cient techniques for processing regular queries in shared-nothing parallel database system using tree and structural indexes. The criteria of the e±ciency are: the waiting time for the answer and the total query processing and communication cost over all machines of the system. There are two approaches for processing regular queries in distributed environment: stream processing and partial processing. We will study and compare our algorithm with the partial parallel processing algorithm represented for partial processing approach [ 5, 4 ] and the tree traversal algorithm represented for stream processing approach [ 5, 4 ].

As an example, we consider the regular query E = ==a=b==a (written in XPath syntax) over the XML tree T depicted in Fig.1(b). T is fragmented by F0; : : : ; F5. E can be represented by the nondeterministic ¯nite automata Q depicted in Fig. 1(a). The answer of E over T is the set of data nodes A = f5; 13; 16; 19; 22; 25g. A is determined by traversing T and Q with following rules: (1) beginning at (1; q0) (the pair of the roots of T and Q) ; (2) From (u; q) we visit (v; p) for each v is a child of u and there exists a transition from state q to state p with the label of u. (3) u is an element of A if some (u; q) is visited and there exists a transition from state q to ¯nal state p with the label of u.

In fragmented environment, the above algorithms works di®erently as follows:

Tree traversal. The traversal algorithm is modi¯ed as follows. In rule (2), if u is the root of another fragment then we jump to and continue processing over the sub-fragment. After ¯nishing the process the sub-fragment will give back the control and the result to the parent fragment and the parent fragment will continue processing. In the example, with the depth-¯rst traversal the the list of visited fragments sorted by time is F0F1F0F1F0F2F5F2F5F2F0F3F4F3F4F3F0.

Partial Parallel Processing. The basic processing operation (F; q) is to process over a fragment F beginning at the root of F with state q without communicating other sites or operations. If the operation (F2; q3) is processed, there is no communication with the operations over F0 or F5 and the result is f17g. We process each site in parallel. At each site, we compute all possible processing operations and ¯nd out the relationship of the operations. The relationship of operations tells us about which operations are processed over the child-fragment if some operation is processed over the parent-fragment. In the example, in the site containing F2 we process four operations (F2; q0); : : : ; (F2; q3) over F2 and we will ¯nd out following facts of relationship of the operations: if (F2; q0) is processed, (F5; q0) and (F5; q1) will be processed if (F2; q1) is processed, (F5; q2) will be processed if (F2; q2) is processed, (F5; q2) and (F5; q2) will be processed

Based on all map of operations from the sites the chosen master site will decide which operations are reachable if the operation (Froot; qroot) is processed (Froot : the fragment containing the root of the tree; qroot: is the start state of the query graph). The reachable operations in the example are: (F0; q0), (F1; q0), (F1; q2), (F2; q0), (F5; q0), (F5; q1), (F3; q0), (F3; q1), (F4; q0), (F4; q3). Finally, each site only sends the sub-results of reachable operations, which has been computed in previous phases, to the master site. The result of the query is the union of them.

There is no parallelism in the tree traversal algorithm so the waiting time is not reduced. Moreover the communication cost is high as each fragment and each site may be visited by many times (F0 is visited 5 times, F1 is visited 2 times, etc). With the partial parallel processing algorithm, the waiting time is reduced by the parallelism. The communication cost is also reduced as each site is visited only one time, and the total network tra±c is independent from the size of the XML tree [ 4 ]. However, the total processing cost may be high as for each fragment we process for all possible operations. This is not good in case the system serves many clients in the same times.

Our ¯rst contribution is to introduce the indexes. The tree-index stores information of the paths connecting the roots of fragments. The DL-tree-indexes coming from the DL-indexes [ 7 ] summarize the fragment structure. The total size of the indexes depends only on the number of fragments, which is considered as constant in shared-nothing parallel database system. Our second contribution is an e±cient algorithm for processing regular queries over shared-nothing parallel database system. We de¯ne two types of redundant operations in the partial processing approach. The redundant operations of type 1 are the operations which are not reachable from the operation (Froot; qroot). The redundant operations type 2 are the operations which do not ¯nd out any matching nodes. The redundant operations type 1 are determined by processing the query over the tree index . The redundant operations type 2 are restrained by processing over some structural index of the fragment. Our algorithm ¯rstly processes on the indexes to reject redundant operations. The processing cost is considered as constant in case the total size of the index is considered constant. The pre-computing on the indexes not only decreases the total query processing in the partial evaluation phrase but also minimizes the communication cost and the waiting time as sites communicate only one time with the master site and processing with minimal number of redundant operations. Our third contribution is an experimental study evaluating our algorithm versus other approaches. Our experimental results show that our algorithm overcomes the other algorithms according both the waiting time and the total query processing and communication cost.

The remainder of the paper is organized as follows. Section 2 is the preliminary. We introduce the concepts of the data model and the regular queries and a template for processing over a fragment using in this paper. Section 3 is about stream processing vs. partial processing approaches. We study two algorithm: tree traversal algorithm represented for stream processing approach and partial parallel processing algorithm represented for partial processing approach. In section 4, we introduce and describe the technics to using tree- and structural indexes to processing regular queries over share-nothing parallel database system. In section 5, we present our experiments. Section 6 is the related works. Section 7 concludes the paper. 2 2.1

Preliminaries Data Model

The data set is modelled as a labelled tree T . § is an alphabet. Each node v of T is labelled by a label value label(v) 2 §. T is fragmented by a collection F of disjoint trees, or fragments Fi. The fragments are stored in di®erent sites (or machines). The sites are connected with each others by high speed interconnection. The numbers of sites and fragments are considered as constants.

The fragment containing the root of the tree T , is called the root fragment denoted by Froot. In Fig. 1, this is fragment F0. Given two fragments F and F 0, we say that F is a child-fragment of F 0 and F 0 is the parent-fragment of F if there exists node v 2 F 0 such that the root w of F is a child of v in the original tree T . In Fig. 1, fragment F5 is a child-fragment of F2 and F0 is a parent-fragment of F1 . site(F ) denotes the site containing fragment F . Naturally, we assume that site(F ) 6= site(F 0) if F is a child-fragment of F 0. 2.2

Regular Queries

A regular expression over § alphabet is de¯ned as follows:

R = "j a j R1:R2 j R1jR2 j R1¤ where " is the empty word; a 2 § is a letter; R1, R2, R are regular expressions; (:), (j), (¤) are the concatenation-, the alternation-, and the iteration operations respectively. Each regular query is de¯ned by a regular expression, and each regular expression also de¯nes a regular language over §, which contains all words over § marching the regular expression. We call the rooted labelled directed graph of the ¯nite nondeterministic automata, which computes the regular language de¯ned by the regular expression, the query graph of the query. The start state of the automata is denoted by qroot. A data node u is an element of the result of regular query Q over tree T , if the label of the path from the root to u matches the according path expression. You can see the example of query graph in Fig. 1. We just remark our readers that the formula writing the query in the example is in XPath syntax [ 11 ], which is converted easily to our de¯ned syntax. 2.3

Processing Over Fragment

We introduce and study basic operation processing a fragment begin at the root with some state of the query graph. The template of the operation processing fragment F with state q is as follows: Fragment-Process(F ,q) begin 1. F ragResult Ã ; 2. Pre-Process(F ,q) 3. Scan(root(F ),q) 4. Post-Process(F ,q) 5. return F ragResult end Matching(q,label) begin 1. S Ã ; 2. for each transition (q; label; p) do 3. S Ã S [ fpg 4. return S end

Scan(u,q) begin 1. if u is stored in another fragment then 2. Working-Link-Node(u,q) 3. else 4. for each state p 2 Matching(q; label(u)) do 5. if p is ¯nal state then 6. F ragResult= F ragResult [ fug 7. for each edge (u; v) do 8. Scan(v,p) end

In the generalized fragment processing algorithm, the Matching(q,label) procedure returns the set of states p where there exists a transition from state q to state p labelled by label. We traverse the data graph with the Scan procedure. If a pair (u; q) is visited by calling Scan(u,q), these pairs (v; p) are also visited for each edge (u; v) and p 2 Matching(q,label(u)). Each pair (u; q) is visited maximum one time as T is a tree. Therefore, the Scan procedure is never in in¯nity loop. The Working-Link-Node procedure is called when we meet a node stored in another fragment. The main di®erences between the query processing algorithms over distributed environment are presented by the implementation of Working-Link-Node, Pre-Process and Post-Process virtual procedures for working with nodes stored in another fragment, doing before or after processing fragment respectively.

Stream Processing vs. Partial Processing Approaches

We study two most popular approaches for processing regular queries in distributed environment: stream processing and partial processing. We study tree traversal algorithm represented for stream processing algorithm approach in subsection 3.1 and partial parallel processing algorithm represented for partial processing approach in subsection 3.2. 3.1

Tree Traversal Algorithm

The main behavior of stream processing approach when we meet a node stored in another fragment is to jump to the new fragment for continuing processing. The parent process waits until the process operation over the child-fragment ¯nishes, then it receives the result of the operation from the child-fragment and continue processing. Therefore, the Post-Process and Working-Link-Node procedures are implemented as follows: Post-Process(F ,q) begin 1. if F 6= Froot then 2. Send-Result(Result) end

Working-Link-Node(u,q) begin 1. Let F be the remote fragment containing u 2. Fragment-Process(F ,q) (Calling from remote) 3. subResult ÃReceive-Result(F ) 4. F ragResult Ã subResult [ F ragResult end

The Send-Result procedure is used for sending the result to the parentfragment. The Receive-Result(F ) procedure is used for receiving the result from the operation over child fragment F . A Receive-Result procedure calling synchronizes with some Send-Result procedure calling. Finally, the result of the query on tree T is the result of the fragment processing operation (Froot; qroot). It is determined by calling the Spider procedure.

Spider(Q) begin 1. return Fragment-Process(Froot,qroot) end

The example of the tree traversal algorithm is shown in section 1. The tree traversal algorithm can be considered as a sequence of calling the FragmentProcess procedure. There is no parallelism. Moreover each site and fragment can be visited many times and the number of the communication between sites is not constant, it depends on the query graph. Therefore, the waiting time for the answer is very high, and it does not take the advantages of the share-nothing parallel database systems.

Partial Parallel Processing Algorithm

The main behavior of the partial processing approach is to process each operation independently. There is no communication with other operation while processing. When we meet some node stored in another fragment, we just write down the relationship between the current operation and the corresponding operation over the child fragment. Therefore, the Pre-Process and Working-Link-Node procedures are implemented as follows: Pre-Process(F ,q) begin 1. bF Ã F 2. bQ Ã q end

Working-Link-Node(u,q) begin 1. Let F be the remote fragment containing u 2. M ap Ã M ap [ ((bF; bQ); (F; q)) end

The fact "if operation (bF; bQ) is processed then (F; q) will be processed" is coded by ((bF; bQ); (F; q)). (bF; bQ) is current operation. bF and bQ variables are initialized in the Pre-Process procedure. M ap is the global variable storing all relationships of operations we ¯nd out when processing over the site. The query processes over sites are independent and in parallel. At each site, all possible operations are processed ¯rstly. After that the map of operations is sent to the chosen master site. Then each site waits until the master site sends the set of operations, which are reachable when we process the operation (Froot; qroot). Finally, all sites just send the reachable results to the master site and the union of these result is the result of the answer. Master(Q) procedure is run in the master site to control the sites processing query graph Q. The partial result of query graph Q is computed in site S by using the Site-Process(S,Q) procedure. The reachable operations are determined by the Compute-Reachable-Results procedure on the master site. In the Master procedure, M ap[S] is the map of the operations we receive from site S and RO[S] is the set of reachable operations of site S.

Master(Q) begin 1. for each site S do in parallel 2. Site-Process(S,Q) (Calling from remote) 3. M ap[S]Ã Receive-Map(S) 4. joint: waiting until receiving all map result

from every site 5. Compute-Reachable-Operations() 6. for each site S do in parallel 7. Send-Reachable-Operations(S,RO[S]) 8. subResult[S] ÃReceive-Result(S) 9. joint: waiting until receiving all result

from every site 10. Result Ã ? 11. for each site S do 12. Result Ã Result [ subResult[S] end Compute-Reachable-Operations() begin 1. for each site S do 2. RO[S] Ã ; 3. Stack Ã ; 4. Stack:push((Froot; qroot)) 5. while Stack 6= ; do 6. (F; q) Ã Stack:pop() 7. Let S be the site containing F 8. RO[S] Ã RO[S] [ (F; q) 9. for each ((F; q); (F 0; q0)) 2 M ap[S] do 10. Stack:push((F 0; q0)) end Site-Process(S,Q) begin 1. M ap Ã ; 2. for each fragment F of S do 3. for each state q of Q do 4. result[F; q]ÃFragment-Process(F ,q) 5. Send-Map(M ap) 6. ROÃReceive-Reachable-Operations() 7. Result Ã ; 8. for each (F; q) 2 RO do 9. Result Ã Result [ result[F; q] 10.Send-Result(Result) end

The sequence of the activities ordered by time when processing a query graph Q is: 1. Activating the query process over each site (line 1-2 of the Master procedure). 2. Computing the partial result at each site (line 1-4 of the Site-Process procedure). 3. Synchronization: Sending the map of the result from some site to the master site (line 5 in the Site-Process procedure, line 3 in the Master procedure). 4. The master waits until receiving all map results (line 4 in the Master procedure). 5. Determining reachable operations (line 5 in the Master procedure). 6. Synchronization: Sending the reachable results to each site (line 6 in the Site

Process procedure, line 7 in the Master procedure). 7. Each site determines the site result (line 7-9 in the Site-Process procedure). 8. Synchronization: Sending the result from some site to the master site (line 10 in the Site-Process procedure, line 8 in the Master procedure). 9. The master waits until receiving all site results (line 9 in the Master procedure). 10. Determining the ¯nal result (line 10-12 in the Master procedure).

The waiting time for the answer is reduced by the parallelism. Each site and fragment is visited only one time. Moreover the number of the communication between sites is constant, and the size of communicated data is only depend on the size of the query and the result [ 4 ]. We can reduce the waiting time by expanding the system with more machines (sites). The total processing cost is higher than the necessity as we process for each fragment with all possible states. 4

Processing Queries with Tree- and Structural Indexes

We begin this section by de¯ning two types of redundant fragment processing operations in partial processing approach.

De¯nition 1. (F; q) is redundant type 1 if it is not reachable when we process the operation (Froot; qroot). (F; q) is redundant type 2 if there is no data node in the fragment F matching the query when we process the operation. In the example in Fig. 1, (F2; q2) is a redundant operation type 1 but not a redundant operation type 2 as if from all operations over F2, (F2; q0) is the only F1 a b

F0 a

ac F2 b

a F5 b (a) ε

F3 a

b F4 a a * a e,f a e,f * a,b 0-depth 1-depth 2-depth

(b) F4 Fragment operation which is reachable from (F0; q0). Furthermore, if (F2; q2) is processed, node 18 matches the query. In another case (F4,q0) is a redundant operation type 2 but not a redundant operation type 1. Obviously, we do not need to carry out an operation if it is redundant type 1 or 2. 4.1

Tree Index

De¯nition 2. Tree TI = (VI ; EI ) is the tree index of the XML tree T fragmented by F, in which: (i) VI =F and each index node F 2 VI is labelled by the root of fragment F (ii) (F; F 0) is an index edge if F is parent fragment of F 0. (F; F 0) is labelled by the label path of path p, which is the path connecting from the root of F to the root of F 0 but we reject two roots.

The tree index stores the information about the paths connecting the roots of the fragments. The index tree of the fragmented tree T in Fig 1 is depicted in Fig 2(a). In this tree, (F0; F2) is labelled by a:c which is the label path of the path < 3; 7 >; (F0; F3) is labelled by " as the root of F0 is the parent of the root of F3. The size of the index tree can be considered constant as the number of fragments is considered constant.

The (F; q) operations which are not redundant type 1 are determined by the Process-Index-I procedure as follows:

The structure of Process-Index-I procedure is similar to the ComputeReachable-Results procedure's in the partial parallel processing algorithm. The di®erence between two procedures is the way expanding new reachable operations. We use the Matching-II procedure (at line 8) in the ¯rst procedure and M ap[S] (at line 9) and in the second. Both of them give the same information that "Which operations over the child-fragment are reachable if some operation over parent-fragment is reachable.". The correctness of the Matching-II can be proven easily as it is considered as a sequence of applying the Matching procedure. Therefore, the Process-Index-I procedure will return the set of operations which are reachable when we process (Froot; qroot). These operations are not redundant type 1. The processing cost over tree index is considered as constant as the size of the tree index is considered as constant. 4.2

Structural Indexes

We restrain redundant operations type 2 by processing the query graph over structural indexes of the fragments. In general, a structural index IG of the graph G is built by the following general procedure: (1) partitioning the data nodes into classes according to some equivalence relation, (2) making an index node for each equivalence class and the index node is labelled by the union of the labels of the data node elements, and (3) adding an index edge from index node I to index node J if there exists an edge from data node u to data node v, where u is an element of I and v is an element of J . In the case G is a rooted graph, the index node root(IG ) containing the root of G is de¯ned as the root of IG .

The traversal rules for processing operation (IG ,q) over the rooted index graph IG beginning with state q as follows: (1)We begin at (root(IG ),q) .(2) From (I; p) we visit (J; p0) for each index edge (I; J ) and there exists a transition from state p to state p0 with the label l 2 label(I). (3) I matches the query if some (I; p) is visited and there exists a transition from state p to ¯nal state p0 with label l 2 label(I).

Proposition 1. Let I be a structural index of fragment F . (F; q) is a redundant operation type 2, if there is no index node matching the query when we process (I,q) operation.

Proof. We use indirect method. We assume (F; q) is not a redundant operation type 2. It implies that there exists a path u1u2 : : : un and a sequence of states q0q1 : : : qn such that (i) u1 is the root of F and q0 = q (ii) ui are data nodes of F (iii) qn is ¯nal state (iv) there is a transition from state qi¡1 to state qi with label label(ui) (i = 1; : : : ; n). Let U1; : : : ; Un be the index nodes of I containing u1; : : : ; un respectively. The construction of the index implies that (v) U1 is the root of I (vi) (Ui; Ui+1) are index edges (i = 1; : : : ; n ¡ 1). label(ui¡1) 2 label(Ui¡1) and there exists a transition from state qi¡1 to state qi with label label(ui) so we can traverse from (Ui¡1; qi¡1) to (Ui; qi). It implies that Un matching the query we traverse the query graph beginning at the root of I, U1, and state q0 = q (con°ict). The proposition gives us a necessary condition to check if (F; q) is a redundant type 2. The cost for checking the condition depends on the size of the structural index, which is not bigger than the fragment. The condition is su±cient if the structural index is the 1-index [ 14 ]. However, the size of the 1-index is big in most cases. The size of the structural index should be constant. Based on the DL-indexes [ 7 ] we introduce and study the DL-tree-indexes using for checking redundant operations type 2.

The processing cost over a structural index depends on the number of the labels of the data nodes in each index nodes. We de¯ne special label 0¤0 (¤ 2= §) for matching any label of §. If the label of an index node I is 0¤0, we can jump from (I; q) to any (J; p) where (I; J ) is an index edge and there exists a transition from state q to state p.

De¯nition 3. Let F be the fragment, h be the depth of F and k be an integer 1 · k · h. The k-depth DL-tree-index of F , Ik, is a structural index in which the partition of nodes is as follows:

(i) For each 0 · l · k we group the data nodes in the same depth l in the same index node Il.

(ii) In the case k < h we group the data nodes whose depth is more than k in the same index node and the index node Ik+1 labelled by 0¤0.

The DL-tree-indexes of fragment F4 of tree T in Fig. 1(b) is shown in Fig. 2(b). The root of Ik is I0. The Ik has k index nodes in the case k = h otherwise k + 1 index nodes. We can choose k value as bigger as the size of Ik can be considered as constant. The process over a structural index Ik of the fragment F to determine if (F; q) is a redundant type 2 is as follows: Proposition 2. If the Process-Index-II(Ik,q) returns false value, (F; q) is a redundant operation type 2. Proof. Let S0; : : : ; Sk be the values of S at the beginning of the loop at line 2 for each i = 0; : : : ; k respectively. We will show that "Si (i = 0; : : : ; k) is the set of states pi, in which (Ii; pi) is visited if (Ik; q) is processed" (1). Obviously (1) is true for i = 0 as S0 = f g

0 . The Si+1 computation based on Si at line 4-6 guarantees for the correctness of the proof for (1) by induction.

The return value is false so the condition we check at line 7 and line 11 must be false. Exists-Final-State(Si) returning true value (at line 7) implies that Si (i = 1; : : : ; k) not containing any ¯nal state. Therefore, (1) implies that Ii (i = 0; : : : ; k) not matching the query. Matching-III (Ik+1; S) returning false value (at line 11) implies that Ik+1 does not match the query either. No index node matches the query when we process (IG; q), so (F; q) is a redundant operation type 2.

The DL-indexes [ 7 ] are quite dynamic and °exible. We can use a DL-tree-index as the initial index graph and then re¯ning the index so that the size is ¯t with the system or the index supports a set of frequently queries [ 7 ]. The processing index over the re¯ning index should be modi¯ed so that avoiding the circles in the index graph. However, the processing cost is constant if the size of the index is considered as constant. 4.3

Processing Queries with Tree- and Structural Indexes

We store the tree index TI of the system and for each fragment F we also store Index(F ) which is a k-depth DL-tree-index of F on the chosen master site. The Eff-Master(Q) procedure is run in the master site to control the sites processing query graph Q. The partial result of query graph Q is computed in site S by using the Eff-Site-Process(S,Q) procedure.

Eff-Master(Q) Eff-Site-Process(S,Q) begin begin 1. States Ã Process-Tree-I(TI ,Q) 1. ROÃReceive-Required-Operations() 2. for each site S do 2. Result Ã ; 3. RO[S] Ã ; 3. for each (F; q) 2 RO do 4. for each (F; q) 2 States do 4. Result Ã Result [Fragment-Process(F; q) 5. if Process-Index-II(Index(F ),q) then 5. Send-Site-Result(SiteResult) 6. Let S be the site containing F end 7. RO[S] Ã RO[S] [ (F; q) 8. for each site S do in parallel 9. Eff-Site-Process(S,Q) (Calling from remote) 10. Send-Required-Operations(S,RO[S]) 11. subResult[S] ÃReceive-Site-Result(S) 12.joint: waiting until receiving all result from every site 13.Result Ã ; 14.for each site S do 15. Result Ã Result [ subResult[S] end

The sequence of the activities of the query process ordered by time is: 1. Rejecting the redundant operations at master site : (line 1-7 of the Master procedure). 2. Activating the query process at each site (line 9 in the Eff-Master procedure) 3. Synchronization: Sending the required operations to each site (line 10 in the Eff

Master procedure, line 1 in the Eff-Site-Process procedure). 4. Computing the required operations at each site. (Line 3-4 in the Eff-Site-Process procedure ) 5. Synchronization: Sending the site result from some site to the master site (line 5 in the Eff-Site-Process procedure, line 11 in the Eff-Master procedure). 6. The master waits until receiving all site results (line 12 in the Eff-Master procedure). 7. Determining the result (line 13-15 in the Eff-Master procedure).

Obviously, the number and the cost of communication in our algorithm is smaller than two algorithm in section 3. The total cost of query processing in each site is also smaller as the number of processing operations is fewer. If the number of fragments is considered constant we can ignore the cost of processing over tree- and structural indexes. In this case our algorithm overcomes two above algorithms according both the waiting time and the total query processing and communication cost criteria. 5

Experiments

We compared the performance of our e±cient algorithm using tree and structural indexes (EPP for short), the partial processing algorithm (PP for short) and the tree traversal algorithm (TP for short) in the waiting time and the total processing and communication criteria. We implemented the algorithms in C++ programming language.

Data set and fragmentation. We used the XMark data set containing the activities of an auction Web site is generated by using the Benchmark Data Generator [ 18 ]. The size of our data set is about 500 Mb. The number of nodes is 10 267 360 and the number of labels is 77. We uses 19 Linux machines over a local LAN. We use a program splitting the XML tree into 76 fragments. Each fragment has about 150 000 nodes. The fragments are chosen randomly to be stored in the sites.

Queries. We proposed 10 regular path queries representing for di®erent conditions of the environment for our experiments. They are: 1. Q1: //keyword. 2. Q2: //listitem/text/keyword/bold. 3. Q3: //asia/item//keyword/bold. 4. Q4: //listitem//keyword 5. Q5: //people/*/pro¯le/income 6. Q6: //bold j //emph 7. Q7: //asia//bold j //asia//emph 8. Q8: //person//name 9. Q9://item/mailbox/mail 10. Q10://namerica//description//keyword

We processed these queries with our algorithms many times and measured the average values of the waiting time and the cost of processing and communication for each query and each algorithm. Here are the results:

Queries

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Waiting time. The ratio of the waiting time of three algorithms, EPP : PP : TP, is 1 : 1.94 : 37.52. Our experiments shows that the partial processing approach absolutely overcomes the stream processing approach according to the waiting time criterion. There are two reasons for this explanation: (1) TP does not use the parallelism like PP and EPP; (2) there are many communication between sites. In cases Q3, Q7 and Q10, although the processing and communication cost of TP is lower than PP's but the waiting time is still higher. EPP overcomes PP according the waiting time criterion as the ratio is about 50% and being better in all test cases. Especially in cases Q7 and Q10, there are a lot of redundant operations for PP but they are restrained by preprocessing over indexes in EPP.

Processing and Communication Cost. The ratio of the processing and communication cost of three algorithms, EPP : PP : TP, is 1 : 1.77 : 2.75. The communication cost makes the cost of TP is higher than PP in general. However the redundant operations types 1 make the total cost of PP is higher than TP in several cases ( Q3, Q7 and Q10). By restraining redundant operations, the cost of EPP is always the lowest in all test cases among three algorithms.

Our experiments shows that our EPP algorithm is the best among the introduced algorithms according the waiting time and the processing and communication cost criteria. 6

Related Works

The fundamental concepts of parallel database systems have been found in [ 1 ]. Architecture, data placement and query optimization are the focuses of interest. We use the shared-nothing architecture for our system with no data replacement. Some ideas and suggestions about the fragmentation (data placement) are introduced in [ 8 ].

The problem of optimizing, processing regular path expressions in the context of navigating semi-structured data in web sites has been introduced and studied in [ 2, 3 ]. The focus of [ 2 ] is on the use of local information expressed in the form of path constraints in the optimization of path expression queries. In [ 3 ] two query optimization techniques is proposed to rewrite a given regular path expression into another query that reduces the scope of navigation. [ 4, 5 ] are closely related to our work as they studied the distributed query evaluation on semi-structured data using partial evaluation. The boolean XPath queries [ 5 ] and the structural recursion queries [ 4 ] are introduced and studied. The main di®erence between these works and our work is the context of the system. All of them [2{5] assume that the data set is distributed in di®erent machines in "weak" relationship and the communication cost may be very expensive. In their context the fragmentation is not manageable and the indexes describing the relationship between sites are not suggested. The spirit of the processing algorithms described in [ 4, 5 ] is similar to our partial parallel processing algorithm.

The structural indexes [6, 7, 14{17] are introduced to speed up the query evaluation over semi-structured data by simulating the data graph by an index graph. An index graph is equivalent with the data graph over a given query if the processing the query over the index graph and the query graph bring the same result. The 1-index [ 14, 6 ] is equivalent with the data graph over all regular queries. The A(k)-index [ 15 ] is equivalent with the data graph over all primary queries not longer than k. The D(k)-index [ 16 ], the M(k)-index [ 17 ] and DLindexes [ 7 ] are equivalent with the data graph over a given set of primary queries. In our context, the size of the structural indexes simulating the fragments are considered constant so the chosen indexes must be dynamics and adaptive. Our DL-tree-index are improved from the DL-indexes, which are the most adaptive and dynamics indexes. 7

Conclusion

We have shown an e±cient algorithm processing regular queries in sharednothing parallel database systems based on partial evaluation. We de¯ne two types of redundant operations. The redundant operation type 1, which are not reachable, are determined by the tree structural. The redundant operations type 2, which have no matching node, are restrained by processing over structural indexes. The DL-tree-indexes are introduced and proposed for this purpose. The size of the tree index and the DL-tree indexes are considered as constants. The partial evaluation becomes more e®ective by restraining redundant operations. Our experimental study has veri¯ed the e®ectiveness of our techniques.

We plan to extend the current work in a number of directions. First, the e®ectiveness of our algorithm depends on the fragmentation. An e®ective fragmentation algorithm concerns with the statics system and the set of frequency queries. Some ideas are introduced in our work [ 8 ]. Second, we plan to extend our algorithms to handle more general queries in XPath and XQuery. The partial evaluation, indexes for these queries are open problems.

1. Ameet

S. Talwadker

2003 . Survey of performance issues in parallel database systems . In Journal of Computing Sciences in Colleges archive Volume 18, Issue 6 .

Abiteboul , V. Vianu 1997 . Regular path queries with constraints . In Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems.

Fernandez and D. Suciu 1998 . Optimizing regular path expressions using graph schemas . In Proceedings of the 14th Inter. Conference on Data Engineering.

4. D. Suciu 2002 . Distributed query evaluation on semistructured data . ACM Transactions on Database Systems Volume 27 , Issue 1.

Buneman ,G. Cong,

Fan , A. Kementsietsidis 2006 . Using partial evaluation in distributed query evaluation . In Proceedings of the 32nd international conference on Very large data bases Volume 32 , VLDB 2006 .

Kiss , V. L. Anh 2005 . Combining Tree Structure Indexes With Structural Indexes . In Proceedings of 9th East-European Conference on Advances in Databases and Information Systems, LNCS 3631.

Kiss , V. L. Anh 2006 . E±cient Processing SAPE Queries Using The Dynamic Labelling Structural Indexes . In Proceedings of 10th East-European Conference on Advances in Databases and Information Systems, LNCS 4512.

V. L.

Anh ,

Kiss ,

Z. Vinceller 2007. Parallel

Processing Search Engine For Very Large XML Data Sets . In ICAI 2007 .

Buneman ,

Fernandez , and D. Suciu 2000 . UNQL: A query language and algebra for semi-structured data based on structural recursion . In VLDB J.9, Issue 1 .

10.

Deutsch ,

Fernandez ,

Florescu ,

Levy , and D. Suciu 1999 . A query language for XML . In Proceedings of the Eights International World Wide Web Conference (WWW8) , Toronto.

11.

Berglund ,

Boag ,

Chamberlin ,

M. F.

Fernandez ,

Kay ,

Robie , and J. Simeon. XML path language (xpath) 2.0 . In http://www.w3.org/TR/xpath20, August 2002 .

12.

Boag ,

Chamberlin ,

M. F.

Fernandez ,

Florescu ,

Robie , and J. Simeon. XQuery 1 .0: An

XML

Query

Language . In http://www.w3.org/TR/xquery/, November 2006 .

13. Extensible Markup Language (XML) . In http://www.w3.org/XML/.

14.

Milo and D.Suciu 1999 . Index Structures for Path Expressions . In ICDT , 1999 .

15.

Kaushik ,

Shenoy , P. Bohannon and Ehud Gudes 2002 . Exploiting Local Similarity for E±cient Indexing of Paths in Graph Structured Data . In ICDE , 2002 .

16.

Chen ,

Lim ,

K. W. Ong

2003. D(K)-Index : An Adaptive structural Summary for Graph-Structured Data . In ACM SIGMOD 2003 , June 9-12.

17. Hao

, Jun Yang 2004 . Multiresolution Indexing of XML for Frequent Queries . In Proceedings of the 20th International Conference on Data Engineering.

18. XMark: The XML benchmark project . http://monetdb.cwi.nl/xml/index.html.