Algorithms

An Algorithm for Transforming XPath Expressions According to Schema Evolution

Kazuma Hasegawa

s1221595@u.tsukuba.ac.jp 1

Kosetsu Ikeda

lumely@slis.tsukuba.ac.jp 1

Nobutaka Suzuki

nsuzuki@slis.tsukuba.ac.jp 0 0 Faculty of Library , Information, and Media Science , University of Tsukuba 1 Graduate School of Library, Information and Media Studies, University of Tsukuba

2013

1008 3 10

XML is a de-fact standard format on the Web. In general, schemas of XML documents are continuously updated according to changes in real world. If a schema is updated, then query expressions have to be transformed so that they are \valid" under the updated schema, since the expressions are no longer valid under the updated schema due to the schema update. However, this is not an easy task since many of recent schemas are large and complex and thus it is becoming difficult to know how to update the query expressions correctly. In this paper, we propose an algorithm for transforming XPath expressions according to schema evolution. For an XPath expression p and a schema S, our algorithm treats both p and S as tree automata T Ap and T AS, respectively. Our algorithm rst takes the product automaton T AR of T Ap and T AS, then analyze T AR to nd the correspondence between the states of T Ap and T AS. Based on this correspondence, the algorithm transforms T Ap according to an update operation applied to T AS. We also show some preliminary experimental results.

Algorithms

XML, XPath, schema evolution, tree automaton

1. INTRODUCTION

XML[ 5 ] is a de-fact standard format on the Web. An XML document is usually stored with its schema so that the structural consistency of the document is ensured. In general, schemas are continuously updated according to changes in This work is licensed under the Creative Commons AttributionShareAlike 3.0 Unported License (CC BY-SA 3.0). To view a copy of the license, visit http://creativecommons.org/licenses/by-sa/3.0/. real world, which is called schema evolution. If a schema is updated, then query expressions have to be transformed so that they are \valid" under the updated schema, since the expressions are no longer valid against the updated schema due to the schema update. However, this is not an easy task since many of recent schemas are large and complex, and thus it is becoming very difficult to know how to update the query expressions correctly.

In this paper, we propose an algorithm for transforming XPath expressions according to schema evolution. Here, XPath[ 4 ] is the most popular query language for XML. For a given schema S, an edit operation op to S, and an XPath expression p under S, our algorithm transforms p into an XPath expression p′ \equivalent" to p whenever possible, that is, the result of p′ under op(S) coincides with that of p under S, where op(S) is the updated schema obtained by applying op to S.

To illustrate our algorithm, let us show a simple example. Let D be the DTD in Fig. 1(a), and suppose that element students is inserted as the child of school (Fig. 1(b)). According to this schema update, our algorithm transforms the following XPath expression

An XML document is modeled as a labeled ordered tree, and a schema is modeled as a tree automaton. Formally, a tree automaton is a quadruple T A = (N; ; s; P ), where /school/student[supervisor]/name into the following: /school/students/student[supervisor]/name. Then, suppose that element supervisor is deleted from the DTD in Fig. 1(b). In this case, it is impossible to transform the above XPath expression into an equivalent one. Thus, as an alternative answer our algorithm deletes supervisor from the above expression and returns the following: /school/students/student/name.

Although these DTDs are very small, schemas used in practice are much larger and complex [ 6 ]. Therefore, a user tends not to understand the entire structure of a schema exactly, and thus our algorithm is helpful to transform XPath expressions appropriately according to schema evolutions.

In this paper, schema is modeled as (unranked) tree automaton, which is equivalent to regular tree grammar. Tree automaton is the formal model of RELAX NG, and XML Schema and DTD can also be modeled by tree automaton [14]. Moreover, tree automaton is equivalent to specialized DTD [17]. Thus, our algorithm is applicable to most of formal and practical XML schema languages. As for XPath, we focus on an XPath fragment using child and descendantor-self axes with predicates. Although our XPath fragment supports no upward axes, this gives usually little problem since the majority of XPath queries uses only downward axes[10]. Thus, we believe that our algorithm is useful to correct a large number of XPath queries.

Related Work

The study most related to this paper is [13]. This study proposes an algorithm for transforming XPath expressions according to schema evolution, assuming that a schema monotonically increases (no element can be deleted from a schema). On the other hand, this paper has no such an assumption and allows more general schema updates. Since actual schema evolutions usually involve element deletions or some similar updates, our algorithm is more practical in real world situations. To the best of our knowledge, there is no study on transforming XPath expressions according to schema evolution, except [13]. However, several studies deal with update operations to schemas. For example, Ref. [18] proposes a \complete" set of update operations to DTDs. Refs. [12, 9] propose update operations schemas and algorithms for extracting \diff" between two schemas. Refs. [7, 8, 19] propose update operations that assures any updated schema contains its original schema. Recently, Ref. [16] introduces a taxonomy of possible problems induced by a schema change, and gives an algorithm to detect such problems. Ref. [11] studies query-update independence analysis, and shows that the performance of [ 3 ] can be drastically enhanced in the use of -calculus.

DEFINITIONS N is a set of states (element types),

is a nite set of element names, s 2 N is the start state, P is a set of transition rules of the form X ! a(reg) or X ! Y , where X; Y 2 N and reg is a regular expression over N .

For a transition rule X ! a(reg), we say that X is the lefthand side, a(reg) is the right-hand side, a is the label, and reg is the content model of the rule. For example, consider the tree automaton T Aa shown in Fig. 2. This is equivalent to the DTD in Fig. 1(a). \School" in N is the type of element \school" , and so on. We assume that element \pcdata" represents an arbitrary string. By L(T A) we mean the language of tree automaton T A, i.e., the set of trees \valid" against T A.

Following [15], we de ne the product of tree automata. Let TA1 = (N1; 1; s1; P1) and TA2 = (N2; 2; s2; P2) be tree automata. Without loss of generality, we assume that 1 = 2 = . First, we de ne the product reg1 reg2, where reg1 is a regular expression over N1 and reg2 is a regular expression over N2. Then reg1 reg2 is a regular expression over N1 N2 such that n11n12 ni1 matches reg1 and n21n22 ni2 matches reg2 if and only if [n11; n21][n12; n22] [ni1; ni2] matches reg1 reg2, where n11n12 ni1 is a sequence of states in N1 and n21n22 ni2 is a sequence of states in N2. reg1 reg2 is constructed as follows.

1. reg′1 is obtained from reg1 by replacing each state n by [n1; n21]j[n1; n22]j j[n1, n2k], where n21; n22; ; n2k is an enumeration of N2. Similarly, reg′2 is obtained from reg2. 2. Construct two automata A1 and A2 from reg′1 and reg′2, respectively. 3. Construct the product automaton of A1 and A2. 4. Construct a regular expression, reg1 product automaton. reg2, from the

The product automaton of TA1 and TA2 is TA3 = (N1 N2; ; s1 s2; P3), where P3 = f[n1; n2] ! a(reg1 n2 ! a(reg2) 2 P2) [f[n1; n2] ! [n1; n′2] j (n1 2 N1; n2 ! n′2 2 P2)g [f[n1; n2] ! [n′1; n2] j (n1 ! n′1 2 P1; n2 2 N2)gg: By de nition, it is immediate that for any tree t, t 2 L(T A3) if and only if t 2 L(T A1) and t 2 L(T A2).

Let be a set of element names. We de ne XPath expression p as follows.

reg2) j (n1 ! a(reg1) 2 P1; p ::= =p′ p′ ::= q ::= p′

:: l j p′=p′ j p′[q], where l 2 ::=child j descendant-or-self

In this section, we de ne tree automaton and the product of tree automata.

We call each 2 fchild; descendant-or-selfg axis, and q predicate, respectively. P =

UPDATE OPERATION TO TREE AU TOMATON

In this section, we de ne update operations to tree automaton.

Let reg be a regular expression. To de ne update operations to a tree automaton, we need to identify the positions of states and operators in reg. Thus we de ne the set pos(reg) of positions in a reg, as follows.

If reg = ϵ or reg = a (a 2

), then pos(reg) = f g.

Otherwise, pos(reg) = f g [ fu j u = vw; 1 v n; w 2 pos(regv )g, where regv is a subexpression consisting of the descendants of v in reg.

For example, let reg = (AjB) . Then pos(reg) = f ; 1; 11; 12g. As shown in Fig. 3, is the position of , 1 is the position of j, 11 is the position of A, and 12 is the position of B.

In the following, without loss of generality we assume that for any state A and any element name a, there is at most one transition rule whose left-hand side is A and label is a. If there are transition rules A ! a(reg1) and A ! a(reg2), then these can be merged into one transition rule A ! a(reg1jreg2).

We now de ne update operations to a tree automaton T A = (N; ; s; P ) as follows.

ins state(A; a; B; i): Inserts new state B at position i in the content model of r, where r is the transition rule in P such that the left-hand side of r is A and that the label of r is a. ins opr(A; a; opr; i): Inserts an operator opr ( , +, ?) at position i in the content model of r, where r is the transition rule in P such that the left-hand side of r is A and that the label of r is a. del state(A; a; i): Deletes the state at position i in the content model of r, where r is the transition rule in P such that the left-hand side of r is A and that the label of r is a. del opr(A; a; i): Deletes the operator at position i in the content model of r, where r is the transition rule in P such that the left-hand side of r is A and that the label of r is a. nest state(A; a; B; b; i): Replaces the subexpression E at position i (i.e., subtree rooted at position i) in content model of r by state B, where r is the transition rule in P such that the left-hand side of r is A and that the label of r is a. Moreover, a transition rule B ! b(E) is added to P . unnest state(A; a; b; i): This is the inverse operation of nest state, and replaces the state B at position i in the content model of r by reg, where (1) r is the transition rule in P such that the left-hand side of r is A and that the label of r is a and (2) reg is the content model of the transition rule whose left-hand side is B and label is b. replace state(A; a; B; i): Replaces the state at position i in the content model of r by B. In this paper, this operation is treated as a pair of unnest state(A; a; i) and nest state(A; a; B; b; i). This operation is used in order to \rename" element names.

For example, consider the tree automaton T Aa in Fig. 2. Applying nest state(School; school; Students; students; ) to T Aa, we obtain the tree automaton T Ab in Fig. 4. Then applying del opr(Student; student; 4) and del state(Student; student; 4) to Tb, we obtain the tree automaton T Ac in Fig. 5. 4.

TRANSFORMATION FROM XPATH EXPRESSION INTO TREE AUTOMATON

Our algorithm treats an XPath expression as a tree automaton. Thus, in this section we de ne a transformation from an XPath expression into a tree automaton. This transformation is based on [15].

An XPath expression is transformed into a tree automaton as follows. First, we transform a location step :: l into a tree automaton. Each transformed tree automaton has input and output states, corresponding to the \context node" for :: l and the \result node" selected by :: l, respectively. For an XPath expression p of the form p1=p2, p′[q], or =p′, we transform p into a tree automaton in a bottom-up manner, according to the structure of p.

We rst de ne a tree automaton (N; ; s; P ) corresponding to location step :: l. In the following, A ! ( ) denotes fA ! ( ) j 2 g.

= child, then { The initial state s is B. P = The input state and the nite set of output states of a nite tree automata are fCg and fDg, respectively. C represents the context node and D represents the node selected by the location step :: l.

We next de ne the transformation from an XPath expression p into a tree automaton. We have to consider the case where p = p1=p2, p = p[q], and p = =p. In the following, for a tree automaton T Ap of p, Np denotes the set of states of T Ap, N Ip Np denotes the set of input states of T Ap, and N Op Np denotes the set of output states of T Ap. The case of p1=p2 Let T Ap1 = (Np1 ; ; sp1 ; Pp1 ) be the tree automaton of p1 and T Ap2 = (Np2 ; ; sp2 ; Pp2 ) be the tree automaton of p2. Moreover, let T A = (N; ; s; P ) be the product automaton of T Ap1 and T Ap2 . Then the automaton T Ap1=p2 of p1=p2 is de ned as T Ap1=p2 = (Np1=p2 ; ; s; P ), where = f[np1 ; np2 ] j np1 2 N Ip1 ; np2 2 (Np2 N Ip2 )g; f[np1 ; np2 ] j np1 2 (Np1

N Op1 ); np2 2 N Op2 g:

The case of p[q] Let T Ap = (Np; ; sp; Pp) be the tree automaton of p and T Aq = (Nq; ; sq; Pq) be the tree automaton of q. Moreover, let T A = (N; ; s; P ) be the product automaton T Ap and T Aq. Then the tree automaton T Ap[q] of p[q] is de ned as T Ap[q] = (Np[q]; ; s; P ), where

Np[q] = f[np; nq] j (np 2 N Op; nq 2 N Iq) _ (np 2 (Np

N Op); nq 2 (Nq

N Iq))g:

The set N Ip[q] of input states and the set N Op[q] of output states of T Ap[q] are de ned as follows.

N Ip[q] = N Op[q] = f[np; nq] j np 2 N Ip; nq 2 (Nq N Iq)g; f[np; nq] j (np 2 N Op; nq 2 N Iq)g: The case of =p Let :: l be the rst location step in p and let T Ap = (Np; ; sp; Pp) be the tree automaton of p. Then the tree automaton T A=p of =p is de ned as T A=p = (N=p; ; R; P=p), where

N=p P=p = =

Np [ fRg;

Pp [ fR ! root(D)g; where R is the initial state of T A=p and root is the element name corresponding to the root node.

The set N I=p of input states and the set N O=p of output states of T A=p are de ned as follows.

N I=p N O=p = fRg; =

N Op; where N Op is the set of output states of T Ap. 5.

ALGORITHM FOR TRANSFORMING XPATH EXPRESSION ACCORDING TO SCHEMA EVOLUTION

In this section, we show an algorithm for transforming a given XPath expression according to schema evolution.

To describe our algorithm, we need some de nitions. For an XPath expression p, the selection path of p is the XPath expression obtained by dropping every predicate from p. Let p be an XPath expression of the form =p1[q]=p2. Then =p1=q is called a predicate path of p (the predicate path for a predicate in q can be de ned similarly). For example, Let p = =a[f ]=b[c=e]=d. Then =a=b=d is the selection path of p, and =a=f and =a=b=c=e are the predicate paths of p.

Let us rst show the \main" algorithm (Fig. 6). Let op be the update operation applied to tree automaton T A. Moreover, let p be the input XPath expression, p0 be the selection path of p and p1; ; pk be the predicate paths of p. We transform pi to p′i according to op, for each i = 0; 1; ; k, and merge the resulting k expressions p′0; p′1; ; p′k as the result. More concretely, the algorithm rst partition p into p0; p1; ; pk (step 1). Then p0 is transformed into p′0 according to op by function Transform (shown later). If the result of p0 becomes empty due to del state, then p′0 = nil and the transformation of p is terminated (steps 3 and 4). Otherwise, we transform pi for each i = 1; ; k (steps 6 to 8), by Transform. Finally, p0; p1; ; pk are merged and returned as the result. If a predicate path p′i is nil, then p′i is just ignored when merged. For example, let p = =a[f ]=b[c=e]=d. Then p0 = =a=b=c, p1 = =a=f , and p2 = =a=b=c=e. Suppose that c is deleted by unnest state. Then we obtain p′0 = =a=b=d, p′1 = =a=f , and p′2 = =a=b=e due to Transform, and merging these three expressions we have p′ = =a[f ]=b[e]=d, which is the result of the algorithm.

Let us next consider function Transform. Let T A = (N; ; s; P ) be a tree automaton, op be an edit operation to T A, and p be an XPath expression having no predicate. Our objective is to transform p into an XPath expression p′ so that the result of p′ under op(T A) coincides with that of p under T A. However, this is sometimes impossible if a state (and its corresponding element) is deleted from T A. Thus, Transform is constructed as follows.

If op is ins state, ins opr, or del opr, then p is unchanged.

If op is nest state(A; a; B; b; i), then location step child :: b is inserted to p at the position that nesting is occurred, if necessary (i.e., unless p contains a descendant-or-self axis that \masks" the nest state operation).

If op is unnest state(A; a; b; i), then a location step whose node test is b is deleted from p, if it is not \masked" by a descendant-or-self axis.

If op is del state, then p is modi ed as follows.

Input : XPath expression p, tree automaton T A, update operation op to T A.

Output : XPath expression or nil 1. Partition p into the selection path p0 and the predicate path p1; p2; ; pk. 5. else 6. for i p′ i end 2. p′0

Transform(p0; T A; op);

3. if p′0 = nil then return nil; 1 to k do

Transform(pi; T A; op);

9. Merge p′0; p′1;

; p′k into p′. 10. return p′;

Now we present function Transform (Fig.7). state(A; a; i; P ) in steps 16, 23, and 29 denotes the state at position i in the content model of r, where r is the transition rule in P such that its left-hand side is A and its label is a. If op is ins state, ins opr, or del opr, p is unchanged (steps 4 to 7). If op is del state, we examine whether the result of p becomes empty under op(T A) (steps 8 to 11). If the result of p does not become empty, p is unchanged (steps 11 to 12). If the result of p becomes empty and p is the selection path, the algorithm returns nil (steps 13 to 14). Otherwise (i.e., p is a predicate path), we delete the suffix of p that becomes invalid due to op (steps 15 to 20). In step 17, P ′′ is obtained in step 2, and we say that r is a transition rule from A to C if the left-hand side of r contains A and a state in the content model of r contains C. In step 18, we say that lsj corresponds to r 2 P ′′ if r is the intersection of (i) the transition rule corresponding to lsj in Pp and (ii) some transition rule in P . If op is nest state(A; a; B; b; i), we examine whether we need to transform p (steps 23 to 24). If we need to do, we insert a new location step child :: b to p (steps 25 and 26). If op is unnest state(A; a; b; i), then we also examine whether we need to transform p (steps 30 to 31). Moreover, if the location step lsj+1 to be deleted has a predicate q and the axis of lsj+1 is descendant-or-self, we delete q (steps 32 to 33). Otherwise, we delete lsj+1 from p (steps 34 to 38).

Finally, let us present the time complexity of the algorithm. Let T A = (N; T; s; P ) be a tree automaton (schema) p be an XPath expression that is partitioned into p0; p1; p2; ; pk. Then the algorithm runs in O(k jpj2 jT Aj), where jpj is the number of location steps in p and jT Aj is the size of T A. 6.

EXPERIMENTAL RESULTS

To verify if our algorithm transforms XPath expressions appropriately under real world schemas, we implemented our algorithm (in Ruby) and made a few experimentations. We use two pairs of schemas, MSRMEDOC DTDs (version 2.1.1 and 2.2.2)[ 2 ] and the NLM Journal Publishing Tag Set Tag Library DTDs (version 2.3 and 3.0)[ 1 ].

First, we give the evaluation of our algorithm on MSRMEDOC DTDs. Let D211 be the version 2.1.1 MSRMEDOC DTD and D222 be the version 2.2.2 MSRMEDOC DTD. The number of elements of D211 is 185 and that of D222 is 205. Table 1 shows the number of update operations between D211 and D222, where \others" are attributes insertions (not supported by our algorithm).

We generate 90 XPath expressions under D211 by XQgen[20]. The average size (i.e., number of location steps) of the XPath expressions is 5, where the minimum size is 4 and the maximum size is 7. Each XPath expression uses child axes and at most one descendant-or-self axes (78 XPath expressions use a descendant-or-self axis). There are 5 XPath expressions whose result becomes empty under D222. Since there is no predicate in these 90 XPath expressions, we D211 ! D222 D23 ! D30 choose 4 expressions and add a predicate to each of 4 expressions (given later).

We transform the above 90 XPath expressions by our algorithm. For 85 expressions, the elements retrieved under D222 coincides with the elements retrieved under D211. For the rest 5 XPath expressions, our algorithm returns nil since the results of these expressions become empty under D222. These 5 XPath expressions are shown in Table 2 (deleted elements are italicized). Since elements LIST, NOTE, FIGURE, and FORMULA (child elements of REMARK) and an element PRIVATE-CODES (child element of COMPANYDOC-INFO) are deleted, the results of these XPath expressions become empty (note that if an element is deleted from a content model, then its descendants cannot be retrieved either).

The four XPath expressions with a predicate, denoted p1; p2; p3; p4, are transformed as follows (Table 3).

Since element REMARK

COMPANY-REVISION-INFO) predicate of p1 is deleted. (child is

element deleted,

of the Since element REMARK is deleted, the last location step in the predicate of p2 is deleted.

Since element REMARK is deleted, the location step corresponding to an element REMARK in p3 is deleted, therefore our algorithm return nil.

Since element L-1 is inserted between P and FT and between P and STD, two location steps L-1 are inserted to p4.

As shown above, we can say that every XPath expression is transformed appropriately by our algorithm, even if del state's are involved in the schema evolution. The total execution time of the algorithm for the 90 XPath expression and 191 update operations is 9.58 sec, thus it takes an average 0.111 sec per one XPath expression.

We also made a similar experimentation using the NLM Journal Publishing Tag Set Tag Library. Let D23 be the version 2.3 The NLM Journal Publishing Tag Set Tag Library DTD and D30 be the version 3.0 DTD. The number of elements of D23 is 211 and that of D30 is 233. Table 1 shows the number of update operations between D23 and D30.

We generate 97 XPath expressions under D23 by XQgen. The average size of the XPath expressions is 6. Each XPath expression uses child axes and at most once descendant-orself axes, where 95 XPath expressions include descendantor-self axes. There are 10 XPath expressions whose result becomes empty under D30. There is no XPath expression with a predicate, thus we choose 3 XPath expressions and added a predicate to each expression.

We transform the 97 XPath expressions by our algorithm. For 87 expressions, the elements retrieved under D30 coincides with the elements retrieved under D23. For the rest 10 XPath expressions, our algorithm returns nil since the results of these expressions become empty under D30, which are listed in Table 4. Since elements citation, contract-num, contract-sponsor in element p are deleted, the results of these XPath expressions become empty. The three XPath expressions with a predicate, denoted p5; p6; p7, are transformed as follows (Table 5).

Since element chem-struct (child element of ack) is deleted, the predicate of p5 is deleted.

Since element chem-struct-wrapper is renamed to chem-struct-wrap, the predicate of p6 is renamed accordingly.

Since element custom-meta-wrap is renamed to custom-meta-group, the corresponding location step in p7 is renamed to custom-meta-group.

Again our algorithm seems to work well despite of del state operations. These results suggest that our algorithm can be applied to XPath expressions under real world schema evolutions. The total execution time of the algorithm for the 97 XPath expression and 733 update operations is 77.561 sec, thus it takes an average 0.800 sec per one XPath expression.

CONCLUSION

In this paper, we proposed an algorithm for transforming an XPath expression according to schema evolution allowing element deletions. However, this is just an ongoing work and we have a lot to do. First, we would like to extend our algorithm so that it can handle more general XPath expressions, e.g., supporting sibling and parent axes. Second, we use only two schema evolutions in our experimentation. Thus we would like to evaluate our algorithm under more real world schemas. Third, our algorithm supports neither attribute nor entity declaration in schemas. These should be incorporated into our algorithm.

ACKNOWLEDGEMENT 8. 9.

This work is partly supported by Grants-in-aid for Scienti c Research (23500110). XPath expression p2: //DOC-REVISION[COMPANY-REVISION-INFOS/COMPANY-REVISION

INFO/REMARK]/MODIFICATIONS/MODEFICATION Transformed result: //DOC-REVISION[COMPANY-REVISION-INFOS/COMPANY-REVISION

INFO]/MODIFICATIONS/MODEFICATION XPath expression p3: //DOC-REVISION[COMPANY-REVISION-INFOS/COMPANY-DOC-INFO/PRIVATECODES/PRIVATE-CODE]/DOC-REVISIONS/DOC-REVISION/COMPANY-REVISION

INFOS/COMPANY-REVISION-INFO/REMARK Transformed result: nil

XPath expression p4:

Transformed result: //P[FT]/STD //P[L-1/FT]/L-1/STD XPath expression p6: //license/p/preformat/named-content/supplementary-material[chem-struct-wrapper]/dispquote Transformed result: //license/p/preformat/named-content/supplementary-material[chem-struct-wrap]/disp-quote

[1]

\Journal

Publishing Tag Set" NLM Journal Archiving and Interchange Tag Suite . http://dtd.nlm.nih.gov/publishing/.

[2] \MSR Download" MSR Home . http://www.msr-wg.de/medoc/downlo.html.

[3]

Benedikt and

Cheney . Destabilizers and independence of xml updates . Proc. VLDB Endow ., 3 ( 1 -2): 906 { 917 , 2010 .

[4]

Berglund ,

Boag ,

Chamberlin ,

M. F.

Fernandez ,

Kay ,

Robie , and J. Simeon, editors. XML Path Language (XPath) 2 . 0 (Second

Edition

). http://www.w3.org/TR/xpath20/.

[5]

Bray ,

Paoli ,

C. M.

Sperberg-McQueen ,

Maler , and F. Yergeau, editors. Extensible Markup Language (XML) 1 . 0 (Fifth

Edition

). http://www.w3.org/TR/xml/.

[6]

Choi . What are real DTDs like? In Proc. WebDB , pages 43 { 48 , 2002 .

XPath expression p7: /article/front/journal-meta/custom-meta-wrap[custom-meta/meta-name]/custom-meta/metavalue