<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Algorithm for Transforming XPath Expressions According to Schema Evolution</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kazuma Hasegawa</string-name>
          <email>s1221595@u.tsukuba.ac.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kosetsu Ikeda</string-name>
          <email>lumely@slis.tsukuba.ac.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nobutaka Suzuki</string-name>
          <email>nsuzuki@slis.tsukuba.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Library</institution>
          ,
          <addr-line>Information, and Media Science</addr-line>
          ,
          <institution>University of Tsukuba</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Graduate School of Library, Information and Media Studies, University of Tsukuba</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2013</year>
      </pub-date>
      <volume>1008</volume>
      <fpage>3</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>XML is a de-fact standard format on the Web. In general, schemas of XML documents are continuously updated according to changes in real world. If a schema is updated, then query expressions have to be transformed so that they are \valid" under the updated schema, since the expressions are no longer valid under the updated schema due to the schema update. However, this is not an easy task since many of recent schemas are large and complex and thus it is becoming difficult to know how to update the query expressions correctly. In this paper, we propose an algorithm for transforming XPath expressions according to schema evolution. For an XPath expression p and a schema S, our algorithm treats both p and S as tree automata T Ap and T AS, respectively. Our algorithm rst takes the product automaton T AR of T Ap and T AS, then analyze T AR to nd the correspondence between the states of T Ap and T AS. Based on this correspondence, the algorithm transforms T Ap according to an update operation applied to T AS. We also show some preliminary experimental results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Algorithms</title>
      <p>XML, XPath, schema evolution, tree automaton</p>
      <sec id="sec-1-1">
        <title>1. INTRODUCTION</title>
        <p>
          XML[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] is a de-fact standard format on the Web. An XML
document is usually stored with its schema so that the
structural consistency of the document is ensured. In general,
schemas are continuously updated according to changes in
This work is licensed under the Creative Commons
AttributionShareAlike 3.0 Unported License (CC BY-SA 3.0). To view a copy
of the license, visit http://creativecommons.org/licenses/by-sa/3.0/.
real world, which is called schema evolution. If a schema is
updated, then query expressions have to be transformed so
that they are \valid" under the updated schema, since the
expressions are no longer valid against the updated schema
due to the schema update. However, this is not an easy task
since many of recent schemas are large and complex, and
thus it is becoming very difficult to know how to update the
query expressions correctly.
        </p>
        <p>
          In this paper, we propose an algorithm for transforming
XPath expressions according to schema evolution. Here,
XPath[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] is the most popular query language for XML. For
a given schema S, an edit operation op to S, and an XPath
expression p under S, our algorithm transforms p into an
XPath expression p′ \equivalent" to p whenever possible,
that is, the result of p′ under op(S) coincides with that of
p under S, where op(S) is the updated schema obtained by
applying op to S.
        </p>
        <p>To illustrate our algorithm, let us show a simple example.
Let D be the DTD in Fig. 1(a), and suppose that element
students is inserted as the child of school (Fig. 1(b)).
According to this schema update, our algorithm transforms the
following XPath expression</p>
        <p>An XML document is modeled as a labeled ordered tree,
and a schema is modeled as a tree automaton. Formally, a
tree automaton is a quadruple T A = (N; ; s; P ), where
/school/student[supervisor]/name
into the following:
/school/students/student[supervisor]/name.
Then, suppose that element supervisor is deleted from the
DTD in Fig. 1(b). In this case, it is impossible to transform
the above XPath expression into an equivalent one. Thus,
as an alternative answer our algorithm deletes supervisor
from the above expression and returns the following:
/school/students/student/name.</p>
        <p>
          Although these DTDs are very small, schemas used in
practice are much larger and complex [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Therefore, a user tends
not to understand the entire structure of a schema exactly,
and thus our algorithm is helpful to transform XPath
expressions appropriately according to schema evolutions.
        </p>
        <p>In this paper, schema is modeled as (unranked) tree
automaton, which is equivalent to regular tree grammar. Tree
automaton is the formal model of RELAX NG, and XML
Schema and DTD can also be modeled by tree
automaton [14]. Moreover, tree automaton is equivalent to
specialized DTD [17]. Thus, our algorithm is applicable to most of
formal and practical XML schema languages. As for XPath,
we focus on an XPath fragment using child and
descendantor-self axes with predicates. Although our XPath fragment
supports no upward axes, this gives usually little problem
since the majority of XPath queries uses only downward
axes[10]. Thus, we believe that our algorithm is useful to
correct a large number of XPath queries.</p>
      </sec>
      <sec id="sec-1-2">
        <title>Related Work</title>
        <p>
          The study most related to this paper is [13]. This study
proposes an algorithm for transforming XPath expressions
according to schema evolution, assuming that a schema
monotonically increases (no element can be deleted from a
schema). On the other hand, this paper has no such an
assumption and allows more general schema updates. Since
actual schema evolutions usually involve element deletions
or some similar updates, our algorithm is more practical in
real world situations. To the best of our knowledge, there
is no study on transforming XPath expressions according to
schema evolution, except [13]. However, several studies deal
with update operations to schemas. For example, Ref. [18]
proposes a \complete" set of update operations to DTDs.
Refs. [12, 9] propose update operations schemas and
algorithms for extracting \diff" between two schemas. Refs. [7,
8, 19] propose update operations that assures any updated
schema contains its original schema. Recently, Ref. [16]
introduces a taxonomy of possible problems induced by a
schema change, and gives an algorithm to detect such
problems. Ref. [11] studies query-update independence analysis,
and shows that the performance of [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] can be drastically
enhanced in the use of -calculus.
        </p>
      </sec>
      <sec id="sec-1-3">
        <title>DEFINITIONS</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>N is a set of states (element types),</title>
      <p>is a nite set of element names,
s 2 N is the start state,
P is a set of transition rules of the form X ! a(reg)
or X ! Y , where X; Y 2 N and reg is a regular
expression over N .</p>
      <p>For a transition rule X ! a(reg), we say that X is the
lefthand side, a(reg) is the right-hand side, a is the label, and
reg is the content model of the rule. For example, consider
the tree automaton T Aa shown in Fig. 2. This is
equivalent to the DTD in Fig. 1(a). \School" in N is the type of
element \school" , and so on. We assume that element
\pcdata" represents an arbitrary string. By L(T A) we mean the
language of tree automaton T A, i.e., the set of trees \valid"
against T A.</p>
      <p>Following [15], we de ne the product of tree automata. Let
TA1 = (N1; 1; s1; P1) and TA2 = (N2; 2; s2; P2) be tree
automata. Without loss of generality, we assume that 1 =
2 = . First, we de ne the product reg1 reg2, where reg1
is a regular expression over N1 and reg2 is a regular
expression over N2. Then reg1 reg2 is a regular expression over
N1 N2 such that n11n12 ni1 matches reg1 and n21n22 ni2
matches reg2 if and only if [n11; n21][n12; n22] [ni1; ni2] matches
reg1 reg2, where n11n12 ni1 is a sequence of states in N1
and n21n22 ni2 is a sequence of states in N2. reg1 reg2 is
constructed as follows.</p>
      <p>1. reg′1 is obtained from reg1 by replacing each state n
by [n1; n21]j[n1; n22]j j[n1, n2k], where n21; n22; ; n2k
is an enumeration of N2. Similarly, reg′2 is obtained
from reg2.
2. Construct two automata A1 and A2 from reg′1 and
reg′2, respectively.
3. Construct the product automaton of A1 and A2.
4. Construct a regular expression, reg1
product automaton.
reg2, from the</p>
      <p>The product automaton of TA1 and TA2 is TA3 = (N1
N2; ; s1 s2; P3), where
P3
=
f[n1; n2] ! a(reg1
n2 ! a(reg2) 2 P2)
[f[n1; n2] ! [n1; n′2] j (n1 2 N1; n2 ! n′2 2 P2)g
[f[n1; n2] ! [n′1; n2] j (n1 ! n′1 2 P1; n2 2 N2)gg:
By de nition, it is immediate that for any tree t, t 2 L(T A3)
if and only if t 2 L(T A1) and t 2 L(T A2).</p>
      <p>Let be a set of element names. We de ne XPath
expression p as follows.</p>
      <p>reg2) j (n1 ! a(reg1) 2 P1;
p ::= =p′
p′ ::=
q ::= p′</p>
      <p>:: l j p′=p′ j p′[q], where l 2
::=child j descendant-or-self</p>
      <p>In this section, we de ne tree automaton and the product
of tree automata.</p>
      <p>We call each 2 fchild; descendant-or-selfg axis, and q
predicate, respectively.
P
=</p>
      <sec id="sec-2-1">
        <title>UPDATE OPERATION TO TREE AU</title>
      </sec>
      <sec id="sec-2-2">
        <title>TOMATON</title>
        <p>In this section, we de ne update operations to tree
automaton.</p>
        <p>Let reg be a regular expression. To de ne update
operations to a tree automaton, we need to identify the
positions of states and operators in reg. Thus we de ne the set
pos(reg) of positions in a reg, as follows.</p>
        <p>If reg = ϵ or reg = a (a 2</p>
        <p>), then pos(reg) = f g.</p>
        <p>Otherwise, pos(reg) = f g [ fu j u = vw; 1 v
n; w 2 pos(regv )g, where regv is a subexpression
consisting of the descendants of v in reg.</p>
        <p>For example, let reg = (AjB) . Then pos(reg) =
f ; 1; 11; 12g. As shown in Fig. 3, is the position of ,
1 is the position of j, 11 is the position of A, and 12 is the
position of B.</p>
        <p>In the following, without loss of generality we assume
that for any state A and any element name a, there is at
most one transition rule whose left-hand side is A and
label is a. If there are transition rules A ! a(reg1) and
A ! a(reg2), then these can be merged into one transition
rule A ! a(reg1jreg2).</p>
        <p>We now de ne update operations to a tree automaton
T A = (N; ; s; P ) as follows.</p>
        <p>ins state(A; a; B; i): Inserts new state B at position i
in the content model of r, where r is the transition rule
in P such that the left-hand side of r is A and that the
label of r is a.
ins opr(A; a; opr; i): Inserts an operator opr ( , +, ?)
at position i in the content model of r, where r is the
transition rule in P such that the left-hand side of r is
A and that the label of r is a.
del state(A; a; i): Deletes the state at position i in the
content model of r, where r is the transition rule in
P such that the left-hand side of r is A and that the
label of r is a.
del opr(A; a; i): Deletes the operator at position i in
the content model of r, where r is the transition rule
in P such that the left-hand side of r is A and that the
label of r is a.
nest state(A; a; B; b; i): Replaces the subexpression E
at position i (i.e., subtree rooted at position i) in
content model of r by state B, where r is the transition
rule in P such that the left-hand side of r is A and
that the label of r is a. Moreover, a transition rule
B ! b(E) is added to P .
unnest state(A; a; b; i): This is the inverse operation
of nest state, and replaces the state B at position i
in the content model of r by reg, where (1) r is the
transition rule in P such that the left-hand side of r is
A and that the label of r is a and (2) reg is the content
model of the transition rule whose left-hand side is B
and label is b.
replace state(A; a; B; i): Replaces the state at position
i in the content model of r by B. In this paper, this
operation is treated as a pair of unnest state(A; a; i)
and nest state(A; a; B; b; i). This operation is used in
order to \rename" element names.</p>
        <p>For example, consider the tree automaton T Aa in Fig. 2.
Applying nest state(School; school; Students; students; )
to T Aa, we obtain the tree automaton T Ab in
Fig. 4. Then applying del opr(Student; student; 4) and
del state(Student; student; 4) to Tb, we obtain the tree
automaton T Ac in Fig. 5.
4.</p>
      </sec>
      <sec id="sec-2-3">
        <title>TRANSFORMATION FROM XPATH</title>
      </sec>
      <sec id="sec-2-4">
        <title>EXPRESSION INTO TREE AUTOMATON</title>
        <p>Our algorithm treats an XPath expression as a tree
automaton. Thus, in this section we de ne a transformation
from an XPath expression into a tree automaton. This
transformation is based on [15].</p>
        <p>An XPath expression is transformed into a tree automaton
as follows. First, we transform a location step :: l into a
tree automaton. Each transformed tree automaton has input
and output states, corresponding to the \context node" for
:: l and the \result node" selected by :: l, respectively.
For an XPath expression p of the form p1=p2, p′[q], or =p′, we
transform p into a tree automaton in a bottom-up manner,
according to the structure of p.</p>
        <p>We rst de ne a tree automaton (N; ; s; P )
corresponding to location step :: l. In the following, A ! ( )
denotes fA ! ( ) j 2 g.</p>
        <p>If</p>
        <p>= child, then
{ The initial state s is B.
P
=
The input state and the nite set of output states of a nite
tree automata are fCg and fDg, respectively. C represents
the context node and D represents the node selected by the
location step :: l.</p>
        <p>We next de ne the transformation from an XPath
expression p into a tree automaton. We have to consider the case
where p = p1=p2, p = p[q], and p = =p. In the following, for
a tree automaton T Ap of p, Np denotes the set of states of
T Ap, N Ip Np denotes the set of input states of T Ap, and
N Op Np denotes the set of output states of T Ap.
The case of p1=p2
Let T Ap1 = (Np1 ; ; sp1 ; Pp1 ) be the tree automaton of p1
and T Ap2 = (Np2 ; ; sp2 ; Pp2 ) be the tree automaton of p2.
Moreover, let T A = (N; ; s; P ) be the product automaton
of T Ap1 and T Ap2 . Then the automaton T Ap1=p2 of p1=p2
is de ned as T Ap1=p2 = (Np1=p2 ; ; s; P ), where
=
f[np1 ; np2 ] j np1 2 N Ip1 ; np2 2 (Np2
N Ip2 )g;
f[np1 ; np2 ] j np1 2 (Np1</p>
        <p>N Op1 ); np2 2 N Op2 g:</p>
        <p>The case of p[q]
Let T Ap = (Np; ; sp; Pp) be the tree automaton of p and
T Aq = (Nq; ; sq; Pq) be the tree automaton of q. Moreover,
let T A = (N; ; s; P ) be the product automaton T Ap and
T Aq. Then the tree automaton T Ap[q] of p[q] is de ned as
T Ap[q] = (Np[q]; ; s; P ), where</p>
        <p>Np[q] =
f[np; nq] j (np 2 N Op; nq 2 N Iq) _
(np 2 (Np</p>
        <p>N Op); nq 2 (Nq</p>
        <p>N Iq))g:</p>
        <p>The set N Ip[q] of input states and the set N Op[q] of output
states of T Ap[q] are de ned as follows.</p>
        <p>N Ip[q] =
N Op[q] =
f[np; nq] j np 2 N Ip; nq 2 (Nq
N Iq)g;
f[np; nq] j (np 2 N Op; nq 2 N Iq)g:
The case of =p
Let :: l be the rst location step in p and let T Ap =
(Np; ; sp; Pp) be the tree automaton of p. Then the tree
automaton T A=p of =p is de ned as T A=p = (N=p; ; R; P=p),
where</p>
        <p>N=p
P=p
=
=</p>
        <p>Np [ fRg;</p>
        <p>Pp [ fR ! root(D)g;
where R is the initial state of T A=p and root is the element
name corresponding to the root node.</p>
        <p>The set N I=p of input states and the set N O=p of output
states of T A=p are de ned as follows.</p>
        <p>N I=p
N O=p
= fRg;
=</p>
        <p>N Op;
where N Op is the set of output states of T Ap.
5.</p>
      </sec>
      <sec id="sec-2-5">
        <title>ALGORITHM FOR TRANSFORMING</title>
      </sec>
      <sec id="sec-2-6">
        <title>XPATH EXPRESSION ACCORDING TO</title>
      </sec>
      <sec id="sec-2-7">
        <title>SCHEMA EVOLUTION</title>
        <p>In this section, we show an algorithm for transforming a
given XPath expression according to schema evolution.</p>
        <p>To describe our algorithm, we need some de nitions. For
an XPath expression p, the selection path of p is the XPath
expression obtained by dropping every predicate from p.
Let p be an XPath expression of the form =p1[q]=p2. Then
=p1=q is called a predicate path of p (the predicate path for
a predicate in q can be de ned similarly). For example, Let
p = =a[f ]=b[c=e]=d. Then =a=b=d is the selection path of p,
and =a=f and =a=b=c=e are the predicate paths of p.</p>
        <p>Let us rst show the \main" algorithm (Fig. 6). Let op be
the update operation applied to tree automaton T A.
Moreover, let p be the input XPath expression, p0 be the selection
path of p and p1; ; pk be the predicate paths of p. We
transform pi to p′i according to op, for each i = 0; 1; ; k,
and merge the resulting k expressions p′0; p′1; ; p′k as the
result. More concretely, the algorithm rst partition p into
p0; p1; ; pk (step 1). Then p0 is transformed into p′0
according to op by function Transform (shown later). If the
result of p0 becomes empty due to del state, then p′0 = nil
and the transformation of p is terminated (steps 3 and 4).
Otherwise, we transform pi for each i = 1; ; k (steps
6 to 8), by Transform. Finally, p0; p1; ; pk are merged
and returned as the result. If a predicate path p′i is nil,
then p′i is just ignored when merged. For example, let
p = =a[f ]=b[c=e]=d. Then p0 = =a=b=c, p1 = =a=f , and
p2 = =a=b=c=e. Suppose that c is deleted by unnest state.
Then we obtain p′0 = =a=b=d, p′1 = =a=f , and p′2 = =a=b=e
due to Transform, and merging these three expressions we
have p′ = =a[f ]=b[e]=d, which is the result of the algorithm.</p>
        <p>Let us next consider function Transform. Let T A =
(N; ; s; P ) be a tree automaton, op be an edit operation
to T A, and p be an XPath expression having no predicate.
Our objective is to transform p into an XPath expression p′
so that the result of p′ under op(T A) coincides with that of p
under T A. However, this is sometimes impossible if a state
(and its corresponding element) is deleted from T A. Thus,
Transform is constructed as follows.</p>
        <p>If op is ins state, ins opr, or del opr, then p is
unchanged.</p>
        <p>If op is nest state(A; a; B; b; i), then location step
child :: b is inserted to p at the position that
nesting is occurred, if necessary (i.e., unless p contains a
descendant-or-self axis that \masks" the nest state
operation).</p>
        <p>If op is unnest state(A; a; b; i), then a location step
whose node test is b is deleted from p, if it is not
\masked" by a descendant-or-self axis.</p>
        <p>If op is del state, then p is modi ed as follows.</p>
        <p>Input : XPath expression p, tree automaton T A, update
operation op to T A.</p>
        <p>Output : XPath expression or nil
1. Partition p into the selection path p0 and the predicate path
p1; p2; ; pk.
5. else
6. for i
p′
i
end
2. p′0</p>
        <sec id="sec-2-7-1">
          <title>Transform(p0; T A; op);</title>
          <p>3. if p′0 = nil then
return nil;
1 to k do</p>
        </sec>
        <sec id="sec-2-7-2">
          <title>Transform(pi; T A; op);</title>
          <p>9. Merge p′0; p′1;</p>
          <p>; p′k into p′.
10. return p′;</p>
          <p>Now we present function Transform (Fig.7).
state(A; a; i; P ) in steps 16, 23, and 29 denotes the
state at position i in the content model of r, where r is
the transition rule in P such that its left-hand side is A
and its label is a. If op is ins state, ins opr, or del opr, p
is unchanged (steps 4 to 7). If op is del state, we examine
whether the result of p becomes empty under op(T A) (steps
8 to 11). If the result of p does not become empty, p is
unchanged (steps 11 to 12). If the result of p becomes
empty and p is the selection path, the algorithm returns nil
(steps 13 to 14). Otherwise (i.e., p is a predicate path), we
delete the suffix of p that becomes invalid due to op (steps
15 to 20). In step 17, P ′′ is obtained in step 2, and we
say that r is a transition rule from A to C if the left-hand
side of r contains A and a state in the content model of
r contains C. In step 18, we say that lsj corresponds to
r 2 P ′′ if r is the intersection of (i) the transition rule
corresponding to lsj in Pp and (ii) some transition rule in
P . If op is nest state(A; a; B; b; i), we examine whether we
need to transform p (steps 23 to 24). If we need to do, we
insert a new location step child :: b to p (steps 25 and 26). If
op is unnest state(A; a; b; i), then we also examine whether
we need to transform p (steps 30 to 31). Moreover, if the
location step lsj+1 to be deleted has a predicate q and the
axis of lsj+1 is descendant-or-self, we delete q (steps 32 to
33). Otherwise, we delete lsj+1 from p (steps 34 to 38).</p>
          <p>Finally, let us present the time complexity of the
algorithm. Let T A = (N; T; s; P ) be a tree automaton
(schema) p be an XPath expression that is partitioned into
p0; p1; p2; ; pk. Then the algorithm runs in O(k jpj2
jT Aj), where jpj is the number of location steps in p and
jT Aj is the size of T A.
6.</p>
        </sec>
      </sec>
      <sec id="sec-2-8">
        <title>EXPERIMENTAL RESULTS</title>
        <p>
          To verify if our algorithm transforms XPath expressions
appropriately under real world schemas, we implemented
our algorithm (in Ruby) and made a few experimentations.
We use two pairs of schemas, MSRMEDOC DTDs (version
2.1.1 and 2.2.2)[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and the NLM Journal Publishing Tag Set
Tag Library DTDs (version 2.3 and 3.0)[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>First, we give the evaluation of our algorithm on
MSRMEDOC DTDs. Let D211 be the version 2.1.1 MSRMEDOC
DTD and D222 be the version 2.2.2 MSRMEDOC DTD. The
number of elements of D211 is 185 and that of D222 is 205.
Table 1 shows the number of update operations between
D211 and D222, where \others" are attributes insertions (not
supported by our algorithm).</p>
        <p>We generate 90 XPath expressions under D211 by
XQgen[20]. The average size (i.e., number of location steps) of
the XPath expressions is 5, where the minimum size is 4 and
the maximum size is 7. Each XPath expression uses child
axes and at most one descendant-or-self axes (78 XPath
expressions use a descendant-or-self axis). There are 5 XPath
expressions whose result becomes empty under D222. Since
there is no predicate in these 90 XPath expressions, we
D211 ! D222
D23 ! D30
choose 4 expressions and add a predicate to each of 4
expressions (given later).</p>
        <p>We transform the above 90 XPath expressions by our
algorithm. For 85 expressions, the elements retrieved under
D222 coincides with the elements retrieved under D211. For
the rest 5 XPath expressions, our algorithm returns nil since
the results of these expressions become empty under D222.
These 5 XPath expressions are shown in Table 2 (deleted
elements are italicized). Since elements LIST, NOTE,
FIGURE, and FORMULA (child elements of REMARK) and an
element PRIVATE-CODES (child element of
COMPANYDOC-INFO) are deleted, the results of these XPath
expressions become empty (note that if an element is deleted from
a content model, then its descendants cannot be retrieved
either).</p>
        <p>The four XPath expressions with a predicate, denoted
p1; p2; p3; p4, are transformed as follows (Table 3).</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Since element REMARK</title>
      <p>COMPANY-REVISION-INFO)
predicate of p1 is deleted.
(child
is</p>
      <p>element
deleted,</p>
      <p>of
the
Since element REMARK is deleted, the last location
step in the predicate of p2 is deleted.</p>
      <p>Since element REMARK is deleted, the location
step corresponding to an element REMARK in p3 is
deleted, therefore our algorithm return nil.</p>
      <p>Since element L-1 is inserted between P and FT and
between P and STD, two location steps L-1 are
inserted to p4.</p>
      <p>As shown above, we can say that every XPath
expression is transformed appropriately by our algorithm, even if
del state's are involved in the schema evolution. The total
execution time of the algorithm for the 90 XPath
expression and 191 update operations is 9.58 sec, thus it takes an
average 0.111 sec per one XPath expression.</p>
      <p>We also made a similar experimentation using the NLM
Journal Publishing Tag Set Tag Library. Let D23 be the
version 2.3 The NLM Journal Publishing Tag Set Tag
Library DTD and D30 be the version 3.0 DTD. The number
of elements of D23 is 211 and that of D30 is 233. Table 1
shows the number of update operations between D23 and
D30.</p>
      <p>We generate 97 XPath expressions under D23 by XQgen.
The average size of the XPath expressions is 6. Each XPath
expression uses child axes and at most once
descendant-orself axes, where 95 XPath expressions include
descendantor-self axes. There are 10 XPath expressions whose result
becomes empty under D30. There is no XPath expression
with a predicate, thus we choose 3 XPath expressions and
added a predicate to each expression.</p>
      <p>We transform the 97 XPath expressions by our algorithm.
For 87 expressions, the elements retrieved under D30
coincides with the elements retrieved under D23. For the rest
10 XPath expressions, our algorithm returns nil since the
results of these expressions become empty under D30, which
are listed in Table 4. Since elements citation, contract-num,
contract-sponsor in element p are deleted, the results of
these XPath expressions become empty. The three XPath
expressions with a predicate, denoted p5; p6; p7, are
transformed as follows (Table 5).</p>
      <p>Since element chem-struct (child element of ack) is
deleted, the predicate of p5 is deleted.</p>
      <p>Since element chem-struct-wrapper is renamed to
chem-struct-wrap, the predicate of p6 is renamed
accordingly.</p>
      <p>Since element custom-meta-wrap is renamed to
custom-meta-group, the corresponding location step in
p7 is renamed to custom-meta-group.</p>
      <p>Again our algorithm seems to work well despite of del state
operations. These results suggest that our algorithm can be
applied to XPath expressions under real world schema
evolutions. The total execution time of the algorithm for the 97
XPath expression and 733 update operations is 77.561 sec,
thus it takes an average 0.800 sec per one XPath expression.</p>
      <sec id="sec-3-1">
        <title>CONCLUSION</title>
        <p>In this paper, we proposed an algorithm for transforming
an XPath expression according to schema evolution
allowing element deletions. However, this is just an ongoing work
and we have a lot to do. First, we would like to extend
our algorithm so that it can handle more general XPath
expressions, e.g., supporting sibling and parent axes. Second,
we use only two schema evolutions in our experimentation.
Thus we would like to evaluate our algorithm under more
real world schemas. Third, our algorithm supports neither
attribute nor entity declaration in schemas. These should
be incorporated into our algorithm.</p>
      </sec>
      <sec id="sec-3-2">
        <title>ACKNOWLEDGEMENT 8. 9.</title>
        <p>This work is partly supported by Grants-in-aid for
Scienti c Research (23500110).
XPath expression p2:
//DOC-REVISION[COMPANY-REVISION-INFOS/COMPANY-REVISION</p>
        <p>INFO/REMARK]/MODIFICATIONS/MODEFICATION
Transformed result:
//DOC-REVISION[COMPANY-REVISION-INFOS/COMPANY-REVISION</p>
        <p>INFO]/MODIFICATIONS/MODEFICATION
XPath expression p3:
//DOC-REVISION[COMPANY-REVISION-INFOS/COMPANY-DOC-INFO/PRIVATECODES/PRIVATE-CODE]/DOC-REVISIONS/DOC-REVISION/COMPANY-REVISION</p>
        <p>INFOS/COMPANY-REVISION-INFO/REMARK
Transformed result: nil</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>XPath expression p4:</title>
      <p>Transformed result:
//P[FT]/STD
//P[L-1/FT]/L-1/STD
XPath expression p6:
//license/p/preformat/named-content/supplementary-material[chem-struct-wrapper]/dispquote
Transformed result: //license/p/preformat/named-content/supplementary-material[chem-struct-wrap]/disp-quote</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>\Journal</given-names>
            <surname>Publishing Tag</surname>
          </string-name>
          <article-title>Set" NLM Journal Archiving and Interchange Tag Suite</article-title>
          . http://dtd.nlm.nih.gov/publishing/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>[2] \MSR Download" MSR Home</article-title>
          . http://www.msr-wg.de/medoc/downlo.html.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Benedikt</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Cheney</surname>
          </string-name>
          .
          <article-title>Destabilizers and independence of xml updates</article-title>
          .
          <source>Proc. VLDB Endow</source>
          .,
          <volume>3</volume>
          (
          <issue>1</issue>
          -2):
          <volume>906</volume>
          {
          <fpage>917</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Berglund</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Boag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chamberlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Fernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Robie</surname>
          </string-name>
          , and J. Simeon, editors.
          <source>XML Path Language (XPath) 2</source>
          .
          <fpage>0</fpage>
          <string-name>
            <surname>(Second</surname>
            <given-names>Edition</given-names>
          </string-name>
          ). http://www.w3.org/TR/xpath20/.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Bray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Paoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Sperberg-McQueen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Maler</surname>
          </string-name>
          , and F. Yergeau, editors.
          <source>Extensible Markup Language (XML) 1</source>
          .
          <fpage>0</fpage>
          <string-name>
            <surname>(Fifth</surname>
            <given-names>Edition</given-names>
          </string-name>
          ). http://www.w3.org/TR/xml/.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Choi</surname>
          </string-name>
          .
          <article-title>What are real DTDs like?</article-title>
          <source>In Proc. WebDB</source>
          , pages
          <volume>43</volume>
          {
          <fpage>48</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <article-title>XPath expression p7: /article/front/journal-meta/custom-meta-wrap[custom-meta/meta-name]/custom-meta/metavalue</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>