<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Updating Views Over Recursive XML</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ming Jiang</string-name>
          <email>jiangm@cs.wpi.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ling Wang</string-name>
          <email>lingw@cs.wpi.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Murali Mani</string-name>
          <email>mmani@cs.wpi.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elke Rundensteiner</string-name>
          <email>rundenst@cs.wpi.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Worcester Polytechnic Institute</institution>
          ,
          <addr-line>100 Institute Road, Worcester MA 01609</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We study the problem of updating XML views defined over XML documents. A view update is performed by finding the base updates over the underlying data sources that achieve the desired view update. If such base updates do not exist, the view update is said to be untranslatable and rejected. In SQL, determining whether a view update is translatable is performed using schema level analysis, where the view definition and the base schema are used. XML schemas are more complex than SQL schemas, and can specify recursive types and cardinality constraints. In this paper, we propose a solution based on schema level analysis for determining whether an update over XML views is translatable and for finding the translation if one exists, while considering the features of XML schemas.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        In databases systems, a user sees a portion of the base data called a view. Therefore
he/she may need to update base data through these views (view updates). Especially
in shared databases, it is essential to provide the capacity to support view updates. In
the relational scenario, there have been many studies on determining whether a view
update is translatable [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. A common semantics used for determining whether a view
update is translatable is side-effect free semantics. In this semantics, a view update is
said to be translatable if there exists base updates that achieve the desired view update
without affecting any other portion of the view. Current relational/SQL systems use
schema level analysis for determining whether a view update is translatable, where the
view definition and the base schemas are used.
      </p>
      <p>Nowadays, as XML is becoming the standard format for data exchange, database
community is exploring its ability to store data. In fact, view updates become more
common as many XML databases are available on the internet, and a large number of
users have access to such databases. In this paper, we study how to perform XML view
updates over XML data sources, using schema level analysis. This problem is much
harder than for relational schemas because of the complex features in XML schema,
such as recursive types and cardinality constraints.</p>
      <p>Let us consider an example XML document with its schema as in Figure 1. Note
the base schema element course is recursive, as a course may have a child element
pre, which stands for pre-requisite for this course, and pre in turn can have course
elements as its children. Similarly, the base element pre is also recursive. Now consider
two queries over D, as shown in Figure 2 and Figure 3.</p>
      <p>In Figure 2, (a) is the XQuery statement which defines the view. (b) is the view
schema tree that corresponds to the XQuery. (c) is the view instance tree generated by
the XQuery and XML document D. The same goes with Figure 3. 1
1 The subscripts a, b, c in Figure 1 and 1,2,3 in Figure 2(c) and Figure 3(c) are for illustration
purpose only. They do not appear in the actual documents or views.
Fig. 3. Query Q2 and corresponding view Fig. 4. Query Q3</p>
      <p>A user may want to delete course1 in Figure 2(c). If we delete coursea in D, this
update would cause course2, course3 and their descendants to be removed in
Figure 2(c). This is a side-effect and therefore it is not a correct translation. Now let us
consider Figure 3(c) and try to delete course2. We can achieve this by deleting the base
element coursec which has the name child. However, doing so will also delete course3
in the view and therefore it is also not a correct translation.</p>
      <p>Intuitively, recursive base schemas and queries cause the above problems. However,
are the above two scenarios the only cases that recursion may have side-effects? If not,
how can we effectively check out all such side-effects? This problem has not been
studied, to the best of our knowledge.</p>
      <p>There are also other XML features that need to be considered for XML view
update problems, such as cardinality constraints in the base schema. Will these features
make the problem different from the relational scenario? Let us take a look at the query
in Figure 4(c). It indicates that each prof essor element in the base will join with
every student element. Therefore each prof essor and student may be used more than
once and we cannot delete prof -student view element. However, let us reconsider this
query, given the base schema as shown in Figure 4(a). It indicates that there is only one
prof essor in the base. We now know that each student will be used only once and we
can delete a certain prof -student by deleting the corresponding student in the base
XML document. From this example, we can observe that utilizing cardinality
information provided in the base schema may give a better translation for the view update. How
to fully handle cardinality is also discussed in this paper.</p>
      <p>Our main technical contributions include: we study how features in XML schemas,
such as recursive types and cardinality constraints, impact the XML view update
problem. We propose an algorithm for determining whether a view update over XML data
sources is translatable and for finding the translation if one exists, based on schema level
analysis. Our algorithm is sound (a translation returned by our algorithm is guaranteed
to not cause side-effects) and complete (a translation is guaranteed to be returned by our
algorithm if there exists one). We believe these results go a long way towards
understanding the XML view update problem and provide the capacity to efficiently update
XML views.</p>
      <p>
        Outline. The rest of the paper is organized as follows. Section 2 defines view update
translatability and then defines the scope of the problem we consider. Section 3
introduces notations and background. Section 4 discusses how to handle unique features in
XML schemas when solving the view update problem over XML data sources.
Section 5 proposes our three-step checking algorithm and Section 6 gives a conclusion and
discusses the future work.
2 Related Work
There are many studies on view updates in relational scenario, such as [
        <xref ref-type="bibr" rid="ref4 ref5 ref6 ref9">6, 5, 9, 4</xref>
        ]. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
introduces the concept of a complementary view. The authors argue that when changing
the data in the base corresponding to the updates on the view, the rest of the database
that is not in the view should remain unchanged. This solution tends to be too strict,
as many view updates are not translatable by this theory. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the authors argue that
we can perform a view update by deleting base tuples that contribute to the existence
of this view element. Also such base tuples are required not to contribute to other view
elements to avoid side-effects. Similarly, in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], Keller proposes an algorithm to check
whether 1-1 mapping exists between a set of base tuples and a set of view tuples. This
mapping indicates that a certain view element can be deleted without side-effects.
      </p>
      <p>
        While [
        <xref ref-type="bibr" rid="ref5 ref6 ref9">6, 5, 9</xref>
        ] study the view update problem on the schema level, there are other
works such as [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] that study the problem on the instance level. Therefore in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], more
updates can be performed without side-effects. However, because of the large size of
the database, such data-centric algorithms tend to be more time-consuming.
      </p>
      <p>
        In order to utilize the maturity of relational database techniques and also adapt to
the current required web applications, people tend to build XML views over relational
databases, such as [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ]. There are some research that consider XML views as
compositions of flat relational views, such as [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], for the purpose of querying relational
databases. Some other work further study the updatability of XML views over
relational databases. In [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], the authors discuss how to check side-effects for updating
XML view elements over a relational database. In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] the authors use the nested
relational algebra as the formalism for an XML view of a relational database to study the
problem of when such views are updatable. However, given an XML view over XML
data, how to check the updatability of the view elements and further give the correct,
efficient translation of this view update remains unsolved.
      </p>
      <p>
        Language for updating XML documents is being studied by [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] discusses
updates in XML scenarios. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] presents some interesting problems in XML view
updates. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] considers virtual updatable views for a query language addressing native
XML databases, including information about intents of updates into view definitions.
[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] studies type checking in XML view updates.
3
      </p>
      <p>View Update Translatability and Problem Scope</p>
      <sec id="sec-1-1">
        <title>3.1 View Update Translatability Definition</title>
        <p>A view update operation u can be a delete, an insert or a replacement. The corresponding
update on the XML base is said to be the translation of the view update.
Definition 1. Let D be an XML document and V a view defined by DEF V over D. An
XML document update sequence U R is a correct translation of a view update uV if
uV (DEF V (D))=DEF V (U R(D )).</p>
        <p>This definition is depicted in Figure 5. The update is correct if the diagram in
Figure 5 commutes.</p>
        <p>(1)
DEFv
v
D
(2) uv
(3) UR
uv(v)
(4)</p>
        <p>DEFv</p>
        <p>UR(D)</p>
      </sec>
      <sec id="sec-1-2">
        <title>3.2 Problem Scope</title>
        <p>
          Update Operations Considered As we introduced above, a view update operation can
be a delete, an insert or a replacement. Deletions are typically considered to be different
from insertions. For instance, consider an SQL view defined as a join between student
table and prof essor table, where a student row joins with at most one prof essor row.
The SQL standard [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] supports deleting a row in this view by deleting a corresponding
student row, whereas inserts are rejected as they might need to insert into student
table, or prof essor table or even both, which is more complex and hard to decide. As
the first work considering view updates over XML data sources, we consider only
deletions and inserts are out of our scope. Further, we study single view element deletion,
as opposed to deleting a set of view elements. In addition, we do not use a view update
language. As we focus on updates of a single view element, how the view element is
specified (by the view update language) is not significant.
        </p>
        <p>Base Schema Language We use DTD (Document Type Definition) as schema
language to describe the underlying databases. DTD is a very expressive and complex
language. The two most significant features in DTD that we consider are recursion and
cardinality. The cardinality information is obtained from the content model in DTD,
which uses ”*”, ”+”, ”?”, ”,” or ”|”. We will not consider other features in XML schema
languages, for doing so will make the algorithm extremely complicated and hard to
understand. More specifically, we will not consider ID/IDREF constraints in DTD, and
sub-typing and key/foreign key constraints in XML schema.</p>
        <p>
          View Definition Language We will use a subset of XQuery as the view definition
language described as follows:
1. The XQuery we consider could have FOR, WHERE and RETURN clauses and
dirElemConstructor [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] in the statement.
2. In each FOR clause, there can be multiple variable binding statements.
3. In an XPath expression, multiple ”//” and ”|” can exist. Further, a node test [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] can
be specified as a wildcard.
4. RETURN can contain nested XQuery statements.
        </p>
        <p>Even though we consider WHERE clause, the predicates specified in the WHERE
clause are not used to determine whether a view update is translatable. Though
considering such predicates might result in more view updates being translatable, it can be
handled similarly as in relational scenario and we want to focus on the unique XML
features. Also, the LET clause is not considered as an XQuery that uses LET can be
rewritten into one without the LET clause. Similar to SQL solutions, we do not
consider aggregation, user-defined functions and Orderby clauses.</p>
        <p>Restrictions on Translations Considered There are various strategies for
translating view updates. For those base XML elements corresponding to the view element
to be deleted, we can set its value to null, or delete it but keep its descendants, etc.
However, we consider only the translations where we delete an XML view element
by deleting the corresponding base elements and also the descendants. This keeps the
problem tractable, and is similar to existing solutions in SQL/relational scenarios. Now
the problem we study can be described as:</p>
        <p>Problem Statement: Let Schema(D) be an XML schema and Q a view query over
Schema(D). Given a view schema node n, n ∈ Q, does there exist a translation for
deleting a view element whose view schema node is n that is correct for every instance
of Schema(D)?</p>
        <p>
          Note that we study the problem with schema level analysis, which utilizes the view
definition and the schema of the base XML data sources. In other words, we do not
examine the base data to determine whether there exists a translation. Such schema
level analysis is similar to the approach in relational scenarios [
          <xref ref-type="bibr" rid="ref5 ref9">5, 9</xref>
          ]; data level analysis
for the view update problem has been studied in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4 Notations</title>
      <p>In this section we first introduce some concepts and notations which are the
foundation of later discussions. A summary of them can be found in Table 1 2. Let D be an
XML document(base XML data sources) with schema Schema(D). Schema(D) can
be represented as a tree called the base schema tree, denoted as STBase. The STBase
of the XML Document in Figure 1 is shown in Figure 6 3.</p>
      <p>The XML view is defined as a query Q over Schema(D). The corresponding
instance is denoted as V . Q specifies a view schema tree, denoted as STV iew, such as
Figure 2(b), Figure 3(b) and Figure 4(b).</p>
      <p>vei is a view element in V that is to be deleted. The node in STV iew corresponding
to vei is called the view schema node of vei, denoted as SNV iew(vei). Let us consider
the view element course1 in Figure 2(c), SNV iew(course1) is the node course in
Figure 2(b).
2 SNV iew stands for View Schema Node and STV iew stands for View Schema Tree. SNBase
and STBase are analogously defined for the base XML document.
3 Note there is some information not captured by STBase such as order of elements. We only
capture those information that will be utilized by our algorithm, such as cardinality constraints
and recursive types.</p>
      <p>Semantic Meaninig</p>
      <p>XML data sources
Schema(D) XML schema of D</p>
      <p>STBase sscohuercmeas tree of XML data
SNBase a node in STBase
bej a base element in D
source(vei) a base element that
contribute to the existence
of vei
Source(vei) a SNBase that contributes</p>
      <p>to the existence of vei in V
des(source) The set of base elements
that are the descendants of
source and source itself</p>
      <p>Notations</p>
      <p>Q
V
STView</p>
      <p>Semantic Meaning
XQuery Statement defining
the view
view instance defined by Q
schema tree of Q
SNView a node in STView</p>
      <p>vei a view element in V
sources(vei) All base elements that
contribute to the existence of
vei
Sources(vei) all the SNBase that contribute</p>
      <p>to the existence of vei
Des(Source) The set of schema nodes that
are the descendants of Source
and Source itself</p>
      <p>Let us examine the view element course1 in Figure 3(c) again. It exists in the view
only when the following two conditions are both satisfied:
1. In the base XML document, there exists one pre element, demonstrated as prea,
and one course element, denoted as courseb.
2. The courseb element is a descendant of the prea element.</p>
      <p>course1 in Figure 3(c) exists because of prea and courseb in base XML Document.
Deleting any one of these base elements will lead to deleting course1. Therefore, these
base elements are considered as candidates for deleting course1. Let us now define
those candidates 4.</p>
      <p>Given a SNV iew(vei) in STV iew, every XPath expression that appears on the path
from the root till SNV iew(vei) in STV iew corresponds to a base schema node, which
is called a Source and denoted as Source(vei). The name indicates that it is a way to
delete the view element. The set of all such XPath expressions is denoted as Sources(vei).</p>
      <p>For example, in Figure 7(c), let us consider the view element name1. According
to Figure 7(b), There are four path expressions from the root till name1, which are
Document(”base.xml”)//department, $dept//prof essor, $prof /student,
$student/name. Therefore, Sources(name1) = {department, prof essor, student,
namestudent}.</p>
      <p>For each Source(vei), there exists a set of base elements I(Source(vei)) in D
corresponding to it. In I(Source(vei)), there exists one base element contributing to
the existence of vei and we call this a source, denoted as source(vei). For example, in
Figure 7(c), sources(name1) is {department, prof essor, studenta, namea}.</p>
      <p>Note while we can delete a source to delete its corresponding view element, it is
possible that some other view elements got unexpectedly affected because of this
update, which are normally called side-effects. There are two kinds of side-effects. The
first kind of side-effects is a descendant of source(vei) is a source of another view
element. For example, we may want to delete coursea in Figure 1 to delete course1
4 In fact, deleting an ancestor of any of these base elements can be considered as a candidate
for deleting course1 also. Doing this, however, will delete some base elements that are not
required to get updated. Further this does not affect translatability. Therefore, we do not include
them in our candidates.</p>
      <p>name
name
name</p>
      <p>root
inst*itute</p>
      <p>+
department</p>
      <p>+
professor +</p>
      <p>*
student
namestudent
course</p>
      <p>*
name
in Figure 3(c), as coursea is a source of course1. However, courseb, which is a
descendant of coursea, is the source of course2 in Figure 2(c). Therefore, such update
will cause side-effects over view element course2,as one of its sources get deleted.
The second kind is source(vei) is also a source of another view element. For example,
courseb in Figure 1 is the source of course2 in Figure 3(c). However, it is also a source
of course3. If we want to delete courseb to delete course2, there will be side-effects
over course3, as one of its sources get deleted.</p>
      <p>Our goal is to find, given a view element vei, whether there exists a non-empty
subset of sources(vei) such that deleting any source source(vei) in this subset will
delete vei without affecting any other non-descendant view element of vei. Deleting
source(vei) does not affect vej if des(source(vei)) ∩ sources(vej) = ∅. Based on
the above concepts, the definition of correctly translating the deletion of a view element
problem can be refined as:</p>
      <p>Problem Statement: Let Schema(D) be an XML schema and Q a view query over
it. Given a view schema node n, does the following condition hold for every instance
of Schema(D) whose corresponding view instance is V : For any element vei, whose
schema node is n, does there exist source(vei) such that ∀ vej ∈ V , vei 6= vej and vej
is not descendant of vei, where des(source(vei)) ∩ sources(vej) = ∅.</p>
    </sec>
    <sec id="sec-3">
      <title>5 Algorithm Analysis</title>
      <p>5.1 A Naive Algorithm
Using the above concepts, we can observe the following. Consider deleting a view
element vei by deleting a certain base element source(vei). Let this element correspond
to the base schema node Source(vei). Consider all base schema nodes that could be
descendants of Source(vei), basically Des(Source(vei)). If none of these nodes form
a Source(vej), then deleting source(vei) will not affect vej. This is stated below.
Lemma 1. Deleting a source(vei) will not affect view element vej, if Des(Source(vei))
∩ Sources(vej) = ∅.</p>
      <p>For example, consider course2 and course3 in Figure 3(c). Suppose we want to
delete course2. As course in Figure 1 is a Source(course2), Des(course) ∩ Sources
(course3) = {course, pre} which is not empty. This implies if we delete course2,
some base elements contributing to the existence of course3 may also get deleted and
therefore there may exist side-effects on course3, which gives the same result as in our
previous analysis.</p>
      <p>Using Lemma 1, we can come up with a naive algorithm. Let sum be the union
of Sources of every non-descendant view element vej of vei, vej 6= vei. If there
exists Source(vei), such that Des(Source(vei)) ∩ sum = ∅, Source(vei) is a correct
translation of deleting vei.</p>
      <p>However, this algorithm cannot be applied for all view elements. Consider view
elements whose view schema nodes are the same. SNV iew(vei), such as student1 and
student2 in Figure 7(c). If we want to delete student1, it is easy to observe that we can
delete the studenta element in the base document, corresponding to the base schema
node student in Figure 1. However, according to the above lemma, Des(student) ∩
Sources (student2) 6= ∅ and thus student1 cannot be updated.</p>
      <p>Also, Lemma 1 cannot be applied to detect side-effects on view elements whose
schema nodes are descendants of SNV iew(vei). Because for such a view element vej ,
we have Sources(vei) ⊆ Sources(vej ), as all the base schema nodes that contribute
to the existence of vei, also contribute to the existence of every view element that is the
descendant of vei. For the above two cases, we need other strategies.</p>
      <p>Though Lemma 1 cannot be applied to the above two types of view elements, it can
still be applied to detect side-effects on nodes whose schema nodes are non-descendants
of Source(vei).
1
2
3</p>
      <p>Fig. 8. Schema Tree Structure</p>
      <p>We therefore partition the view schema tree into three parts, as shown in Figure 8.
Let n = SNV iew(vei) be the view schema node for vei. The first group, marked as 1,
is view schema nodes that are non-descendant of n. We can apply Lemma 1 to detect
side-effects on view elements whose schema nodes are in this group. The second group,
marked as 2, is view schema node n itself. We discuss how to detect side-effects on view
elements whose schema node is in this group in Section 5.2. The third group, marked as
3, is schema nodes that are descendants of n. We discuss how to detect side-effects on
view elements whose schema nodes are in this group in Section 5.3. Also, these three
groups cover all schema nodes without any overlap. Thus we check all view elements
for side-effects effectively, and a correct translation is returned if there exists one.</p>
      <sec id="sec-3-1">
        <title>5.2 Detecting Side-Effects in Group 2</title>
        <p>Here we check view elements that share the same view schema node as vei, the view
element to be updated. This is similar to the relational view update problem, and we
can utilize the solutions from the relational scenario.</p>
        <p>
          Updating Relational Views In [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], Keller proposes an algorithm to check whether
there is a 1-1 mapping between the set of tuples in the relational view and the set of
tuples in a base relation. This algorithm can be used to check whether we can delete a
tuple in the view without side-effects in the relational scenario. We use Keller’s
algorithm as the basis for studying view updates in XML scenario as well. Therefore, in this
section, we will introduce and discuss this algorithm.
Keller’s Algorithm: Given a relational database D and a relational view V , in order to
find all possible relations r1, r2, . . . , ri such that there is a 1-1 mapping between the set
of tuples in V and the set of tuples in every rp, 1 ≤ p ≤ i, construct a directed graph,
also called as a trace graph, as:
1. every relation used by the view forms a node in the graph. Suppose there are nodes
r1, r2, . . . , rn in the graph.
2. let ri, rj be two nodes (ri 6= rj ). There is an edge ri → rj iff there is a join
condition of the form ri.a = rj .k (rj .k is the key for rj . If there is a ri.k = rj .k join,
then there are two edges ri → rj and also rj → ri.).
        </p>
        <p>If there is any node r which can reach all other nodes, then there is a 1-1 mapping
from tuples in V to tuples in the relation which is denoted by node r. 2
Adapting Keller’s Algorithm to XML scenario In Keller’s Algorithm, an edge ri →
rj represents that a tuple in ri joins with at most one tuple in rj . The same intuition can
be applied to XML scenario. Given view element vei, its trace graph has a root element
and one node for every Source(vei). Let Sourcei, Sourcej ∈ Sources(vei). We draw
an edge from Sourcei to Sourcej if the XPath expression of Sourcei starts with the
variable representing Sourcej . We draw an edge from Sourcei to root if the XPath
expression of Sourcei starts with Document(”base.xml”). Let us consider element
student in Figure 7(b); Sources(student) = {department, prof essor, student}.
The corresponding XPath expressions are Document(”base”)//department, $dept
//prof essor, $prof /student respectively. Every prof essor will join with at most
one department. Similarly, every student is guaranteed to join with at most one
prof essor. According to Keller’s algorithm, there are four nodes in the trace graph:
root, department, prof essor and student. We can draw an edge from student to
prof essor, one from prof essor to department and one from department to root.
student can reach all the other nodes. This implies we can delete view element student1
by deleting base element student1 in D, as analyzed before.</p>
        <p>However there are differences between relational and XML scenarios. For instance,
a node in the trace graph that does not reach all other nodes can still be a correct
translation. Consider view schema node prof -student in Figure 4(b). A view element of
prof -student has Sources = {prof essor, student}, without any edge between them
in the trace graph. However, as base schema in Figure 4(a) implies that there is only one
prof essor element in the base, any view element whose schema node is prof -student
can be deleted by deleting a base element whose schema node is student. So
cardinality constraints should be considered to determine whether a Source can be a correct
translation.</p>
        <p>On the other hand, a node in the trace graph that reaches other nodes might not be a
correct translation. Consider course1 in Figure 3(c), Sources(course1) = {pre, course}.
In the trace graph there is an edge from course to pre. However, course1 cannot be
deleted by deleting courseb in Figure 1. This is because coursec is a descendant of
courseb and is source of both course2 and course3. Also course2 in Figure 3(c)
cannot be deleted because it shares the same source as course3. Both of these occur
because of recursive types in XML.</p>
        <p>In the rest of the section, we study how we can extend Keller’s algorithm to handle
cardinality constraints and recursive types in XML.
(a) trace graph generated from
Keller’s Algorithm
(c) situation when cardinality for each
table is revealed</p>
        <p>(b) situation when cardinality information
rrrjij...ttt112forretttri121ajch tab………le ………is not reveale…rrdii..…tt11 r…rrijj..…tt21rj
…… …… (c)
……
……
……
……
……
……
……
……
Handling Cardinality Constraints How cardinality information impacts the
translatability of view updates in relational scenario is illustrated in Figure 9, where ri and
rj can reach all other nodes except each other. Without any cardinality information, a
view tuple cannot be deleted either from ri or rj , as there can be side-effects shown
in Figure 9(b). However, if we know the cardinality information that there is only one
tuple in ri 5, then view tuples can be deleted from rj , shown in Figure 9(c).</p>
        <p>While such cardinality information cannot be specified easily in relational schema,
it does exist in XML schema, as we mentioned in section 3.2. We only capture
cardinality constraints *, 1 and 0. Note XML schema can specify more complex cardinality
constraints such as MaxOccurs and MinOccurs. However they do not affect whether a
view element can be updated or not. So we ignore them in this paper.</p>
        <p>Given two base schema nodes t and tn which are of ancestor-descendant
relationship, however, what is the cardinality between them? Here we give the formal definition:
Definition 2. Let t/a1 :: t1/a2 :: t2/ . . . /an :: tn be a path expression between two
nodes t and tn in the base schema, where ∀ai, 1 ≤ i ≤ n, can be child,
descendantor-self, or attribute. The cardinality card(t, tn) between t and tn, which can also be
denoted as card(t, /a1 :: t1/a2 :: t2/ . . . /an :: tn), is defined as:
1. if n &gt;1, card(t, /a1 :: t1/a2 :: t2/ . . . /an :: tn) = card(t, /a1 :: t1)×card(t1, /a2 ::
t2) × . . . × card(tn−1, /an :: tn). For the multiplication, please refer to Figure 10.
2. if n=1:
(a) if a1 is descendant-or-self, card(t, /a1 :: t1) = *.
(b) if a1 is attribute, card(t, /a1 :: t1) = 1.
(c) if a1 is child, and the content model of t is re. Then card(t, /a1 :: t1) =
cardRE(t1, re). cardRE(t1, re) is defined as follows:
i. if re = (re1, re2), cardRE(t, re) = cardRE(t1, re1)+ cardRE(t1, re2).
ii. if re = (re1 | re2), cardRE(t1, re) = max{
cardRE(t1, re1), cardRE(t1, re2)}.
iii. if re = (re1)∗, cardRE(t, re) = cardRE(t1, re1) × ∗.
iv. if re = ti:</p>
        <p>A. if ti = t1, then cardRE(t1, re) = 1.</p>
        <p>B. if ti 6= t1, then cardRE(t1, re) = 0.
1
0
*
x 1
1
0
*
0
0
0
0
*
*
0
*
+ 1
1
0
*
*
1
*
0
1
0
*
*
*
*
*
root
1
*
$prof
$student
Cardinality Multiplication Cardinality Addition</p>
        <p>Table Table
1. if l = n, we can delete vei from any source(vei) whose corresponding node in
trace graph has 0-indegree.
2. if l = n − 1, we can delete vei by deleting the source whose base schema node is
the 0-indegree node with cardinality as ”*”.
3. if l ≤ n − 2, there is no correct translation.</p>
        <p>Let us consider the query in Figure 4 again. Figure 11 is the trace graph of prof
student in Figure 4(b). With Definition 1, card(result,
prof essor)= 1, card(result, student) = *. Therefore, to delete the view element whose
view schema node is prof -student, we can delete from Source student.
Handling Recursive Type Let us first consider the side-effects where source(vej ) ∈
des(source(vei)), vei and vej share the same view schema node. Consider course1 in
Figure 2(c). Deleting it will have side-effects because some descendants of its source,
sourcea, also contribute to the existence of other view elements, such as course2. To
identify such side-effects, we define recursive Source as below.</p>
        <p>Definition 3. Let Schema be an XML schema and Q a view query defined over this
schema. Let S be a Source for a view element whose view schema node is n. S is said
to be a recursive Source if ∃D, an XML Document confirming to Schema, where the
conditions below are all satisfied:
1. there exist two view elements in Q(D), vei and vej , such that i 6= j but SNV iew(vei)
= SNV iew(vej ) = n.
2. I(S) contains bei and bej, bei and bej is source of vei and vej respectively, and
they have ancestor-descendant relationship.
5 This is a quite strict requirement, which will be relaxed in later discussions.</p>
        <p>One might think that if a path expression for a Source has ”//” operation, then the
Source is recursive. However, this need not be the case, such as in the XPath expression
Document(”base.xml”)//department/course. To identify recursive Source, we
define AbsoluteX P ath below.</p>
        <p>Definition 4. The path in the trace graph from Source to root is called a branch,
denoted as branchSource. The XPath expression obtained by concatenating all the XPath
expressions in branchSource is called the absolute XPath of Source.</p>
        <p>To identify whether a Source is recursive, we check its absolute XPath. If the
absolute XPath retrieves two base elements that have ancestor-descendant relationship, then
the Source is recursive.</p>
        <p>Proposition 2. Let P be the absolute XPath of a Source(vei) for view element vei. We
call Source(vei) as recursive iff the following two conditions are both satisfied:
1. P is of the form /P1//bere/P2/bel, where P1, P2 are path expressions and bere,
bel are base schema nodes.
2. the last base element bel in P can have bere as its descendant.</p>
        <p>Proposition 2 is illustrated in Figure 13(a). Here both the bel’s satisfy P and have
ancestor-descendant relationship. The Source, student, for a student view element in
Figure 7 has the absolute XPath Document(”base.xml”)//department//prof essor
/student, which does not match Proposition 2, therefore student is not recursive.
However the Source, course, for a course view element in Figure 2 has the
absolute XPath Document(”base.xml”)//course. This matches Proposition 2 where P1
is Document(”base.xml”), P2 is empty and bere = bel = course, and course has
course as descendant.</p>
        <p>P1
bere
P2
bel
bere
P2
bel</p>
        <p>P1
P2
z
y
P3
x z
P4</p>
        <p>P2 y
*
a1
c b
a
(a)
root
b1
b2
(a) base schema tree</p>
        <p>STBase
(b) base instance
(c) view query
(d) view instance tree</p>
        <p>c1
(b) c2
a1
a2
(d)
a2
b2
(c)
&lt;root&gt;
FOR $a IN Document(“base1.xml”)//a
RETURN &lt;a&gt;</p>
        <p>FOR $c IN $a/c,</p>
        <p>$b IN $c//b
RETURN $b
&lt;/a&gt;
b1
b2</p>
        <p>P3</p>
        <p>x
(a) Proposition 2
(b) Proposition 3</p>
        <p>P4
&lt;/root&gt;
Fig. 12. ST B0ase, Query Q4
Proposition 3. Consider the trace graph of a view element whose view schema node
is n. Let Source1 and Source2 be two Sources in this trace graph, with an edge from
Source2 to Source1. I(Source2) may contain a base element that is the source of two
view elements, ve1 and ve2, iff all the following conditions below are satisfied:
1. The absolute XPath of Source1 is of the form P1//z/P2/y. Let y be the variable
that Source1 binds to and Source1 is marked as recursive using Proposition 2.
2. The absolute XPath of Source2 is of the form $y/P3//x//P4.
3. z ∈ Des(x).</p>
        <p>With Proposition 1, Proposition 2 and Proposition 3, we can detect all the
possible side-effects on view elements whose schema node is in Group 2 when deleting
Source(vei). Please refer to Section 6 for how to integrate them.
5.3 Detecting Side-Effects in Group 3
In this section, we will discuss how to detect side-effects on view elements whose
schema nodes are descendants of n. Note view elements that are descendants of vei
will get deleted with vei, according to the hierarchial structure of XML view.
Therefore, we focus on whether any view element, vej , that are descendants of siblings of
vei, gets affected when deleting source(vei).</p>
        <p>b
a
(a)
c
(a) base schema tree</p>
        <p>STBase
(b) base instance
(c) view query
(d) view instance tree ba
aa
ab
(b)
bb</p>
        <p>cc
root (d)</p>
        <p>(c)
a1 a2 &lt;FRrOEoTRoUt&gt;$RaNIN&lt;aD&gt;ocument(“base1.xml”)//a</p>
        <p>Document(“base1.xml”)//b,</p>
        <p>Document(“base1.xml”)//c
ba1 bb1 cc1 ba2 bb2 cc2 &lt;/root&gt; &lt;/a&gt;</p>
        <p>Fig. 14. ST B0ase, Query Q5</p>
        <p>Figure 14 illustrates side-effects on Group 3. If we delete a1 in Figure 14(d) by
deleting aa in Figure 14(b), then the view element ba2, the descendant of a2 in
Figure 14(d) is deleted. This is a side-effect. This happens because view element ba2 has
a source, ba, which is the descendant of source(a1). On the other hand, there is no
side-effects on view element cc2.</p>
        <p>We identify such side-effects as follows. Let vej be a descendant of sibling view
element of vei. If Source(vej ) is not a descendant of Source(vei), we need not
consider it as it will never get affected. Consider cc2 in Figure 14(d) as vej and a1 as vei.
As c is Source(vej ) ∈/ Sources(vei) and also c is not a descendant of Source(vei),
a, no side-effect on view element cc2 will appear.</p>
        <p>On the other hand, if Source(vej ) is descendant of Source (vei) or itself, source
(vej ) must contribute to at most one view element that must be a descendant of vei.
This implies there should need an edge from Source(vej ) to Source(vei) in the trace
graph of vej . Consider ba2 as vej and a1 as vei. As Source(vej ), b, is a descendant of
Source(a1), there needs to be an edge from b to a in the trace graph of vej , which
actually does not exist. Therefore, there may be side-effects on ba2. The above conclusions
are formalized in the following lemma:
Lemma 2. For every descendant element ved of SNV iew(vei), get its trace graph.
Suppose there are n 0-indegree nodes that cannot reach Source(vei), say r1, r2, . . . , rn.
For some ved, if ∃ ri such that SNBase(ri) ∈ Des(SNBase(Source(vei))), Source
(vei) cannot be the correct translation of deleting vei.
6 Algorithm for Correctly Deleting Single View Element in XML</p>
        <p>Scenario
In this section, we will present the three-step algorithm for finding the correct
translation of deleting a view element vei.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Step 0:</title>
        <p>0. Candidates = Sources(vei)
Step 1:
1. Let Sources0 be the union of Sources of all non-descendant view elements of vei.
2. For every Source(vei) ∈ Candidates, if Des(Source(vei)) ∩ Sources0 6= ∅,</p>
        <p>Candidates = Candidates − Source(vei).
3. If Candidates = ∅, the algorithm terminates; else go to Step 2.</p>
        <p>Step 2:
4. Draw the trace graph of vei and let SourcesKeller be the set of 0-indegree nodes.
5. Use Proposition 1 to check SourcesKeller . Let l be the number of nodes whose
relative cardinality is ”1”.
(a) if l = n − 1, SourcesKeller = {SNrest}, where SNrest is the only schema
node in SourcesKeller whose relative cardinality is ”*”.</p>
        <p>(b) if l ≤ n − 2, Candidates = ∅; the algorithm terminates.
6. Use Proposition 2 to check if Source(vei) is recursive. If so Candidates =</p>
        <p>Candidates − Source(vei).
7. For every branch of the trace graph, find two consecutive Sources that satisfy the
condition in Proposition 3. If there exists such two Sources, Candidates = ∅; the
algorithm terminates.
8. Candidates = Candidates ∩ SourcesKeller . If Candidates = ∅, the algorithm
terminates; otherwise go to Step 3.</p>
        <p>Step 3:
9. For every Source ∈ Candidates, if deleting Source has side-effects on a
descendant according to Lemma 2, Candidates = Candidates − Source.
10. The algorithm terminates. If Candidates = ∅, there is no correct translation of
deleting vei; otherwise each Source ∈ Candidates 0is a correct translation.
Theorem 1. After the above algorithm, if Sources(vei) is empty, deleting vei is
untranslatable. Otherwise deleting ∀source ∈ sources0(vei) is a correct translation of
deleting vei.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>7 Conclusion</title>
      <p>In this paper we presented an algorithm for correctly translating the deletion of an XML
view element as deleting an element in the underlying XML base. Our algorithm uses
a schema-level analysis to efficiently find a correct translation and it is based on the
previous work for updating relational views, extending this with recursive types and
cardinality constraints in XML, and ”//” operator in XQuery. Our algorithm is sound
and complete.</p>
      <p>This paper forms a first major step in studying view updates in XML scenario.
Future work needs to consider incorporating other update operations such as insert, replace
and XML specific operations and considering updating multiple elements. Further, we
need to consider more semantics both in XML Schema and XQuery statements.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. http://www.w3.org/xml/query/.
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>S.</given-names>
            <surname>Abiteboul</surname>
          </string-name>
          .
          <article-title>On views and xml</article-title>
          .
          <source>PODS</source>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>V.</given-names>
            <surname>Braganholo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Davidson</surname>
          </string-name>
          , and
          <string-name>
            <surname>C. Heuser.</surname>
          </string-name>
          <article-title>the updatability of xml views over relational databases</article-title>
          ,
          <year>June 2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Cui</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Widom</surname>
          </string-name>
          .
          <article-title>Lineage tracing for general data warehouse transformations</article-title>
          .
          <source>In The VLDB Journal</source>
          , pages
          <fpage>471</fpage>
          -
          <lpage>480</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>U.</given-names>
            <surname>Dayal</surname>
          </string-name>
          and
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Bernstein</surname>
          </string-name>
          .
          <source>On the Correct Translation of Update Operations on Relational Views. In ACM Transactions on Database Systems</source>
          , volume
          <volume>7</volume>
          (
          <issue>3</issue>
          ), pages
          <fpage>381</fpage>
          -
          <lpage>416</lpage>
          ,
          <year>Sept 1982</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>F.</given-names>
            <surname>Bancilhon</surname>
          </string-name>
          and
          <string-name>
            <given-names>N.</given-names>
            <surname>Spyratos</surname>
          </string-name>
          .
          <source>Update Semantics of Relational Views. ACM Transactions on Database Systems (TODS)</source>
          , pages
          <fpage>557</fpage>
          -
          <lpage>575</lpage>
          ,
          <year>1981</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>M.</given-names>
            <surname>Fernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Tan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Suciu</surname>
          </string-name>
          . SilkRoute:
          <article-title>Trading between Relations and XML</article-title>
          . http://www.www9.org/w9cdrom/202/202.html, May
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>International</surname>
          </string-name>
          <article-title>Organization for Standardization (ISO) &amp; American National Standards Institute (ANSI).</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>A. M.</surname>
          </string-name>
          <article-title>Keller. Algorithms for Translating View Updates to Database Updates for View Involving Selections, Projections and Joins</article-title>
          .
          <source>In Fourth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems</source>
          , pages
          <fpage>154</fpage>
          -
          <lpage>163</lpage>
          ,
          <year>1985</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. H.
          <string-name>
            <surname>Kozankiewicz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Leszczyowski</surname>
            , and
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Subieta</surname>
          </string-name>
          .
          <article-title>Updatable xml views</article-title>
          .
          <source>Advances in Databases and Information Systems</source>
          , pages
          <fpage>381</fpage>
          -
          <lpage>399</lpage>
          ,
          <year>September 2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>P.</given-names>
            <surname>Lehti</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Fankhauser</surname>
          </string-name>
          .
          <article-title>Towards type safe updates in xquery</article-title>
          . http://www.ipsi.fhg.de/ lehti/Typing
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <article-title>Oracle Technologies Network. Using XML in Oracle Database Applications</article-title>
          . http://technet.oracle.com/tech/xml/htdocs/about oracle xml products.htm,
          <year>November 1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>M.</given-names>
            <surname>Rys</surname>
          </string-name>
          .
          <article-title>Bringing the Internet to Your Database: Using SQL Server 2000</article-title>
          and
          <article-title>XML to Build Loosely-Coupled Systems</article-title>
          . In VLDB, pages
          <fpage>465</fpage>
          -
          <lpage>472</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>R.</given-names>
            <surname>Vercammen</surname>
          </string-name>
          .
          <article-title>Updating xml views</article-title>
          .
          <source>VLDB PhD Workshop</source>
          , page 6ł10,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. L.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>E. A.</given-names>
          </string-name>
          <string-name>
            <surname>Rundensteiner</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Mani. UFilter: A Lightweight XML View Update</surname>
          </string-name>
          <article-title>Checker</article-title>
          . In ICDE, poster paper,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>