<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Future Work</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Computer Science, FEE CTU Prague</institution>
          ,
          <addr-line>Karlovo n ́am. 13, 121 35 Prague</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2006</year>
      </pub-date>
      <abstract>
        <p>The aim of this paper is to outline an uniform functional approach useful both for constructing query languages for XML and also for formal description of their semantics. This framework offers an alternative way to today's mainstream query languages - XPath and XQuery. With respect to the goal of the VLDB PhD Workshop it is focused more on the concept than on the complete solution.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>2</p>
    </sec>
    <sec id="sec-2">
      <title>Motivation</title>
      <p>Evolution of a query language usually brings some
changes in its syntax or/and the semantics. For
example, the progress from XPath 1.0 to XPath 2.0 (i.e.
sub–specification of XQuery 1.0) brought a new data
model for XML. It was a logical and necessary step
forced by requirements given for the new language.
Our idea is to use such formalism (formal system) as a
base for the language that is universal enough to avoid
big changes when enriching the language with new
constructs. Therefore the idea of a functional approach
combined together with basic lambda calculi
operations (lambda abstractions and lambda applications)
seems to be very simple and not restrictive, rather it
allows to be easily extended in the future.</p>
      <p>Our work tries to link to the idea of the XML-λ
Framework (see Section 3), explore it further and
study its various aspects from different points of view.
Its author already identified some issues and
drawbacks of the framework. We see one of the biggest
issues the missing comparison of the proposed query
language with the XQuery language. The question of
expressive power of these two languages is very
challenging because these approaches represent different
formal systems and seems to be not solved yet.</p>
      <p>To lay out more specific goals for our
consecutive work we formulate the following aims: (a)
Utilize the XML-λ Framework for formal description of
XQuery semantics, (b) Construct a XQuery-to-XML-λ
query compiler (and vice–versa) and compare
expressive power of XQuery and XML-λ query languages,
(c) Evaluate functional approach for various aspects
related to querying XML.
3</p>
    </sec>
    <sec id="sec-3">
      <title>The XML-λ Framework</title>
      <p>
        XML-λ framework is an idea of using functional
approach together with lambda calculi [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and utilize it
for querying XML data. The proposal was originally
developed by Pokorny´ [16, 17].
      </p>
      <p>The following sections outline the theoretical base
of the XML-λ Framework and the query language built
upon it. In Section 3.3 we show an example of a query
evaluation.
3.1</p>
      <sec id="sec-3-1">
        <title>Concept</title>
        <p>The type system is built on base B – a set
containing a finite number of base types S1, . . . , Sk (k ≥ 1).
Type hierarchy is then created by following inductive
definition:</p>
        <p>Type System T . Let B is a set of primitive types
S1 . . . Sn, n ≥ 1. Type System T over base B is the
least set containing types given by 1.-4.</p>
        <p>1. base type: each member of B is type over B
2. functional type: if T1 and T2 are types over B,
then (T1 → T2) is also a type over B
3. n-tuple type: if T1, . . . , Tn (n ≥ 1) are types over</p>
        <p>B, then (T1, . . . , Tn) is type over B
4. union type: if T1, . . . , Tn (n ≥ 1) are types over B,
then (T1 + . . . + Tn) is type over B
Subsequently we define a regular type system Treg that
extends type system T with regular constructs:</p>
        <p>Type System Treg. Let B= {String, Bool}, let
N AM E be a set of names. Type System Treg is the
least set containing types given by 1.-6.</p>
        <p>1. Every member of the base B is an (primitive) type
over B.
2. named character data: Let tag ∈ N AM E. Then
tag : String is an (elementary ) type over B,
tag : is an (empty elementary ) type over B.
3. Let T be a group type or named character data.</p>
        <p>Then
• zero or more: T ∗ is a type over B.
• one or more: T + is a type over B.</p>
        <p>• zero or one: T ? is a type over B.
4. alternative: Let T1 and T2 be types. Then (T1|T2)
is a type over B.
5. sequence: Let T1, . . . , Tn be types.</p>
        <p>(T1, . . . , Tn) is a type over B.</p>
        <sec id="sec-3-1-1">
          <title>Then</title>
          <p>6. named type: Let T be a type given by a step from
3.-5. Let tag ∈ N AM E. Then tag : T is a type
of B.</p>
          <p>Note here that this type system Treg is still suitable
to describe types not only in a concrete data structure
(in our case tree structure) but for all data that might
be tagged with name - for example data accordant to
the relational model.</p>
          <p>Having the Treg type system we have to extend it
to be able to work with XML data. We build the
type system TE induced by Treg. Key idea is to define
abstract elements that are particular XML elements
with some content and also define a set containing all
abstract elements within an XML instance – E .</p>
          <p>Type System TE . Let Treg over base B be a type
system from definition 3.1 and E is the set of abstract
elements. Then type system TE induced by Treg is the
least set containing type given by this rule:
• Let tag : T ∈ Treg. Then T AG : T is a member of
TE . (Replacement of all tags in tag : T by upper
case version)</p>
          <p>Note one remark: we can see the type system TE
as function container that adds functions for
extracting data values from elements (via abstractions and
projections) with two ways
• simple element: if tag : String ∈ Treg, then (E →
tag : String) ∈ TE
• compound element: if tag : T ∈ Treg, then (E →</p>
          <p>T 0) ∈ TE
3.2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>XML-λ Query Language</title>
        <p>A typical query has a main (body) part – an expression
to be evaluated over data – and a constructor part
that wraps query result and forms the XML output.
XML-λ’s Query Language (XLQL) is based on λ-terms
defined over the type system TE .</p>
        <p>Main constructs of the language are variables,
constants, tuples, use of projections and λ-calculus
operations – abstractions and applications. Tagged
terms might be used for declaring functions.
Syntax of this language is similar to λ-calculus
expression thus queries are structured as λ-expresions, i.e.
λ . . . (λ . . . (expression) . . .). In addition, there are
also typical constructs such as logical connectives,
constants or comparison predicates.</p>
        <p>XML-λ Query Language is inductively defined as
the least set containing all terms created by
application of following rules:</p>
        <p>Let T, T1, . . . , Tn, n ≥ 1 be members of T . Then
1. variable: each variable of type T is a term of type</p>
        <p>T
2. constant: each constant (member of F ) of type T
is a term of type T
3. application: if M is a term of type ((T1, . . . , Tn) →
T ) and N1, . . . , Nn are (in the same order) types
T1, . . . , Tn, then M (N1, . . . , Nn) is a term of type
T
4. λ-abstraction: if x1, . . . , xn are distinct
variables of types T1, . . . , Tn and M is a term of
type T , then λx1, . . . , xn(M ) is a term of type
((T1, . . . , Tn) → T )
5. n-tuple: if N1, . . . , Nn are terms of types
T1, . . . , Tn, then (N1, . . . , Nn) is a term of type
(T1, . . . , Tn)
6. projection: if (N1, . . . , Nn) is a term of type
(T1, . . . , Tn), then N1, . . . , Nn are terms of types
T1, . . . , Tn
7. tagged term: if N is a term of type N AM E and
M is a term of type T then N : T is a term of
type (E → T ).</p>
        <p>Simple implementation of this language is available
in [18].
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Example of a Query</title>
        <p>Following example should give more detailed view to
an evaluation of a XML-λ query. Let us consider the
DTD and a conforming XML instance as shown in
Figure 1 and Figure 2 and a query that will return
first names of all authors mentioned in the document.
&lt;!ELEMENT book (author+,</p>
        <p>title, price)&gt;
&lt;!ELEMENT author (first, last)&gt;
&lt;!ELEMENT first (#PCDATA)&gt;
&lt;!ELEMENT last (#PCDATA)&gt;
&lt;!ELEMENT title (#PCDATA)&gt;
&lt;!ELEMENT price (#PCDATA)&gt;
Solution First we specify a set E of all abstract
elements contained in XML instance as follows (note
that subscripts are here only to distinguish abstract
elements of the same type):
E = {book, author1, f irst1, last1, author2, f irst2,
last2, name, price}
&lt;book&gt;
&lt;author&gt;
&lt;first&gt; Jaroslav &lt;/first&gt;
&lt;last&gt; Pokorny &lt;/last&gt;
&lt;/author&gt;
&lt;author&gt;
&lt;first&gt; Karel &lt;/first&gt;
&lt;last&gt; Richta &lt;/last&gt;
&lt;/author&gt;
&lt;title&gt; XML Technology &lt;/name&gt;
&lt;price&gt; 155.00 &lt;/price&gt;
&lt;/book&gt;</p>
        <p>Upon this base we construct type system TE of
functions based by given XML schema (in our case DTD).
See the list of types available:
BOOK : (AU T HOR+, T IT LE, P RICE),
AU T HOR : (F IRST, LAST ),
F IRST : String,
LAST : String,
T IT LE : String,
P RICE : String</p>
        <p>Now, let us have a look at the following simple query
solving our example:</p>
        <p>λf (/book/author(a) and a/f irst = f )</p>
        <p>Evaluation of this query takes place in following
way: this query is composed of one high-level λ-term
and contains two variables called a and f .
Navigation over the structure of elements is implemented as
continuous λ-applications combined with projections
(performed by element name). When evaluating the
value of the query first the value of a is evaluated
(note that /book/author(a) is syntactically equivalent
to author(book(a))). This term is evaluated using
sequent application of book and author functions. Then
the variable a contains an abstract element of type
author. We apply to this element function author that
returns a n-tuple of elements from E. Last step is
projection from this n-tuple into the first component.
The result of this operation is a particular abstract
element.</p>
        <p>Hereby we receive the output (without enveloping
elements): Jaroslav, Karel.
3.4</p>
      </sec>
      <sec id="sec-3-4">
        <title>Utilizing the Functional Approach</title>
        <p>In the previous sections we describe a query language
based on the XML-λ Framework. It uses functions
as a data model for XML documents and lambda
calculi abstractions and applications for development of
a query language suitable for querying XML.</p>
        <p>Beside that we can also use this data model for
different purposes. Due to universality of the λ-calculus,
we can construct a language for modifying XML
data. Generally it means performing changes
(additions/removals) in the set E of abstract elements for
a particular XML instance and eventual changes in
the type system TE. With this approach we can also
be able to express integrity constraints for XML data.
This research direction however lays out of scope of
this paper.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Relationship Between XQuery and XML-λ</title>
      <p>
        XQuery [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is a mainstream language for querying
XML. The language uses expressions as its
fundamental constructs. Basic expressions are literals and
variables, for navigating in XML documents it uses XPath
expressions. Related specification [15] contains an
extensive list of predefined numeric, arithmetic, string or
date-time functions.
      </p>
      <p>One important type of expression are the FLWOR
expressions. FLWOR is a shortcut for For, Let, Where,
Order-By, Return statements that are used in the same
way as in imperative languages. This type of
expression is sometimes compared to SQL SELECT
statement known from relational databases.</p>
      <p>The goal of our research is to explore the
relationship between XQuery and XML-λ. Thus, in XQuery
we can write a query returning all first names in our
bibliography database in following way:
&lt;result&gt; {
for $x in doc("bib.xml")/book/author
return (
&lt;fname&gt;</p>
      <p>{$x/first/text()}
&lt;/fname&gt;)
} &lt;/result&gt;</p>
      <p>A query with the same semantics but written in
XML-λ has (of course) different syntax as shown
below:
xmldata("bib.xml")
lambda fname f
(/book/author(a) and a/first = f)</p>
      <p>Both queries however return the same result:
&lt;result&gt;
&lt;fname&gt; Jaroslav &lt;/fname&gt;
&lt;fname&gt; Karel &lt;/fname&gt;
&lt;/result&gt;</p>
      <p>We want to show that all queries in XQuery can
be rewritten into XML-λ and vice–versa. This claim
should be backed by formal description of denotational
and operational semantics of both languages and by
finding a transformation between semantic subtrees of
all language constructs.
Future work should lead into a successful completion
of the dissertation thesis. Therefore we need to define
concrete tasks and formulate consecutive experiments.</p>
      <p>The proposed dissertation thesis investigates an
utilization of the λ-calculus based framework for
description of query language semantics. We plan to
formulate both denotational and structured operational
semantics of the XQuery language. Having this
formalism available we will compare the expressive power
of these languages and further study their properties.
Our primary aim is to define the existing semantics
of XQuery in a functional way as a base for ongoing
experiments and then compare their expressive power
by finding transformations between queries in XQuery
and XML-λ.</p>
      <p>There are two concrete ongoing tasks – specification
of XQuery semantics by using the XML-λ Framework
and consecutive construction of transformation engine
for converting XQuery and XML-λ queries.
5.1</p>
      <sec id="sec-4-1">
        <title>Specification of XQuery Semantics Using</title>
        <p>λ-calculus
The functional approach applied in XML-λ might be
used not only for constructing a query language but
also for a different purpose – description of
semantics of various query languages. Within our research
we want to describe the semantics of the mainstream
XML query language XQuery with help of the
functional approach based on λ-calculus – XML-λ.
Taking into account the type system used we expect the
semantics to be easier to express in comparison with
existing W3C semantics.</p>
        <p>With respect to the scope of the XQuery semantics
specification this topic represents a complex
theoretical research.
5.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Bi-directional Transformation of XQuery</title>
        <p>and XML-λ Queries
Expression of the XQuery semantics with help of
λ-calculus gives us an opportunity to compare
expressive power of XQuery and XML-λ. Because of the fact
that the semantics of XML-λ query language is closely
related to its syntax we can propose a method for
conversion of queries based on language’s semantics. The
result of this aim is to have a transformation engine
and its implementation for converting XQuery queries
in XML-λ and vice–versa available. The conversion
process is generally a transformation of an input
semantic tree in one language to an output semantic
tree in the other language.
6</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>This paper outlines a framework for querying XML
data based on a functional approach combined with
lambda calculi operations. The functional approach
is not common nowadays but offers a universal way
that might be used as a base for constructing various
query languages for XML. Beside that it can be also
used for specifying semantics of these languages.
Having the semantics formalized with the same tool we
can compare their expressive power and evaluate their
relationship.</p>
      <p>For consecutive work to lead into a dissertation
thesis we plan to describe semantics of XQuery using the
functional framework. With this formalism we can
construct a XQuery-to-XML-λ compiler (and vice–
versa) by transformation of respective semantic trees.
These transformations describe relationships between
various language constructs. With knowledge of these
relationships we can then compare XQuery’s
expressive power with a query language based on lambda
calculi.</p>
      <p>The main contribution of the thesis should be seen
in using functional approach for description of XQuery
semantics and for evaluation of properties of lambda
calculi based framework and its suitability for querying
XML.
7</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This research has been partially supported by MSˇMT
under research program MSM 6840770014.</p>
      <p>XML
November
[8] J. Cowan and R. Tobin. XML
information set (second edition), April 2004.
http://www.w3.org/TR/2004/REC-xml-infoset20040204/.
2004.
[15] A. Malhotra, J. Melton, and N.</p>
      <p>XQuery 1.0 and XPath 2.0
tions and Operators, September
http://www.w3.org/TR/xpath-functions/.</p>
      <sec id="sec-6-1">
        <title>Walsh.</title>
        <p>Func2005.
[16] J. Pokorny´. XML functionally. In B. C. Desai,
Y. Kioki, and M. Toyama, editors, Proceedings
of IDEAS2000, pages 266–274. IEEE Comp.
Society, 2000.
[17] J. Pokorny´. XML-λ: an extendible framework for
manipulating XML data. In Proceedings of BIS
2002, Poznan, 2002.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Abiteboul</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Quass</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>McHugh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Widom</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Wiener</surname>
          </string-name>
          .
          <article-title>The Lorel query language for semistructured data</article-title>
          .
          <source>International Journal on Digital Libraries</source>
          , pages
          <fpage>68</fpage>
          -
          <lpage>88</lpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Barendregt</surname>
          </string-name>
          .
          <article-title>Lambda calculi with types</article-title>
          .
          <source>In Handbook of Logic in Computer Science, Volumes</source>
          <volume>1</volume>
          (Background:
          <article-title>Mathematical Structures) and 2 (Background: Computational Structures)</article-title>
          ,
          <source>Abramsky &amp; Gabbay &amp; Maibaum (Eds.)</source>
          , Clarendon, volume
          <volume>2</volume>
          .
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Boag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chamberlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Fern</surname>
          </string-name>
          ´andez,
          <string-name>
            <given-names>D.</given-names>
            <surname>Florescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Robie</surname>
          </string-name>
          , and J.
          <source>Simeon. XQuery 1</source>
          .0:
          <string-name>
            <surname>An</surname>
            <given-names>XML</given-names>
          </string-name>
          <string-name>
            <surname>Query Language</surname>
          </string-name>
          ,
          <year>September 2005</year>
          . http://www.w3.org/TR/xquery/.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Bray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Yergeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cowan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Paoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Sperberg-McQueen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and E.</given-names>
            <surname>Maler</surname>
          </string-name>
          .
          <article-title>Extensible markup language (XML) 1</article-title>
          .1,
          <year>February 2004</year>
          . http://www.w3.org/TR/2004/REC-xml11-
          <volume>20040204</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Chamberlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Berglund</surname>
          </string-name>
          , and
          <string-name>
            <given-names>e. a. Scott</given-names>
            <surname>Boag. XML Path</surname>
          </string-name>
          <article-title>Language (XPath) 2</article-title>
          .0,
          <year>September 2005</year>
          . http://www.w3.org/TR/xpath20/.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Chamberlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Robie</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Florescu</surname>
          </string-name>
          .
          <article-title>Quilt: An XML query language for heterogeneous data sources</article-title>
          .
          <source>WebDB, Lecture Notes in Computer Science</source>
          , volume
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Clark</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>DeRose.</surname>
          </string-name>
          <article-title>Language (XPath) 1</article-title>
          .0, http://www.w3.org/TR/xpath.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>