<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Denotational Semantics of the XML- λ Query Denotational Semantics of ?the XML- Query Language? Language</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pavel Loupal</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karel Richta</string-name>
          <email>richta@ksi.mff.cuni.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pavel Loupal</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karel Richta</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Query Language</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DepMaratlmosetnratnosfkeSonfatwma.r2e5E</institution>
          ,
          <addr-line>n1g1i8ne0e0riPnrgaMhaF1F</addr-line>
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department ofKCaorlmovpoutnaem ́r .S1c3ie</institution>
          ,
          <addr-line>n1c2e1a3n5d PErnaghiane2ering, FEL CVUT</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2010</year>
      </pub-date>
      <abstract>
        <p>In this paper, we define formally the XML- λ Query Language, a query language for XML, that employs the functional data model. We describe its fundamental principles including the abstract syntax and denotational semantics. The paper basically aims for outlining of the language scope and capabilities. In this paper, we define formally the XML- λ Query Language, a query language for XML, that employs the functional data model. The first idea for such an attitude was published in [4]. This research brought in the key idea of a functional query processing with a wide potential that was later proven by a simple prototype implementation [6]. We can imagine two scenarios for this language; firstly, the language plays a role of a full-featured query language for XML (it has both f ormal syntax and semantics and there is also an existing prototype that acts as a proof-ofthe-concept application). In the second scenario, the lang uage is utilized as an intermediate language for the description of XQuery semantics. In [3] we propose a novel method for XQuery evaluation based on the transformation of XQuery queries into their XML- λ equivalents and their subsequent evaluation. As an integral part of the work, we have designed and developed a prototype of an XML- λ query processor for validating the functional approach and experimenting with it.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        1 Introduction
In this section, we describe the query language XML- λ, that is based on the
simply typed lambda calculus. As a formal tool we use the approach published in
? This work has been supported by the Ministry of Education, Youth and Sports under
Research Program No. MSM 6840770014 and also by the grant project of the Czech
Grant Agency (GACˇR) No. GA201/09/0990.
Richtas’ overview of semantics [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. For listing of language syntax, we use the
Extended Backus-Naur Form (EBNF) and for meaning of queries th e denotational
semantics [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
2.1
      </p>
      <p>
        Language of Terms
Typical query expression has a query part — an expression to b e evaluated over
data — and a constructor part that wraps a query result and for ms the output.
The XML- λ Query Language is based on λ-terms defined over the type system
TE as shown later. Lambda calculus, written also as λ-calculus, is a formal
mathematical system for investigation of function definition and application. It
was introduced by Alonzo Church and has been utilized in many ways. In this
work, we use a variant of this formalism, the simply-typed λ-calculus, as a core
for the XML- λ Query Language. We have gathered the knowledge from [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Our realization is enriched by usage of tuples.
      </p>
      <p>The main constructs of the language are variables, constants, tuples,
projections, and λ-calculus operations — applications and abstractions. The syntax is
similar to λ-calculus expressions, thus the queries are structured as n ested
λexpressions, i.e., λ . . . (λ . . . (expression) . . .). In addition, there are also typical
constructs such as logical connectives, constants, comparison predicates, and a
set of built-in functions.</p>
      <p>Language of terms is defined inductively as the least set containing all terms
created by the application of the following rules. Let T, T1, . . . , Tn, n ≥ 1 be
members of TE. Let F be a set of typed constants. Then:
1. variable: each variable of type T is a term of type T
2. constant: each constant (member of F ) of type T is a term of type T
3. application: if M is a term of type ((T1, . . . , Tn) → T ) and N1, . . . , Nn are
terms of the types T1, . . . , Tn,then M (N1, . . . , Nn) is a term of the type T
4. λ-abstraction: if x1, . . . , xn are distinct variables of types T1, . . . , Tn and
M is a term of type T , then λx1 : T1, . . . , xn : T1.(M ) is a term of type
((T1, . . . , Tn) → T )
5. n-tuple: if N1, . . . , Nn are terms of types T1, . . . , Tn, then (N1, . . . , Nn) is a
term of type (T1, . . . , Tn)
6. projection: if (N1, . . . , Nn) is a term of type (T1, . . . , Tn), then N1, . . . , Nn
are terms of types T1, . . . , Tn
7. tagged term: if N is a term of type N AM E and M is a term of type T then</p>
      <p>N : M is a term of type (E → T ).</p>
      <p>Terms can be interpreted in a standard way by means of an interpretation
assigning to each constant from F an object of the same type, and by a semantic
mapping from the language of terms to all functions and Cartesian products
given by the type system TE. Speaking briefly, an application is evaluated as
an application of of the associated function to given arguments, an abstraction
c’onstructs’ a new function of the respective type. The tupl e is a member of
Cartesian product of sets of typed objects. A tagged term is interpreted as a
function defined only for one e ∈ E. It returns again a function.
As for evaluation of a query, we do not need its complete derivation tree; such
information is too complex and superfluous. Therefore, in order to diminish the
domain that needs to be described without any loss of precision, we employ
the abstract syntax. With the abstract syntax, we break up the query into
logical pieces that forming an abstract syntax tree carrying all original information
constitute an internal representation suitable for query evaluation. We introduce
syntactic domains for the language, i.e., logical blocks a query may consist of.
Subsequently, we list all production rules. These definitions are later utilized in
Section 4 within the denotational semantics.
3.1</p>
      <p>Syntactic Domains
By the term syntactic domain, we understand a logical part of a language. In
Table 1, we list all syntactic domains of the XML- λ Query Language with their
informal meaning. Notation Q : Query stands for the symbol Q representing a
member of the Query domain.</p>
      <p>
        Q : Query XML- λ queries,
O : Option XML- λ options – XML input attachements,
C : Constructor XML- λ constructors of output results,
E : Expression general expressions, yield a BaseT ype value,
T : T erm sort of expression, yield a BaseT ype value,
F : F ragment sub-parts of a T erm,
BinOp : BinOperator binary logical operators,
RelOp : RelOperator binary relational operators,
N : N umeral numbers,
S : String character strings,
Id : Identif ier strings conforming to the N ame syntactic rule in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ],
N F : N ullary identifiers of nullary functions (subset of Identif ier),
P roj : P rojection identifiers for projections (subset of Identif ier).
The abstract production rules listed in Table 2 (written using EBNF) connect
the terms of syntactic domains from the previous section into logical parts with
suitable level of details for further processing. On the basis of these rules, we
will construct the denotational semantics of the language.
4
      </p>
      <p>Denotational Semantics
For description the meaning of each XML- λ query, we use denotational
semantics. The approach is based on the idea that for each correct syntactic construct
of the language we can define a respective meaning of it as a formal expression in
Query ::= Options Constructor Expression
Constructor ::= ElemConstr + | Identifier+
ElemConstr ::= Name AttrConstr ∗ (Identifier | ElemConstr)
AttrConstr ::= Name Identifier
Expression ::= F ragment
F ragment ::= Nullary | Identifier | F ragment P rojection</p>
      <p>| SubQuery | F unctionCall | Numeral | String | Boolean
T erm ::= Boolean | F ilter | ’ not’ T erm | T erm BinOper T erm
F ilter ::= F ragment RelOper F ragment
SubQuery ::= Identifier + Expression
BinOper ::= ’ or’ | ’ and’
RelOper ::= ’ &lt;=’ | ’ &lt;’ | ’ =’ | ’ !=’ | ’ &gt;’ | ’ &gt;=’
Numeral ::= Digit+ | Numeral 0.0 Digit+
Digit ::= ’ 0’ | ’ 1’ | ’ 2’ | ’ 3’ | ’ 4’ | ’ 5’ | ’ 6’ | ’ 7’ | ’ 8’ | ’ 9’
Identifier ::= Name
P rojection ::= Identifier
Nullary ::= Identifier
another, well-known, notation. We can say that the program i s the denotation of
its meaning. The validity of the whole approach is based on structural induction;
i.e, that the meaning of more complex expressions is defined on the basis of their
simpler parts. As the notation we employ the simply typed lambda calculus. It is
a well-known and formally verified tool for such a purpose.
4.1</p>
      <p>Prerequisites
The denotational semantics utilizes a set of functions for the definition of the
language meaning. For this purpose, we formulate all necessary mathematical
definitions. We start with the data types and specification of the evaluation
context followed by the outline of bindings to the TE type system. Then, all
auxiliary and denotation functions are introduced.</p>
      <p>Data Types. Each value computed during the process of the query evaluation is
of a type from T ype. Let E be a type from the type system TE, we define T ype as:
T ype ::= BaseT ype | SeqT ype
SeqT ype ::= ⊥ | BaseT ype × SeqT ype
BaseT ype ::= E | P rimitiveT ype</p>
      <p>P rimitiveT ype ::= Boolean | String | N umber
Primitive types, Boolean, String, and N umber, are defined with their set of
allowed values as usual. The type SeqT ype is the type of all ordered sequences
of elements of base types3. We do not permit sequences of sequences. The symbol
⊥ stands for the empty sequence of types – represents an unknow n type. More
3 We suppose usual functions cons, append, null, head, and tail for sequences.
precisely, we interpret types as algebraic structures, where for each type τ ∈ T ype
there is exactly one carrier Vτ , whose elements are the values of the respective
type τ .</p>
      <p>Variables. An XML- λ query can use an arbitrary (countable) number of
variables. We model variables as pairs name : τ , where name refers to a variable
name and τ is the data type of the variable – any member of T ype. Syntactically,
variable name is always prepended by the dollar sign. Each expression in XML- λ
has a recognizable type, otherwise both the type and the value are undefined.
Query Evaluation Context. During the process of query evaluation we need to
store variables inside a working space known as a context. Formally, we denote
this context as the State. We usually understand a state as the set of all active
objects and their values at a given instance. We denote the semantic domain
State of all states as a set of all functions from the set of identifiers Identif ier
into their values of the type τ ∈ T ype. Obviously, one particular state σ : State
represents an immediate snapshot of the evaluation process; i.e., values of all
variables at a given time. We denote this particular value for the variable x as
σ[[x]]. Simply speaking, the state is the particular valuation of variables. We use
the functor f [x ← v] for the definition of a function change in one point x to
the value v.
4.2</p>
    </sec>
    <sec id="sec-2">
      <title>Auxiliary Functions</title>
      <p>For the sake of readability improvement, we propose few semantic functions,
denoted as auxiliary, that should make the denotations more legible. We introduce
functions: isIdent — returns true iff its argument denotes an variable identifier
in a given State, typeOf — returns a type of given argument (one type from the
T ype set), valueOf — returns a typed value of an expression, bool — evaluates
its argument as a Boolean value, num — converts its argument into a numeric
value, and str — converts its argument into a string value.</p>
      <p>Each expression e has a distinguished type in a state σ. The type can depend
on the state because an expression can contain variables. This type is available
by calling the typeOf semantic function defined in Table 3.</p>
      <p>typeOf : Expression ×</p>
      <p>State → T ype
4.3</p>
    </sec>
    <sec id="sec-3">
      <title>XML Schema-Specific Functions</title>
      <p>For utilization of the features offered by the XML- λ Framework we propose
a number of functions working with information available in the type system.
These functions help us to access an arbitrary data model instance. An
application is informally used for accessing child elements of a given one. More formally,
it is an evaluation of a T -object specified by its name. A projection is generally
used for selecting certain items from a sequence. A nullary function. A T -nullary
function returns all abstract elements from ET . Root Element Access is a
shortcut for a common activity in the XML world — accessing the root element of
an XML document. We type it as a constant of a given type from TE .
 ⊥ if e is a nullary fragment
 Boolean if e ∈ νBoolean
 (e is a constant of the type Boolean)
 N umeral if e ∈ νNumeral
 (e is a constant of the type N umeral)
typeOf [ e] (σ) =  String i(feeis∈a νcSotnrsitnagnt of the type String)
 τ i(feiissIadevnatr[iea]b(lσe)oafntdheσ[tey]p:eττ )
 Boolean if e is a relational fragment (filter)
 e1 RelOper e2
 Boolean if e is a logical expression
 e1 BinOper e2, or not e1
appXMLDoc : E → SeqT ype
projXMLDoc : SeqT ype × τ → SeqT ype
nullXMLDoc : E × T → SeqT ype
rootXMLDoc : E
Having defined all necessary prerequisites and auxiliary functions (recalling that
the SeqT ype represents any permitted type of value), we formalize semantic
functions over semantic domains as</p>
      <p>SemQuery : Query → (X M LDoc → SeqT ype)
SemOptions : Options → (State → State)
SemExpr : Expression → (State → SeqT ype)
SemT erm : T erm → (State → Boolean)
SemF rag : F ragment → (State → SeqT ype)
SemRelOper : F ragment × RelOper × F ragment → (State → Boolean)
SemBinOper : T erm × BinOper × T erm → (State → Boolean)
4.5</p>
      <p>Semantic Equations
We start with the semantic equations for the expressions. Each expression e has
a value SemExpr[[e]](σ) in a state σ. The state represents values of variables. The
result is a state, where all interesting values are bound into local variables.
Resulting values are created by constructors. A constructor is a list of items
which can be variable identifier or constructing expression. Resulting values can
be created by element constructors. Elements can have attributes assigned by
attribute constructors.</p>
      <p>Options and Queries. The only allowed option in the language is now the
specification of input XML documents. We explore a function Dom(X ) that converts
input XML document X into its internal representation accessible under
identification X #. A query consists of query options, where input XML documents
DenDoetnatoitoantaiolnSaelmSaenmtiacnstiocfstohfetXheMXLM-λLQ-ueQryueLryanLgaunaggueage
SemT erm[ B] = λσ : State.bool[ B] if B is a constant of the type Boolean
SemT erm[ F1 RelOp F2] = λσ : State.SemRelOper[ F1 RelOp F2] σ
SemT erm[ 0not0 T ] = λσ : State.not(SemT erm[ T ] σ)
SemBinOper[ T1 ’ or’ T2] = λσ : State.(SemT erm[ T1] σ or SemT erm[ T2] σ)
SemBinOper[ T1 ’ and’ T2] = λσ : State.(SemT erm[ T1] σ and SemT erm[ T2] σ)
SemT erm[ T1 BinOper T2] = λσ : State.SemBinOper[ T1 BinOper T2] σ
SemAttrConstr[ N I] σ = attribute(N, SemExpr[ I] σ)
SemElemConstr[ N A1...AnI] σ =
= element(N, σ[ I] , SemAttrCons[ A1] σ, ..., SemAttrCons[ An] σ)
SemElemConstr[ N A1...AnE] σ =
= element(N, SemExpr[ E] σ, SemAttrCons[ A1] σ, ..., SemAttrCons[ An] σ)
SemElemConstr[ N I] σ = element(N, σ[ I] , nil)
SemElemConstr[ N E] σ = element(N, SemExpr[ E] σ, nil)
SemCons[ E1E] σ = append(SemElemCons[ E1] σ, SemCons[ E] σ)
SemCons[ I1E] σ = cons(σ[ I1] , SemCons[ E] σ)
SemCons[ ] σ = nil
SemF rag[ N ull] = λσ : State.nullXMLDoc[ N ull]
SemF rag[ Id] = λσ : State.σ[ Id]
SemF rag[ f (E1, ..., En)] = λσ : State.f (SemExpr[ E1] σ, ..., SemExpr[ En] σ)
SemF rag[ F P ] = λσ : State.(SemF rag[ F ] ◦ SemF rag[ P ] )σ
SemF rag[ (subquery)(arg)] = λσ : State.(SemExpr[ subquery] (σ)(SemExpr[ arg] (σ)))
SemF rag[ I1I2...InE] = SemExpr[ I2...InE] (σ[SemExpr[ E] σ ← I1])
SemF rag[ N ] = λσ : State.num[ N ] if N is a constant of the type N umeral
SemF rag[ S] = λσ : State.str[ S] if S is a constant of the type String
SemF rag[ B] = λσ : State.bool[ B] if B is a constant of the type Boolean
SemExpr[ F ] σ = SemF rag[ F ] σ
= λδ : XM LDoc.(SemCons[ C] (SemExpr[ E] (SemOptions[ O] (λσ.⊥)(δ)))
SemQuery[ Q] (nil) = nil
SemQuery[ Q] (cons(H, T )) = append(SemQuery[ Q] (H), SemQuery[ Q] (T ))
SemOptions[ ] = λσ : State.⊥
SemOptions[ xmldata( X ) Y ] = λσ : State.SemOptions[ Y ] (σ[Dom(X) ← X#])
are bound to its formal names, the query expression to be evaluated, and the
output construction commands. First, input files are elaborated, than an initial
variable assignment takes place, followed by evaluation of expression. Finally,
the output is constructed. The whole meaning of a query can be modeled as a
mapping from the sequence of input XML documents into a sequence of output
values of the type of T ype.
5</p>
      <p>Conclusions
In this paper, we have presented syntax and denotational semantics of the
XML- λ Query Language, a query language for XML based on simply typed
lambda calculus. We use this language within the XML- λ Framework as an
intermediate form of XQuery expressions for description of its semantics.
Nevertheless the language in its current version does not support all XML features,
e.g. comments, processing instructions, or deals only with type information
available in DTD, it can be successfully utilized for fundamental scenarios both for
standalone query evaluation or as a tool for XQuery semantics description.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>H.</given-names>
            <surname>Barendregt</surname>
          </string-name>
          .
          <article-title>Lambda calculi with types</article-title>
          .
          <source>In Handbook of Logic in Computer Science, Volumes</source>
          <volume>1</volume>
          (Background:
          <article-title>Mathematical Structures) and 2 (Background: Computational Structures)</article-title>
          ,
          <source>Abramsky &amp; Gabbay &amp; Maibaum (Eds.)</source>
          , Clarendon, volume
          <volume>2</volume>
          . Oxford University Press,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>T.</given-names>
            <surname>Bray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Paoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Sperberg-McQueen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Maler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Yergeau</surname>
          </string-name>
          .
          <article-title>Extensible markup language (XML) 1.0 (fourth edition)</article-title>
          ,
          <year>August 2006</year>
          . http://www.w3.org/ TR/2006/REC-xml-
          <volume>20060816</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>P.</given-names>
            <surname>Loupal</surname>
          </string-name>
          .
          <article-title>XML- λ : A Functional Framework for XML</article-title>
          .
          <source>Ph.D. Thesis</source>
          , Department of Computer Science and Engineering, Faculty of Electrical Engineering, Czech Technical University in Prague,
          <year>February 2010</year>
          . Submitted.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>J.</given-names>
            <surname>Pokorny</surname>
          </string-name>
          <article-title>´. XML functionally</article-title>
          . In B. C. Desai,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kioki</surname>
          </string-name>
          , an d M. Toyama, editors,
          <source>Proceedings of IDEAS2000</source>
          , pages
          <fpage>2662</fpage>
          -
          <lpage>74</lpage>
          . IEEE Computer Society,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>K.</given-names>
            <surname>Richta</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Velebil</surname>
          </string-name>
          .
          <article-title>Sem´antika programovaıcc´h jazyku</article-title>
          .
          <source>Univerzita Karlova</source>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>P.</surname>
          </string-name>
          <article-title>Sˇar´ek. Implementation of the XML lambda language</article-title>
          .
          <source>Master' s thesis</source>
          , Dept. of Software Engineering, Charles University, Prague,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>J.</given-names>
            <surname>Zlatuˇska.</surname>
          </string-name>
          Lambdak-alkul . Masarykova univerzita, Brno, Cˇ eska´ republika,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>