=Paper= {{Paper |id=Vol-70/paper-10 |storemode=property |title=ROSA: A Data Model and Query Language for e-Learning Objects |pdfUrl=https://ceur-ws.org/Vol-70/paper10.pdf |volume=Vol-70 |dblpUrl=https://dblp.org/rec/conf/pgldb/PortoMFFSCC03 }} ==ROSA: A Data Model and Query Language for e-Learning Objects== https://ceur-ws.org/Vol-70/paper10.pdf
    ROSA: A Data Model and Query Language for e-Learning Objects

 Fábio Porto1, Ana Maria de C. Moura1, Adriana P. Fernandez1, Abílio Fernandes1, Fábio
    José Coutinho da Silva1, Gilda Helena Bernardino de Campos2, Laura Coutinho2
                                                                         2
     1                                                                   CCEAD / PUC-Rio
     Military Institute of Engineering
                                                                     Rio de Janeiro, RJ, Brazil
       Rio de Janeiro, RJ, Brazil
                                                                  {gilda, laura}@ccead.puc-rio.br
   {fporto, anamoura}@de9.ime.eb.br

                                                      Abstract
         Learning Content Management Systems (LCMS) supports e-learning applications with storage
         and efficient access for e-learning objects (LO)s. ROSA is a LCMS built as a semantic layer on
         the top of an XML native DBMS, Tamino. Together, ROSA and Tamino, offer instructional
         designers a semantic view of e-learning content. In this paper, we present ROSA Data Model
         and Query Language, designed as an extension to RDF data model and RQL query language.
         The Data Model is structured around the LO modeling and their relationships, adapted to the e-
         learning domain. An algebra defines valid operations over LO data. Queries are formulated in
         ROSAQL that extends RQL with joins, graph navigation and recursion.


1. Introduction

Distance Education, also known as open, flexible or distributed learning, is a mode of
education whereby learners are physically separated from the institution, and where the
learning process takes place outside the education establishment. Students learn where and
when it suits them, at their own pace [1]. This education mode resorts to various
educational technologies to facilitate both the learning and the communication processes
between tutors and learners.
   LCMSs are being developped with the intention of abstracting e-learning material
storage and access from applications. The fundamental unit of reference on e-learning data
modeling has been specified as a Learning Object(LO). A LO is a collection of reusable
material used to support learning, i.e., an entity that can be digital or not, and can be used
for learning, education or training [2]. A LO is identified by a set of metadata descriptors
established by an international metadata standard, such as LOM (Learning Object
Metadata) [3] and SCORM (Sharable Content Object Reference Model) [4], an extension
of LOM. The data elements that constitute a metadata instance of a LO are organized into a
hierarchy, providing information about: its general characteristics, life cycle, meta-
metadata, technical requirements, educational characteristics, intellectual property,
relationships between other LOs, LOs annotations and LO classification system.
   In a LCMS, the process of designing e-learning courses may be supported by query
processing techniques that explore metadata and semantic relationships to aid in LOs
search. This project uses Semantic Web technology to enable semantic query capability.
   The Semantic Web is built upon two W3C standards, corresponding to the second and
third layers respectively of Berners-Lee’s architecture [5]: the XML language [6], which
provides standard syntax for data representation, ensuring data exchange; and the
framework Resource Description Framework – RDF [7], a standard data model for
expressing structure and representing metadata vocabularies, serving as a metadata

PGLDB’2003, pp. 91-104, 2003.
 PUC-Rio, Rio de Janeiro-RJ, Brazil
                  ROSA: A Data Model and Query Language for e-Learning Objects


language for the Semantic Web. This model allows the user to define relationships between
resources as semantic graphs on the Web, using XML as the syntax language to transport
data. In addition, RDF includes a schema, named RDF Schema (RDFS) that can be used to
define application specific vocabularies, defining taxonomies of resources and properties
used by specific RDF descriptions.
   In this paper, our attention is focused on proposing a data model and query language as a
formal basis for the development of ROSA LCMS, in the context of the Semantic Web. LO
is the main structure of the data model, which is complemented with semantic and
aggregation associations. The latter are modeled as an extension of RDF statements. ROSA
data model provides a powerful algebra that makes it possible to query over LOs metadata,
as well as over the semantic associations between LOs. ROSA is designed as a semantic
layer middleware on the top of a DBMS. The project is sponsored by Consist, which
provides the Tamino XML native DBMS as a storage layer [8]. As a result, some of the
propositions in this paper are influenced by the target XML data model, although mappings
to other data models may also be envisaged.
   The rest of the paper is organized as follows: section 2 presents a use case that serves as
a query benchmark to describe the data model in section 3. This section gives a
comprehensive description of its structure and algebra, with examples based on the queries
presented before. Section 4 presents the ROSAQL query language and section 5 presents
related work. Finally, section 6 concludes the paper with additional comments and future
work.


2. Use case

This section presents a use case for ROSA project. It comprehends a RDF style
representation of a LO conceptual map for a Master program in Computer Systems, a class
diagram and a list of representative queries that should be answered by this system.
   The use case data and queries reflect ROSA objective of supporting the design of e-
learning courses. The LOs metadata and their relationships offer a rich semantic model for
searching the repository during class preparation. On the other hand, it is not designed as a
support for running courses, which would require execution time primitives not found in
this use case.

2.1 The LO Conceptual Map
Figure 1 presents a simplified LO conceptual map for the Master Program in Computing
Systems and the corresponding schema describing and classifying LOs and their
relationships. The map is represented through a directed-labeled graph where nodes
represent LOs, identified by their names. Directed labeled edges model relationships
between LOs, like predicates in RDF. In this map, some LOs are associated to their classes
in the schema through a dashed line.
   The classes Program, Course and Topics represent domain specific types of LO. They
model the main composition structure of courses in a certain school program. Other
relationships, such as prerequisite and is basis for also appear in the map representing a
specific domain knowledge defined by the user. In addition, the classes e-Learning
Program, Master Program and PhD Program specialize the Program class. A LogicalLO
represent a collection of LOs and may contain many PhysicalLOs. A PhysicalLO contains


                         Proceedings of the PGL DB Research Conference                     92
                           ROSA: A Data Model and Query Language for e-Learning Objects


physical media used to prepare a dicdatic material, such as image, documents, ppt files, etc.
PhysicalLOs are not represented in the Conceptual map.
                                          comprehends                                   covers
                         Program                                Course                                               Topic

                                                                                                                                 covers
              Master        PHD
                                          E-learning
             Program       Program
                                           comprehends
                                     Data            Graphs Network Relational                                  OO          OR
                                     Struct comprehends
                                                                                        covers
                                      comprehends
                                               Trees                            DB
                       comprehends                                                                         r
                                     Algorithm                                 models                   fo           SQL
                                                                                                     is_
                                                                                    Tec.          bas Relat.
             Computing                           is-prereq_of                                si _              fundaments




                                                                    co
                                                                                   Armaz.              Algebra




                                                                      ve
              Systems     comprehends




                                                                         r
                                                                         s
                                                           Data




                                                                                                     covers
               IME                                                                  Query                              OR   QUEL
                                                           bases                    Lang.                      Relat.
                         comprehends                                                                          Calculus covers
                                 Network




                                                                      covers
                    comprehends                                                    Query                                        QBE
                                                                                  Optimiz.
             Computing                                                                                           is_basis_for
              Systems                                           is_basis_for
                                                                                                      Transaction
               PUC        ….. Computer                                            Conc..




                                                                                            covers
                                                           DB
                                Theory                  retrieval                 Control
                                                                                                      Protocols

                         Figure 10: Master Program in Computing Systems – LO conceptual map

Metadata descriptors of LO classes correspond to a subset of the LOM metadata standard1
[3].
In order to give a general idea of the semantic meaning expressed by the LO conceptual
map in Figure 10, the following assertions can be made:
      • the Database course covers the topics Database Models ;
      • the topic Relational Calculus fundaments the topic SQL;
      • the topic Storage Techniques is basis for the topic Optimization;

The verbs in these assertions (relationships) are taken from a controlled vocabulary
(thesaurus) specified within an application domain. The next subsection lists relationships
considered relevant to the use case domain.

2.2 Domain predicates
The relationships considered in the LO conceptual map in Figure 10 and their synonyms
include:
      a) Aggregation predicatescomprehends (has, covers, is part of , is made of,
                 characterizes, contains, includes, etc.);Domain specific predicates
             a. prerequisite
             b. fundaments (is basis for, is condition for, etc.)
             c. requires (opposite of prerequisite)
             d. implies (determines, leads to, derivates, influences, allows for, etc.)




1
    LO attributes (metadata descriptors) have been omitted in this paper due to space restrictions


93                                       Proceedings of the PGL DB Research Conference
                 ROSA: A Data Model and Query Language for e-Learning Objects



2.3 Queries
In this section, queries covering e-learning design activity complete the use case with data
access requirements. The use case supports queries exploring LO metadata and the
semantic relationship between LOs. Queries are classified and listed bellow:
       1. LO metadata projection
              a) Which are the difficulty levels of the registered Topics?
              b) What are the formats of PhysicalLOs of type File?
       2. LO projection
               c) Which are the Program LOs registered?
       3. Semantic relationships projection
               d) Which are the associations established with Program LOs ?
       4. Projection of semantically associated LOs
               e) Which are the prerequisite Courses for the Database Course?
       5. Projection of aggregated LOs
               f) Which Topics does the Database Course comprehend?
       6. Aggregation navigation
               g) Which courses are parts of each Program?
       7. Selection on metadata
               h) Get LOs created since ‘01/01/2003’?
               i) Which Topics are classified with level ‘b’?
       8. Selection on relationships
               j) Which LOs are associated to the Database Course?
               k) Which LO does comprehend the Database Course?
               l) Which LOs do fundament the Topic ‘SQL’?
               m) Which LOs classified with level ‘b’ does the Database Course
                   comprehend?
       9. Selection on relationship identification
               n) Which are the fundamental LOs?
       10. Recursive navigation
               o) Which Topics do fundament SQL?
               p) Which Topics do the Program MS in Computing Systems comprehend?
       11. Merge
               q) Which Topics are present at both IME and PUC institutions?
       12. Physical LO retrieval
               r) Which Physical LOs does the Topic Transaction Management cover?

This system does not intend to support a natural language query interface as these queries
may suggest. Rather, use case queries are useful to measure system capabilities and to
validate the algebra and query language proposed in this paper. Queries are directly
submitted by a user interface or by an ad hoc interface using the ROSAQL query language
(see Section 4.1).




                         Proceedings of the PGL DB Research Conference                   94
                  ROSA: A Data Model and Query Language for e-Learning Objects




3. Data Model

In this Section, a Data Model for the ROSA system is proposed. Initially, the structural
aspect of the Model is specified, followed by an algebra defining a set of valid operations.

3.1 Structure
A database comprehends a database schema and database instances for a certain user
domain. A database schema, or simply schema, identified by a name, includes definitions
and rules used to create objects and relationships between them in a certain domain.
   Figure 2 presents the LO data model expressed in a UML class diagram. The
ComplexResource class is the root class of the model. It makes it possible a common
representation for LOs, Relationships and Dynamically created objects, which represent
views in the database.
    LO class derives directly from ComplexResource and corresponds to the attributes
specification dictated by the LOM metadata standard. LOs are uniquely identified by a
resource identification attribute and are classified according to a type attribute.
    LOs instances represent concepts (Logical LO) and documents (PhysicalLO) that take
part in an e-learning repository. LogicalLOs correspond to different levels of aggregations
in the domain, such as: Topic, Course and Program in Figure 1. PhysicalLO corresponds to
files and their metadata. An instance of PhysicalLO is considered atomic, which means that
it can not be composed by other LOs, whereas LogicalLOs may contain a collection of
other LOs associated by aggregation and semantic relationships.
   Associations between LOs are specified through Relationships. There are two types of
relationships: aggregation and semantic. Aggregation relationship models composition of
LOs. Its semantic is fixed and known to the model. On the other hand, semantic
relationships establish specific domain associations between LOs. Relationships are also
identified by a unique complex resource identification corresponding to the verb it
represents, as specified in the domain thesaurus.
   The associations of LOs expressed through relationships conceptually define triples
t(subject, property, object), where subject and objects are of type LO and property is
classified as a relationship. The schema specifies valid associations through triples T(class,
relationship type, class). Thus, valid instances of t are in accordance to T.
   The cardinality of associations is instance dependent. Typical association cardinality is
one to many but restrictions may limit the number of elements playing the role of objects in
a association. For this reason, relationship instances (properties) are modeled as collections
with objects as their elements.
   Relationship types are specialized according to the type of collection they implement.
Aggregations may implement the semantics of bag, set, list and hierarchy. Set and list
represent the corresponding mathematical concepts, whereas bags may contain duplicates
in the collection. Hierarchy corresponds to LOs whose contents follow a tree structure.
   Note that the collection abstraction provides a modeling construct for a whole complex
LO structure. This is important as long as the subject LO requires a specific navigation over
aggregated LOs . This construct is essential; otherwise we cannot ensure that required LOs
will be followed during an instruction.



95                       Proceedings of the PGL DB Research Conference
                  ROSA: A Data Model and Query Language for e-Learning Objects


   The classes set and bag include attributes to identify maximum and minimum
cardinalities. Those attributes introduce restrictions over the collections that specify,
respectively, the maximum and minimum number of elements that may be extracted from
the collection. In Figure 10, for instance, the exclusive or aggregation relationship between
Relational Calculus and SQL, QUEL and QBE can be modeled by a set collection with
maximum cardinality 1 and minimum cardinality 1, meaning that at least one and at
maximum one of the LOs in the set should be exhibited when presenting Relational
Calculus. The representation of sets and bags with unlimited access to its elements is
represented by minimum cardinality=unbounded.
   Finally, relationships implement semantic similar to that of statement in RDF. The data
model, nevertheless, differ from RDF data model. Firstly, LOM metadata attributes are
specified as properties of the LO class, rather than statements about the LO resource.
Hence, attribute values are not identifiable and are not considered first class objects, as in
the case of literals in RDF [9]. Aggregation and semantic relationships are modeled as
collections of LOs. This is more general than in RDF where only aggregation type of
relationships are defined as collections. Moreover, relationship cardinalities in ROSA may
restrict the participation of members in collections, which is not available in the RDF
statement semantic.
                                                         Complex
                                                         Resource


                                                                                         1..n       LO
                                   Relationship                                          1..n
                                                                                                   LOM
                                                                                                 attributes
                                                                    0..n
                                                                     Semantic
                             Aggregation   Semantic                  collection
                                                                    0..n
                                                            0..n           1..1
                                                      Aggregation                  LogicalLO PhysicalLO
                                                       collection
                                                           0..1                   1..1



                                                                                            Ci                Cj

                     List       Bag          Set        Hierarchy
                              min card     min card
                              max card     max card
                                      Figure 11: The Rosa Data Model

3.2 Algebra
In order to be complete, a data model for LOs requires the specification of valid operators
for that model, which is considered the model algebra. The algebra definition makes it
possible to express a high level query language into a formal canonical representation and,
therein, the conception of optimization strategies.
   Many algebras have been proposed covering operations in different data models ([10],
[11], [12], [13], [14]). This work presents an algebra which, in some sense, borrows from
all these works. Firstly, the majority of the operators are adapted from their relational
algebra counterparts. Path expressions through LO relationships are similar to path
expressions in the OO data model [11].




                            Proceedings of the PGL DB Research Conference                                          96
                   ROSA: A Data Model and Query Language for e-Learning Objects


   In LO algebra, operators receive collections of LOs as input and equally produce
collections of LOs as output. The closure of the model enables the composition of operators
in an algebra expression.
   The algebra includes operators to explore relationships between LOs as one of the
semantic enrichments provided by the proposed model. Aggregation relationship receives
special treatment once its semantic is fixed and known to the systems.
   Semantic relationship operators provide the processing of domain specific associations.
This processing considers the classification of predicates as reflexive, symmetric and
transitive. As an example, reflexive predicates include, implicitly, the association between a
LO and itself through that predicate. Transitive predicates make it possible the evaluation
of transitive closure operators and a symmetric predicate enables answering queries
independently of the LO role in an association, as subject or object.
   A list of operators and their semantics are presented next:

Object functions
Object functions implement per object properties access. Three types of object functions
are proposed corresponding to access to LO attributes, aggregated members and
semantically associated members.
     • gettype() – obtains a literal corresponding to the type of the current LO.
     • getLOM_attributei() – invoked on a complex resource, returns a literal
         corresponding to the value of the attribute named LOM_attributei in the current
         object.
     • semantic() – invoked on a complex resource, returns a bag of LOs associated to
         the former through any type of semantic relationship.
     • semanticrelationshipi() – invoked over a complex resource, returns a collection
         of LOs associated to the current complex resource through the semantic
         relationship named semanticrelationship i.
     • comprehends() - invoked over a complex resource, returns a collection of LOs
         that take part in an aggregation relationship within the current complex resource.
     • property() – invoked on a complex resource, returns a literal corresponding to
         the ids of the associations found in the complex resource.

Relationship fuctions
      •   getId() – invoked on relationship objects, returns the literal corresponding to the
          value of its id attribute.

Collection Operations
Operations on collections are type preserving in respect to the input collection. The
collection resulting from a semantic relationship is always of type bag. LOs obtained from
navigating through an aggregation relationship form a collection with type and restrictions
inherited from the relationship. The union of collections of different types results in a
collection of type bag.
   The processing of a collection may require the invocation of object functions over
collection members. When these functions return collections, a set-collapse operator groups
all collections into a single one.


97                        Proceedings of the PGL DB Research Conference
                  ROSA: A Data Model and Query Language for e-Learning Objects




Source (c(expression)) – This operation produces a collection of LOs from an expression. The
expression syntax is left to be defined by an implementation. It is equivalent to its
homonym operation in [15].

getNext ([r1,..]c) – This function obtains the next LO from a collection c. The navigation
order is defined by the operation implementation according to the collection type. If the
collection has a maximum cardinality n specified, then a set of LOs {r1, r2,.., rk}, where k ≤
n, identifies the interesting LOs, otherwise, the first k LOs are retrieved.

Projection Π [p1[^p2[^..^[pn]]] ] (c) – This operation receives a collection c of type C of LOs
and produces a new collection with complex resources according to p, which defines the
projection behavior. The behavior associated to p modifies the resulting collection
structure. The specification of a list of pi defines a conjunction of each pi behavior. A list of
projections pi leading to collections with elements of incompatible types are invalid.
   The different projection behaviors are presented bellow, including the formulation of the
projection operation for the corresponding use case queries.
      a) Structure projection - Π(a1,a2,...,an) (c)
           if p is a list of LO attributes (i.e. (a1,a2,...,an) ) then the resulting complex
           resources structure follows the attribute list and the attribute values are obtained
           by invoking the getLOM_attributei() function on each object in c, for ≤ i ≤ n.
               o Which are the difficulty levels of the registered Topics?
                         ƒ (difficultylevel) (IME.Topic)
               o What are the formats of PhysicalLOs of type File?
                         ƒ (format) (IME.File)
      b) LO projection - Π (c)
           If p is not present, then the resulting collection is a copy of the input collection.
               o Which are the registered Programs?
                         ƒ (IME.Program)
      c) LO projection through generic association - Π(semantic()) (c)
           If p is the generic semantic() function the resulting collection is the result of the
           union of all the collections obtained by invoking the respective function on each
           element of c.
               o Which are the Topics associated to the Database Course?
                         ƒ (semantic()) (c) – where c is a collection composed of the Database
                              course LO.
      d) Semantic relationship projection - Π (property()) (c)
           if p is the property() function, then the resulting collection will include all
           complex resources of type Relationship with id attribute value obtained from
           invoking the getid() function on the semantic relationship objects associated to
           the objects in c.
               o Which are the associations established with Program LOs?
                         ƒ (Property())(IME.Program)
      e) Projection of semantically associated LOs - Π (semanticrelationshipi()) (c)




                          Proceedings of the PGL DB Research Conference                       98
                   ROSA: A Data Model and Query Language for e-Learning Objects


           if p is a semanticrelationshipi() function, the resulting collection is the union of
           the collections obtained by invoking the corresponding function on each element
           of c.
               o Which are the prerequisite Courses for the Database Course?
                         ƒ (prerequisite) (c) – where c is a collection composed of the Database
                             Course LO.
       f) Projection of aggregated LOs - Π (comprehends()) (c)
           if p is the comprehends() aggregation relationship function, then the resulting
           collection will contain LOs obtained by applying this function on each object in
           c.
               o Which Topics does the Database Course comprehend?
                         ƒ (comprehends()) (c) - where c is a collection composed of the
                             Database Course LO.
       g) Projection on a path expression - Π (semanticrelationshipi()..semanticrelationshipn ()) (c)
               o Which Topics are covered by the Database Course and are the basis for
                   other Topics?
                         ƒ (comprehend().isbasefor()) (c) – where c is a unitary collection
                             containing the DatabaseCourse LO.
Selection σ (δai) (c) – evaluates predicates over LO metadata attributes in a collection c. The
collection may be obtained by traversing semantic or aggregation types of relationships. In
each case, selected LOs are those that evaluate true in an existential quantification
predicate. Predicates are formulated as a combination of logical operators (and, or, not) and
comparative operators (<,>,≥,≤, = ,≠) on attribute values. Predicates over LOs taking part in
relationship collections require the invocation of collection functions. This algebra
considers an instance variable associated to the collection objects so that conjunctive
predicates are implicitly defined over instance variable attributes.
   Relationship properties provide alternative paths for the selection operations. Firstly,
reflexive relationships include the subject LO in the collection of LOs to be evaluated by
the selection predicate. Selection operation on collections associated through symmetric
relationships may opt for evaluating predicates without having to follow the relationship
association, as both ends will appear as subjects of that relationship.
       a) Selection on metadata attribute σ (δai) (c)
           For each object in c, evaluate the predicate δ on LOM attribute ai, including in
           the resulting collection LOs that evaluate as true.
               o Which Topics are classified with level ‘b’?
                         ƒ    (σ (difficultylevel= ”b”) LO)
               o Get LOs created since ‘01/01/2003’.
                         ƒ    (σ (creationDate>”01/01/2003”) LO)
       b) Selection on elements in relationship collections σ (δsemantic().attributei()) (c).
           For each LO, apply the semantic() function obtaining a collection of LOs
           semantically associated to the subject LOs. For each LO found obtain the value
           of attributei and evaluate the δ predicate over it. For those evaluated as true,
           return the subject LO.
                 o Which are the LOs associated to the Database Course?
                         ƒ (σ ((semantic().id=‘Database’) (IME.Course) )



99                         Proceedings of the PGL DB Research Conference
                   ROSA: A Data Model and Query Language for e-Learning Objects


      c) Selection through a specific relationship σ (δsemanticrelationshipi ().attributei)
         (c)
          For each LO in c, apply the semanticrelationship() function obtaining a collection
          of semantically associated LOs. For each LO found obtain the value of attributei
          and evaluate the δ predicate over it. For those evaluating true, return the subject
          LO.
               o Which LOs does comprehend the Topic “SQL”?
                      ƒ Π (σ ((comprehend().id=‘SQl’) (IME.Topic) )
               o Which are the LOs that fundament the Topic “SQL”?
                      ƒ Π (σ ((fundament().id=‘SQl’) (IME.Topic) )
      d) Selection on relationship identification σ (δproperty ()) (c)
          For each LO in c, evaluate the existence of an association with a specific id. If it
          evaluates as true, return the LO to the resulting collection.
               o Which are the fundamental LOs?
                      ƒ Π (σ ((property()=‘fundament’) (IME.LO) )
      e) Selection on a path expression σ (δ(semanticrelationshipi ()..semanticrelationshipn ().ai) (c)
          For each LO in c, obtain the collection by iteratively invoking the semantic
          relationships in the predicate list. The predicate is evaluated over attribute ai of
          LOs found at the end of the relationship path.
               o Which Topics created by “Ana Maria Moura” are covered by the
                   Database Course and are the basis for other Topics?
                      ƒ Π σ (δ(comprehend().isbasefor().author=’Ana Maria Moura’) (c) – where c is a
                          unitary collection containing the DatabaseCourse LO.
      f) Selection on all relationship components
               o Which LOs classified with level ‘b’ does the Database Course
                   comprehend?
                      ƒ Π (σ ((level=‘b’) (Π (comprehends()) (σ (id=‘Database’) (IME.Course)))) )

Transitive Closure ϕ (relationship [iteration max], up|down) (c) – obtain a collection by recursively
invoking the referred relationship over LOs. Relationships must qualify as transitive
property. Aggregation relationship is transitive by definition. Other semantic relationships
must have transitive property set. The transitive closure iteration ends when either there are
no more paths to follow or, no more nre LOs are produced, or an iteration limit, such as
iteration max, is reached. The path from a LO may follow a down direction, default, where
relationships are followed from LOs playing the subject role towards LOs playing object
roles, or the opposite direction, up, conversely.
    a) Transitive closure down
                o Which Topics do the Program MS in Computing Systems comprehend?
                        ƒ Π ϕ (comprehend())(c) – where c is a unitary collection containing the
                            LO MS in Computing Systems.
    b) Transitive closure up
                o Which Topics do fundament SQL?
                        ƒ Π ϕ (fundament(), up)(c) – where c is a unitary collection containing
                            the SQL LO.




                           Proceedings of the PGL DB Research Conference                           100
                  ROSA: A Data Model and Query Language for e-Learning Objects


Union (c1 ∪ c2) - The resulting collection contains the distinct LOs in collections c1 and
c2. The LO equality operation considers the identification attribute value for duplicate
elimination.
             o Which Topics and Courses were created by ‘Ana Maria Moura’?
                    ƒ Π (σ(autor=”Ana Maria Moura”) (Topic) ∪ σ(autor=”Ana Maria Moura”)
                       (Course))

Join     (c1)p(c2)- The resulting collection includes the LOs in collections c1 and c2 whose
attribute values match the predicate in p.
                o Produce LOs from collections Topic and Course that agree on their
                    authors?
                        ƒ Π ( (t Topic) (t.author=c.author) (c Course) )


4. Query Language

Ad-Hoc data base queries are directly submitted to the ROSA system through the ROSAQL
query language. The system also includes a QBE like environment for submitting queries
following a user-friendly interface. In the next subsection the ROSAQL query language is
briefly presented.

4.1 ROSAQL
       return [[p1[^p2[^..^[pn]]] ] [C] (c)]
       [from Ei(Ci) [v] [in δ ai] [, Ej(Cj)] [on δ]
       [where {(δai)Ci, δ (collection-function),
               δ (p1^p2^...^δpn)}]
       [start at [Ci with δpi] through [relationship, . ] {up,down} [until limit]]
       [order by {LOM attribute } {asc,desc]
       [union]

      a) return – specifies the complex resource structure to be produced. Its behavior is
         defined by the pi term according to the algebra. In the absence of a from clause, a
         collection c may be specified as a source for the database collection.
      b) from – specifies the input collections to be operated on by the query. A collection
         is specified by a pair (schema, class) where schema identifies a database and class
         is the collection class. A list of n terms in this clause should be followed by n-1
         join predicates between the collections. An instance variable v may be defined to
         iterate over a collection, in which case a selection may restrict the collection
         elements.
      c) where – specifies a selection predicate over collections involved in the query.
         Selections may be specified over LOs metadata attributes, or over relationship
         identifications. The user may also write path-expressions to navigate through a
         list of relationships and define a terminal collection over which a predicate would
         operate.
      d) Start – specifies the transitive closure operation. The input collection may be
         specified in the from clause in conjunction with the where clause or directly in the


101                      Proceedings of the PGL DB Research Conference
                ROSA: A Data Model and Query Language for e-Learning Objects


        start clause. The semantic relationship defined to guide the inference must have
        the transitive property set, unless a dot ‘.’ is specified, in which case an
        aggregation relationship is assumed. The inference stops when one of the events
        occur: no more relationship to follow; no new LOs are obtained; a limit of
        recursion is achieved. The resulting collection contains LOs obtained from each
        recursive navigation. The clauses, up and down, indicate the path navigation
        direction to be followed. A down clause processes relationships from a subject
        LO to an object LO, whereas an up clause works in the opposite direction.
     e) Order by – orders LOs according to a list of LOM attributes. If needed, the
        resulting collection is coerced to be of type list.

4.2 Query Examples
     a) Which are the difficulty levels of the registered Topics?
                   return difficulty-level (IME.Topic)
     b) Which are the registered Programs?
                   return (IME.Program)
     c) Which are the Topics associated to the Database Course?
                   return semantic() Topic (IME.Course) c
                             where c.id=’Database’
     d) Which are the prerequisite Courses for the Database Course?
                   return prerequisite() Course (IME.Course) c
                             where c.id=’Database
     e) Which Topics are covered by the Database Course and are the basis for other
        Topics?
                   return comprehend().isbasefor() Topic (IME.Course) c
                             where c.id=’Database’
     f) Which Topics are classified with level ‘b’?
                   return Topic (IME.Topic) t
                             where t.level=’b’
     g) Which LOs does comprehend the Topic “SQL”?
                   return (IME.LO) l
                             where l.comprehend().id = ‘SQL’
     h) Which Topics created by “Ana Moura” are covered by the Database Course and
        are the basis for other Topics?
                   return Topic
                   from IME.Course c in c.id=´database’
                             where       c.comprehends().author=’Ana Moura’    and
                          c.isbasisfor()
     i) Which Topics do the Program MS in Computing Systems comprehend?
                   return Topic
                   from IME.Program p in p.id=’ MS in Computing Systems’
                   start comprehend()
     j) Which Topics and Courses were created by ‘Ana Maria Moura’?
                   return Topic
                             where author=’Ana Maria Moura’
                   union


                       Proceedings of the PGL DB Research Conference                102
                  ROSA: A Data Model and Query Language for e-Learning Objects


                  return Course
                          where author=’Ana Maria Moura’
      k) Produce LOs from collections Topic and Course that agree on their authors?
                  return LO
                  from Topic t, Course c on t.author=c.author


5. Related Work

The research on algebra languages has been very intense since Codd [10] presented the
relational algebra. The majority of algebras proposed since then adapted the basic relational
operators set to new data models, in addition to proposing new operators to deal with
special aspects of the data model.
   Research on e-learning data models is still in its infancy and, to the best of our
knowledge, no previous work has proposed an algebra for query languages in this domain.
A reasonable comparison could be made with algebras proposed for the RDF data model.
RAL is an algebra for querying RDF [13]. RAL models RDF as a finite set of triples
composed of resources associated through properties, which form a directed labeled graph.
The model integrates RDF and RDFS definitions. Nodes in the graph are either resources or
literals, both labeled by unique identifiers. The algebra operators receive a collection of
nodes and process then with adapted relational operators, which include: projection,
selection, cartesian product, join, union, difference, intersection and loop operators that
iterate over collection nodes. The operations navigate through the RDF graph nodes using
uniformly the projection operation to access object values from resources playing the role
of subjects in triples.
   ROSA data model differs from RAL in many aspects. Firstly, the model accepts attribute
definition for LOM metadata attributes, whose literal values are not identifiable. In
addition, although conceptually forming triples, the data model represent relationships as
one-to-many associations that take part in LO structure. Moreover, associations in ROSA
may be restricted by maximum and minimum cardinality definitions and properties qualify
associations as reflexive, symmetric and transitive. These aspects extend RDF data model
towards a more semantic model. Regarding the algebra, ROSA includes path expressions
and transitive closure operations that provide inferences on the intentional data model.


6. Conclusion

Nowadays, Distance Education (DE) is a rapidly expanding mode of education. Many
courses traditionnally taught in a classroom environment by a teacher, are now being
transformed into an electronic sibling, in order to be adapted to this new education
methodology. Learning objects have been used as a promising solution to develop standard
infrastructures enabling tutors to create and organize their didactic material.
   This paper presented ROSA, a repository system to store and access LOs, based on their
semantic properties, expressed in RDF. The system includes a powerful data model whose
algebra is completed described. In order to test the algebra capability, a query language,
named ROSAQL is under development. We use the semi-structured DBMS Tamino 4.0 [8],




103                      Proceedings of the PGL DB Research Conference
                   ROSA: A Data Model and Query Language for e-Learning Objects


a native XML DBMS to store LOs. The use of the XML data model for representing RDF
sentences provides an easy data interchange and query capabilities.
   There is a vast spectrum of research issues to be addressed in the near future. The algebra
can be extended to support inferences and analysis on the database schema. Equivalence
rules and heuristics must be conceived in order to enable queries optimization. There is also
a study, already in course, for integrating a Thesaurus to the system helping users with
terms relationships.


References

[1]    The     British    Counsil,     Educação      a     Distância,     2001    available    at
      www.britishcouncilpt.org/education/distance.htm.
[2] Jacobsen, P. E-learning Magazine
      http://www.elearningmag.com/elearning/article/articleDetail.jsp?id=5043.
[3] IEEE: “Draft Standard for Learning Object Metadata”; 15 July 2002.
[4] Advanced Distributed Learning Sharable ontent Object Reference Model Version 1.2 – The
     Scorm Overview, http://www.adlnet.org.
[5] The Semantic Web - XML2000. http://www.w3.org/2000/Talks/1206-xml2k-tbl/Overview.html.
[6]   W3C (World Wide Web Consortium). XML                   -   Extensible   Markup   Language.
      http://www.w3.org/XML/ - Last access Jan., 2003.
[7] Resource Description Framework (RDF) Model and Syntax Specification. 1999.
    http://www.w3.org/TR/PR-rdf-syntax/, 1999. Last access Oct. 2001.
[10] Codd, E.F., A relational model of data for large shared data banks, Comm. Of the ACM,
     13(6), June 1970, pp. 377-387.
[11] Catell, R.G.G., Barry, D.K., Berler, M., Eastman, J., Jordan, D., Russell, C., Olaf, S.,
     Stanienda, T., and Velez, F., editors. Object Data Standard ODMG 3.0. Morgan Kaufman,
     January 2000.
[12] W3C (World Wide Web Consortium, The XML Query Algebra, Working Draft, May
     2000,http://www.w3c.org/TR/2000/WD-query-algebra-20001204
[13] Francincar,F., Houben G., Vdovjak R., RAL: an Algebra for Querying RDF, 3rd Conf. on Web
      Information Systems Engineering, Singapore,12-14 Dec. 2002
[14] Francincar,F., Houben G., Pau C., XAL: an Algebra for XML Query Optimization, in Database
      Technologies 2002, 13th Australasian Database Conference, Volume 5 of Conferences in
      Research and Practice in Information Technology, pp. 49-56. Austraolian Computer Society
      Inc., 2002
[15] Jeffrey Naughton, David DeWitt, David Maier et all, The Niagara Internet Query System,
      IEEE Data Engineering Bulletin 24(2): 27-33 (2001)
[16] M. Fernandez, J. Simeon, P.Wadler, A semi-monad for semi-structured data, Int’l Conference
     on Database Theory, London, January 2001




                          Proceedings of the PGL DB Research Conference                       104