<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Open Corpus Adaptation++ in GALE: Friend or Foe?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>David Smits</string-name>
          <email>d.smits@tue.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paul De Bra</string-name>
          <email>debra@win.tue.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Mathematics and Computer Science, Eindhoven University of Technology</institution>
          ,
          <addr-line>Postbus 513, 5600 MB Eindhoven</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>“Open” has quickly become the hottest topic in any field related to information, including open government data, open learning resources, open user models, … Open Corpus Adaptation has been defined as the ability to perform adaptation to resources located anywhere on the Web. This leaves the definition of and control over the adaptation in a central place. GALE adds the ability to have the adaptation (definition) distributed over the Web. In this paper we describe how GALE achieves this functionality and we raise the question whether this is actually a desired feature or potentially a dangerous addition with unintended consequences.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Using hypertext to open up and link all available information was first suggested by
Ted Nelson when introducing Xanadu (see http://www.xanadu.net/) and became a
reality soon after the introduction of the Web. The initial Web was a “safe”
environment, where all information was static. The browser could download any web page
and display it, and the user would be assured that this would not have any side effects.
Since then the (on-line) world has become much more dynamic. Our data resides “in
the cloud”, processing is done “in the cloud”, but even when just accessing websites
they know (remember) who we are, and they may cause our browser to execute code
we have no control over.</p>
      <p>So far adaptive hypermedia applications have been “safe”: an adaptive application
is served by a single adaptive hypermedia system (AHS), providing adaptation to
local resources, and storing user-related information in a local database. These
applications are also “closed”. Initiatives to open up AHS have so far approached two
aspects: 1) the user model has become distributed [1], integrating information coming
from many different adaptive and non-adaptive applications, including social
networks, and 2) the resources have become distributed in open corpus adaptive
hypermedia [2]. Distributing user model and resources has also been a goal in the
GRAPPLE project (http://www.grapple-project.org/) in which GALE (the GRAPPLE
Adaptive Learning Environment) was developed. Within GRAPPLE it was foreseen
that the definition of the adaptation would be kept centralized, created through a
graphical authoring toolset GAT (GRAPPLE Authoring Tool) [3]. GALE has been
designed to be able to perform to resources that can be loaded from anywhere on the
Web (retrieved through HTTP). This leads to a seemingly strange situation: an author
defines adaptation for a resource in GAT and GALE performs adaptation to that
resource that is created by someone else, located anywhere in the world, without the
author of the resource having any influence on the adaptation that will be performed
to that resource. Would it not be logical to enable authors of resources to also define
the adaptation (and user model updates) associated with that resource? This is exactly
what the Open Corpus Service in GALE allows. In Sect. 2 we describe this “open
corpus adaptation++” and then (Sect. 3) we discuss the feasibility of actually uptake
of this functionality and the potential dangers involved.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Open Corpus Adaptation++ in GALE</title>
      <p>In “standard” use an application is defined by an author and added to GALE through
the “CAM update service”, resulting in a domain model (DM) which in GALE
contains the conceptual structure and the adaptation for the application. The GALE event
bus can connect different DM services with the common adaptation engine. It can
also connect different user model services, including an internal GALE user model
service and an external GRAPPLE User Model Framework GUMF. In this paper we
concentrate on the Open Corpus Service, which is a DM service.</p>
      <p>A domain model in GALE is defined using the GAM language (GALE application
model). The authoring process normally results in a set of concepts with for each
concept the associated GAM code that defines properties, user model attributes and
event code for the concept and its attributes. All requests to GALE normally specify a
concept (not a web page). When the concept specification refers to a concept on an
external server (the concept is requested from another server through HTTP) the Open
Corpus service retrieves that concept and scans the file for a &lt;meta&gt; element with a
‘name’ attribute with value ‘gale.dm’. When no information for the current concept is
found, the Open Corpus service searches for files called concept.gdom and
concept.gam (where concept stands for the actual concept name). It does so from the
current path in the URL up to the root of the server specified. The first description
found on the current concept is used.</p>
      <p>Below is an example http://gale.win.tue.nl/elearning.xhtml (taken from [4]) with
the following content:
&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;html xmlns=http://www.w3.org/1999/xhtml</p>
      <p>xmlns:gale="http://gale.tue.nl/adaptation"&gt;
&lt;head&gt;
&lt;meta name="gale.dm" content="
{ #[visited]:Integer `0` {
event `if (${#suitability} &amp;&amp; ${#read} &lt; 100)</p>
      <p>#{#read, 100};
else if (!${#suitability} &amp;&amp; ${#read} &lt; 35)</p>
      <p>#{#read, 35};`}
#knowledge:Integer !`GaleUtil.avg(new Object[]
{${&lt;=(parent)#knowledge},${#read}}).intValue()`
#[read]:Integer `0`
#suitability:Boolean `true`
event `#{#visited, ${#visited}+1};` } " /&gt;
&lt;/head&gt;
&lt;body&gt;
&lt;p&gt;This page is a placeholder for the elearning</p>
      <p>concept.&lt;/p&gt;
&lt;/body&gt;
&lt;/html&gt;
We don’t describe the details of the GAM syntax and semantics here, but only briefly
explain the example code:
• The code event `#{#visited, ${#visited}+1};` } means that when
the concept is accessed the value of the “visited” attribute in increased by 1.
• The attribute “visited” is an integer, and when its value changes its event
code is execute which updates the “read” attribute.
• The attribute “read” is also an integer.
• The attribute “knowledge” is an integer which is not stored but calculated
from the “read” value and the “knowledge” value of the children of the
“elearning” concept.
• The attribute “suitability” is a Boolean, which is “true” by default. This too is
not stored but calculated when needed. If there were prerequisites for the
“elearning” concept there would be an expression that defines the condition
for the concept to become suitable.</p>
      <p>Another “page” can “inherit” this adaptation (GAM) code as follows:
&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;html xmlns=http://www.w3.org/1999/xhtml</p>
      <p>xmlns:gale="http://gale.tue.nl/adaptation"&gt;
&lt;head&gt;
&lt;meta name="gale.dm" content= {-&gt;(extends)</p>
      <p>http://gale.win.tue.nl/elearning.xhtml}" /&gt;
&lt;/head&gt;
&lt;body&gt;
&lt;p&gt;This page uses the elearning template.&lt;/p&gt;
&lt;/body&gt;
&lt;/html&gt;
When a whole application domain is stored in a single file the “meta” element for the
concepts/pages would look like:</p>
      <p>&lt;meta name=’gale.dm’ content=’redirect:course.gam’ /&gt;
and the file “course.gam” might have contents like:
welcome.xhtml {
-&gt;(extends)http://gale.win.tue.nl/elearning.xhtml
-&gt;(extends)layout.xhtml
&lt;-(parent)gale.xhtml
&lt;-(parent)gat.xhtml
}
gale.xhtml {</p>
      <p>-&gt;(extends)welcome.xhtml
-&gt;(parent)welcome.xhtml
}
gat.xhtml {
-&gt; (extends)welcome.xhtml
-&gt;(parent)welcome.xhtml
}
layout.xhtml {
#layout:String `
&lt;struct cols="250px;*"&gt;
&lt;view name="static-tree-view" /&gt;
&lt;struct rows="60px;*;40px"&gt;
&lt;view name="file-view" file="gale:/header.xhtml" /&gt;
&lt;content /&gt;
&lt;p&gt;&lt;hr /&gt;Next suggested concept to study:</p>
      <p>&lt;view name="next-view" /&gt;&lt;/p&gt;
&lt;/struct&gt;
&lt;/struct&gt; `
}
Again we do not explain this code but just illustrate that code can be shared between
different concepts/pages, and can be placed in individual files or combined into a
single GAM file.</p>
      <p>When GALE retrieves “open corpus GAM definitions” it treats them just like a
locally stored definition: the concepts are created, user model information is stored and
updated, and the adaptation of other concepts (and the retrieved concepts themselves)
can depend on user model values for both these external and internal concepts. The
event code in GAM is essentially arbitrary Java code. This has potentially serious
implications which we discuss in the next section.
When “dynamic” content was first introduced on the Web it came with significant
security concerns. To illustrate:
• Browser plug-ins consist of executable code that can potentially harm the
end-user’s computer. It has full access to all resources to which the browser
has access. A harmful plug-in can not only crash the browser but also wipe
the user’s hard drive, send spam messages, search for critical personal data
on the hard drive like credit card numbers and transfer that to a criminal
organization, etc.
• Scripting code can be made somewhat less dangerous depending on what the
scripting language allows.
• Java Applets are running within a Sandbox environment: they cannot read or
write any information on the hard drive and they can only make network
connections to the site from which they are downloaded. The end-user can
make an exception (for signed applets) to allow access to the hard drive and
network.</p>
      <p>The Open Corpus Service in GALE allows arbitrary GAM code to be stored in the
domain model, after which it is executed by the GALE Adaptation Engine (AE). This
AE executes GAM event code which is arbitrary Java code that stores, retrieves and
updates user model information, but that in principle can try to also do anything else.
The security measures within GALE are:
• The AE runs in a Sandbox environment just like browser applets. The code
has no direct access to the hard drive or the network. Its only “way out” of
the Sandbox are the methods the Sandbox provides. These methods must
allow the service to store and retrieve user model data.
• The only user model access that is allowed is to the user model of the user
for whom the AE is executing code. This currently prevents GALE from
providing “group adaptation” but it is at least “secure”.</p>
      <p>Although the adaptation engine cannot do anything “truly harmful” it does perform
user model updates. And with open corpus adaptation++ the AE performs user model
updates defined by possibly unknown authors. When the end-user types the URL to
access a remote concept on any server through the local AE that local AE will execute
whatever GAM code the unknown author has written. This code may potentially
retrieve “private” user model information, and it may also destroy valuable
information in the user model. This is currently a concern that is specific to GALE as
GALE is the only “open corpus adaptation++ engine” we know of. GALE provides
basic safety of user model information by limiting user model updates to concepts
with a URI relative to the URI of the concept where the code resides. But the issue as
to what should be allowed (and what not) in open corpus adaptation++ is still open in
general.
This paper presented the concept of Open Corpus Adaptation++ where not only the
corpus is distributed over the Web but also the adaptation model is distributed. This is
currently just a novel feature offered by GRAPPLE’s Adaptive Learning Environment
GALE, and not yet widely used because the current authoring tool set GAT still does
not support specifying open corpus adaptation++. The code shown in Sect. 2 is clearly
not intended to be hand-written by human authors, so authoring tools will be needed.</p>
      <p>But most importantly the paper has raised concern that open corpus adaptation++
can be potentially harmful so we should discuss what is permissible and what should
be blocked for arbitrary adaptation models loaded from the Web.</p>
    </sec>
    <sec id="sec-3">
      <title>Acknowledgement References</title>
      <p>We wish to thank the European Commission, project 215434 (GRAPPLE) for their
financial support for this research.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Abel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henze</surname>
            .,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herder</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krause</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <article-title>Interweaving Public User Profiles on the Web</article-title>
          ,
          <source>In Proceedings of UMAP</source>
          <year>2010</year>
          ,
          <article-title>User Modeling Adaptation and Personalization</article-title>
          , LNCS
          <volume>6075</volume>
          , pp.
          <fpage>16</fpage>
          -
          <lpage>27</lpage>
          , Springer,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Brusilovsky</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henze</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , Open Corpus Adaptive Educational Hypermedia, in: The Adaptive Web, pp.
          <fpage>671</fpage>
          -
          <lpage>696</lpage>
          , Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Hendrix</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cristea</surname>
            ,
            <given-names>A.I.</given-names>
          </string-name>
          ,
          <article-title>Design of the CAM model and authoring tool</article-title>
          .
          <source>A3H: 7th International Workshop on Authoring of Adaptive and Adaptable Hypermedia Workshop, 4th European Conference on Technology-Enhanced Learning</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Smits</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Bra</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , GALE:
          <string-name>
            <given-names>A Highly</given-names>
            <surname>Extensible Adaptive Hypermedia Engine</surname>
          </string-name>
          ,
          <source>Proc. of the ACM Conference on Hypertext and Hypermedia</source>
          , Eindhoven,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>