<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Neural-Symbolic System for Automated Assessment in Training Simulators A Position Paper</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Leo de Penning</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bart Kappé</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karel van den Bosch TNO Defense</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Security</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Safety Soesterberg</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>The Netherlands leo.depenning@tno.nl</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>bart.kappe@tno.nl</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>karel.vandenbosch@tno.nl</string-name>
        </contrib>
      </contrib-group>
      <fpage>35</fpage>
      <lpage>38</lpage>
      <abstract>
        <p>Performance assessment in training simulators is a complex task. It requires monitoring and interpreting the student's behaviour in the simulator using knowledge of the training task, the environment and a lot of experience. Assessment in simulators is therefore generally done by human observers. To capture this process in an automated system is challenging and requires innovative solutions. This paper proposes a new module for automated assessment in simulators that is based on NeuralSymbolic Learning and Reasoning and the Recurrent Temporal Restricted Boltzmann Machine (RTRBM). The module is capable of using existing and learning new rules for performance assessment, by observing experts and students performing the training tasks. These rules are used to validate and support the assessment process and to automatically assess student performance in a training simulator. The module will be developed in a three year research project on assessment in driving simulators for testing and examination.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Performance assessment in training simulators has always
been a complex task that is generally performed by human
observers. Performance assessment by automated systems is
often limited to simple training tasks, because assessing
complex tasks requires the modelling of all interrelations
between the information present in the simulation, the
training tasks, and the constructs being assessed (e.g.
competences). Also, when it comes to more subjective assessments
(e.g., how ‘safe’ is the student driving), conventional
modelling techniques fall short, as the applied assessment rules are
often implicit and difficult to elicitate from the simulation or
domain experts.</p>
      <p>We propose a new module for automated assessment as
part of the Virtual Instruction platform SimSCORM
[Penning et al., 2008]. This assessment module will be able to
learn new rules from the task description, (real-time)
simulation data, related assessment data of domain experts or
students and already existing rules (also called background
knowledge). These rules can be presented in a
humanreadable (‘symbolic’) form, facilitating the validation of the
assessment rules and supporting the assessment process.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Global Architecture</title>
      <p>The automated assessment module requires real-time
interaction with the simulator(s), the student and human
assessors, and a description of the training task, a student profile
and the simulated environment. SimSCORM provides a
generic platform for definition and presentation of
simulation based training content and interaction between the
content, its users and the simulation based on international
standards (e.g. SCORM, HLA, XML, etc.). Via this
platform the automated assessment module can easily access the
objects and attributes in the simulation and get information
on the student profile and progress.</p>
      <p>
        Figure 1 depicts the automated assessment module
(named CogAgent) in the SimSCORM context.
SimSCORM provides a player that presents a SCORM based
training task to the students and possibly one or more
assessors (e.g. teachers, examiners or students) via a (web-based)
Learning Management System. This player uses SimAgent
to interact with the simulator(s) and CogAgent to do
automated performance assessment and learn new assessment
rules from observation. Therefore, the player configures
CogAgent with information on the training task, measured
variables, student profile, assessed constructs and existing
symbolic rules. During execution of the training task,
assessors can provide feedback on the assessed constructs which
will be presented to CogAgent as short-term evaluations
(depicted as assessment data). SimAgent will act as a
generic interface between the simulator(s) and CogAgent, and
pre-processes received data from the simulator(s) based on
measured variable descriptions. Based on the information
from the player and SimAgent, CogAgent determines an
overall (or long-term) evaluation for the assessed constructs
which will be presented to the students (and assessors) as
assessment result. Parallel to this it uses the measured data
and assessment data to adapt the internal knowledge on
assessment rules, resulting in new rules that can be validated
afterwards. All information, including the symbolic rules,
will be encoded in XML as part of the working memory of
the agents and will be distributed via SOAP (either locally
or via a web-service).
The CogAgent must be able to learn new rules from
observation and existing rules, infer conclusions from these rules
and present them in a human readable form. Research on
Neural-Symbolic Learning and Reasoning focuses on the
integration of learning techniques and architectures from
Neural Networks with the symbolic presentation and
reasoning techniques in (Fuzzy) Logic Programs
        <xref ref-type="bibr" rid="ref1">(see [Bader and
Hitzler, 2005])</xref>
        .
      </p>
      <p>
        The Neural-Symbolic model proposed for CogAgent is
based on the Recurrent Temporal Restricted Boltzmann
Machine (RTRBM) [Sutskever et al., 2009] and is depicted
in Figure 2. This partially connected symmetric neural
network implements an auto-associative memory of its input
layers (called visible layers). CogAgent contains three
visible layers that represent its beliefs, desires and intentions
        <xref ref-type="bibr" rid="ref2">(introduced by [Bratman, 1999])</xref>
        . Beliefs are variables
related to the training task (initial conditions, dynamic
behaviour and measured variables) and the student profile.
Intentions are variables related to actions or instructions. And
desires are variables related to performance assessments
(e.g. evaluations or rewards). Beliefs and intentions are
directly related to the current state of the context whereas
desires will be related to future states as well using Temporal
Difference learning [Sutton, 1988]. This technique learns
the model to predict a maximum obtainable value for its
desires (e.g. overall evaluation scores) based on the current
and previous states. Otherwise, the model would only learn
to map short-term evaluations, which is not desired in this
case.
      </p>
      <p>The hidden layer of the RTRBM is connected to the
visible layers with symmetric connections. Each hidden unit
represents a rule or relation between one or more visible
units. It also contains recurrent hidden-to-hidden
connections that enable the RTRBM to learn the temporal
dynamics in the visible layers using an algorithm based on
contrastive divergence and backpropagation through time. Using
this layer we can infer the posterior probability of beliefs,
intentions and desires in relation to the state of current and
previous beliefs, intentions and desires.
3.1</p>
    </sec>
    <sec id="sec-3">
      <title>Symbolic Rules and Fuzzy Atoms</title>
      <p>As described in section 2, the rules CogAgent needs to
encode, learn and reason about are relations (or causalities)
between XML encoded constructs, which will be called
atoms hereafter. An XML based atom describes a belief,
intention or desire as a function of measured data from the
simulator and/or assessment data from the assessors (or
students). In case of training simulators this data is often
expressed in both continuous and binary values. Therefore we
need to use functions in the visible units that can express
both. In [Chen and Murray, 2003] sigmoid functions are
introduced that contain a ‘noise-control’ parameter to allow
a smooth translation from noise-free deterministic behaviour
to binary-stochastic behaviour. These continuous stochastic
functions can express both binary and continuous variables.
The ‘noise-control’ parameter controls the steepness of the
sigmoid function and can be trained, such that the behaviour
of a function dynamically changes according to the
distribution of its input values. We will extend our model with such
functions to create a Recurrent Temporal Continuous
Restricted Boltzmann Machine (RTCRBM).
To express relations between atoms in symbolic rules we
propose to use the temporal propositional logic described in
Lamb et al., [2007]. This logic contains several modal
operators that extend classical modal logic with a notion of
past and future. All these operators can be translated to a
form that relates only to the immediate previous timestep
(denoted by the temporal operator ●). This allows us to
encode any rule from this language in the RTCRBM as a
combination of visible units (or atoms) and recurrent hidden
units that represent applied rules in the previous timestep.
For example the proposition α“β denotes that a proposition
α has been true since the occurrence of proposition β. This
can be translated to: β → α“β and α ∧ ●(α“β) → α“β,
where α and β are modelled by visible units and ●(α“β) is
modelled by a recurrent hidden unit.</p>
      <p>We extend this logic with the use of equality and
inequality formulas to represent the atoms for continuous variables
(e.g. A=x, A&lt;x, etc). Note that the atoms for binary
variables can also be represented as A=true or A=false, which
allows us to handle the outcome of these atoms in the same
way as with the continuous atoms. But for readability we
will use the classical notion A and ¬A.</p>
      <p>Due to the stochastic nature of the sigmoid functions used
in our model, the atoms can be regarded as fuzzy sets with a
Gaussian membership function. This allows as to represent
fuzzy concepts, like good and bad or fast and slow or
approximations of learned values, which is especially useful
when reasoning with implicit and subjective rules. In fact
our model can be regarded as a neural-fuzzy system similar
to the fuzzy systems described in [Kosko, 1992] and [Sun,
1994].</p>
      <p>Now let’s take the training task depicted in Figure 3.
Using our extended temporal propositional logic, we can
describe rules about the conditions, scenario and performance
assessment related to this task.</p>
      <p>Example rules for a driver training task:</p>
      <sec id="sec-3-1">
        <title>Conditions:</title>
        <p>(Area = urban)
(Weather ≥ good)
(Time ≥ 6) ∧ (Time ≤ 18)</p>
      </sec>
      <sec id="sec-3-2">
        <title>Scenario:</title>
        <p>(Speed &gt; 0) ∧ ApproachingIntersection → CrossIntersection
ApproachingIntersection ∧ ◊(ApproachingTraffic = right)
((Speed &gt; 0) ∧ (HeadingIntersection)) “ (DistanceIntersection &lt; x) →
ApproachingIntersection</p>
      </sec>
      <sec id="sec-3-3">
        <title>Assessment:</title>
        <p>ApproachingIntersection ∧ (DistanceIntersection = 0) ∧
(ApproachingTraffic = right) ∧ □(Speed = 0) → (Evaluation = good)
ApproachingIntersection ∧ (DistanceIntersection = 0) ∧
(ApproachingTraffic = right) ∧ ◊(Speed &gt; 0) → (Evaluation = bad)
The rule with temporal operator “, denotes that
ApproachingIntersection is true when the driver has been driving
towards an intersection since a certain distance x to an
intersection was passed. This rule and the actual value for x can
be learned from observation by clamping the actual speed,
heading and distance to the visible units and the value true
to the unit for ApproachingIntersection when the trainee is
approaching the intersection. This can be done by an
assessor or the student, but could also be automatically inferred
by the model, as explained in the next section.
3.2</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Rule encoding and extraction</title>
      <p>To encode and extract symbolic rules in symmetric
connectionist networks, like the RBM, Pinkas [1995] describes a
generic method that directly maps these rules to the energy
function of such networks. Therefore he describes an
extension to propositional logic, called penalty logic that applies
a penalty to each rule. This penalty can be regarded as the
“certainty” or “reliability” of a rule and is directly related to
the weights of the connections between the units that form
this rule. To apply the encoding and extraction algorithms of
Pinkas successfully to our model we need extend our
temporal propositional logic with the use of penalties. [Sun, 1994]
describes a method to map atoms with classical modal
operators to real values. We propose to extend this method to
create a mapping of atoms and rules with the modal
operators used in our model to penalties. Furthermore we need to
investigate what changes are required to the algorithms to
handle the use of equality formulas and continuous
variables. For example, we need to prove that it is possible to
infer the correct value for unknown continuous variables in
a rule via pattern reconstruction based on known values and
(previously) applied rules. And to encode and extract rules
with inequality formulas we need to be able to transform
these to and from rules that contain only equality formulas.</p>
      <p>The penalties that are encoded or learned by our model
can be used to rank the rules according to their applicability
in a certain context or scenario, giving the students and
assessors a nice overview of the applied rules. Also they allow
us to solve ambiguities in the application of rules, by using
such a ranking to select the most applicable (or reliable) rule
in each case.
4</p>
    </sec>
    <sec id="sec-5">
      <title>Further Research and Experiments</title>
      <p>The model described here is still conceptual and requires
further research. To summarize the previous sections, we
need to investigate the following topics:
• Is the proposed language for symbolic rules adequate
enough to represent the subjective and fuzzy rules
applied in performance assessment?
• How to determine the penalties of atoms and rules
based on their modalities? And how to map penalties to
temporal modalities of rules and atoms?
• How to transform rules with inequality formulas to and
from rules with only equality formulas?
• If and how to adapt the rule encoding and extraction
methods of Pinkas [1995] to make them applicable to
the RTCRBM?
• How to integrate temporal difference learning in the
RTCRBM for long term evaluation of desires?</p>
      <p>These and many other topics will be investigated in a
three year research project on assessment in driving
simulators, carried out by TNO in cooperation with the Dutch
licensing authority (CBR), Research Center for Examination
and Certification (RCEC), Rozendom Technologies and
ANWB driving schools. The resulting automated
assessment module will be validated in several experiments on a
large student population using multiple commercial driving
simulators. If successful, the module will be used to support
the Dutch driver training and examination program.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[Bader and Hitzler</source>
          , 2005]
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Bader</surname>
          </string-name>
          and
          <string-name>
            <given-names>Pascal</given-names>
            <surname>Hitzler</surname>
          </string-name>
          .
          <article-title>Dimensions of neural-symbolic integration - a structured survey</article-title>
          .
          <source>In We Will Show Them: Essays in Honour of Dov Gabbay</source>
          , Volume
          <volume>1</volume>
          .
          <source>International Federation for Computational Logic</source>
          , pages
          <fpage>167</fpage>
          -
          <lpage>194</lpage>
          ,
          <string-name>
            <given-names>College</given-names>
            <surname>Publications</surname>
          </string-name>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>[Bratman</source>
          , 1999]
          <string-name>
            <given-names>Michael E.</given-names>
            <surname>Bratman</surname>
          </string-name>
          . Intention, Plans, and
          <string-name>
            <given-names>Practical</given-names>
            <surname>Reason</surname>
          </string-name>
          . Cambridge University Press,
          <year>June 1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>[Chen and Murray</source>
          , 2003]
          <string-name>
            <given-names>Hsin</given-names>
            <surname>Chen and Alan F. Murray</surname>
          </string-name>
          .
          <article-title>Continuous restricted Boltzmann machine with an implementable training algorithm</article-title>
          .
          <source>In Vision</source>
          , Image and
          <string-name>
            <given-names>Signal</given-names>
            <surname>Processing</surname>
          </string-name>
          ,
          <source>IEE Proceedings</source>
          , pages
          <fpage>153</fpage>
          -
          <lpage>158</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>[Kosko</source>
          , 1992]
          <string-name>
            <given-names>Bart</given-names>
            <surname>Kosko</surname>
          </string-name>
          .
          <source>Neural Networks and Fuzzy Systems: A Dynamical Systems Approach to Machine Intelligence</source>
          , Prentice Hall,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [Lamb et al.,
          <year>2007</year>
          ]
          <string-name>
            <surname>Luís</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Lamb</surname>
          </string-name>
          ,
          <string-name>
            <surname>Rafael</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Borges</surname>
          </string-name>
          , Artur S.
          <string-name>
            <surname>d'Avila Garcez</surname>
          </string-name>
          .
          <article-title>A Connectionist Cognitive Model for Temporal Synchronisation and Learning</article-title>
          .
          <source>In Proceedings of the Conference on Association for the Advancement of Artificial Intelligence (AAAI)</source>
          , pages
          <fpage>827</fpage>
          -
          <lpage>832</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Penning et al.,
          <year>2008</year>
          ] Leo de Penning, Eddy Boot and
          <string-name>
            <given-names>Bart</given-names>
            <surname>Kappé</surname>
          </string-name>
          .
          <article-title>Integrating Training Simulations and e-Learning Systems: The SimSCORM platform</article-title>
          .
          <source>In Proceedings of the Conference on Interservice/Industry Training, Simulation, and Education Conference (I/ITSEC)</source>
          , Orlando, USA,
          <year>December 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>[Pinkas</source>
          , 1995]
          <string-name>
            <given-names>Gadi</given-names>
            <surname>Pinkas</surname>
          </string-name>
          .
          <article-title>Reasoning, nonmonotonicity and learning in connectionist networks that capture propositional knowledge</article-title>
          .
          <source>In Artificial Intelligence v.77 n.2</source>
          , pages
          <fpage>203</fpage>
          -
          <lpage>247</lpage>
          ,
          <year>September 1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [Sutskever et al.,
          <year>2009</year>
          ]
          <string-name>
            <given-names>Ilya</given-names>
            <surname>Sutskever</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Geoffrey E.</given-names>
            <surname>Hinton</surname>
          </string-name>
          and
          <string-name>
            <surname>Graham W. Taylor.</surname>
          </string-name>
          <article-title>The Recurrent Temporal Restricted Boltzmann Machine</article-title>
          .
          <source>In Advances in Neural Information Processing Systems</source>
          <volume>21</volume>
          , MIT Press, Cambridge, MA,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>[Sun</source>
          , 1994]
          <string-name>
            <given-names>Ron</given-names>
            <surname>Sun</surname>
          </string-name>
          .
          <article-title>A neural network model of causality</article-title>
          .
          <source>In IEEE Transactions on Neural Networks</source>
          , Vol.
          <volume>5</volume>
          , No.
          <volume>4</volume>
          . pages
          <fpage>604</fpage>
          -
          <lpage>611</lpage>
          . July,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[Sutton</source>
          , 1988] Richard
          <string-name>
            <given-names>S.</given-names>
            <surname>Sutton</surname>
          </string-name>
          .
          <article-title>Learning to predict by the methods of temporal differences</article-title>
          .
          <source>In Machine Learning</source>
          <volume>3</volume>
          : pages
          <fpage>9</fpage>
          -
          <lpage>44</lpage>
          , erratum page 377,
          <year>1988</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>