=Paper=
{{Paper
|id=Vol-2600/short3
|storemode=property
|title=Combining Symbolic and Sub-symbolic AI in the Context of Education and Learning
|pdfUrl=https://ceur-ws.org/Vol-2600/short3.pdf
|volume=Vol-2600
|authors=Rainer Telesko,Stephan Jüngling,Phillip Gachnang
|dblpUrl=https://dblp.org/rec/conf/aaaiss/TeleskoJG20
}}
==Combining Symbolic and Sub-symbolic AI in the Context of Education and Learning==
<pdf width="1500px">https://ceur-ws.org/Vol-2600/short3.pdf</pdf>
<pre>
   Combining Symbolic and Sub-symbolic AI in the Context of Education
                            and Learning
                                     Rainer Telesko, Stephan Jüngling, Phillip Gachnang
                         FHNW University of Applied Sciences and Arts Northwestern Switzerland, School of Business
                                 School of Business, Riggenbachstrasse 16, CH-4600 Olten, Switzerland
                                      {rainer.telesko|stephan.juengling|phillip.gachnang}@fhnw.ch


                                Abstract                                                     Motivation and Background
   Abstraction abilities are key to successfully mastering the
   Business Information Technology Programme (BIT) at the                      The FHNW is striving towards receiving an accreditation
   FHNW (Fachhochschule Nordwestschweiz). Object-Orien-                        according to AACSB (Association to Advance Collegiate
   tation (OO) is one example - which extensively requires ana-                Schools of Business's) (AACSB, 2019). Earning an AACSB
   lytical capabilities. For testing the OO-related capabilities a             accreditation implies that a concrete framework to continu-
   questionnaire (OO SET) for prospective and 1st year students
   was developed based on the Blackjack scenario. Our main                     ally measure the quality of the school’s programs has to be
   target of the OO SET is to identify clusters of students which              in place. Two core processes which ensure the quality of the
   are likely to fail in the OO-related modules without a sub-                 student’s target skills are the admission and teaching pro-
   stantial amount of training. For the interpretation of the data             cesses. The admission process verifies that students own the
   the Kohonen Feature Map (KFM) is used which is nowadays                     necessary entry requirements and skills for mastering the
   very popular for data mining and exploratory data analysis.
   However, like all sub-symbolic approaches the KFM lacks to                  programme, and the teaching process should contain a mon-
   interpret and explain its results. Therefore, we plan to add -              itoring component to measure the student’s performance.
   based on existing algorithms - a “postprocessing” component                 The AI-based data analysis based on the OO SET is a valu-
   which generates propositional rules for the clusters and helps              able contribution to ensure the quality of these processes.
   to improve quality management in the admission and teach-                   The evaluation aims to identify clusters of students and to
   ing process. With such an approach we synergistically inte-
   grate symbolic and sub-symbolic artificial intelligence by                  predict their potential performance in terms of passing the
   building a bridge between machine learning and knowledge                    assessment stage.
   engineering.
                                                                               AACSB requires comprehensible adjustments (system or
                                                                               content improvements) if the student's performance does not
                            Introduction                                       meet the requirements (Assurance of Learning process -
OO-related content exists in a considerable number of BIT                      AoL). For this reason, we opted for a 2-step data analysis
modules, mainly related to Business Analysis, Software En-                     method that adds an explanatory component to the sub-
gineering and IT Architecture. Researchers generally agree                     sybolic AI. All in all, the OO SET can be integrated into
that abstraction ability is a necessary skill for OO design and                AoL to handle OO topics across modules. As the composi-
OO programming (Alphonce & Ventura, 2002, Bennedsen                            tion of the students - and thus the existing entry skills -
& Caspersen, 2008; Nguyen & Wong, 2001; Or-Bach &                              change over time, an on-line monitoring together with a
Lavy, 2004); however, a reliable instrument to test a per-                     moving time window is necessary.
son’s level of abstraction ability in the context of OO has
not yet been developed. We focus in our research on the ab-                                      Current State of Work
straction ability, which is needed to build OO-related ab-
stractions based on the understanding of a predefined do-                      For setting up the OO SET we focus on the abstraction abil-
main.                                                                          ity, which is relevant not only for the beginning of program-
                                                                               ming in the small (classes, attributes, relationships, hierar-
                                                                               chies) but also for programming in the large (libraries,
                                                                               frameworks, design patterns, software architectures). OOA
                                                                               and OOD are still predominant in software engineering and

Copyright © 2020 held by the author(s). In A. Martin, K. Hinkelmann, H.-       University, Palo Alto, California, USA, March 23-25, 2020. Use permitted
G. Fill, A. Gerber, D. Lenat, R. Stolle, F. van Harmelen (Eds.), Proceedings   under Creative Commons License Attribution 4.0 International (CC BY
of the AAAI 2020 Spring Symposium on Combining Machine Learning                4.0).
and Knowledge Engineering in Practice (AAAI-MAKE 2020). Stanford


                                                                                                                                                     1
working with models as abstractions from code (e.g. UML            concepts vary in complexity. However, elements such as
class diagrams) is vital not only for software- but also for       polymorphism, abstract classes, interfaces, and Design Pat-
database- engineering (ERD).                                       terns that are classified as more complex (Bennedsen &
The OO SET was implemented with Google Forms                       Schulte, 2006; Okur, 2007) were not part of the question-
(https://forms.gle/rj5NSqmgTth1dm2f7). As a “test” do-             naire.
main, a scenario related to the card game Blackjack was se-
lected, as this game is widely known and can be relatively
easy explained for a short assignment like the questionnaire.
Furthermore, the different cards and conditions (e.g. of a
particular casino) offer various possibilities for tasks in con-
nection with OO concepts. The first prototype contains 30
multiple choice questions, and in the first round of testing
we had 27 participants. In order to get a clearer picture of
the students’ aptitude and to support a more sophisticated
evaluation, every question is assigned both to a OO concept
category and a level according to Bloom (1956). In the BIT
program both students with and without pre-knowledge in
programming are enrolled. In order to “simulate” such a sit-
uation for the OO SET, two test groups (i.e. total beginners)
and BIT 1st term students with some knowledge based on
the running Programming module were considered.
The first part of the questionnaire asks for information about
participants, such as age, prior OO knowledge, gender, etc.
and covers a basic overview of relevant OO principles by                        Figure 2– OO SET example questions
using text, graphics and videos (see fig.1)
                                                                   One key criterion for selecting the elements above was to
                                                                   have a high degree of overlap with the introductory pro-
                                                                   gramming module in BIT which is the major challenge for
                                                                   students to master successfully the assessment stage. Cur-
                                                                   rently this programming module follows an OO-first ap-
                                                                   proach using Eclipse as Integrated Development Environ-
                                                                   ment and JavaFX as framework for programming Graphical
                                                                   User Interfaces. After deducting points for incorrect answers
                                                                   to questions where multiple selections were possible, the av-
                                                                   erage questionnaire score across all student groups (i.e.
                                                                   based on their self-indicated level of OO knowledge) was
                                                                   55%. Intuitively, students that indicated they had prior OO
                                                                   knowledge (i.e. identifying as “intermediate” or “ad-
                                                                   vanced”) performed better than those with little or no prior
            Figure 1– OO SET tutorial for methods                  OO knowledge, as can be seen in the following figure 3.

The second part of the questionnaire (see fig. 2) deals with
questions related to core OO concepts. The selection of the
OO concepts discussed and tested in the questionnaire was
made in consultation with BIT lecturers and based on simi-
lar field research (Bennedsen & Schulte, 2006; Okur, 2007).
The list of the OO concepts used and tested in the question-
naire includes: classes, objects, classes vs. objects, attrib-
utes, classes vs. attributes, methods in classes, parameters of                    Figure 3– Results from Bloom
methods, inheritance, multiplicity, encapsulation and rela-
tionships between classes (association, aggregation, compo-        Based on the results of the overall score from all Bloom lev-
sition).                                                           els being above 50%, and given that the scores increase with
While classes and objects are regarded as rather simple, en-       the students’ level of OO knowledge, the questionnaire can
capsulation is seen as a more advanced concept; thus, the          be considered as successfully testing the abstraction ability.


                                                                                                                               2
Test validity was checked by comparing the results of OO
SET with exam results from a module covering abstraction
abilities, namely “Introduction into BIT”. This module also
belongs to the assessment stage.


            Data Mining using the KFM
Data from the OO SET can be used to optimize the admis-
sion process and to identify clusters of students with similar
performance. Especially interesting are the students who fail                             Figure 5 – Clusters
the aptitude test because they might share common charac-
teristics.                                                          Certain dimensions are especially interesting, in order to de-
                                                                    rive conclusive rules for the OO SET. Fuzzy rules are espe-
For our first experiments, we used the Kohonen Feature              cially interesting because they are understandable by hu-
Map (KFM) (Kohonen, 1998; Oja & Kaski, 1999). The                   mans and can easily be processed. One of the most interest-
KFM is especially interesting when the clusters are not             ing dimensions in the OO SET is whether the student passed
known in advance, as it is the case in the data related to the      the test or not. The heatmap in figure 6 shows the three clus-
OO SET. The KFM is a two-layer, fully connected, feedfor-           ters reduced on this dimension. The darker a neuron is, the
ward network where a multidimensional input vector is               more pronounced is its pass characteristic and the more con-
mapped to a grid of output neurons. The KFM enables a top-          trast a neuron has to the neighbors the more dissimilar the
ological preservation of input vectors on the output layer af-      neuron is on this characteristic.
ter training, i.e. input vectors with a high degree of similarity
in terms of Euclidian distance metrics are mapped to neigh-
bor neurons on the output layer. In our case the student
metadata (gathered from part 1 of the OO SET question-
naire, like age, origin, entry qualifications etc.) is coded in
the input vector. The number of the neurons in the competi-
tive output layer is chosen arbitrary and builds the grid for
student neighborhoods sharing similar characteristics.

The following figure 4 shows the U-matrix (unified distance
matrix) of a trained KFM of the OO SET test results. Dark                              Figure 6 – Pass Clusters
neurons define the dissimilarity to the neighbor neuron and
the line highlights the borders of similarities in all dimen-       The first cluster with the light colored neurons represents
sions of the input vectors, thus similarity clusters of the         students, which clearly did not pass the test. It is surrounded
trained neurons.                                                    by the second cluster, which groups students, which are
                                                                    close to the pass/fail borderline. The third cluster with the
                                                                    dark colored neurons groups the students who clearly passed
                                                                    the test. These clusters combined with the dimension
                                                                    weights, as shown in figure 7, allow to derive meaningful
                                                                    fuzzy rules based on the values in the different dimensions.


                     Figure 4 – U-matrix

A hierarchical clustering on the distances of the neurons to
their neighbors has been used as shown in figure 5. The re-
sult are the three clusters colored in red, green and blue.
These clusters are the result of the similarity in all dimen-
sions of the neurons.
                                                                                    Figure 7 – Dimension Weights


                                                                                                                                 3
The weights of each neuron are representative or similar to       order to optimize the admission and teaching processes. Re-
the student’s characteristics mapped to that neuron. The cor-     spective performance data concerning basic OO concepts is
relation between age and education, the sections with the         generated via a questionnaire (OO SET). For learning the
two darkest greens are significantly pronounced and corre-        “similarity” of students a neural network was used. As re-
late with the white colored pass dimension. This leads to a       ported in the previous section, the use of KFM provides the
conclusive fuzzy rule, that students with a high education        possibility to cluster students with similar levels of under-
level and a high age will most likely pass the test.              standing OO concepts. This allows deriving rules as well as
                                                                  subsequent learning units, which can be mapped to the par-
The KFM uses a stochastic learning algorithm because input        ticular needs of the different student clusters and be based
vectors are selected randomly in order to avoid a bias. This      on the taxonomy of Bloom. However, this process can be
implies that with every experiment the feature map will look      generalized to other disciplines throughout the course of
differently by preserving the major topological preserva-         studies. KFMs can be used to extract more general rules not
tions. Another important issue is that the KFM does not pro-      only related to the abstraction capabilities and OO thinking.
vide clusters at the end of the learning algorithm. Every neu-    All our modules are described with learning goals, which
ron on the output layer stands for a best-matching unit, in       are mapped to the different taxonomy levels of Bloom.
our case a student best matching his input vector.                Given the fact, that many lecturers already work with ques-
Neural networks only work well if sufficient amount of rep-       tionnaires, which are published in the learning management
resentative data is available which is not yet the case. Be-      system Moodle, these questionnaires could easily be reused
cause we are currently spreading the OO SET among all be-         to analyzed with KFMs and provide general possibilities to
ginning and prospective students this situation will clearly      analyze and individualize students the students learning
improve.                                                          paths. Education al guidance can be provided based on more
                                                                  detailed skill maps and support the overall process of AoL.

            KFM-based knowledge base
                                                                                           References
In summary, our approach is divided into four-steps. First,
we set up a KFM based on the data gathered with the OO            AACSB (2019, November 14). Association to Advance Collegiate
SET. The KFM itself does not provide any clusters. In order       Schools of Business. Retrieved from https://www.aacsb.edu/.
to get clusters, a post-processing is necessary in the second     Alphonce, C. & Ventura, P. (2002). Object Orientation in CS1-CS2
                                                                  by Design. Proceedings of the 7th Annual Conference on innova-
step. Our approach is based on “coloring” output neurons
                                                                  tion and Technology in Computer Science Education, Aarhus,
based on their distances on the map. Such a colored KFM           Denmark, pp. 70-74.
(U-matrix) shows light neurons belonging to the same clus-        Bennedsen, J. & Schulte, C. (2006). A Competence Model for Ob-
ter with dark neurons at the cluster border (Ultsch & Korus,      ject Interaction in Introductory Programming. In Proceedings of
1995). In the third step, these clusters can be additionally      the 18th Workshop of the Psychology of Programming Interest
represented as fuzzy rules, which enables us to build up a        Group, pp. 215-229.
KFM-based knowledge base (Ultsch & Korus, 1995;                   Bennedsen, J. & Caspersen, M. E. (2008). Abstraction Ability as
Malone, 2006). The administrative staff as well as the lec-       an Indicator of Success for Learning Computer Science? Sydney,
                                                                  Australia: ICER’08.
turers will use this knowledge base within the admission and
teaching processes. However, these rules have to be contin-       Bloom, B. S. (Ed.). (1956). Taxonomy of educational objectives:
                                                                  The classification of educational goals, by a committee of college
uously updated in the fourth and last step due to the probably    and university examiners. New York: D. McKay.
changing student distribution and performance results in the
                                                                  Kohonen, T. (1998). The self-organizing map. Neurocomputing,
OO SET over time.                                                 21(1–3), pp.1–6.
With such an approach (starting with the KFM and ending           Malone, J., McGarry, K., Wermter, S. and Bowerman, C. (2006).
up with rules) also the main drawback of sub-symbolic AI –        Data mining using rule extraction from Kohonen self-organising
which consists in not being able to explain its results – can     maps. Neural Computing & Applications, 15 (1), pp. 9-17.
be removed. The FHNW quality management gets a DSS                Nguyen, D. & Wong, S. (2001). OOP in Introductory CS: Better
(decision support system) which enables to pay attention to       Students Through Abstraction, Proceedings of the fifth Workshop
specific student groups during the enrollment and admission       on Pedagogies and Tools for Assimilating Object-Oriented Con-
                                                                  cepts, OOPSLA.
process.
                                                                  Oja, E. and Kaski, S. (1999). Kohonen Maps. Elsevier, Amster-
                                                                  dam.
                       Conclusion                                 Okur, M. (2007). Teaching Object Oriented Programming At The
                                                                  introductory Level. In Journal of Yasar University. 1 (2), pp. 149-
The main target of our research is to identify clusters of stu-   157.
dents which are likely to fail in the OO-related modules in


                                                                                                                                   4
Or-Bach, R. & Lavy, I. (2004). Cognitive Activities of Abstraction
in Object Orientation: An Empirical Study. Inroads - the SIGSCE
Bulletin. 36 (2), pp. 82-86.
Ultsch, A. & Korus, D. (1995). Self-organizing Neural Networks
for Acquisition of Fuzzy-Knowledge, Proceedings of the 3rd GI-
Workshop Fuzzy-Neuro-Systeme in Darmstadt, pp. 326-332.


                                                                     5

</pre>