<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Wenjun Wu NLSDE Department of Computer Science and Engineering</institution>
          ,
          <addr-line>Beihang University</addr-line>
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <fpage>55</fpage>
      <lpage>62</lpage>
      <abstract>
        <p>With the popularization of higher education, honors education has become an important work of research-oriented universities to cultivate excellent students. In order to evaluate the achievements of honors education and to make a guidance for honor educators, it is necessary to predict the performance of honors students effectively. This paper proposes a data-driven model to make predictions on students' performances based on an adjusted Elman Neural Network (Elman NN). Moreover, to be more significant, we made a comparison between Elman NN and some other methods. The result shows that our model performs much better. The performance predictor may provide a reference for honor educators in the professional choices and enable them to provide appropriate suggestions or motivations for those of the honors students who are at an early stage of learning risk or have a potential of an out-standing talent.</p>
      </abstract>
      <kwd-group>
        <kwd />
        <kwd>Elman Neural Network</kwd>
        <kwd>Data Mining</kwd>
        <kwd>Predictive Model， Regression， Classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>In recent years, honors programs of higher education have become available among
the world famous universities and are widely recognized in American. Among the top
100 universities in the world, which have abundant education resources and small
scales, more than 40% of the universities have honors programs providing honors
students with challenging courses and high-level scientific research training
opportunities. Following the success of their honors programs, top-notch universities in China
also adopt honors programs to cultivate elite students.</p>
      <p>Our experience of running the honors program in Beihang University for more than
a decade reveals that there are two challenges for the student advisors. First of all, it’s
necessary to find out how to help honors students to choose majors which suit them
best. Usually, the students have to choose their majors depending on their willingness
and ability after the first school year. It is ideal to give students essential guidance in
developing their interest and talent in the most suitable majors for them in the first
year. But given the limited manpower of our honors program administers, we need a
predictive tool to efficiently assess student ability and predict their future
performance. Secondly, every year we have to manually identify the honors students as risk
and give them counselling to overcome their academic difficulties and even adjust
their negative timing habits in daily life. A powerful predictive model is also very
important to help us in fulfilling this responsibility through necessary learning
suggestions and teaching interventions.</p>
      <p>In this paper, we adopt an Elman Neural Network as a modeling framework to
implement the predictive model, which has been widely used in predictive problems in
various fields. We redesign weighted context units in the hidden layer of Elman NN
to reflect the latent interaction among honors students in the same year. Based on this
improvement on the original Elman NN, we establish a predictive model to fit
performance of honors students and verify the effectiveness of the model by the actual
datasets. Experiments show that our model outperforms some other regular models.</p>
      <p>The rest of the paper is organized as follows. Section 2 presents a summary of
related work and a brief comparison to our model. Section 3 presents the dataset
descriptions and processing details. Section 4 introduces our predictive model and
related experimental results. Finally, Section 5 concludes the paper.
2</p>
    </sec>
    <sec id="sec-2">
      <title>RELATED WORK</title>
      <p>Prediction of student scores is an important research topic in the field of educational
data mining. Many researchers have proposed predictive models based on a variety of
machine learning techniques. Jie Xu, et al (2017) [1] developed a novel algorithm that
enables progressive prediction of students’ performance by adapting ensemble
learning techniques and utilizing education-specific domain knowledge. It is proved that its
prediction results are accurate enough compared to some other methods. Elbadrawy et
al. (2016) [2], proposed a predictive model based on regression-based and matrix
factorization–based methods to predict student performance. Dekker et al. (2009) [3],
presented a case study to evaluate multiple drop-out prediction models.</p>
      <p>All these previous efforts only focus on predicting future performance based on
student current status and past academic performance without considering behavior
features that are not directly related to their course study. They often rely upon
Learning Management Systems on campus to collect study records as training datasets for
developing their models. Such an approach has inherent limitation because it cannot
capture students’ daily activities that may have great impact on their study. Especially
for the honors students at their first campus year, life style can bring negative
influence on their study. To incorporate these factors into our predictive model, we decide
to enrich the feature space of our model by introducing student daily activity features
including consumption in campus cafeteria, Internet accessing at different time frames
and library book-lending transactions. These data are collected from multiple
ecampus service systems and assimilated into our training dataset. Given the temporal
natural of honors student development and their daily activity data, we choose to
adopt a simplified recurrent neural network, which was called Elman Neural Network
[5], to build our predictive model.</p>
      <p>2</p>
    </sec>
    <sec id="sec-3">
      <title>DATA DESCRIPTION AND PROCESSING</title>
      <sec id="sec-3-1">
        <title>Dataset Description</title>
        <p>For honors students in Honors College, the design of honors project follows the
principle of a solid foundation and gradual improvement. In the first year, students
will learn basic subjects as a basis and preparation for further professional learning. In
the following years, students will be major-oriented educated and learn more
professional courses. As the knowledge basis of the first academic year is very important,
we hope to predict the performance of the first school year in the first semester.</p>
        <p>In this paper, we will establish a data-driven predictive model based on the data of
students in grade 2015 and grade 2016, including their initial grades, learning and
daily behaviors in the first semester, and we’ll predict the performance of their core
subjects and comprehensive scores. The input dataset contains 501 vectors, of which
205 are from students in Grade 2015 and 296 from students in Grade 2016. Every
vector contains a 54-dimensional input vector and a 9-dimensional output vector.</p>
        <p>After entering the University, many students will indulge in computer games
resulting in reduced learning time. As students’ internet access is a major factor
affecting their academic performance, we collected students' internet accessing details
including total length of Internet time, active periods, traffic, etc. And we organize
Internet time, traffic data by month (X25-X54) and active periods by 6-hour periods
(X21-X24). The college entrance examination scores represent the students’ initial
knowledge level and learning ability (X3-X7). The First midterm examination in
college comes two months after enrolment, which indicates students' adaptability to
university studies to some content. Moreover, we assume that students’ monthly
consumption, book-borrowing numbers and birth dates will also influence their final
results. The initial CEE data, book-borrowing data, consumption and
internetaccessing data can be collected easily through multiple e-campus service systems.</p>
        <p>
          The 9-dimensional output data includes a 3-dimensional part of the comprehensive
performance and a 6-dimensional part of performances in core courses. The
consolidated performance part includes consolidated performance, the average grade of main
courses and credit scores. For honors students, the consolidated performance is related
to the latter two parameters by the Eq (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) as follows:
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
        </p>
        <p>Y_2max means the maximum value of average performance of main courses for all
students in the same grade. Y_3max means the maximum value of credit scores for all
students in the same grade. The equation indicates the importance of core courses for
honors students. The core subject grades section contains 6 elements, corresponding
to their performances of the six core courses. These subjects are set especially for
honors students in honors project, thus they can measure students’ mathematical
ability, experimental ability, programming ability, language ability properly, which are
representative enough in measuring students' ability distributions.</p>
        <p>The details of the input data and output data are shown in table 1. Xi means input
data and Yi means output data.
X11-X20
X21-X24</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>MODEL AND RESULTS</title>
      <sec id="sec-4-1">
        <title>Principles of Elman NN and Our Adjustment</title>
        <p>An Elman neural network is a three-layer network with the addition of a set of
"context units" used to remember the output value of the hidden layer units, which
can be considered as a delay operator. The hidden layer is connected to these context
units fixed with a weight of one initially. At each training step, the input will
propagate over the feed-forward part during which a learning rule is applied. The
backconnections part is fixed and save a copy of the values of the hidden units in the
context units. The saved values will propagate over the connections before the learning
rule is applied. That means the network can take into account the internal relationship
among input data.</p>
        <p>For honors students, the performance of every student might be influenced not
only by his or her initial scores and daily behaviors, but also by the behaviors of other
students. That’s why we choose an Elman neural network which can take the
interinfluence factors into consideration. The training sequences of the input data in all
training epochs are set randomly, making it more reasonable to take mutual influence
into account. Suppose that there are m nodes for the input layer, n nodes for the
output layer, and r nodes for the hidden layer. Thus there will be r context units. In our
model, m=54, n=9, r=30. The structure of the initial Elman NN and our adjusted
model are shown in Fig. 1(right).</p>
        <p>
          In traditional Elman neural network, the weight from context units is to the input
layers are set as ones. But in fact, the recurrent part will not play an important role as
the parameters from the input layer for the hidden layer. The weight shall be adjusted
to correspond well to the application in predicting performance. Also, to make it more
significant, we assume that the influence from previous two steps should not be
neglected. The equations are shown as follows:
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
        </p>
        <p>X(k) is the input value from the input layer. H(k)is the output value of the hidden
layer. Y(k)is the output value of the output layer. f(x) is the activation function. It is
always set as the sigmoid function. W1 is the weight matrix between the input layer
and the hidden layer W2 is the weight matrix between the context units and the hidden
layer. W3 is the weight matrix between the hidden layer and the output layer. α is the
feedback gain parameter for self-connection. β is the feedback gain parameter for the
previous self-connection part.</p>
        <p>In our model, α and β should be set to a small value. Based on enough
experiments, we found that the model performs well when we set α=β=0.05. When the
values changes in a small range (0.02-0.2), the final results won’t change a lot. That
means the model is stable in such parameters.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Training and Tests</title>
        <p>
          In training part, we set 401 train and validation data and 100 test data. As the
predictive model is a regression problem actually, we set loss function (R(y,y*)) as
mean square error (MSE) function, which always performs well in regression
problems. The equation is as follows.
(
          <xref ref-type="bibr" rid="ref6">6</xref>
          )
        </p>
        <p>The process of model training and the results are recorded and shown in Fig. 3.
The training function is chosen as gradient descent with momentum and adaptive
learning rate backpropagation, which works well in Elman neural network according
to a lot of experiments. We set learning rate as 0.05. There are 1000 training steps in
each epoch. After 212 epochs, the model reached a stable point.</p>
        <p>The output value are decimals ranging from 0 to 1 (representing scores ranging
from 0 to 100). We calculated the errors of test data and present the errors of
consolidated performance above. According the Figure 4, most errors of predicted results are
no more than 0.1, which means our results are credible enough.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3 Comparisons</title>
        <p>To be more significant, we made a comparison among Elman NN , BPNN
(Backpropagation neural network), the most frequently used neural network model, and
linear model. To ensure that the compared network is in the same size and scale, the
BPNN is also arranged by three layers, including a 54-node input layer, a 30-node
hidden layer, and a 9-node output layer. The training methods are all set similarly.</p>
        <p>
          To evaluate the two methods properly, we calculated the confidence rate of both
outputs based on the confidence interval of 10%.
(
          <xref ref-type="bibr" rid="ref7">7</xref>
          )
        </p>
        <p>
          In this formula, n means number of items in test data, mi means number of credible
items in test data. A tested item is treated as a credible one if:
(
          <xref ref-type="bibr" rid="ref8">8</xref>
          )
        </p>
        <p>Y* means the predicted result and Yi means the actual results, which is also the
labeled value. The confidence rate in output data of both two methods is shown in
Table 2. Obviously, the credible rate of our model is better. That means it is credible
enough to predict student performance based on our model. Also, the prediction about
average grade of core courses is the most accurate, which is also the most valuable
parameter in measuring student learning ability.</p>
        <p>6</p>
      </sec>
      <sec id="sec-4-4">
        <title>Symbol Meaning CR of CR of CR of</title>
      </sec>
      <sec id="sec-4-5">
        <title>ENN BPNN LM</title>
        <p>Y1 Consolidated performance 89% 86% 69%
Y2 Average grade of Core Courses 91% 88% 81%
Y3 Credit scores 82% 79% 73%
Y4 Mathematics performance 86% 84% 62%
Y5 Basic Physics performance 79% 72% 58%
Y6 General Chemistry performance 83% 85% 71%
Y7 Basic Life Sciences 88% 79% 65%
Y8 Advanced Programming perform 74% 66% 54%
Y9 College English performance 85% 69% 68%
Average Value 84% 79% 67%
The prediction confidence rate of advanced language programming performance is
obviously lower than the others, for the uncertainty of the course. On the whole, most output items can
be predicted accurately and we can trust the results at a low risk of making mistakes.</p>
      </sec>
      <sec id="sec-4-6">
        <title>4.4 Classifications</title>
        <p>In the classification model, we divide the honors students into three categories
according to their consolidated performance, which respectively represent excellent, good
and general level. The number of students in each category and the results are
presented in Table 3. The structure of the model and the receiver operating characteristic
curve (ROC curve) of test data are shown in Fig. 3.</p>
        <p>As the prediction model can predict students' grades and categories at an early stage, risk
students can be identified half a year in advance. In application, counselors will combine the
results of classification and regression models. Firstly, they will collect data required and input
the data into the model, which is an automated process wasting less time. After that, they can
easily identify risk students based on the classification results. To know more details, they can
consult the regression model to know the ability distribution details of those risk students
according to the 9-dimension outputs. Finally, they shall offer some necessary suggestions.</p>
        <p>In the early stage, counselors were unable to get sufficient information about students'
learning status. Thus it is difficult to assess students’ performance manually. But the models offer
predictions based on students’ data that are easy to get. The results with accuracy of 77.4% are
valuable enough for honors educators to assess the learning level of every student. It is a
convenient early-stage performance predicting tool on campus.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>CONCLUSIONS</title>
      <p>In summary, we have performed the process of the establishment of our predictive model
and presented the results of prediction of students’ final performance and ability distribution
based on data of 501 honors students. By adjusting the values of feedback gain parameters for
self-connection in Elman neural network and training the network reasonably, the predictive
model works better compared to BPNN and linear regression method. According to our
experiments, the consolidated performance and average grade of core courses in output value can be
predicted most accurately. It is convenient for honors program counselors to predict students’
performance and provide appropriate suggestions or motivations for different student categories</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Han,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Marcu</surname>
          </string-name>
          , D., van der Schaar M.
          <article-title>: Progressive Prediction of Student Performance in College Programs</article-title>
          .
          <source>In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence</source>
          , San Francisco, CA, USA,
          <fpage>4</fpage>
          -
          <lpage>10</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Elbadrawy</surname>
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Polyzou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sweeney</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karypis</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Rangwala</surname>
          </string-name>
          , H.:
          <article-title>Predicting Student Performance Using Personalized Analytics</article-title>
          . Computer,
          <volume>49</volume>
          (
          <issue>4</issue>
          ),
          <fpage>61</fpage>
          -
          <lpage>69</lpage>
          , (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>W. Dekker M. Pechenizkiy J. M.</surname>
          </string-name>
          <article-title>Vleeshouwers: Predicting students drop out: A case study</article-title>
          .
          <source>Proc. Int. Conf. Educ. Data</source>
          Mining pp.
          <fpage>41</fpage>
          -
          <lpage>50</lpage>
          (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>AI-Radaideh</surname>
            ,
            <given-names>Q</given-names>
          </string-name>
          ,
          <string-name>
            <surname>AI-Shawakfa</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>AI-Najjar</surname>
            ,
            <given-names>M:</given-names>
          </string-name>
          <article-title>Mining student data using decision trees</article-title>
          .
          <source>International Arab Conference on Information Technology</source>
          , Yarmouk University, Jordan, (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Elman</surname>
            ,
            <given-names>J. L. Finding</given-names>
          </string-name>
          <article-title>Structure in Time</article-title>
          .
          <source>Cognitive Science</source>
          ,
          <volume>14</volume>
          :
          <fpage>179</fpage>
          -
          <lpage>211</lpage>
          . doi:
          <volume>10</volume>
          .1207/s15516709cog1402_
          <fpage>1</fpage>
          .(
          <year>1990</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Liu</surname>
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tian</surname>
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liang</surname>
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Wind speed forecasting approach using secondary decomposition algorithm and Elman neural networks</article-title>
          . https://doi.org/10.1016/j.apenergy.
          <year>2015</year>
          .
          <volume>08</volume>
          .014 (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Xing</surname>
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petakovic</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goggins</surname>
            <given-names>S</given-names>
          </string-name>
          .
          <article-title>Participation-based student final performance prediction model through interpretable Genetic Programming, Integrating learning analytics, educational data mining and theory</article-title>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Ahmed</surname>
            <given-names>A. B. E. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elaraby</surname>
            <given-names>I. S.</given-names>
          </string-name>
          :
          <article-title>Data Mining: A prediction for Student's Performance Using Classification Method</article-title>
          .
          <source>World Journal of Computer Application and Technology</source>
          , vol.
          <volume>2</volume>
          , no.
          <issue>2</issue>
          ,
          <fpage>43</fpage>
          -
          <lpage>47</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Kukk</surname>
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Course Implementation: Value-Added Mode</article-title>
          . In: Uden L.,
          <string-name>
            <surname>Liberona</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Feldmann</surname>
            <given-names>B</given-names>
          </string-name>
          . (
          <article-title>eds) Learning Technology for Education in Cloud - The Changing Face of Education</article-title>
          .
          <source>LTEC 2016. Communications in Computer and Information Science</source>
          , vol
          <volume>620</volume>
          . Springer, Cham (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>F.</given-names>
            <surname>Okubo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Yamashita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shimada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ogata</surname>
          </string-name>
          .
          <article-title>A neural network approach for students' performance prediction</article-title>
          <source>Proceeding LAK '17 Proceedings of the Seventh International Learning Analytics &amp; Knowledge Conference</source>
          ,
          <volume>598</volume>
          -
          <fpage>599</fpage>
          , (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R.</given-names>
            <surname>Alkhasawneh</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. H.</given-names>
            <surname>Hargraves</surname>
          </string-name>
          ,:
          <article-title>Developing a hybrid model to predict student first year retention in STEM disciplines using machine learning techniques</article-title>
          ,
          <source>Journal of STEM Education: Innovations and Research</source>
          , vol.
          <volume>15</volume>
          , no.
          <issue>3</issue>
          ,
          <fpage>35</fpage>
          -
          <lpage>42</lpage>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>