<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Data Mining Framework for Analyzing Students' Feedback of Assessment</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zainab Mutlaq Ibrahim</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohamed Bader-El-Den</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mihaela Cocea (zainab.mutlaq-ibrahim</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>mohamed.bader</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>mihaela.cocea)@port.ac.uk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Portsmouth</institution>
          ,
          <addr-line>Lion Terrace, PO1 3HE</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Assessment constitutes a fundamental part of an academic learning process due to its importance in testing students gaining knowledge and nalizing their grades. This study aims to develop a data mining based framework for analyzing students' assessment feedback that will be obtained from social media sites and/or text feedback. The study consists of three stages: The rst stage is to build a model that automatically detect the polarity of student feedback using sentiment analysis methods. The second stage is to build a model that automatically classify issues of assessment. And nally, test the correlation between issue(s) and students' performance. The research uses di erent popular algorithms for text classi cation to analyze students' feedback of assessment to enhance learning process.</p>
      </abstract>
      <kwd-group>
        <kwd>Assessment Decision Tree Machine learning algorithms Naive Bays Random Forest Sentiment analysis Support Vector Machines</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Analyzing students' feedback of assessment can point out issues they may have
in accomplishing its components. Many students squander time in completing
assessment trying to gure out what the thought processes of the marker might
be, what s/he wants to hear or read instead of developing their own skills and
understanding in evaluating what the assessment component asked for [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], that
should remind the assessment author to clarify what exactly the assessment
asked for. More examples of assessment issues are: length, format, validity,
reliability, late recti cation, no remedy, issues with teaching and curriculum, and
disruptive.
      </p>
      <p>Massive online open courses(MOOCs) bring new challenges to the learning
process in general and assessment in particular as MOOC unit includes a huge
number of candidates from di erent cultures and backgrounds who use di erent
languages and may be di erent accents and jargon of the same languages. This
will need extra e ort and di erent techniques to design and simplify the
assessment to test the right knowledge and skills.</p>
      <p>To understand more about assessment the following section is to highlight its
main aspects:</p>
    </sec>
    <sec id="sec-2">
      <title>Assessment</title>
      <p>
        Assessment is de ned as all procedures that evaluate student's knowledge,
understanding, abilities, or skills according to the quality assurance agency for
higher education [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Students receive feedback from their tutors during the semester and at the end
of it. The latter is called suumative assessment that sums up all student
achievements and leads towards awarding the nal grade, while the rst, which is given
to student during the semester with the main aim to recognize students strengths
and weaknesses to guide them accordingly [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Assessment types include written and oral exams, essays, reports, portfolios,
presentation, projects, posters, theses, and many more. All form of assessment have
general advantages and disadvantages[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. However, advantages to student A can
be disadvantage to student B. For example essays shows depth of learning which
suits a good writer who has a great writing skills, but not to other students who
don't have good writing skills. Struyven and Black [
        <xref ref-type="bibr" rid="ref4 ref5">4,5</xref>
        ] revealed that students'
perceptions of assessment remarkably a ected their approaches to learning and
studying, this means that di erent design and style of assessment would guide
them in choosing the right method of studying to in achieve better results.
Black and William[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] added that assessment in uences learning in three ways:
provides motivation to students; highlights the important part of the curriculum;
and helps students to evaluate and judge the e ectiveness of their learning.
Despite the importance of Students' feedback on assessment as a constructive
act of learners in higher education, there are relatively limited studies and
reviews that considering students' perspectives [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>Institutions seek students' feedback using questions-based surveys in which choices
are provided to choose from.This is good to evaluate the impact of issues that
are de ned by the survey but not issues that are de ned by students themselves.
Also this may make any study biased to the terms of the survey questions and
choices, nite to the number of its questions, and not speci c to a particular
component of learning such as assessment.</p>
      <p>This research aims to study students' feedback of assessment in higher education.
The proposed approach of this research has three stages: rst, to take students'
feedback on assessment in a written text format and assign a sentiment to each
entry as( positive, negative). The sentiment will be assigned to all instances that
include instances talking about all advantages and disadvantages of assessment;
second, to detect issues of assessment from students' negative instances; nally,
to integrate the result from the second stage with actual students' marks and
attendance.</p>
      <p>First stage aims to see the extent of students' satisfaction of assessment. To do
that we intend to use sentiment analysis methods. Second stage aims to
identify and develop a set of labels to classify issues accordingly, in this stage we
use classi cation methods. Final stage aims to test the correlation among the
following variables: Mark, attendance, and issues that student su ers from, In
this stage classi cation and statistical methods will be used.</p>
      <p>Thus, The following section is addressing the research questions:
1.2</p>
    </sec>
    <sec id="sec-3">
      <title>Research Importance and Questions</title>
      <p>Analyzing students' feedback of assessment can lead to identifying issues that
students struggle with and allow decision-makers to propose a suitable solution
to tackle them to enhance the learning process
The research is important in educational data mining eld, as the literature did
not reveal any study that applied data mining models on assessment practices
data sets.</p>
      <p>This research is also important as it will produce optimal data mining models
ready to apply on big data sets, such as online, and massive online open courses
(MOOCs) feedback.</p>
      <p>In this research, we aim to answer the following questions:</p>
      <p>How to automatically detect the polarity of students' feedback of assessment?
How to automatically detect issues of assessment?
What is the best method to visualize the correlation between performance and
detected issues?
1.3</p>
    </sec>
    <sec id="sec-4">
      <title>Research Contribution</title>
      <p>This research will contribute the following elements to the eld of knowledge:
Proposes e ective data mining framework that detect the polarity of the
students' feedback.</p>
      <p>Proposes e ective data mining framework that detect issues of students'
feedback of assessment.</p>
      <p>Proposes e ective data mining framework that test the correlation between
detected issues and performance.</p>
      <p>As it is di cult to nd assessment related data set, this research generates
one from student general feedback.</p>
      <p>The rest of this paper is structured as follow: section two is literature review,
section three is the methodology, and section four is the Early results and Future
Work.
2</p>
      <sec id="sec-4-1">
        <title>Literature Review</title>
        <p>
          Researchers studied students' experiences that a ected their performance[
          <xref ref-type="bibr" rid="ref6 ref7">6,7</xref>
          ]using
social media data. Their intentions were to identify engineering students' issues
regarding learning in general and were not concentrating on a speci c subject
such as assessment. They [
          <xref ref-type="bibr" rid="ref6 ref7">6,7</xref>
          ] used data mining and natural language processing
methods. Studies[
          <xref ref-type="bibr" rid="ref10 ref4 ref8 ref9">4,8,9,10</xref>
          ] analyzed assessment, Struyven, Janssens and Dochy
[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] reviewed articles and documents(between 1980-2002) that related to
assessment and evaluation from students' point of view .They found that students'
perceptions of assessment and their approaches for learning are strongly related.
Students perceived the multiple choice format as more favourable than the essay
format with exception of female students and students who have strong learning
skills[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>
          Cruickshank studied the use of exams at postgraduate level, the language and
cultural issues faced by international students, and the impact of
internationalisation on the United Kingdom higher education sector [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. His outcome
recommendations were: Institutions should increase exam time by 15 minutes to allow
reading, reduce exam weighting marks, and assessment should test the students'
knowledge and not their language skills[
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>
          Trotter studied students' perception of continuous summative assessment and its
impact on their motivation,approaches to learning, and changes to their learning
environment[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. He concluded that although the process is time consuming and
hard work, it signi cantly enhanced the learning environment [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
Alqarran analysed a two universities' methods of assessing students, his study
recommended that institutions should encourage and support di erent methods
of assessments[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
        <p>
          Our research approach is similar to [
          <xref ref-type="bibr" rid="ref11 ref6 ref7">6,7,11</xref>
          ] in using text data from surveys and
social media sites but it will go further to test the correlation among the ndings
from our research and actual output(mark and attendance) from students' les.
we are going to identify issue(s) of assessment from the students' feedback unlike
[
          <xref ref-type="bibr" rid="ref10 ref4 ref8 ref9">4,8,9,10</xref>
          ] who de ned speci c issues of assessment and studied them.
3
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>Research Methodology</title>
        <p>As mentioned above in the introduction section and shown in gure 1 the
research approach divided into three stages: Sentiment analysis; Issue detection;
and performance-Issues correlation.
3.1</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Data Collections</title>
      <p>The rst two phases will be based on text data feedback that will be obtained
from the students in text format. For the third phases, numerical data (Grades
and attendance will be taken from actual students' records) will be combined
with the result from sentiment analysis and issues detection phases.The
feedback will be collected shortly after the students complete their assessment to
ensure the integrity of the collected feedback. Also, we will investigate the use of
previous feedback which is normally collected by universities at the end of each
teaching block.
3.2</p>
    </sec>
    <sec id="sec-6">
      <title>Sentiment Analysis and Issue Detection</title>
      <p>In this section, text classi cation techniques and methods are used, the general
framework is presented in gure 2
Raw Text Data</p>
      <p>Labeling</p>
      <p>Pre-Processing
Classi cation model</p>
      <p>Evaluation</p>
      <p>
        Adoption
Labeling Text feedback in general needs an in depth analyzing as it is normally
contain large amount of informal words, jargon, abbreviations local slang words,
mis-spelling words, and sarcasm which make the meaning extraction process
difcult. Chen [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] tried a popular topic modelling algorithm called Latent Dirichlet
Allocation(LDA), it produced senseless word groups with a lot of overlapping
words across di erent topics. They [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] decided to function an in depth
analyzing process to have a qualitative look at the data to recognize the quality and
minimize the margin of error as they categories these entries .
      </p>
      <p>
        Pre-processing Cleaning data enhances the output accuracy and minimize
data dimension. It depends on the source of data which can be a hand written
text or social media and blogs sites as they use special characters.Researches
[
        <xref ref-type="bibr" rid="ref12 ref13 ref14 ref15 ref16">12,13,14,15,16</xref>
        ] used one or more technique(s) of the following:tokenization,
remove stop words, needless punctuation, exclamation, question marks,any
additional unnecessary symbols,and special marks. And modify words contain
uppercase letters or special marks.
      </p>
      <p>
        Feature Selection Feature selection is also called variable selection or attribute
selection, it is used to improve classi cation e ectiveness and computational
performance[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], the most popular used features are N-Gram, and Part of Speech
(PoS) features[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]
      </p>
      <p>
        Classi cation model In this stage, we will investigate a wide range of
state-ofthe-art classi cation algorithms [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] such as but not limited to: Naive Bays (NB),
Support Vector Machine (SVM), Desioson Tree (DT), and Random Forest(RF).
Evaluation and Adoption Evaluation is the process of using speci c metrics
to assess how good is the developed classi cation model. The most popular
metrics are: accuracy, precision, recall, and the F-measure.
      </p>
      <p>Early Results Early results showed that about 25% of the rst data set
commented on the assessment procedures in particular in spit of the fact that the
data set was a general feedback .</p>
      <p>Also regarding the rst stage which is detecting the polarity of the assessment
feedback showed signi cant performance of Support Vector machine models.
3.3</p>
    </sec>
    <sec id="sec-7">
      <title>Integrating Students' Grades and Attendance with detected issues</title>
      <p>This section represents the nal stage of the research, it is to project the students'
grades, attendance with detected issue(s) that student found it/them as an
obstacle(s) to achieve better. In this stage we aim to test the correlation among
detected issues, grade, and attendance, to achieve that, data mining algorithms
and statistical methods will be used.
4</p>
      <sec id="sec-7-1">
        <title>Future Work</title>
        <p>In particular, this research is to propose an e ective data mining framework to
study students' feedback of assessment, to detect issues of its procedures, inform
the decision-makers of these issues to update and modify accordingly. The main
aim is to enhance the learning process.</p>
        <p>The research is still in early stage, The next step is to complete collecting data
and use it to explore issues of assessment. Then integrate the output of issue
detection stage with students' performance data.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Murphy F. Module</surname>
            Design and Enhancement,
            <given-names>Assessment</given-names>
          </string-name>
          <string-name>
            <surname>Types</surname>
          </string-name>
          .
          <year>2009</year>
          . Accessed:
          <fpage>2017</fpage>
          -09-07.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <article-title>Understanding assessment: its role in safeguarding academic standards and quality in higher education</article-title>
          .
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Strakova</given-names>
            <surname>Zuzana</surname>
          </string-name>
          .
          <article-title>Promoting learning through assessment</article-title>
          .
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Janssens S Struyven</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dochy F. Students</surname>
          </string-name>
          <article-title>' perceptions about evaluation and assessment in higher education: a review</article-title>
          .
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Wiliam</surname>
            <given-names>D. Black P</given-names>
          </string-name>
          .
          <article-title>Assesment and classroom learning. Assessment in education:priciples, policy</article-title>
          and practice,
          <volume>5</volume>
          (
          <issue>1</issue>
          ):
          <volume>68</volume>
          {
          <fpage>75</fpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Madhavan K Chen</surname>
            <given-names>X</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vorvoreanu</surname>
            <given-names>M.</given-names>
          </string-name>
          <article-title>mining social media data for understanding students'learning experiences</article-title>
          .
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Pagare P. Recognizing</surname>
          </string-name>
          <article-title>Students' Problem Using Social Media Data</article-title>
          .
          <source>International Journal of Computer Science and Mobile Computing</source>
          ,
          <volume>4</volume>
          (
          <issue>6</issue>
          ):
          <volume>440</volume>
          {
          <fpage>446</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Cruickshank</surname>
            <given-names>P.</given-names>
          </string-name>
          <article-title>The Perceptions of Postgraduate International Students of Examinations</article-title>
          .
          <source>Journal of Perspectives in Applied Academic Practice</source>
          ,
          <volume>4</volume>
          (
          <issue>3</issue>
          ),
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>E</given-names>
            <surname>Trotter</surname>
          </string-name>
          .
          <article-title>Student perception of continous summative assessment</article-title>
          .
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Alquraan Mahmoud R. A</surname>
          </string-name>
          cross
          <article-title>-cultural study of students' perceptions of assessment practices in higher education</article-title>
          .
          <source>Education, Business and Society: Contemporary Middle Eastern Issues</source>
          ,
          <volume>7</volume>
          (
          <issue>4</issue>
          ):
          <volume>293</volume>
          {
          <fpage>315</fpage>
          ,
          <year>August 2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Nabeela</surname>
            <given-names>Altrabsheh</given-names>
          </string-name>
          , Mihaela Cocea, and
          <string-name>
            <given-names>Sanaz</given-names>
            <surname>Fallahkhair</surname>
          </string-name>
          .
          <article-title>Sentiment analysis: towards a tool for analysing real-time students feedback</article-title>
          .
          <source>In Tools with Arti cial Intelligence (ICTAI)</source>
          ,
          <year>2014</year>
          IEEE 26th International Conference on, pages
          <volume>419</volume>
          {
          <fpage>423</fpage>
          . IEEE,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Tippakorn Rungkasiri Wilas Chamlertwat</surname>
            , Pattarasinee Bhattarakosol and
            <given-names>Choochart</given-names>
          </string-name>
          <string-name>
            <surname>Haruechaiyasak</surname>
          </string-name>
          .
          <article-title>Discovering consumer insight from twitter via sentiment analysis</article-title>
          .
          <source>Journal of Universal Computer Science</source>
          ,
          <volume>18</volume>
          (
          <issue>8</issue>
          ):
          <volume>973</volume>
          {
          <fpage>992</fpage>
          ,
          <year>August 2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>M. Cooper W.B. Claster</surname>
            and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Sallis</surname>
          </string-name>
          .
          <article-title>Thailand { tourism and con ict: Modeling sentiment from twitter tweets using naive bayes and unsupervised arti cial neural nets</article-title>
          .
          <source>pages 89{94</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>K.</given-names>
            <surname>Mouthami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. N.</given-names>
            <surname>Devi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V. M.</given-names>
            <surname>Bhaskaran</surname>
          </string-name>
          .
          <article-title>Sentiment analysis and classi - cation based on textual reviews</article-title>
          .
          <source>In 2013 International Conference on Information Communication and Embedded Systems (ICICES)</source>
          , pages
          <fpage>271</fpage>
          {
          <fpage>276</fpage>
          ,
          <string-name>
            <surname>Feb</surname>
          </string-name>
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Pak</surname>
          </string-name>
          and
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Paroubek</surname>
          </string-name>
          .
          <article-title>Twitter based system: Using twitter for disambiguating sentiment ambiguous adjectives</article-title>
          .
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>Alexander</given-names>
            <surname>Pak</surname>
          </string-name>
          and
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Paroubek</surname>
          </string-name>
          .
          <article-title>Twitter as a corpus for sentiment analysis and opinion mining</article-title>
          .
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Yang</surname>
            <given-names>Y.</given-names>
          </string-name>
          <article-title>Rogati m. High-Performing Feature Selection for Text Classi cation</article-title>
          .
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Fallahkhair S Altrabsheh</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cocea</surname>
            <given-names>M.</given-names>
          </string-name>
          <article-title>Predicting learning-related emotions from students' textual classroom feedback via Twitter</article-title>
          .
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Pollyanna Goncalves Marcos Andre Goncalves Fabr cio Benevenuto Filipe N Ribeiro</surname>
          </string-name>
          ,
          <article-title>Matheus Araujo. entiBench - a benchmark comparison of state-ofthe-practice sentiment analysis methods</article-title>
          .
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>