=Paper=
{{Paper
|id=Vol-3077/paper15
|storemode=property
|title=Learning analytics of MOOCs based on natural language processing
|pdfUrl=https://ceur-ws.org/Vol-3077/paper15.pdf
|volume=Vol-3077
|authors=Yulia Yu. Dyulicheva,Elizaveta A. Bilashova
|dblpUrl=https://dblp.org/rec/conf/cs-se-sw/DyulichevaB21
}}
==Learning analytics of MOOCs based on natural language processing==
Learning analytics of MOOCs based on natural
language processing
Yulia Yu. Dyulicheva1 , Elizaveta A. Bilashova1
1
V.I. Vernadsky Crimean Federal University, 4 Vernadsky Ave., Simferopol, 295007, Crimea
Abstract
The perspectives of application of machine learning, especially, decision trees, random forest and deep
learning for educational data mining problem solving, and learning analytics tools development are
considered in the paper. The abilities of sentiment analysis with BERT deep model, clustering based on
kMeans with the different approaches to the text vectorization are investigated for the development of
learning analytics tools on the example of the learning analytics of some programming MOOCs from
Udemy. We analyze 300 titles of MOOCs and proposed their clustering for better understanding the
directions of learning and skills, and 1150 sentences that contain the word “teacher” or its synonyms
and 2365 sentences about the course for sentiments detection of students and top of words that describe
opinions with positive and negative polarities and the issues during learning.
Keywords
MOOC, sentiment analysis, BERT deep model, learning analytics
1. Introduction
Recently MOOCs and various e-learning environments propose Big Data connected with stu-
dents and tutors activities in form of clickstreams, sequences of video watching, and interactions
with different types of learning content, video streams from the camera during learning for facial
students’ expressions recognition, and students’ comments on various social media platforms.
The diversity of data in education requires special approaches to handling and data recognition.
Machine learning algorithms are widely used in learning analytics (LA) and educational data
mining (EDM) [1]. EDM is considered as a methodology for mining regularities from big educa-
tional data that are gathered in educational environments [2]. LA is aimed at tools development
for analyzing and optimization learning [3, 4].
During the COVID-19 pandemic, when distance learning became the only possible form of
education, the development of tools for solving the problem of assessing the quality of education
and analyzing feedback from students in the form of answers to questions of google-forms,
comments and reviews on various media platforms attracted the attention of many researchers
as one of the most pressing tasks of learning analytics [5].
CS&SE@SW 2021: 4th Workshop for Young Scientists in Computer Science & Software Engineering, December 18, 2021,
Kryvyi Rih, Ukraine
" dyulicheva@gmail.com (Y. Yu. Dyulicheva); lizatkchk@mail.ru (E. A. Bilashova)
~ https://researchgate.net/Yulia-Dyulicheva (Y. Yu. Dyulicheva)
0000-0003-1314-5367 (Y. Yu. Dyulicheva); "" (E. A. Bilashova)
© 2022 Copyright for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
187
There are different problems in LA and EDM formulated as supervised learning, unsupervised
learning, or reinforcement learning problems connected with the extraction of qualitative
regularities from educational data for understanding students’ and tutors’ behavior and activities.
The following purposes are stated in the paper
• studying the effectiveness of the machine learning algorithms application for solving
learning analytics problems;
• the development of the learning analytics tool for extracting the regularities from some
programming language MOOCs using python libraries.
2. Literature review
Consider some machine learning algorithms and their usage for educational data mining and
learning analytics tools development.
2.1. The decision trees and random forest for EDM and LA
Social influences, math achievements and students engagement (their interests and beliefs)
impact acceptance or vice versa rejection from engineering specialities choice. It is important
to develop tools for monitoring the causes of negative relationships to engineering specialities
and specialities that required high-quality math knowledge. Tan et al. [6] noted the perspective
of the linear regression model and random forest usage to investigate how demographic and
family background factors, high school factors and students’ academic achievements with their
non-cognitive behaviour influence their involvement in engineering learning choice.
The task of predicting student performance is one of the primary tasks in e-learning systems
aimed at identifying factors that affect student performance; assessing the quality of learning;
identifying groups of lagging and successful students, etc. Ahmed and Elaraby [7] considered
the problem of academic performance prediction as a supervised learning problem in the case
of the target function with discrete values. Authors constructed decision tree on feature space
like faculty, average score with intervals (excellent rating with score >=85%, very good rating
with score >=75% & <85%, good rating with score >=65% & <75%, bad rating with score >=
50% & <65%, very good rating with a score < 50%); results of the tests; assessing of learning in
seminars, students’ activity as a binary value (yes or no), homework availability as a binary
value (yes or no), and received decision rules for decision making about students final score
results. The hybrid approach based on the decision tree together with a genetic algorithm was
used for identifying the successful students’ groups with a description of the professional skills
that will be useful for employers [8].
Decision trees can be used to describe the key characteristics of certain groups of students –
listeners of MOOCs. Identifying such target groups can help the tutor better understand the
target audience of online courses. Topîrceanu and Grosseck [9] allocated the following target
groups of students according to their archetypes:
1) egocentric learners (interested in getting the skills they need on a regular basis);
2) short-term learners (focused on short quick courses);
188
3) comfort-oriented learners (focused not only on short but also easy-to-learn online courses);
4) interactive learners (focused on short courses and the possibility of active interaction in
communities, and among these learners there are those who seek to obtain the certain
competencies and those who use online courses for fun);
5) learners in order to distract themselves (the duration of the course is not important for them);
6) learners aimed at competencies searching (focused on the rapid acquisition of professional
skills, but unable to complete the course);
7) curious initiators (learners who attend courses out of curiosity with a probability of com-
pleting them approximately 0.5);
8) limited learners focus on the shortcomings of online courses and are frustrated by the lack
of personalized approach;
9) optimistic learners (learners that always set to success);
10) learners with the Pygmalion effect, justifying their personal failures on the course by the
fact that most of the learners did not complete online courses.
2.2. Deep learning for EDM and LA
2.2.1. Sentiment analysis in education
The challenges of this area are closely related to identifying the mood of students in relation
to this course, teacher, university, etc., based on the analysis of text content in the form of
reviews, comments, posts on various media platforms, including social networks. Analysis of
students’ communities and content of students’ profiles are also a very important challenge for
aggressivity identifying [10].
Sentiment analysis and analysis of opinions allow us to identify the emotional component
(boredom, aggression, fear, antipathy, frustration, inspiration, joy, lively interest, etc.) associated
with learning, to understand the reasons for student behavior (loss of interest in the learning
process, involvement in the educational process, etc.), to detect their interests and needs, as
well as the expectations of students and their perception of reality. Another application of
sentiment analysis in education is the development of systems for the qualitative assessment
of the work of teachers and universities, teaching methods, and the quality of the educational
process based on the analysis of feedback from students. Sutoyo et al. [11] noted the similarities
between understanding behavior and strategies developing for customer and learner retention
when using e-learning environments. Strategies for retaining learners are closely related to
improving learning content, teaching styles, curriculum personalization, and individual learning
trajectories development, development of tools for monitoring and estimation of teachers and
students activity. For individual learning trajectories construction, it is important to understand
students’ characteristics of personality that can be extracted with the help of students’ opinion
mining from social media data [12].
The main stages of the study of text content in educational analytics include text understand-
ing based on natural language processing (NLP) methods with subsequent vectorization (BoW,
one hot embedding, CoVe embedding, word2vec, etc) and the use of machine learning methods
(SVM, neural networks, Random Forest, etc) for classifying or clustering educational text data.
Dsouza et al. [13] investigated the effectiveness of various machine learning methods for the
189
sentimental analysis of student feedback and note that the most accurate was Multinomial
Naive Bayes Classifier versus SVM and Random Forest.
Of particular interest is the development of text mining together with deep learning for
extraction regularities from students’ feedback. [11] proposed the usage of convolutional neural
networks with three types of layers for classification students’ comments by sentiments. For
feature extraction from texts, authors used a convolution layer with different kernels sizes,
pooling layer, and fully connected layer with softmax activation function for result interpretation
as sentiments class. [14] used sentiment analysis for teacher performance assessment based on
LSTM architecture with embedding layer (word2vec and pre-trained model based on it), LSTM
layer, Dense layer, and softmax activation function for final classification, and achieved about
90% accuracy.
The results of learning analytics can be used to develop intelligent systems for monitoring
activity, involvement, and changes in the emotional states of students, for example, math anxiety
[15]. Barron-Estrada et al. [16] investigated CNN together with LSTM and achieved 88.26%
accuracy. Besides, the authors noted the effectiveness of CNN for secondary emotions detection.
2.2.2. Academic performance and students’ dropout prediction
Academic performance research is crucial for universities, MOOCs, and students themselves
because it detects the prestige of universities or MOOCs, forms their reputation, and influences
on future careers of students. Besides, the development of learning analytics tools for detecting
students from the category “at dropout risk” is very important for teachers/tutors/instructors
and courses/curriculums creators. Mondal and Mukherjee [17] proposed the usage of Recurrent
Neural Networks with only one hidden layer with 40 neurons and the ReLu activation function
to analyze data about 480 students with 17 features to predict the students’ performance and
achieved an accuracy of around 85%. The same dataset xAPI-Edu-Data on Kaggle for analysis
was used by Bendangnuksung and Prabu [18]. The authors considered deep neural network
(DNN) for students’ performance prediction based on data about 500 students described with
the help of 16 features (gender, nationality, grades, topic, and different types of students and
parents activities). They used two hidden layers of DNN with ReLu and softmax activation
function for the layers respectively and achieved up to 84.3% accuracy. The students’ grade
prediction was investigated by Yousafzai et al. [19] based on the hybrid deep neural network
such as network with bidirectional long short-term memory (BiLSTM) with four types of layers:
embedding layer, dropout layer, bidirectional LSTM layer, attention layer with tanh and softmax
activation functions, output layer with sigmoid activation function, and special procedure for
choice of the significant feature that more influence on students’ grade. Such architecture of
BiLSTM with significant feature extraction allows achieving the accuracy of 90.16%.
The research of students’ sequence of clicks and the sequence of video content watching
usually is done based on RNN architecture. Jeon and Park investigated the influence of clicks
during learning video watching on students’ dropout. For a description of students’ activity
authors proposed to analyze the n-grams of clicks and video embedding together with the gated
recurrent unit (GRU) and achieved 78.3% accuracy of students’ dropout [20]. The effectiveness of
deep neural network with 5 hidden layers and ReLu activation function after each hidden fully-
connected layers and softmax activation function for output layer and achieved 90.2% accuracy
190
was demonstrated in [21]. Xiong et al. [22] used RNN-LSTM for prediction next behavior of
learners based on their previous behavior such as the sequences of previous observable actions
of learners in form of the responses, comments, posts, and achieved about 90% accuracy.
2.2.3. Recognition of students’ facial expressions
The development of approaches to assessing the emotional state of students and teachers during
classes can be used to improve the educational process, which is especially important for
distance learning, the quality of interaction between students and teachers, the atmosphere
within the student team, and identifying content that causes difficulties. D’Errico et al. [23]
note the importance of recognizing the cognitive emotions of students for the formation of a
positive perception of the educational process, creativity, and success of students.
The development of learning analytics tools based on the recognition of cognitive emotions
of students contribute to the creation of systems for tracking student engagement, assessing the
complexity of tasks and their impact on academic performance. Lee and Lee [24] considered a
deep neural network with three convolutional layers, max-pooling between them, and finally,
a fully connected layer with softmax activation function for students’ expressions classifying
according to difficulties level (easy, neutral, hard) during exams. Sharma and Mansotra [25]
proposed the usage of a convolutional neural network for recognition of students’ facial emotions
such as sad, happy, neutral, angry, disgust, surprise, and fear for moods and psychological
atmosphere detection.
Many researchers note the importance of studying not only the emotional state of students
but also teachers, since the perception of a teacher as a person who confidently demonstrates
his knowledge has a significant impact on the perception of the discipline and can create a
stable positive or negative attitude towards the discipline throughout life. In particular, a deep
neural network with a convolutional layer, pooling layer with dense blocks, and RELM classifier
for final detection of instructors’ facial expressions from one of 5 classes: awe, amusement,
confidence, disappointment and neutral was described in [26].
3. Dataset and methodology
We investigate the data of 300 MOOCs on the studying of most popular programming languages
for machine learning such as Python and R and scrapped 6000 reviews on them and extracted
1150 sentences that describe teacher, tutor, or instructor and 2365 sentences that describe the
course or tutorial.
Consider the main steps of the research.
1. Clustering titles of MOOCs with the help of a bag of words for vectorization and simple
cosine measure with the dot product of two vectors in numerator and product of two
Euclidean norms of vectors in the denominator for detection of similarity of titles and,
additionally, clustering kMeans method.
2. The sentiment analysis of 1150 sentences that contain the word “teacher” or its synonyms
and 2365 sentences about course, tutorial or its synonyms was performed. We used model
BERT for the sentiment analysis of each group. BERT model (Bidirectional Representation
191
for Transofermers) based on transfer learning and has a complex architecture with 12
or 24 encoder stack layers with 12 or 24 bidirectional heads realized self-attention and
768 or 1024 hidden units [27]. We used python-library PyTorch and encode method for
text vectorization based on pre-train model BERT. The received tensor with five scores
we transformed into convenient polarities from 1 to 5: the score 1 and 2 — for negative
polarity, 3 — neutral, 4 and 5 — for positive polarity.
3. Frequency analysis of reviews based on polarities. We extracted the top of words with
high frequency for each group of sentences with different polarities.
4. Results
According to the main steps of our methodology we received the following results.
4.1. The titles clustering of MOOCs
The results of clustering of 300 MOOCs from Udemy based on kMeans and cosine similarity
measure are presented in figure 1.
Figure 1: The results of 14 clusters detection based on the titles of MOOCs.
The top of words from 14 clusters allows listeners to understand the presence of MOOCs
and the main directions of learning. Such words agree well with the extraction of key skills
and basics that are learned during the online course and will be used to match popular job
192
vacancies in the labor market. The most presented courses on Udemy are the courses aimed at
beginner and advanced levels for machine learning, in particular, deep learning, web scrapping,
forecasting, computer vision, natural language processing.
4.2. Sentiment analysis of reviews of MOOCs based on BERT
We investigated three aspects of MOOCS: relationship to teacher and course, and description
of issues during the learning and used python-library transformers and pre-trained text pre-
processing for text vectorization and sentiments of texts detection. The results of sentiment
analysis with BERT deep model are presented at figure 2.
Figure 2: The Sentiments Detection about Relationship to Teacher (right) and Course (left) based on
BERT.
The distribution of opinions by groups, taking into account the polarity, shows that students
of MOOCs in programming are more demanding on the course than on the teacher, but in
general, they show a positive attitude.
4.3. Frequency analysis of reviews based on polarities and aspects
Let’s create clouds of words based on frequency analysis and wordcloud python-library, taking
into account sentiments and the aspects under study. The results of a frequency analysis are
shown in table 1.
Frequency analysis can be used as an additional tool for understanding the causes of students’
issues, for example, we extracted such troubles as installation of libraries, settings, development
of apps with GUI, hard understanding of two programming languages at the same time, etc. for
programming MOOCs. We have highlighted words that well describe the positive and negative
emotions of MOOCs listeners. The most frequently positive words that MOOCs listeners were
used for positive emotions description about the relationship to instructors were knowledgeable,
clear, talented, great, patient, etc., and for the description of negative emotions the words boring,
bad, hard, unprofessional, etc. were used. The positive emotions to course learners of MOOCs
expressed in words great, good, complete, amazing, friendly, etc., and for negative emotion
description they used such words as difficult, bad, confused, expected deep, uninformative,
lacked, etc.
193
Table 1
The cloud of words with polarities and aspects
Aspects Positive polarity Negative polarity
teacher
course
issue -
We demonstrate that even the usage of the simple tool developed based on python natural
language processing libraries gives an understanding of the advantages and disadvantages of
MOOCs that will be used by developers, experts, and instructors for better quality MOOCs
creation that oriented on preferences of each listener of MOOC based on listener feedback
analytics.
194
5. Conclusion
We demonstrated that text mining approaches based on the deep BERT model and clustering
can be considered as instruments for learning analytics. Such instruments are aimed at the
creation of personalized MOOCs and their contents and understanding the issues, preferences,
and needs of students. The feedback from students in form of comments, reviews, and posts can
be used for assessing the quality of education and detecting the direction for its improvement.
We demonstrated that learners of MOOCs are more demanding to the course and its content,
and their opinions, in general, have a positive sentiment.
References
[1] M. S. Mazorchuk, T. S. Vakulenko, A. O. Bychko, O. H. Kuzminska, O. V. Prokhorov,
Cloud technologies and learning analytics: Web application for PISA results analysis and
visualization, CEUR Workshop Proceedings 2879 (2020) 484–494.
[2] P. Bachhal, S. Ahuja, S. Gargrish, Educational data mining: A review, Journal of Physics:
Conference Series 1950 (2021) 012022. doi:10.1088/1742-6596/1950/1/012022.
[3] E. İnan, M. Ebner, Learning Analytics and MOOCs, in: P. Zaphiris, A. Ioannou (Eds.),
Learning and Collaboration Technologies. Designing, Developing and Deploying Learning
Experiences, Springer International Publishing, Cham, 2020, pp. 241–254. doi:10.1007/
978-3-030-50513-4_18.
[4] S. Nunn, J. T. Avalla, T. Kanai, M. Kebritchi, Learning analytics methods, benefits, and
challenges in higher education: A systemic literature review, Online Learning 20 (2016).
doi:10.24059/olj.v20i2.790.
[5] M. Umair, A. Hakim, A. Hussain, S. Naseem, Sentiment analysis of students’ feedback
before and after COVID-19 pandemic, International Journal on Emerging Technologies 12
(2021) 177–182. URL: https://www.researchgate.net/publication/353305417_Sentiment_
Analysis_of_Students%27_Feedback_before_and_after_COVID-19_Pandemic.
[6] L. Tan, J. B. Main, R. Darolia, Using random forest analysis to identify student demographic
and high school-level factors that predict college engineering major choice, Journal of
Engineering Education 110 (2021) 572–593. doi:10.1002/jee.20393.
[7] A. B. E. D. Ahmed, I. S. Elaraby, Data mining: A prediction for studentś performance,
World Journal of Computer Application and Technology 2 (2014) 43–47. doi:10.13189/
wjcat.2014.020203.
[8] H. Hamsa, S. Indiradevi, J. J. Kizhakkethottam, Student academic performance prediction
model using decision tree and fuzzy genetic algorithm, Procedia Technology 25 (2016)
326–332. doi:10.1016/j.protcy.2016.08.114.
[9] A. Topîrceanu, G. Grosseck, Decision tree learning used for the classification of student
archetypes in online courses, Procedia Computer Science 112 (2017) 51–60. doi:10.1016/
j.procs.2017.08.021.
[10] F. K. Ventirozos, I. Varlamis, G. Tsatsaronis, Detecting aggressive behavior in discussion
threads using text mining, in: A. Gelbukh (Ed.), Computational Linguistics and Intelligent
195
Text Processing, Springer International Publishing, Cham, 2018, pp. 420–431. doi:10.
1007/978-3-319-77116-8_31.
[11] E. Sutoyo, A. Almaarif, I. T. R. Yanto, Sentiment analysis of student evaluations of teaching
using deep learning approach, in: J. H. Abawajy, K.-K. R. Choo, H. Chiroma (Eds.),
International Conference on Emerging Applications and Technologies for Industry 4.0
(EATI’2020), Springer International Publishing, Cham, 2021, pp. 272–281. doi:10.1007/
978-3-030-80216-5_20.
[12] A. Khowaja, M. H. Mahar, H. Nawaz, S. Wasi, S. ur Rehman, Personality evaluation of
student community using sentiment analysis, International Journal of Computer Science
and Network Security 19 (2019) 167–180.
[13] D. D. Dsouza, Deepika, D. P. Nayak, E. J. Machado, N. D. Adesh, Sentimental analysis of
students feedback using machine learning techniques, International Journal of Recent
Technology and Engineering 8 (2019) 986–991. URL: https://www.ijrte.org/wp-content/
uploads/papers/v8i1s4/A11810681S419.pdf.
[14] I. A. Kandhro, S. Wasi, K. Kumar, M. Rind, M. Ameen, Sentiment analysis of students’
comment using long-short term model, Indian Journal of Science and Technology 12
(2019). doi:10.17485/ijst/2019/v12i8/141741.
[15] Y. Dyulicheva, Learning Analytics in MOOCs as an Instrument for Measuring Math
Anxiety, Voprosy obrazovaniya / Educational Study Moscow (2021). doi:10.17323/
1814-9545-2021-4-243-265.
[16] M. L. Barron-Estrada, R. Zatarain-Cabada, R. Oramas-Bustillos, Emotion recognition for
education using sentiment analysis. research in computing science, Research in Computing
Science 148 (2019) 71–80. doi:10.13053/rcs-148-5-8.
[17] A. Mondal, J. Mukherjee, An approach to predict a student’s academic performance using
Recurrent Neural Network (RNN), International Journal of Computer Applications 181
(2019) 1–5. doi:10.5120/ijca2018917352.
[18] Bendangnuksung, P. Prabu, Students’ performance prediction using deep neural network,
International Journal of Applied Engineering Research 13 (2018) 1171–1176.
[19] B. K. Yousafzai, S. A. Khan, T. Rahman, I. Khan, I. Ullah, A. Ur Rehman, M. Baz, H. Hamam,
O. Cheikhrouhou, Student-performulator: Student academic performance using hybrid
deep neural network, Sustainability 13 (2021). URL: https://www.mdpi.com/2071-1050/13/
17/9775. doi:10.3390/su13179775.
[20] B. Jeon, N. Park, Dropout prediction over weeks in MOOCs by learning representations of
clicks and videos (2020). arXiv:2002.01955.
[21] J. Whitehill, K. Mohan, D. Seaton, Y. Rosen, D. Tingley, Delving deeper into MOOC student
dropout prediction, 2017. arXiv:1702.06404.
[22] F. Xiong, K. Zou, Z. Liu, H. Wang, Predicting learning status in MOOCs using LSTM,
in: Proceedings of the ACM Turing Celebration Conference - China, ACM TURC ’19,
Association for Computing Machinery, New York, NY, USA, 2019, p. 74. doi:10.1145/
3321408.3322855.
[23] F. D’Errico, M. Paciello, B. D. Carolis, A. Vattanid, G. Palestra, G. Anzivino, Cognitive
emotions in e-learning processes and their potential relationship with students’ academic
adjustment, International Journal of Emotional Education 10 (2018) 89–111. URL: https:
//files.eric.ed.gov/fulltext/EJ1177644.pdf.
196
[24] H.-J. Lee, D. Lee, Study of process-focused assessment using an algorithm for facial
expression recognition based on a deep neural network model, Electronics 10 (2021) 54.
URL: https://www.mdpi.com/2079-9292/10/1/54. doi:10.3390/electronics10010054.
[25] A. Sharma, V. Mansotra, Deep learning based student emotion recognition from facial
expressions in classrooms, International Journal of Engineering and Advanced Technology
8 (2019) 4691–4699. doi:10.35940/ijeat.F9170.088619.
[26] Y. K. Bhatti, A. Jamil, N. Nida, M. H. Yousaf, S. Viriri, S. A. Velastin, Facial expression
recognition of instructor using deep features and extreme learning machine, Computational
Intelligence and Neuroscience 2021 (2021) 5570870. doi:10.1155/2021/5570870.
[27] S. A. Rauf, Y. Qiang, S. B. Ali, W. Ahmad, Using BERT for checking the polarity of movie
reviews, International Journal of Computer Applications 177 (2019) 37–41. doi:10.5120/
ijca2019919675.
197