=Paper=
{{Paper
|id=Vol-2869/PAPER_02
|storemode=property
|title=Comparative Analysis of Various Techniques used for Predicting Student's Performance
|pdfUrl=https://ceur-ws.org/Vol-2869/PAPER_02.pdf
|volume=Vol-2869
|authors=Amita Dhankhar,Kamna Solanki
}}
==Comparative Analysis of Various Techniques used for Predicting Student's Performance==
Comparative Analysis of Various Techniques used for Predicting Student’s Performance Amita Dhankhara, Kamna Solankib a University Institute of Engineering and Technology, Maharshi Dayanand University, Rohtak, India b University Institute of Engineering and Technology, Maharshi Dayanand University, Rohtak, India Abstract Digitization is transforming all aspects of education. Learner’s interactions with their online and offline learning environment lead to a trail of data that can be used for the purpose of analysis. Learning analytics (LA) and Educational data mining and (EDM) are emerging fields that attempt to develop methods to confront an abundance of data from the educational domain in order to optimize learning and leveraging decisions related to learning, teaching, and educational management. EDM/LA techniques interpret such enormous data and turn it into useful action. It provides insight to teachers to improve teaching, to understand learners, to identify difficulties faced by learners, and to provide meaningful feedback to learners thereby improving the learner’s performance. This paper aims to compare different EDM/LA techniques and to identify their potential strength and weaknesses that are applied in the educational domain to predict the student’s performance. Keywords 1 Educational data mining, learning analytics, machine learning, supervised learning, unsupervised learning. 1. Introduction Technology is evolving rapidly [1]. This technological advancement leads to the generation of tremendous amounts of data and it becomes an integral part of all sectors [2]. The educational sector is no exception. Big data in the field of the education sector provides unprecedented opportunities for teachers and educational institutes. The exploration and analysis of an enormous amount of data so that significant patterns can be discovered is called Data mining (DM). It can also be defined as “a non- trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns from data” [3]. The DM techniques when applied to the data gathered from the educational domain to extract knowledge is called Educational data mining [4]. One of the significant areas of interest for researchers in EDM is the prediction of student’s performance. Timely predicting student’s performance helps in identifying poorly performing students thereby helping teachers to provide early intervene. EDM/LA techniques like classification, clustering, association analysis, prediction are used to transform raw data into significant information. Computational advancements in data mining and learning analytics have helped this effort significantly [5]. Considering the importance of various techniques for predicting student’s performance detailed comparative analysis of these techniques would be valuable. The sections that follow are listed as methodology is described in Section-2; Results are summarized in section 3; the conclusion is summarized in section 4. 2. Methodology This paper performed a comparative analysis of various techniques used for predicting student performance. WTEK-2021: Workshop on Technological Innovations in Education and Knowledge Dissemination, May 01, 2021, Chennai, India. EMAIL: amita.infotech@gmail.com (Amita Dhankhar) ORCID: 0000-0002-9305-4088 (Amita Dhankhar) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) 10 For this purpose, relevant articles were identified, selected, evaluated critically using several criteria, and then finding were integrated. Few Research questions were formulated to streamline our contribution, which are: RQ-1 What EDM/LA techniques are used for predicting student performance? RQ-2 Comparative analysis of various techniques on the different facet that includes their strength, weaknesses, and accuracy. To assess and address the above-mentioned Research Questions, we have adopted the PICO model [6] that consists of 4 key components namely population, intervention, comparison, and outcomes. Details of the PICO components of this paper are given in the Table 1. We have searched three databases namely Scopus, IEEE, and Science Direct for the articles published from 2016 to 2020. Population Articles predicting student’s performance Intervention EDM/LA techniques Comparison Comparative analysis of EDM/LA techniques outcomes Effectiveness, the accuracy of the techniques The search string used for the search is (Prediction OR forecast OR predict) AND (techniques OR methods OR framework) AND (student’s performance OR retention OR at-risk) AND (Engineering OR Higher education) AND (data mining OR machine learning OR Learning analytics) To obtain relevant results, the syntax of the string was modified slightly for each database. The articles identified through database searching were evaluated using inclusion and exclusion criteria. Inclusion criteria included articles that explicitly predict student’s performance/predictive models/techniques/methods, considered only journal articles, full text is available for analysis, focus on empirical studies, articles in the domain of higher education. Articles not written in English, conference articles, full text not available were excluded. Figure 1: PRISMA flow chart of methodology [7]. 11 3. Results In this section, we describe the details of the reviewed articles, EDM/LA techniques used for predicting student’s performance, and comparative analysis of various techniques on the different facets that include their strength, weaknesses, and accuracy. Regression and Classification techniques are the most commonly used techniques in educational data mining and learning analytics. It is the supervised learning method that analyzes a set of data and classifies data into a different predefined set of classes. In the context of higher education, this approach has been used to determine or predict student’s success or failure by identifying the patterns from the student’s learning activities with online learning resources. Classification techniques can be used to predict student’s performance, to predict students at-risk or retention [8-10], students dropout prediction [11,12], predict student’s achievement [13], predict which students would likely submit their assignments [14], assessing student’s engagement during the course [15]. In this section, we have discussed various techniques used for predicting student’s performance. Figure 2: Distribution of techniques used for Predicting students’ performance Figure 3: Distribution of datasets used for predicting student’s performance 12 3.1. k-NN K- nearest neighbor is supervised machine learning algorithm. It is the simplest yet powerful technique that can be used for both classification and regression predictive problems. The basic concept of KNN is to classify the test data in a given dataset by using feature similarity. It calculates the distance (closeness or proximity) between the test data and each training data in the dataset. Then it performs the majority voting and classifies the test data by the majority votes of neighbor classes. The distance can be calculated by using various distance functions like Euclidean, Cosine, Chi-square, Minkowsky, etc [38-42]. 3.2. Naive Bayes Naive Bayes is a classification algorithm that assumes that the predictor variables are independent of each other. The base of the naive Bayes is the Baye's theorem which is derived from the conditional probability. Bayesian theorem gives an equation for computing posterior probability P1(c1|x1) from P1(c1), P1(x1), and P1(x1|c1). 𝑝1(𝑥1|𝑐1)𝑝1(𝑐1) 𝑝1(𝑐1|𝑥1) = 𝑃1(𝑐1) P1(c1|x1): the posterior probability of type (c, target) provided predictor (x, attributes), P1(c1): the previous probability of a class, P1(x1|c1): the perspective, which is the probability of predictor given class, P1(1x): the previous probability of predictor. It classifies the test data by computing conditional probability with feature vectors x1, x2…., xn which belong to particular class Ci. Naive Bayes algorithms can be applied in recommendation system spam filtering, sentiment analysis [43-48]. 3.3. Logistic Regression LR is a statistical method that can be used for binary classification problems. It assumes that classes are almost linearly separable. It uses a logistic function also called the sigmoid function which is used to map predicted values to probabilities. It utilizes a logit function for predicting the probability of occurrences of a binary event [49-53]. 3.4. Linear Regression It is a supervised learning process. It finds the function which predicts for given X predicts Y where Y is continuous. F(X)→ Y Many types of functions can be used. The simplest type of function is a linear function. X can comprise a single feature or multiple features. The basic concept of linear regression is to find a line that best fits data. The best fit line means the total prediction error for all data points is as small as possible. The error is the distance between the point to the regression line [54-58]. 3.5. Support Vector Machine It is a very popular machine learning technique. It can be used to perform both classification and regression. The core idea of SVM is that it tries to find out a hyperplane that separates two classes as widely as possible. In other words, it finds the hyperplane that maximizes the margin. As margin increases the generalization accuracy increases. The points through which the hyperplane passes are called support vectors. The variations to SVM are linear SVM, Polynomial kernel SVM, Radial Basis Function SVM [24][25][38][58][59]. 13 3.6. Decision Trees A decision tree is not a distance-based method. It can be used for both regression and classification both. Though, it is mostly used for classification. DT naturally extended to do multi-class classification. The structure of DT is in the form of a tree. Decision nodes and leaf nodes are the two types of nodes in DT. Starting with the root node, it checks the conditions and accordingly goes to the matching branch and continues till it reaches the leaf node. The predicted value will be at the leaf node [60-69]. 3.7. Random Forest Random Forest is basically a bagging technique. In this, some of the row samples and feature samples are taken and given to one of the many base learners. In a random forest base, learners are decision trees. This step is basically bootstrap. After this aggregation is done by using majority voting [70-73]. Table 1: Papers on prediction of student’s performance Paper Objective Predictive Evaluation Data Set Mode No. Model/Technique used /Method [9] Identifies the students who Logistic Deep ANN Open Online are at-risk of a course Regression classificatio University (VLE) failure, early prediction of SVM n model Learning the students who are at-risk Deep ANN achieved Analytics and withdrawal from the classification 93% (OULA) course and identifies model accuracy. patterns of students who pass with distinction [11] The objective is to predict LOGIT_Act LOGIT_Act Activity MOODLE whether a student will drop knowledge Knowledge data from out of a course discovery system. System Moodle It uses logistic achieves an DB of regression accuracy of Madrid modeling and 97.13% Open classification. University [12] Predict dropout by using an FSPred Framework F1 score of XuetangX MOOC integrated framework with which uses FSPred is for KDD feature selection, feature FEATURE 84.69 CUP 2015 generation. SELECTION + logistic regression model [13] The objective is to design a SVM, NB, LR, MLP, F1 score of University Tradition student achievement MLP- Neural MLP Neural al predicting framework using Network-based Network A layer-supervised multi- method. based layer perceptron (MLP) method is Neural Network-based 81.3% method. 14 [24] An innovative two-stage Gaussian RBF 95.53% Higher Moodle approach is proposed and kernel and the accuracy education learning evaluated the effectiveness polynomial kernel achieved by data set manage of it by applying the were applied to Deep Neural ment approach using two the RF, Deep Network system different but Neural Network, complementary datasets. SVM. [25] Simple model Gradual At- Support Vector SVM Universita UOC LMS risk (GAR) is presented, to (SV), K-Nearest achieved an t Oberta identify at-risk students. Neighbors (KNN), accuracy of de Decision Tree 92.41% Catalunya (DT)-CART, Naïve Bayes (NB) [26] Two models have proposed Generalized Linear Gradient Harvard VLE naming the learning Model (GLM) and Boosting University achievement model and the Gradient Boosting Machine and at-risk student model Machine (GBM)AdaB Massachu (GBM)AdaBosst osst algo setts algo, Multi-Layer achieved Institute Perceptron the highest of (NNET2), accuracy Technolog Feedforward that is 89.4% y online Neural Network courses, with a single Open hidden layer University (NNET1), Random online Forest (RF). courses. [27] Predict the possibility of logistic regression, Accuracy=7 University Universit drop out students by a multilayer 7%, in Taiwan y’s implementing machine and perceptron Institutio statistical learning method algorithm nal using deep neural network Research Database ; [28] The aim is to discover the Sequential Random Deanship LMS impact of online activity minimal Forest of E- data and assessment grades optimization achieved Learning in the LMS on student’s (SMO), logistic the highest and performance regression, accuracy i.e Distance multilayer 99.17% Education perceptron (MLP), at King decision tree (J48), Abdulaziz random forest University [29] Use of DM techniques to Decision tree, J48 achieve Umm Al- Tradition predict students’ academic Naive Bayes the highest Qura al performance and to help to accuracy University advise students that is in Makkah 84.38% [30] Developed “University decision tree Accuracy of university Tradition Students Result Analysis algorithms: J48, J48 is student al and Prediction System” database, 15 REPTree, and highest i.e from Hoeffding Tree 85.64% students through Google doc survey [31] Proposed a Multi-task “Multi-task multi- The University MOOC learning framework finding layer LSTM with proposed out the performance of cross-entropy as model students and “mastery of the loss function”, achieved F1- knowledge points” in M-S-LSTM, M-F- score=93.59 MOOCs LSTM standard using online behavior multi-layer based on assignments. perceptron (MLP), LSTM, standard logistic regression (LR), naïve Bayes (NB). [32] Proposed deep LSTM to find deep LSTM model, The OULA VLE out students at-risk by SVM, Logistic proposed converting the problem into Regression, ANN model a sequential weekly format. achieved 90% accuracy [33] Aim to analyze various EDM Random Forest Random University Tradition techniques for improving (RF), k-Nearest forest al the accuracy of prediction Neighbour (k-NN), achieved in a university course for Logistic the highest student academic Regression Naïve accuracy i.e performance. Bayes. 88% [34] Applied ML methods to find Decision tree Accuracy is engineeri Tradition out the final grades of algorithm 96.5% ng degree al students using their at an previous grades. Ecuadoria n university [35] Behavioral data analyzed Naïve Bayes (NB), Logistic University Moodle based on a learning Support Vector Regression of LMS management system used machine (SVM), achieved Pernambu platform for distance learning Logistic regression the highest co courses in a public (LR), CART- accuracy Distance University. Predictive Decision Tree that is 89.3% Learning models have been Departme developed, analyzed, and nt compared. (NEAD/UP E) [36] Predicting student Decision tree (DT), Ensemble The LMS and academic performance (ANN) artificial method University (SRS)Stud using “multi-model neural network, the hybrid of the ent heterogeneous ensemble” and (SVM) Support model West of record approach Vector Machine, achieved Scotland system the highest 16 an Ensemble accuracy question method that is naire hybrid model 77.69% [37] Predict the performance of Decision Tree, 1- Naive Bayes Informatio Tradition students before the Nearest achieved n al completion of the course. Neighbour, Naive the highest Technolog Analyzed the progress of Bayes, Neural accuracy y the students throughout Networks, that is Engineeri the course and combine Random Forest 83.6%, ng them with prediction Trees University results. , Pakistan. Table 2: Advantages and Disadvantages of various techniques used in predicting student’s performance Predicting Techniques Advantages Disadvantages k-NN [16] [38-42] Simple algorithm and easy to As it stores all training data it understand, interpret & becomes a computationally implement. expensive algorithm and requires high memory storage. As no assumption of data therefore helpful for nonlinear When the size of N increases data. the prediction becomes slow. A versatile algorithm as it can k-NN fails if data points in the be used for both regression & dataset are randomly spread. classification both. If the data point is far away from the points in the dataset then it is not sure for its class label. Not good for low latency systems. Naïve Bayes [17] Simple to understand and If conditional independence of implement. features is False then Naïve Bayes performance degrades. If conditional independence of features is true then Naïve Seldom is used for real-valued Bayes performs very well. features. Useful algorithm for high Easily overfit (means if data dimensions for example text slightly changes model changes classification, email spam. drastically) if you don’t use Laplace smoothing. Extensively used when we have categorical features Run time complexity, training time complexity, run time- space complexity are low. Interpretability is good. 17 Logistic Regression [18] Perform well if classes are If classes are not almost almost linearly separable. linearly separable then logistic regression fails. Model interpretability is easy as we can determine feature If dimensionality is large then it importance. is prone to overfit and has to apply L1 regularize. For small dimensionality, it performs very well, Memory efficient and it has less impact on outliers because of a sigmoid function. Linear Regression [19] Simple to implement. The high impact of outliers. Model Interpretability is easy. Multicollinearity must be removed before applying LR. Perform very well for a linearly separable dataset. Prone to underfitting. The impact of Overfitting can be reduced by using regularization. Support Vector Machine [20] The real strength of SVM is the Not easy to find the right kernel trick, with the right kernel/ appropriate kernel kernel/ appropriate kernel function. function SVM solves complex Training time complexity is problems. high for a large dataset. Very effective when the Difficult to interpret and dimensionality is high. understand the model as we Can do linearly inseparable cannot find feature importance classification with global directly from the kernel. optimal. For RBF with small sigma, outliers have a huge impact on the model. Decision Tree [21] High Interpretability In case of imbalanced data, we have to balance the data and Need not to perform feature then apply DT. standardization or normalization. For large dimensionality time complexity to train DT Feature logical interaction is increases dramatically. inbuild in DT. If a similarity matrix is given, DT naturally extended to do then DT does not work as DT multiclass classification. needs the features explicitly. Feature importance is As depth increases the straightforward in DT. possibility of overfitting Space efficient. increases, interpretability 18 decreases, and the impact of outliers can be significant. Random Forest [22] Robust to outliers. Does not handle large dimensionality very well. Need not to perform feature standardization or Does not handle categorical normalization features with many categories effectively. Feature logical interaction is inbuild in RF. Train time complexity is high. RF naturally extended to do multiclass classification. Feature importance is straightforward in RF. Ensembled Methods [84-91] Captures linear and nonlinear Interpretability of the model relationships in data. reduces due to increased complexity. Robust and stable model. Train time is more. It minimizes noise, bias, and variance. Difficult to select a model to ensemble. Neural Network [23] [74-83] Non-linear program. The required large information for training. Operates with insufficient data. Do not assist mixed variables. Capable of updating and Black box nature. reasoning. 4. Critical Analysis The Comparative analysis shows that the techniques used to find out the student’s performance are quite indecisive as different authors present different results. It is also evident from the comparative analysis of the data that mostly the authors have used supervised learning techniques whereas a few authors have chosen the unsupervised learning techniques for predicting the performance of the students. So, there should be more emphasis on the use of unsupervised learning techniques by the researchers. It shows that the Decision tree is a mostly used technique by authors followed by neural network and regression. It is also evident from the comparative analysis that most authors predicted student’s performance at the university level. 5. Conclusion In this paper, we have reviewed EDM/LA techniques and their strengths and weaknesses for predicting student performance. From the analysis of these papers, we can draw some conclusions. The comparative analysis indicates ambivalent results on techniques that can best predict student’s performance. Asif et al., [37] showed that for predicting student’s performance Naïve Bayes achieved the highest classification accuracy at 83.6%. However, Rodrigues et al., [35] noted that logistic 19 regression outperformed the decision tree (CART), support vector machine, Naïve Bayes with 89.3% prediction accuracy. Moreover, Adejo et al., [36] indicated that the ensembled hybrid model achieved the highest prediction accuracy at 77.69% as compared to DT, ANN, SVM. According to Ramaswami et al., [33] Random Forest outperformed NB, LR, K-NN with 88% prediction accuracy. Baneres et al., [25] noted that SVM achieved the highest prediction accuracy with 92.41% as compared to however it is SV, KNN, CART, NB. Hung et al., [24] noted that deep NN achieved 95.53% prediction accuracy and outperformed RF, SVM. However, it is indecisive which technique predicts the student’s performance more accurately as different authors present different results. It is evident from the reviewed papers that DT (22%) is a mostly used technique by the authors for predicting student’s performance followed by neural network and regression. In addition to Random Forest, SVM, NB, Ensemble methods have also been used. Moreover, it is evident from the data collected for this paper that most authors used supervised learning techniques whereas only a few authors (2%) used unsupervised learning techniques for the prediction of student’s performance. It is an opportunity for the researchers to conduct further research in unsupervised learning techniques. Also, 52% of the papers reviewed have predicted student’s performance at the university level. It would be encouraging for the researcher to apply the same working line of predictive techniques on Blended, VLE, LMS, MOODLE, MOOC environments. 6. References [1] Chae, B. K. (2019). A general framework for studying the evolution of the digital innovation ecosystem: The case of big data. International Journal of Information Management, 45, 83–94. [2] Dhankhar A., Solanki K. (2019). A Comprehensive Review of Tools & Techniques for Big Data Analytics. International Journal of Emerging Trends in Engineering Research, vol 7, No.11, pp: 556-562. [3] Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., & Uthurusamy, R. (Eds.). (1996, February). Advances in knowledge discovery and data mining. American Association for Artificial Intelligence. [4] Romero, C., & Ventura, S. (2010). Educational data mining: a review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 40(6), 601- 618. [5] Dhankhar A., Solanki K., Dalal S., Omdev (2021) Predicting Students Performance Using Educational Data Mining and Learning Analytics: A Systematic Literature Review. In: Raj J.S., Iliyasu A.M., Bestak R., Baig Z.A. (eds) Innovative Data Communication Technologies and Application. Lecture Notes on Data Engineering and Communications Technologies, vol 59. Springer, Singapore. https://doi.org/10.1007/978-981-15-9651-3_11 [6] Petersen, K.; Vakkalanka, S.; Kuzniarz, L. Guidelines for conducting systematic mapping studies in software engineering: An update. Inf. Softw. Technol. 2015, 64, 1–18. [7] Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; Prisma Group. Preferred reporting items for systematic reviews and metaanalyses: The PRISMA statement. BMJ 2009, 6, 1–8. [8] Chui, K. T., Fung, D. C. L., Lytras, M. D., & Lam, T. M.: Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Computers in Human Behavior, 107, 105584 (2020). [9] Waheed, H., Hassan, S. U., Aljohani, N. R., Hardman, J., Alelyani, S., & Nawaz, R.: Predicting academic performance of students from VLE big data using deep learning models. Computers in Human Behavior, 104, 106189 (2020). [10] Xing, W., Chen, X., Stein, J., & Marcinkowski, M.: Temporal predication of dropouts in MOOCs: Reaching the low hanging fruit through stacking generalization. Computers in human behavior, 58, 119-129 (2016). [11] Burgos, C., Campanario, M. L., de la Peña, D., Lara, J. A., Lizcano, D., & Martínez, M. A.: Data mining for modeling students’ performance: A tutoring action plan to prevent academic dropout. Computers & Electrical Engineering, 66, 541-556 (2018). [12] Qiu, L., Liu, Y., & Liu, Y.: An integrated framework with feature selection for dropout prediction in massive open online courses. IEEE Access, 6, 71474-71484 (2018). 20 [13] Qu, S., Li, K., Zhang, S., & Wang, Y.: Predicting achievement of students in smart campus. IEEE Access, 6, 60264-60273 (2018). [14] Olive, D. M., Huynh, D. Q., Reynolds, M., Dougiamas, M., & Wiese, D.: A quest for a one-size- fits-all neural network: Early prediction of students at risk in online courses. IEEE Transactions on Learning Technologies, 12(2), 171-183 (2019). [15] Ramesh, A., Goldwasser, D., Huang, B., Daume, H., & Getoor, L.: Interpretable Engagement Models for MOOCs using Hinge-loss Markov Random Fields. IEEE Transactions on Learning Technologies (2018). [16] https://towardsdatascience.com/machine-learning-basics-with-the-k-nearest-neighbors- algorithm-6a6e71d01761. [17] https://towardsdatascience.com/naive-bayes-classifier-81d512f50a7c [18] https://machinelearningmastery.com/logistic-regression-for-machine-learning/ [19] https://towardsdatascience.com/linear-regression-detailed-view-ea73175f6e86 [20] https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning- algorithms-934a444fca47 [21] https://towardsdatascience.com/decision-tree-in-machine-learning-e380942a4c96 [22] https://towardsdatascience.com/understanding-random-forest-58381e0602d2 [23] https://towardsdatascience.com/understanding-neural-networks-19020b758230 [24] Hung, J. L., Shelton, B. E., Yang, J., & Du, X.: Improving predictive modeling for at-risk student identification: A multistage approach. IEEE Transactions on Learning Technologies, 12(2), 148- 157 (2019). [25] Baneres, D., Rodríguez-Gonzalez, M. E., & Serra, M.: An early feedback prediction system for learners at-risk within a first-year higher education course. IEEE Transactions on Learning Technologies, 12(2), 249-263 (2019). [26] Al-Shabandar, R., Hussain, A. J., Liatsis, P., & Keight, R.: Detecting At-Risk Students With Early Interventions Using Machine Learning Techniques. IEEE Access, 7, 149464-149478 (2019). [27] Tsai, S. C., Chen, C. H., Shiao, Y. T., Ciou, J. S., & Wu, T. N.: Precision education with statistical learning and deep learning: a case study in Taiwan. International Journal of Educational Technology in Higher Education, 17, 1-13 (2020). [28] Alhassan, A., Zafar, B., & Mueen, A.: Predict Students Academic Performance based on their Assessment Grades and Online Activity Data. International Journal of Advances Computer Science and Applications, 11(4) (2020). [29] Alhakami, H., Alsubait, T., & Aliarallah, A.: Data Mining for Student Advising. International Journal of Advanced Computer Science and Applications, 11(3) (2020). [30] Hoque, M. I., kalam Azad, A., Tuhin, M. A. H., & Salehin, Z. U.: University Students Result Analysis and Prediction System by Decision Tree Algorithm. Advances in Science, Technology and Engineering Systems Journal Vol. 5, No. 3, 115-122 (2020). [31] Qu, S., Li, K., Wu, B., Zhang, X., & Zhu, K.: Predicting Student Performance and Deficiency in Mastering Knowledge Points in MOOCs Using Multi-Task Learning. Entropy, 21(12), 1216 (2019). [32] Aljohani, N. R., Fayoumi, A., & Hassan, S. U. (2019). Predicting at-risk students using clickstream data in the virtual learning environment. Sustainability, 11(24), 7238, (2019). [33] Ramaswami, G., Susnjak, T., Mathrani, A., Lim, J., & Garcia, P. (2019). Using educational data mining techniques to increase the prediction accuracy of student academic performance. Information and Learning Sciences. [34] Buenaño-Fernández, D., Gil, D., & Luján-Mora, S.: Application of machine learning in predicting performance for computer engineering students: A case study. Sustainability, 11(10), 2833 (2019). [35] Rodrigues, R. L., Ramos, J. L. C., Silva, J. C. S., Dourado, R. A., & Gomes, A. S.: Forecasting Students' Performance Through Self-Regulated Learning Behavioral Analysis. International Journal of Distance Education Technologies (IJDET), 17(3), 52-74 (2019). [36] Adejo, O. W., & Connolly, T.: Predicting student academic performance using multi-model heterogeneous ensemble approach. Journal of Applied Research in Higher Education (2018.) [37] Asif, R., Merceron, A., Ali, S. A., & Haider, N. G.: Analyzing undergraduate students' performance using educational data mining. Computers & Education, 113, 177-194 (2017). 21 [38] Rubiano, S. M. M., & Garcia, J. A. D.: Analysis of data mining techniques for constructing a predictive model for academic performance. IEEE Latin America Transactions, 14(6), 2783-2788 (2016). [39] Wakelam, E., Jefferies, A., Davey, N., & Sun, Y.: The potential for student performance prediction in small cohorts with minimal available attributes. British Journal of Educational Technology, 51(2), 347-370 (2020). [40] Guerrero-Higueras, Á. M., Fernández Llamas, C., Sánchez González, L., Gutierrez Fernández, A., Esteban Costales, G., & González, M. Á. C.: Academic Success Assessment through Version Control Systems. Applied Sciences, 10(4), 1492 (2020). [41] Al-Sudani, S., & Palaniappan, R.: Predicting students’ final degree classification using an extended profile. Education and Information Technologies, 24(4), 2357-2369, 2019. [42] Zhou, Q., Quan, W., Zhong, Y., Xiao, W., Mou, C., & Wang, Y.: Predicting high-risk students using Internet access logs. Knowledge and Information Systems, 55(2), 393-413. [43] Injadat, M., Moubayed, A., Nassif, A. B., & Shami, A.: Systematic ensemble model selection approach for educational data mining. Knowledge-Based Systems, 105992 (2020). [44] Ashraf, M., Zaman, M., & Ahmed, M.: An Intelligent Prediction System for Educational Data Mining Based on Ensemble and Filtering approaches. Procedia Computer Science, 167, 1471- 1483 (2020). [45] Huang, A. Y., Lu, O. H., Huang, J. C., Yin, C. J., & Yang, S. J.: Predicting students’ academic performance by using educational big data and learning analytics: evaluation of classification methods and learning logs. Interactive Learning Environments, 28(2), 206-230 (2020). [46] Francis, B. K., & Babu, S. S.: Predicting academic performance of students using a hybrid data mining approach. Journal of medical systems, 43(6), 162 (2019). [47] Adekitan, A. I., & Noma-Osaghae, E.: Data mining approach to predicting the performance of first year student in a university using the admission requirements. Education and Information Technologies, 24(2), 1527-1543, 2019. [48] Livieris, I. E., Tampakas, V., Karacapilidis, N., & Pintelas, P.: A semi-supervised self-trained two- level algorithm for forecasting students’ graduation time. Intelligent Decision Technologies, 13(3), 367-378 (2019). [49] Gershenfeld, S., Ward Hood, D., & Zhan, M.: The role of first-semester GPA in predicting graduation rates of underrepresented students. Journal of College Student Retention: Research, Theory & Practice, 17(4), 469-488, 2016. [50] Strang, K. D.: Predicting student satisfaction and outcomes in online courses using learning activity indicators. International Journal of Web-Based Learning and Teaching Technologies (IJWLTT), 12(1), 32-50 (2017). [51] Ellis, R. A., Han, F., & Pardo, A.: Improving learning analytics–combining observational and self- report data on student learning. Journal of Educational Technology & Society, 20(3), 158-169 (2017). [52] Christensen, B. C., Bemman, B., Knoche, H., & Gade, R.: Pass or Fail? Prediction of Students? Exam Outcomes from Self-reported Measures and Study Activities. IxD&A, 39, 44-60 (2018). [53] Yang, S. J., Lu, O. H., Huang, A. Y., Huang, J. C., Ogata, H., & Lin, A. J.: Predicting students' academic performance using multiple linear regression and principal component analysis. Journal of Information Processing, 26, 170-176 (2018). [54] B. Raveendran Pillai, Gautham. J.: Deep regressor: Cross subject academic performance prediction system for university level students “International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-8, Issue-11S (2019). [55] Sothan, S. (2019). The determinants of academic performance: evidence from a Cambodian University. Studies in Higher Education, 44(11), 2096-2111. [56] Moreno-Marcos, P. M., Pong, T. C., Muñoz-Merino, P. J., & Kloos, C. D.: Analysis of the factors influencing learners’ performance prediction with learning analytics. IEEE Access, 8, 5264-5282 (2020). [57] Zhang, X., Sun, G., Pan, Y., Sun, H., He, Y., & Tan, J.: Students performance modeling based on behavior pattern. Journal of Ambient Intelligence and Humanized Computing, 9(5), 1659-1670 (2018). 22 [58] Gitinabard, N., Xu, Y., Heckman, S., Barnes, T., & Lynch, C. F.: How widely can prediction models be generalized? performance prediction in blended courses. IEEE Transactions on Learning Technologies, 12(2), 184-197 (2019). [59] Moreno-Marcos, P. M., Pong, T. C., Muñoz-Merino, P. J., & Kloos, C. D.: Analysis of the factors influencing learners’ performance prediction with learning analytics. IEEE Access, 8, 5264-5282 (2020). [60] Dhankhar, A., Solanki, K., Rathee, A., & Ashish.: Predicting Student’s Performance by using Classification Methods. International Journal of Advanced Trends in Computer Science and engineering. 8(4), 1532-1536, 2019. [61] Evale, D.: Learning management system with prediction model and course-content recommendation module. Journal of Information Technology Education: Research, 16(1), 437- 457 (2016). [62] Tran, T. O., Dang, H. T., Dinh, V. T., & Phan, X. H.: Performance prediction for students: a multi- strategy approach. Cybernetics and Information Technologies, 17(2), 164-182 (2017). [63] Seidel, E., & Kutieleh, S.: Using predictive analytics to target and improve first year student attrition. Australian Journal of Education, 61(2), 200-218 (2017). [64] Kostopoulos, G., Kotsiantis, S., Pierrakeas, C., Koutsonikos, G., & Gravvanis, G. A.: Forecasting students' success in an open university. International Journal of Learning Technology, 13(1), 26- 43, (2018). [65] Jastini Mohd. Jamil, Nurul Farahin Mohd Pauzi, Izwan Nizal Mohd. Shahara Nee.: An Analysis on Student Academic Performance by Using Decision Tree Models, The Journal of Social Sciences Research ISSN(e): 2411-9458, ISSN(p): 2413-6670 Special Issue. 6, pp: 615-620 (2018) [66] Bucos, M., & Drăgulescu, B.: Predicting student success using data generated in traditional educational environments. TEM Journal, 7(3), 617 (2018). [67] Helal, S., Li, J., Liu, L., Ebrahimie, E., Dawson, S., Murray, D. J., & Long, Q.: Predicting academic performance by considering student heterogeneity. Knowledge-Based Systems, 161, 134-146 (2018). [68] Yaacob, W. F. W., Nasir, S. A. M., Yaacob, W. F. W., & Sobri, N. M.: Supervised data mining approach for predicting student performance. Indones. J. Electr. Eng. Comput. Sci, 16, 1584-1592, (2019). [69] Mimis, M., El Hajji, M., Es-Saady, Y., Guejdi, A. O., Douzi, H., & Mammass, D.: A framework for smart academic guidance using educational data mining. Education and Information Technologies, 24(2), 1379-1393, 2019. [70] Huang, A. Y., Lu, O. H., Huang, J. C., Yin, C. J., & Yang, S. J.: Predicting students’ academic performance by using educational big data and learning analytics: evaluation of classification methods and learning logs. Interactive Learning Environments, 28(2), 206-230 (2020). [71] Gutiérrez, L., Flores, V., Keith, B., & Quelopana, A.: Using the Belbin method and models for predicting the academic performance of engineering students. Computer Applications in Engineering Education, 27(2), 500-509 (2019). [72] Crivei, L. M., Ionescu, V. S., & Czibula, G.: An analysis of supervised learning methods for predicting students’ performance in academic environments. ICIC Express Lett, 13, 181-190 (2019). [73] Sadiq, H.M., & Ahmed, S.N.: Classifying and Predicting Students’ Performance using Improved Decision Tree C4.5 in Higher Education Institutes, Journal of Computer Science, 15(9), 1291- 1306. [74] Jorda, E. R., & Raqueno, A. R.: Predictive Model for the Academic Performance of the Engineering Students Using CHAID and C 5.0 Algorithm. International Journal of Engineering Research and Technology. ISSN 0974-3154, Volume 12, Number 6, pp. 917-928 (2019). [75] Vora, D. R., & Rajamani, K.: A hybrid classification model for prediction of academic performance of students: a big data application. Evolutionary Intelligence, 1-14, (2019). [76] Kokoç, M., & Altun, A.: Effects of learner interaction with learning dashboards on academic performance in an e-learning environment. Behaviour & Information Technology, 1-15 (2019). [77] Ramanathan, L., Parthasarathy, G., Vijayakumar, K., Lakshmanan, L., & Ramani, S.: Cluster- based distributed architecture for prediction of student’s performance in higher education. Cluster Computing, 22(1), 1329-1344 (2019). 23 [78] Adekitan, A. I., & Noma-Osaghae, E.: Data mining approach to predicting the performance of first year student in a university using the admission requirements. Education and Information Technologies, 24(2), 1527-1543, 2019. [79] Pal, V. K., & Bhatt, V. K. K.: Performance Prediction for Post Graduate Students using Artificial Neural Network. International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN, 2278-3075 (2019). [80] Guerrero-Higueras, Á. M., Fernández Llamas, C., Sánchez González, L., Gutierrez Fernández, A., Esteban Costales, G., & González, M. Á. C.: Academic Success Assessment through Version Control Systems. Applied Sciences, 10(4), 1492 (2020). [81] Yang, T. Y., Brinton, C. G., Joe-Wong, C., & Chiang, M.: Behavior-based grade prediction for MOOCs via time series neural networks. IEEE Journal of Selected Topics in Signal Processing, 11(5), 716-728 (2017). [82] Waheed, H., Hassan, S. U., Aljohani, N. R., Hardman, J., Alelyani, S., & Nawaz, R.: Predicting academic performance of students from VLE big data using deep learning models. Computers in Human Behavior, 104, 106189 (2020). [83] Coussement, K., Phan, M., De Caigny, A., Benoit, D. F., & Raes, A.: Predicting student dropout in subscription-based online learning environments: The beneficial impact of the logit leaf model. Decision Support Systems, 113325 (2020). [84] Wan, H., Liu, K., Yu, Q., & Gao, X.: Pedagogical Intervention Practices: Improving Learning Engagement Based on Early Prediction. IEEE Transactions on Learning Technologies, 12(2), 278- 289 (2019). [85] Xu, J., Moon, K. H., & Van Der Schaar, M.: A machine learning approach for tracking and predicting student performance in degree programs. IEEE Journal of Selected Topics in Signal Processing, 11(5), 742-753 (2017). [86] Bhagavan, K. S., Thangakumar, J., & Subramanian, D. V.: Predictive analysis of student academic performance and employability chances using HLVQ algorithm. Journal of Ambient Intelligence and Humanized Computing, 1-9 (2020). [87] Kamal, P., & Ahuja, S.: An ensemble-based model for prediction of academic performance of students in undergrad professional course. Journal of Engineering, Design and Technology (2019). [88] Adekitan, A. I., & Noma-Osaghae, E.: Data mining approach to predicting the performance of first year student in a university using the admission requirements. Education and Information Technologies, 24(2), 1527-1543, 2019. [89] Shanthini, A., Vinodhini, G., & Chandrasekaran, R. M.: Predicting Students' Academic Performance in the University Using Meta Decision Tree Classifiers. J. Comput. Sci., 14(5), 654- 662 (2018). [90] https://towardsdatascience.com/ensemble-methods-in-machine-learning-what-are-they-and-why- use-them-68ec3f9fef5f [91] Dhankhar, A., & Solanki, K.: State of the art of learning analytics in higher education. International journal of emerging trends in engineering research, 8(3), 868-877 (2020). 24