Introduction

Intelligent Methods and Models for Assessing Level of Student Adaptation to Online Learning

Vasyl Teslyuk

vasyl.m.teslyuk@lpnu.ua 0

Anastasiya Doroshenko

anastasia.doroshenko@gmail.com 0

Dmytro Savchuk

savchukd7@gmail.com 0 0 Lviv Polytechnic National University , S. Bandera str. 12, Lviv, 79013 , Ukraine

The problem of introducing online learning is becoming more and more popular in our society. Due to COVID-19 and the war in Ukraine, there is an urgent need for the transition of educational institutions to online learning, so this paper will help people not make mistakes in the process and afterward. The paper's primary purpose is to investigate the effectiveness of machine learning tools that can solve the problem of assessing student adaptation to online learning. These tools include intelligent methods and models, such as classification techniques and neural networks. This work uses data from an online survey of students at different levels: school, college, and university. The survey consists of questions such as gender, age, level of education, whether the student is in the city, class duration, quality of Internet connection, government/non-government educational institution, availability of virtual learning environment, whether the student is familiar with IT, financial conditions, type of Internet connection, a device used for studying, etc. To obtain the results on the effectiveness of online education were used the following machine learning algorithms and models: Random Forest (RF), Extra Trees (ET), Extreme, Light, and Simple Gradient Boosting (GB), Decision Trees (DT), K-neighbors (K-mean), Logistic Regression (LR), Support Vector Machine (SVM), Naїve Bayes (NB) classifier and others. An intelligent neural network model (NNM) was built to address the main issue.

eol>Prediction classification data dataset datastore feature importance Random Forest SMV Logistic Regression Extra Trees Gradient Boosting Neural Network model

Introduction

Technological progress does not stand. Still, each professional field is gradually developing, and online learning has not been spared either. Therefore, the first appearance of COVID-19 in the world and in Ukraine significantly pushed this direction. Due to this, traditional education has undergone significant changes over the past few years. There was a transition from simple paper and pen sheets to online forms and a keyboard. Now instead of listening to the teacher directly in the classroom, people do it in a Google classroom.

Usually, the transition to online learning is the only possible option. Before COVID-19, students were not sufficiently aware of online education, but after its sudden appearance, they are still trying to adapt. However, the process and its results are not always good, although the majority succeeded, confirming the quality of online learning. The transition can depend on many factors, namely: the level of education of the student, the presence of urbanization around the student, the length of online classes, the presence of a virtual learning environment (LMS), financial conditions, the type of Internet connection or device on which the student is studying. The available dataset used in the work shows that only 8.3% of students can adapt well to online learning, and the remaining 91.7% have a medium or low level of adaptation [ 1 ]. Lack of a permanent internet connection, digital skills, electricity problems, poor network connectivity, proper tools, and instructions are the main obstacles to online education during the COVID-19 pandemic or other events that may challenge the very existence of the state and its society.

Therefore, this paper was carried out and researched to predict the level of adaptation to the online education system.

Since the object of the research is data mining processes that solve classification problems, it is necessary to find out what classification is, how it is performed, and by what methods.

Usually, there are only two forms of data analysis with which you can get models that will describe classes or predict the future data trend. These two types are as follows:  Classification;  Prediction.

Researchers use classification and prediction to obtain models that predict the future data trend and how it will spread and change.

Thus, classification models predict categorical labels of classes, while predictive models predict continuous functions (numerical values). Sometimes, the number of labels can exceed two (as in our case), which means that we are dealing with non-binary (multi-class) classification. The main essence of classification is to determine the category or class label of a new value (observation).

Therefore, to start building a model, it is necessary to determine whether using the existing data set to create a classification model is appropriate. First, it is essential to establish whether there is enough data, i.e., its size (number of observations), and then whether the description of these observations is sufficient (number of independent characteristics).

Sometimes with a sufficiently small number of observations and, in turn, with an insufficient number of independent characteristics, obtaining good accuracy and quality of the model can be challenging.

The next step is to choose the classification model itself. As a result, the following types were used in work: RF, ET, XGBOOST, LIGHTGBM and GB, DT, K-mean, LR, SVM, NB, and NNM. Finally, the classification results have been compared. This is done by drawing the following types of charts: ROC curve, scatter matrix, distribution of features, their importance, and others - which in turn will help to present the results more clearly. 2. 2.1.

Data Preparing Initial Dataset

The first step is to look at the original data set. We need to look at how many observations it contains, what characteristics it has, and its desired variable.

In our case, the dataset was collected after a survey of students. It consisted of various questions, such as gender, age, level of education, whether they live in the city, duration of the lecture, quality of the Internet connection, type of educational institution, availability of LMS, whether the student is familiar with IT, financial conditions, type of Internet or device they use, and their level of adaptation. The total number of students surveyed is 1205, and the number of observations in the dataset also equals this number [ 1 ].

Detailed description of the labels of the target class is given below in Table 1:

Now let's look at the independent characteristics of our dataset in more detail Table 2:

The next step is to create graphs that show how variables are distributed in the dataset (Fig. 1 and Fig. 2). This can be achieved by building a histogram that shows the number of unique characteristics for each independent variable.

Now it is advisable to plot the distribution of characteristics depending on the target class. Therefore, below from Fig. 3. and Fig. 4., we can see the distribution of characteristics values depending on the level of students' adaptation (low, medium, high). These dependencies are built for such characteristics as age, gender, level of education, type of institute, IT studying, student's location, load shedding, economic conditions, type of Internet and network, length of classes, availability of LMS, and type of device the student is studying on. The following conclusions can be drawn from these diagrams (Fig. 3 and Fig. 4):  The level of students' adaptation does not depend on their gender, as the distribution is almost equal for women and men.

 Students aged 21-25, 11-15, and 1-5 are the best adapted, and the worst are those aged 26-30, 16-20, and 6-10.

 Students of universities and schools are better adapted to online education, especially when they are non-government-type.

 Studying students in IT does not increase the effectiveness of online learning.  A student's stay in the city improves the quality of online learning.  A high load-shedding negatively affects students' adaptation.  Wealthier students generally adapt better than other students.  The type of Internet connection and network have a minor impact on student adjustment.  The best performers are students with 1 to 3 hours of daily classes.

 Other characteristics have almost no impact on the quality of online education. 2.3.

Target Class Research

2.4.

Correlation Matrix

It is also necessary to build a correlation matrix to understand how the characteristics depend on each other. The result of building this type of diagram is shown below in Fig. 6.

From this chart, we can see that the most dependent characteristics are the age and financial conditions of the student. However, for others, this relationship is relatively tiny.

This may indicate that no further operations on the dataset's characteristics need to be performed. Such operations may include principal component analysis (PCI) and different types of normalization or regularization (L1 or L2). 3. 3.1.

Model Creation and Classification Model Types

The following machine learning algorithms and models were used to obtain results on the effectiveness of online education implementation [ 3 ]: Decision Tree, Random Forest, Extra Tree, extreme, light, and normal Gradient Boosting, K-Neighbors, Logistic Regression, Ada Boosting, Support Vector Machine, Naїve Bayesian Classifier, Ridge, Dummy, Linear and Quadratic Discriminants, and finally a Sequential Neural Network—more details in Table 3.

Before starting to build models, it is necessary to divide the data set into two parts: data for modeling and data that can be used for prediction. As mentioned before, the total number of students surveyed is 1205, so the final split is nine to one, which equals 1084 observations for modeling and 121 for prediction, as shown below in Fig. 7.

The next step is to split the modeling data into training and test data. This is done in two ways: for the classification and neural network models.

Thus, 70% of all observations were used for training data and only 30% for test data. 3.3.

Model with Classifiers

Fig. 8. shows a typical model creation and training process using standard classifiers. As we can see, there were ten runs of the classifier. The average accuracy after these runs is 87%.

The neural network model used in this paper is shown below in Fig. 10.

From Fig. 10. we can see that the neural network is sequential and has four layers: one input layer with thirteen neurons (for each independent feature), two hidden layers with one hundred neurons each, and one output layer with three neurons, each of which is responsible for the label of the desired class. In particular, it should be noted that training takes place in 50 epochs.

The classification results using a neural network are shown below.

Model Comparison and Result Analysis

From Fig. 12., we can see that the best classification models were: Random Forest with 87.85% accuracy, Extra Trees with 87.72% accuracy, and Extreme Gradient Boosting with 87.59% accuracy. And the worst models for the given task are Naїve Bayes - 26.91% prediction efficiency, Quadratic Discriminant - 53% accuracy, and Dummy classifier - 55.67%.

Other classification quality metrics provided the same assessment as the ROC curve (AUC), Recall, Precision, F1 value, Cohen's kappa, and Matthew's correlation (MCC).

As a result, we can see that the best model is the self-built neural network, which provided a classification accuracy of 91% (Fig. 11).

Permutation Feature Importance of Models

In this paper, we investigated two types of feature importance: for a classification model and a neural network.

Feature Permutation Importance is a tool that helps explain models and describes its behavior in machine learning. This diagram summarizes and evaluates the importance of features [21]. This is done by studying their impact on the classification quality of the created model.

The following features of the characteristics can be seen from the bar graph above (Fig. 13). 1. The most important characteristics are the first seven:  Length of daytime study up to one hour.  Gender: Male;  Load-shedding: Low;  Type of institution: Government;  Location: not in the city;  Presence of LMS;  Belonging to IT - yes; 2. Characteristics of lesser importance include:  Length of daytime study up to one hour.  Type of internet connection WiFi;  Length of daytime study from 1 to 6;  Low financial conditions of the student.  Phone as a device for learning.  Age from 21 - 25.

The following histogram (Fig. 14.) is slightly different from the one in Fig. 13. because it focuses on showing the importance of the features for each epoch of neural network training. These values allow us to calculate the average feature importance values (Y) and standard deviation (X).

Here, the X-axis represents the influence of a variable on the quality of the classification model. And the Y-axis is a set of characteristics ordered by importance. For example, the most important are: financial conditions, gender, type of internet, LMS, and length of classes, and the least important are: load-shedding, stay in the city, and device used by the student.

5. Conclusions

The paper investigates the effectiveness of machine learning tools that can solve the problem of assessing the level of student adaptation to online learning. These tools include intelligent methods and models, including classification methods and neural networks. The best performers were the Random Forest and the Sequential Neural Network, with results of 88% and 91% of accuracy.

The work can help research and find the most effective intellectual model or method to perform the task, which is to assess the level of adaptation of students to online education. The main practical value of the work is that the developed system allows using a few simple questions with high quality and accuracy to assess the level of adaptation of students to online learning. This, in turn, will help stakeholders in the process of implementing and applying online education to know how effective it will be.

The main perspective for developing the work includes the fact that the system can provide much better results after creating a more extensive set of data, which requires a survey of more students. This will help in the creation of models and their training. In particular, the system is flexible, which allows it to implement more classification models if necessary.

Acknowledgements

This work was realized within the framework of the program Erasmus+ Jean Monnet Module «Augmented Reality for Education: implementation of European experience» (101085772 — AR4EDU — ERASMUS-JMO-2022-HEI-TCH-RSCH). [12] Yıldırım S. Naive Bayes Classifier – Explained. Medium. URL: https://towardsdatascience.com/naive-bayes-classifier-explained-50f9723571ed (date of access: 02.02.2023). [13] H. Luo and Y. Liu, "A prediction method based on improved ridge regression," 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 2017, pp. 596-599, doi: 10.1109/ICSESS.2017.8342986. [14] Tezcan B. Why Using a Dummy Classifier is a Smart Move. Medium. URL: https://towardsdatascience.com/why-using-a-dummy-classifier-is-a-smart-move-4a55080e3549 (date of access: 02.02.2023). [15] J. Ghosh and S. B. Shuvo, "Improving Classification Model's Performance Using Linear Discriminant Analysis on Linear Data," 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kanpur, India, 2019, pp. 1-5, doi: 10.1109/ICCCNT45670.2019.8944632. [16] E. Pȩkalska and B. Haasdonk, "Kernel Discriminant Analysis for Positive Definite and Indefinite Kernels," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 6, pp. 1017-1032, June 2009, doi: 10.1109/TPAMI.2008.290. [17] A. Doroshenko and R. Tkachenko, "Classification of Imbalanced Classes Using the Committee of Neural Networks," 2018 IEEE 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT), Lviv, Ukraine, 2018, pp. 400-403, doi: 10.1109/STC-CSIT.2018.8526611. [18] Performance Measures: Cohen's Kappa statistic. The Data Scientist. URL: https://thedatascientist.com/performance-measures-cohens-kappa-statistic/ (date of access: 02.02.2023). [19] sklearn.metrics.matthews_corrcoef. scikit-learn. URL: https://scikitlearn.org/stable/modules/generated/sklearn.metrics.matthews_corrcoef.html (date of access: 02.02.2023). [20] What is a Correlation Matrix? Displayr. URL: https://www.displayr.com/what-is-a-correlationmatrix/ (date of access: 02.02.2023). [21] Feature Permutation Importance Explanations – ADS 1.0.0 documentation. Moved. URL: https://docs.oracle.com/en-us/iaas/tools/adssdk/latest/user_guide/mlx/permutation_importance.html (date of access: 02.02.2023).

[1]

Hasan Suzan ,

N. A.

Samrin ,

A. A.

Biswas and

Pramanik , "Students' Adaptability Level Prediction in Online Education using Machine Learning Approaches," 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur , India, 2021 , pp. 1 - 7 , doi: 10.1109/ICCCNT51525. 2021 . 9579741 .

[2] What is a correlation matrix?. Educative: Interactive Courses for Software Developers . URL: https://www.educative.io/answers/what-is -a-correlation-matrix (date of access: 02 . 02 . 2023 ).

[3]

Savchuk and

Doroshenko , "Investigation of machine learning classification methods effectiveness," 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT) , LVIV, Ukraine, 2021 , pp. 33 - 37 , doi: 10.1109/CSIT52700. 2021 . 9648582 .

[4] What is a Decision Tree | IBM . IBM - Deutschland | IBM. URL: https://www.ibm.com/topics/decision-trees (date of access: 02.02 . 2023 ).

[5] Introduction to Random Forest in Machine Learning. Engineering Education (EngEd) Program | Section . URL: https://www.section.io/engineering-education/ introduction-to-random-forest-inmachine-learning/ (date of access: 02.02 . 2023 ).

[6] How to Develop an Extra Trees Ensemble with Python - MachineLearningMastery.com . MachineLearningMastery.com. URL: https://machinelearningmastery.com /extra-trees-ensemblewith-python/ (date of access: 02 . 02 . 2023 ).

[7]

Gradient

Boosting. WallStreetMojo . URL: https://www.wallstreetmojo.com/gradient-boosting/.

[8] Christopher A. K-Nearest Neighbor . Medium. URL: https://medium.com/swlh/k-nearestneighbor-ca2593d7a3c4 (date of access: 02.02 . 2023 ).

[9] Thorn

J. Logistic Regression

Explained . Medium. URL: https://towardsdatascience.com/logisticregression-explained-9ee73cede081 (date of access: 02.02 . 2023 ).

[10]

Yazhi

Gao ,

Rong ,

Shen and

Xiong , "Convolutional Neural Network based sentiment analysis using Adaboost combination," 2016 International Joint Conference on Neural Networks (IJCNN) , Vancouver, BC, Canada, 2016 , pp. 1333 - 1338 , doi: 10.1109/IJCNN. 2016 . 7727352 .

[11]

José

Luis Rojo-Álvarez; Manel Martínez-Ramón; Jordi Muñoz-Marí; Gustau Camps-Valls, "Support Vector Machine and Kernel Classification Algorithms," in Digital Signal Processing with Kernel Methods , IEEE, 2018 , pp. 433 - 502 , doi: 10.1002/9781118705810.ch10.