=Paper=
{{Paper
|id=Vol-3037/paper4
|storemode=property
|title=Predictive Model for Assigning Exercises to Students in Spreadsheet Functions Using Artificial Neural Networks
|pdfUrl=https://ceur-ws.org/Vol-3037/paper4.pdf
|volume=Vol-3037
|authors=Edwar Saire-Peralta
}}
==Predictive Model for Assigning Exercises to Students in Spreadsheet Functions Using Artificial Neural Networks==
Predictive Model for Assigning Exercises to Students in Spreadsheet Functions Using Artificial Neural Networks Edwar Saire-Peralta 1 1 Universidad Nacional de San Agustín de Arequipa, Av. Independencia s/n, Arequipa, Perú Abstract The objective of the article is the development of a model that allows to predict the exercises that the student can solve, and on the other hand the exercises that the student cannot solve in the course of ®Microsoft Excel basic level with the topics of functions. For the development of the process, artificial neural networks have been used. The model is fed with data such as sex, age, academic grade, parents' level of education, type of school, previous grades of the topics that the student obtains while advancing in the course. The research approach is quantitative, experimental, applied and the population was represented by 85 students. The result shows that the model achieves 72% probability of prediction in the assignment of exercises to students. These exercises could not be solved were provided with an aid for their resolution. Keywords 1 Artificial neural networks, Supervised learning, Data mining, Cross validation 1. Introduction The teaching-learning process is integral, according to [1] points out that, if the conditions of the students are always different, such as the rhythms, ways of learning and starting points of each student, then, what is learned and what is evaluated cannot be standardized, but must be differentiated according to the individual characterization. [2] Indicates that student’s process information according to their capacity, motivation, environment and the guidance provided by the teacher in their learning. Learning rhythms are linked to academic performance, which is determined by personal, family, social and educational factors [3, 4]. A learning session in the classroom is represented by several moments, one of them represents the practice, which mostly aims to have students solve exercises regarding the topic developed. It has been observed that many students have doubts and certain fears when interacting with new learning topics; they are students who find it difficult to adapt to the pace of progress and understanding imposed by the majority of students and even by the teacher. This reality is measurable through the results of the evaluations. [5] Propose that the teacher should work at a safe level of demand, which does not cause discouragement and low grades. According to [6] indicates that it is a mistake to use the same contents, rhythms and evaluation to students, this is a problem because it can cause frustrations and influences the relationship with other students. The situation described is very common in classrooms, and many researches have used predictions to find the most suitable ways to know the student based on certain data about them and give help. 2. Related work CISETC 2021: International Congress on Educational and Technology in Sciences, November 16-18, 2021, Chiclayo, Peru EMAIL: esaire@unsa.edu.pe (A.1) ORCID: https://orcid.org/0000-0002-9526-0205 (A.1) © 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) Related and pre-research work is characterized by the use of one or more classification algorithms. Some research uses as input data those traces or interactions that students have with virtual platforms, others use as inputs those data that are collected through instruments and that are designed at the time. The literature has been reviewed and the opportunity to make predictions with data that arise from the teaching-learning process is observed. In Table 1 we can see a summary of works related to the research. Table 1 Summary of previous Works Authors Title Contribution Opportunities for improvement [7] ICT for education: A system was built that The system was adaptive system based enables the initial fed with data on automatic learning recommendations of collected from mechanisms for the educational content virtual courses, appropriation of appropriate to the both from technologies in high individual characteristics students and school students. of students, administered educators, data according to their analytics and performance and the automatic learning characteristics of the to make territory. predictions and initial recommendations, is limited by the classroom courses, since we do not have the necessary data. [8] Model to predict A predictive model of The data collected academic performance academic performance and used to make based on neural was proposed using data the predictions networks and learning provided by a virtual are from the analytics interaction system with virtual system of students, using learning courses they have, analytics through however, there is artificial neural networks, still an patterns were found that opportunity for were determinant in the improvement if academic performance of personal, social students. and other data of interest to the research are included. [9] Predicting academic Shows a range of To obtain the performance by applying predictions to classify classification, the data mining techniques (pass, fail) prospective data of the students enrolled in a students enrolled course. in the General Data mining techniques Statistics course at were used and results UNALM were were compared using used; however, logistic regression, the factors or decision trees, neural predictor variables networks and Bayesian were selected networks. A prediction based on the data effectiveness of 70% was they already had, achieved. without taking into account, according to research, that there are very influential variables in academic performance, which was not taken into account in their research. [10] Development of a Through the use of Data generated by computerized evaluation artificial neural networks, the same project system using neural an environment of have been used. networks through R and attention to the needs of The results Shiny each student was created obtained were with the use of correct different levels of materials through difficulty for the exercises in their students in the evaluation. This allows to exercises; reduce the feeling of however, there is dissatisfaction and to still room to avoid in many cases the analyze other abandonment of the determining courses. factors and to take into account the levels reached by students in the previous topic, since this process is changing. 3. Problem Students in educational centers are characterized by being unique, singular and belonging to heterogeneous groups. In each learning session the teacher tries to improve his work with the students, especially when developing the practical part, where the teacher usually leaves a set of exercises during the class, which the whole group must solve in a certain time. [11] indicates that a school model where teachers teach the same contents, with the same level of complexity and at the same speed, this school is not attending to the differential needs of the students. It has been observed during the classes that students, when solving the battery of exercises, need support, tutoring, help in some formulas, in their application and syntax. The teacher is regularly confronted with two situations: first, when the student asks the teacher for help or tutoring, time is always pressing, and second, many students need help, but do not ask for it. Diversity refers to heterogeneous groups of students in the classroom. Students are unique and different, through their learning styles, ways of thinking and speed of learning within their limitations [2]. 4. Proposal To address the stated problem, a model has been implemented that allows predicting the assignment of exercises to students on an individualized basis, providing textual help in the exercises that the student cannot solve. In the proposal, a predictive model based on artificial neural networks will be implemented. This algorithm has a set of interconnected elements, where its processing capacity is stored in weight units, this thanks to the adaptation and learning of a set of patterns [12]. The proposed Model has as input the personal and social factors and the academic performance qualifications of the students, which will allow us to classify the students with data mining. With mining we can obtain models that allow discovering patterns and trends regarding student information [13]. The outline of the proposal is shown in Figure 1. Data source : Inputs Questionnary data Evaluation data Construction of the model Classification algorithms Artificial neural networks Predictive model : Output Exercises without help (can solve) Exercises with help (cannot solve) Figure 1: Structure of the Model 4.1. Population The students who participated in the research took the ®Microsoft Excel basic level course. The topics that were developed in the course are mathematical operators, mathematical functions, and statistical functions, among others. There is no filter to enroll students, anyone can take it. We work with heterogeneous groups. In total we worked with 85 students as the population, which also represents the sample. 4.2. Questionnaire data An instrument based on the survey technique was constructed. For the elaboration of the questionnaire instrument, the literature on those factors that influence students in the handling of function subjects was reviewed, in addition to adding other factors contextualized to the problem being addressed. Academic achievement, being multicausal, according to [14] can be grouped into social, personal and institutional determinants. Table 2 shows the factors taken into account for the questionnaire. Table 2 Classification of attributes Individual Academic Socioeconomic Institutional Person with numerical Current occupation skills How many hours a day You were taught do you work Age computer skills in Father's academic level Sex school Mother's academic level Type of school Marital status Experience with Excel Reason for studying Academic degree Excel Hours of study during the day The instrument was validated with a psychologist in Education and a professional in Educational Sciences, the reliability of the instrument was calculated, applying cronbach's alpha, where the result was 0.733, which represents a value of good in reliability. With the validity and reliability obtained, the questionnaire was applied to 85 students who took the course. 4.3. Evaluation data The data collection regarding evaluation represents data from 58 students. The evaluation grades is an indicator that determines the academic performance of students, as stated by [15], where it indicates that academic performance is the level of knowledge that a student has which is reflected in a numerical value, where it measures the result of the teaching and learning process in which the student is the main actor. For each of the 8 topics of the course, exercises were prepared. For each topic, 10 types of exercises were designed. A total of 80 types of exercises were prepared. 5. Application of the Methodology To develop the proposal, the KDD (Knowledge Discovery in Databases) data mining process was followed, as described in [16]. The KDD process is a rare process that allows obtaining information from the data, which is present in a hidden way, initially anonymous and very useful for users or companies [17]. 5.1. Integration and Collection Phase The data sources were merged. The first data source was obtained by applying the questionnaire and the second data source was formed by the evaluation data collection (it was recorded for each topic and type of exercise whether the student could or could not solve the exercise). In the Figure 2, we can see the evaluations recorded. The value of 1 indicates that a student was able to solve one type of exercise and a value of 0 indicates that the student was not able to solve that type of exercise. As mentioned for each topic, 10 types of exercises were designed with labels from letter A to J. Figure 2: Recorded evaluations 5.2. Selection, cleaning and transformation phase The selection phase involved the use of all the data from the questionnaire with the totality of the records collected. The cleaning phase was applied to the assessments with students who did not have assessment scores. Finally, in the transformation phase all the questionnaire data were replaced by numerical codes in order to process the data. The only field calculated was grade with a value from 0 to 20. The two data sources were integrated and linked. Finally, the data were normalized, since there were blunt values, for better processing quality. 5.3. Data mining phase The data mining technique applied to the proposed project is classification. The collected data were separated, where 92% were assigned for training and 8% for model validation. The Artificial Neural Networks algorithm with supervised learning Backpropagation was used. This supervised learning algorithm is based on the repetition of the adjustment of the synaptic weights in the network, with the aim of minimizing the difference in error between the expected and observed results, achieving the most optimal [18]. To obtain the predictive model we used the free tool based on artificial Neural Networks, which is called "Simbrain". In Figure 3 shows the network topology for the first subject. Figure 3: Topology of the first subject Layer 1 represents the input layer (15 attributes of the questionnaire), layer 2 refers to the hidden layer with 20 neurons and finally layer 3 refers to the output layer with 10 answers. To train the model for the second topic (mathematical functions), the topology must now have 16 inputs, which represents the 15 student questionnaire data and the grade obtained for the academic performance of the previous topic (first topic) and so on will increase the inputs for the other topics. There are numerous researches such as those of [19, 20, and 21] have found evidence that the previous performance of their academic performance could condition future results. Table 3 shows results for the first 5 topics. Table 3 Final topology of the proposed model Topic Input layer Hidden layer Output layer Topic 01 15 20 10 Topic 02 16 30 10 Topic 03 17 20 10 Topic 04 18 20 10 Topic 05 19 20 10 5.4. Evaluation and Interpretation Cross-validation was performed with 8% of the records that were initially separated. The results are show in Table 4 for the first item. Recall that the value 1 represents that the student can solve the exercise and the value 0 represents that the student cannot solve the exercise. The prediction on the set of 5 students had a reliability of 72%. Table 4 Model prediction for students Student Model result Expected result Output layer 1 1111111111 1111011100 70% 2 1111111111 1111011101 80% 3 1110000100 1010000111 70% 4 1011011110 1111111101 60% 5 1111011110 1011011111 80% 5.5. Dissemination and Use At the end of the evaluation, teachers as well as students were satisfied with the results, since an assertiveness of 72% was achieved. The strength of the model is to identify those types of exercises where students show difficulty, and it is in this space where the student will be supported. 6. Application and testing Based on the predictive model obtained, its effectiveness was tested by selecting experimental groups (group of students new to the course), to which the predictive model was applied for the first two topics. In addition, control groups were selected (groups of students new to the course) where the first two topics were also developed, but applying the traditional model. A total of 3 experimental groups and 3 control groups were used for testing. In Table 5 we can see the averages obtained by the experimental and control groups, where the average increase was from 13.4 to 17.2. Table 5 Results of the groups Group N° N°. topic Experimental Control Group 01 Topic 01 16.3 13.8 Topic 02 17.4 12.0 Group 02 Topic 01 18.3 15.4 Topic 02 16.8 12.0 Group 03 Topic 01 17.9 15.2 Average 17.2 13.4 To give more reliability support to the obtained predictive model, the averages of the current results of the predictive model were compared with the averages of students from previous years and months (historical data of averages of 3 months of the previous year) and it was seen that the proposed model also improves the averages from 13.3 to 16.4. 7. Discussion and conclusions It is reflected and indicated that, in order to obtain efficient predictive models, it is necessary to work not only with a greater amount of data, but also that these data must be of quality, must be selected by experts in this discipline, data that other researches support. Many times institutions already have data in their virtual systems [22, 23], but we must also measure the quantity of these data against the quality. It has been shown that the proposed model can be successfully used to predict the types of exercises that a student can solve and the types of exercises where he/she shows difficulties. The model was exposed to a cross-validation, which had a prediction close to 72% with respect to the expected results. An increase in their average from 13.49 to 17.29 in their evaluations was observed. This research not only validated the model with 8% of the students, but also tested the model with new groups of students in the institution. This model can be improved by working with more students and more that are related to academic performance, since the learning would be more solid, in addition to adding the characteristic that the exercises should be assigned gradually, i.e., classify the exercises by levels such as basic, intermediate and advanced. 8. References [1] R. Anijovich y G. Cappelletti. La evaluación como oportunidad. Buenos Aires: Paidós, 2017, pp. 21-22. [2] M. L. Méndez, “Diversidad en el aula”, Revista Digital: Innovación y experiencias Educativas, 41, 1(9), 2011. [3] N. Medina, J. Fereira and R. Marzol, “Factores personales que inciden en el bajo rendimiento académico de los estudiantes de geometría”, Telos: Revista de Estudios Interdisciplinarios en Ciencias Sociales, 20(1), 4-28, 2018. URL: https://dialnet.unirioja.es/descarga/articulo/6436353.pdf [4] M. Saucedo, S. Herrera-Sánchez, J. Díaz, S. Bautista and H. Salinas, “Indicadores de reprobación: Facultad de Ciencias Educativas (UNACAR)”. Revista Iberoamericana para la Investigación y el Desarrollo Educativo RIDE, 5(9), 1-11, 2014. doi: https://doi. org/10.23913/ride.v5i9.7 [5] A. Anaya-Durand and C. Anaya-Huertas, “¿Motivar para aprobar o para aprender? Estrategias de motivación del aprendizaje para los estudiantes”. Tecnología, Ciencia. Educación, 25 (1), 5-14, 2010. https://www.redalyc.org/articulo.oa?id=48215094002. [6] J. Tourón, R. Santiago and A. Díez. The Flipped Classroom: Cómo convertir la escuela en un espacio de aprendizaje. Grupo Océano. 2014. [7] A. Otero, W. Rivera, C. Pedraza, and J. Canay. (2019), “TIC para la educación: sistema adaptativo basado en mecanismos de aprendizaje automático para la apropiación de tecnologías en estudiantes de educación media”, Telos: Revista de Estudios Interdisciplinarios En Ciencias Sociales, 21(3), 526–543, 2019. doi: https://doi.org/10.36390/telos213.03. [8] N. Salgado Reyes, J. Beltrán Morales, J. Guaña Moya, C. Escobar Teran, D. Nicolalde Rodriguez, and G. Chafla Altamirano, “Modelo para predecir el rendimiento académico basado en redes neuronales y analítica de aprendizaje”, Revista Ibérica de Sistemas y Tecnologías de Información, 1, 258–266, 2019. https://uvirtual.uce.edu.ec/ [9] C. H. Menacho Chiok, “Predicción del rendimiento académico aplicando técnicas de minería de datos”, Anales Científicos, Vol 78(1), 26, 2017. doi: https://doi.org/10.21704/ac.v78i1.811 [10] J. M. Gutiérrez Cárdenas and F. Casafranca Aguilar. Implementation of a Computerized Assessment System by using Backpropagation Neural Networks with R and Shiny, 2015. http://hdl.handle.net/11354/1087 [11] J. Tourón and R. Santiago, “El modelo Flipped Learning y el desarrollo del talento en la escuela”, Revista de Educación, 368, 196-231, 2015. doi: https://doi.org/10.4438/1988-592X- RE-2015-368-288. [12] K. Gurney. An introduction to neural networks, London, UK: UCL Press, 1997. [13] C. Romero, S. Ventura, E. García, “Data mining in course management systems: Moodle case study and tutorial”, Computers and Education, 51(1), 368-384, 2008. doi: https://doi.org/10.1016/j.compedu.2007.05.016 [14] G. M. Garbanzo Vargas, “Factores asociados al rendimiento académico en estudiantes universitarios, una reflexión desde la calidad de la educación superior pública”, Revista educación, 31(1), 43 – 64, 2007. https://revistas.ucr.ac.cr/index.php/educacion/article/view/1252 [15] S. Cueto. “Una década evaluando el rendimiento escolar. Organización Grupo de Análisis para el Desarrollo”. Lima: GRADE, 2006. [16] J. H. Orallo, M. J. R. Quintana, C. F. Ramírez. Introducción a la minería de datos, Person Educación, S.A. Madrid, ISBN: 978-84-205-4091-7, 2004. [17] U. M. Fayyad, G. Piatetsky-Shapiro and P. Smyth. From data mining to knowledge discovery in Databases: an overview. Ai Magazine. pp. 37-54, 1996. [18] D. E. Rumelhart, G. E. Hinton and R. J. Williams. Learning representations by back-propagating errors. Nature, 323, 533-536, 1996. [19] J. R. Betts and D. Morell, “The Determinants of Undergraduate Grade Point Average. The Relative Importance of Family Background, High School Resources, and Peer Group Effects”, The Journal of Human Resources, Vol. 34 No. (2), 1999. doi: https://doi.org/10.2307/146346 [20] A. Porto and L. Di Gresia: “Rendimiento de Estudiantes Universitarios y sus Determinantes,” Presentado en la Asociación Argentina de Economía Política, 2001 [21] R. A. Naylor and J. Smith, “Determinants of Educational Success in Higher Education,” International Handbook in the Economics of Education, Elgart, 2004. [22] R. Timarán-Pereira, J. Caicedo-Zambrano and A. Hidalgo-Troya, “Árboles de decisión para predecir factores asociados al desempeño académico de estudiantes de bachillerato en las pruebas Saber 11°”, Revista de Investigación, Desarrollo e Innovación, 9(2), 363-378, 2019. doi: 10.19053/20278306.v9.n2.2019.9184. [23] J. Zárate-Valderrama, N. Bedregal-Alpaca and V. Cornejo-Aparicio, “Modelos de clasificación para reconocer patrones de deserción en estudiantes universitarios”, Ingeniare. Revista chilena de ingeniería, 29(1), 168-177, 2021. doi: http://dx.doi.org/10.4067/S0718-33052021000100168.