Perceptions Regarding Online Assessment in Times of COVID-19: a Study of University Students in Lima Iván Montes-Iturrizaga1*, Gloria María Zambrano Aranda1, Yajaira Licet Pamplona-Ciro2, Klinge Orlando Villalba-Condori3 1 Pontificia Universidad Católica del Perú 1, Av. Universitaria 1801, San Miguel, Lima,15088, Perú 2 Universidad Continental 3, Av. San Carlos 1980, Urb. San Antonio, Huancayo, 12000, Perú 3 Universidad Catolica de Santa Maria, San Jose S/N, Arequipa, Perú Abstract The assessment of learning in the classroom requires relevant evidence (performance tests as opposed to answer selection tests) in order to establish adequate feedback processes. Thus, it is considered that performance tests such as oral, essay or case exams would offer an adequate guideline to meaningfully address the study materials. In this context, a quantitative study was conducted to characterize the perceptions and preferences regarding the tests that teachers apply. The sample consisted of university students (n = 240) from a private university in Lima. A questionnaire was applied through Google Forms to students who were doing their studies online (exceptionally) due to COVID-19. It was found that students were evaluated, mostly, with multiple choice tests with rote questions and reasoning questions. In addition, it was found that students recognize that essay exams were the most important for their training; but they preferred multiple choice exam. Keywords Assessment of learning, multiple choice exam, essay exam, college students, online education, COVID- 19 1 1. Introduction The assessment of learning in the classroom is one of the most relevant processes to lead students towards the achievement of the competencies, skills, contents or abilities included in a curriculum or program [1], [2]– [4]. This is because evaluation, if it is formative, should go hand in hand with two relevant components. The first of these is associated with the quality of the evidence of learning that we generate through the review of assignments, oral participations and exams that are applied throughout a semester or academic cycle [5]. This quality of evidence is directly associated with the meaningfulness, realism and contextual validity of the assessment conditions set for students [2], [3] [6]. For example, an essay exam will have a closer correspondence to the intelligent demands of a given professional field as opposed to one where students must select the correct answer purposely from a list of alternatives [1], [2], [7]. The second component or deployment is related to the possibility that university professors use this evidence (hopefully those that obey performance exams) to provide feedback on their teaching and learning processes. In this way, we could say that we would be facing an evaluative process of formative nature; otherwise, we would be measuring or contemplating how students are progressing in terms of their learning [8]. For this reason, it could be affirmed that formative assessment is a process oriented to students learning what they should learn. In other words, evaluation understood in this way is aimed at learning; this is in total opposition to the practices that understand evaluation as a simple verification of what has been learned. JINIS 2023: XXX International Conference on Systems Engineering, October 03–05, 2023, Arequipa, Peru imontes@uc.cl (I. Montes-Iturrizaga); gzambrano@pucp.edu.pe (G. Zambrano Aranda); ypamplona90@gmail.com (Y. Pamplona-Ciro); kvillalba@ucsm.edu.pe (K. Villalba-Condori) 0000-0002-9411-4716 (I. Montes-Iturrizaga); 0000-0001-6021-5757 (G. Zambrano Aranda); 0000-0001-9024- 4444 (Y. Pamplona-Ciro); 0000-0002-8621-7942 (K. Villalba-Condori) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings On the other hand, it is worth mentioning that Peruvian universities in the framework of the current reform generated by the current University Law 30220 of 2014 have seen the need to deploy greater efforts for better teaching. Thus, and under these influences, assessment would have improved significantly thanks to the development of courses, workshops and specializations in university teaching. A series of very determined actions had thus been initiated to optimize the teaching skills of professors and to investigate the impact of the measures undertaken; all this in a context of greater competitiveness of Peruvian universities to conquer better positions in terms of scientific production, respectability and expectant location in the spectrum of higher education [9]. However, these dynamics underwent abrupt changes and modifications as a result of the COVID-19 pandemic [2]. In this case, it is not that improvements in teaching had been stopped, but rather, more urgent issues had to be addressed at that time, such as: the acquisition of platforms for online teaching, teacher training in the use of these platforms and the transfer of teaching materials to a virtual space little known by the vast majority of teachers [7]. Likewise, programs had to be designed to establish teaching skills through digital platforms. Some of these training spaces would have been developed under a face-to-face education logic and others from a perspective closer to engineering than to pedagogy. In other cases, situations have also been reported where some university owners ordered the merger of several sections or classrooms into one; and where a professor previously used to teach 25 or 35 students went from one moment to another to have more than 100 students in a single virtual classroom [2] [3]. This would have occurred to a greater extent in private for-profit universities. In addition, these facts would have occurred in other latitudes with respect to the excessive increase in the number of students per classroom [10]. Thus, the above-mentioned situation would have had a significant impact on the quality of teaching and assessment itself; and where, since it is practically impossible to assess 100 or 125 students through essay exams, an excessive use of multiple-choice exams would have been resorted to. At the same time, it is to be expected that - in other cases - teachers have transferred the good or bad of their face-to-face teaching (and their ways of assess) to the online modality. But this is also a problem for students, a significant percentage of whom graduate from secondary education without having achieved the basic competencies needed to enter the university world. This translates into problems in oral expression, in the search for reliable information, in writing and at the level of thinking in general, among others. This is largely explained by the results of the 2018 PISA (Peru), which revealed the serious deficiencies of our young people who are close to finishing high school. In addition, recent studies report the existence of problems that have become pandemic, such as plagiarism, lack of academic integrity, the increase of artificial grading, the difficulty of grading group work, and the absence of face-to-face relationships between professors and students[11], [10]. But also, the evidence points to favorable perceptions of inclusion, well-being and satisfaction with online assessment in the context of pandemic-relevant evidence [10], [12]. On the other hand, other studies report that in pandemics, scores would have risen considerably on the tests [13] and that student recognized that the online modality demanded a greater effort, but with important advantages [14]. It should be noted that most of these studies are oriented to compare scores before and during the pandemic, but by means of multiple-choice exams[15]; The number of research studies (such as this one) that analyze the types of exams used in online classes is smaller in number. This, under a formative assessment perspective, and where the type of exams matters especially given the need to provide feedback to students [16]. Other studies have reported the need to transcend the traditional multiple-choice exam that would have been used during this health crisis with insistence as well [10], [13], [16]–[18]. This scenario described above has become the object of study in numerous recently published works that highlight the relevance of studying the evaluation process, as well as its instruments, tools and conditions. By virtue of the aforementioned, the general objective of the present research is to study the perceptions and preferences of university students with respect to the assessment of their learning in the classroom in the context of the emergency online teaching given during the pandemic by COVID-19 [5], [16], [19], [20]. 2. Methods A quantitative approach study was developed using the survey method and the questionnaire was constructed as an instrument [21]. In terms of its characteristics, this research is observational (non-experimental), transversal and descriptive, comparative and correlational. The sample consisted of 240 undergraduate university students from a private for-profit university that serves vulnerable socioeconomic sectors of Metropolitan Lima (Peru). There were 181 (75.4%) women and 54 (24.6%) men. In terms of the degree, they came from: 9 from Nutrition (3.7%), 55 from Pharmacy 55 (22.9%), 8 from Law (3.3%), 33 from Administration (13.8), 23 from Marketing (9.6), 23 from Accounting (9.6%), 48 from Psychology (20%) and 41 from Nursing (17.1%). Ages ranged from 16 to 54 years; with a mean of 25.4 and a standard deviation of 8.105. An instrument was elaborated where in the first part identification data such as sex, age and degree program were collected. The second part comprises 11 response selection items that explore students' perceptions regarding the type of tests (and their emphases) that were mostly applied by their university professors. Also, this instrument explores test preferences and their considerations regarding the most relevant instruments for their professional training. It should be specified that this self-administered instrument was applied thanks to Google Forms during the second wave of the health crisis by COVID-19. It is also important to note that at all times the perceptions of the evaluation practiced by their teachers in virtual emergency education in Peru were questioned. This instrument had content validity, determined through the participation of 3 expert judges. The questionnaire (anonymous) was answered under informed consent and voluntarily. Statistical analyses were performed in SPSS version 27 for Windows, including descriptive, comparative and correlational reports. 3. Results This study focuses on characterizing, from a quantitative approach, the perceptions that students at a university in Lima (Peru) have regarding the tests used by their professors in the context of emergency virtual education due to the health crisis caused by COVID 19. This, by virtue of the fact that each of the professional training programs would present their own habits, idiosyncrasies and evaluative practices that would be associated with their own pedagogical traditions. In this way, these degree-specific emphases (and those that are common) would be installed in the teaching staff, who have the power to decide which tests they will use (and which ones they exclude) in their classrooms. By virtue of the above, it can be seen in Table 1 that the students indicated that they would have had to answer a greater number of response selection tests (or the so-called "objective tests"). It is necessary to point out that law students are the ones who considered that these tests were less used on them (37.5%); and where the rest of the students of all the programs considered that they were mostly applied these instruments. In this panorama, psychology (91.7%) and nutrition (89.9%) students considered that these response selection tests (PS) were the ones most applied by their professors. Table 1 Types of tests applied by professors from the students' perception according to degree program Multiple-choice Essay exam Oral Exam Case-Based exam exam Degree program f % f % f % f % Administration 24(72,7) 26(78,8) 14(42,4) 19(57,6) Marketing 13(59,1) 17(77,3) 4(18,2) 11(50,0) Accounting 17(73,9) 19 (82,6) 7(30,4) 12(52,2) Law 3(37,5) 7(87,5) 3(37,5) 4(50,0) Nursing 36(87,8) 25(61,0) 18(43,9) 21(51,2) Pharmacy 48(87,3) 37(67,3) 23(41,8) 24(43,6) Nutrition 8(89,9) 7(77,8) 3(33,3) 2(22,2) Psychology 44(91,7) 35(77,8) 26(54,2) 15(31,3) Also, the purpose was to know how the students considered the answer selection tests (PS) applied by their professors and according to degree program. In this item, which is expressed as a result in Table 2, only one alternative could be marked and the students were asked about how these tests (PS) were applied by their professors (in general terms). It was found that the most frequently applied PS tests combined memoristic items (because they demanded irrelevant data to be memorized) and those oriented to thinking or reasoning. This is followed by PS tests with items for understanding and thinking (reasoning) and, to a lesser extent, exclusively memoristic tests, where much lower percentages (from 0.0% to 9.8%) are evidenced for all the degree programs. It is important to mention, and as something favorable, that no student of psychology, nutrition and law considered that there were exclusively memoristic answer selection tests in their university courses. As for the PS more exclusively oriented to thinking or reasoning, it was found that these were applied to a greater extent (from the students' perception) in nutrition (44.4%), administration (39.4%) and law (37.5). Table 2 Emphasis on multiple-choice exam perceived by students according to degree program Rote exam + thinking Rote exam Exam for thinking Degree program f % f % f % Administration 18(54,5) 2(6,1) 13(39,4) Marketing 15(65,2) 2(8,7) 6(26,1) Accounting 16(69,6) 1(4,3) 6(26,1) Law 5(62,5) 0(0,0) 3(37,5) Nursing 25(61,0) 4(9,8) 12(29,3) Pharmacy 41(74,5) 1(1,8) 13(23,6) Nutrition 5(55,6) 0(0,0) 4(44,4) Psychology 29(60,4) 0(0,0) 19(36,6) We were also interested in finding out the preferences of students in all degree programs for two types of instruments to evaluate their learning throughout their professional training (Table 3): essay tests (ET) and response selection tests (RS). Thus, we found that the preferences of university students pointed more to the PS than to the PE. It is worth mentioning that this preference for PS was present in all degree programs except accounting (26.1%) and marketing (43.5%). Likewise, we identified that nursing (68.3%) and pharmacy (65.5%) students were the ones who preferred PS to a greater extent. Table 3 Students' preferences regarding the type of exams according to degree program Essay exam Multiple-choice exam Degree program f % f % Administration 15(45,5) 18(54,5) Marketing 13(56,5) 10(43,5) Accounting 17(73,9) 6(26,1) Law 4(50,0) 4(50,0) Nursing 13(31,7) 28(68,3) Pharmacy 19(34,5) 36(65,5) Nutrition 4(44,4) 5(55,6) Psychology 22(45,8) 25(52,1) However, when students were asked about the most important tests for their university education, we found that in all the degree programs the essay tests stand out (Table 4). In other words, PS are preferred in almost all degree programs knowing that they would not be as important or would be the least relevant for their preparation for the professional field. In this case, it should be noted that 85% of the students at this university come (according to data from its academic office) from public schools with the lowest learning results in Metropolitan Lima (Peru) as measured by the tests applied by the Ministry of Education. Thus, as is well known, the students who graduated from these basic education schools evidenced poor study habits, less ability to understand what they read and incipient resources to express themselves adequately in writing. Thus, this paradox found would be revealing a clear understanding that the essay tests (PE) are the most important; but they are not preferred because they would be anticipating problems to be able to face them successfully given the deficiencies they would bring from high school or middle school. Table 4 Perceptions of students regarding the most important types of exams for their university education according to degree program Essay exam Multiple-choice exam Degree program f % f % Administration 28(84,8) 5(15,2) Marketing 19(82,6) 4(17,4) Accounting 16(69,6) 7(30,4) Law 7(87,5) 1(12,5) Nursing 28(68,3) 13(31,7) Pharmacy 34(61,8) 21(38,2) Nutrition 6(66,7) 3(33,3) Psychology 31(64,6) 16(33,3) We also wanted to know whether age (ordinal variable) was associated with the preference for response selection tests or essay tests. In this case, there was no association between these variables (χ² = 0.430, df = 1, p > 0.512). Similarly, we did not find a significant association between the age of the students and the consideration of the most relevant tests for professional training (χ² = 1.566, df = 1, p > 0.211). In this sense, it was determined that the marked preferences for response selection tests and the consideration of essay tests as the most relevant for vocational training were not associated with age. In other words, in all age groups the perceptions are similar. Along the same lines, we found that the sex of the students was not associated with preferences for either response selection or essay tests (χ² = 3.175 df = 2, p > 0.204). Along the same lines, we found that there was no association between the sex of the students and the consideration of developmental or essay (EP) tests as the most important for their professional training (χ² = 1,500, df = 2, p > 0.472). 4. Discussion This quantitative research was oriented to explore the perceptions and preferences regarding classroom evaluation in students of a private for-profit university, which preferentially serves vulnerable populations. In this context, the figures of this university report 15% of students who are in poverty and 85% of them have completed their high school studies in public schools in the peripheral areas of the city of Lima. This data is very important because we are probably dealing with a student population (represented in the sample) with largely unsatisfactory school experiences and poor cultural stimuli (books, reflective dialogues and student models). This fact has most probably generated the paradox found that reveals that PS tests are preferred and, at the same time, essay tests (EP) are recognized as the most adequate for university education. Given this, it is possible that for these students it is more complicated to have to write and write an answer because, most likely, in basic education they had few opportunities to develop this competence. On the other hand, answering an answer selection test -even a memorized one- would be more within reach given the low cognitive demand of these tests; even more so when there are still teachers in much of the Peruvian educational system who are more focused on having their students retain large amounts of data, dates, principles and formulas in their memory.[2], [7], [16], [19]. In addition to this, the perceived emphasis on the tests that would be applied by the university professors in the study, where it was clearly identified that the most frequent instruments would be those of response selection (PS) and, within these, those that combine memorized items with those that explore thinking or reasoning. [16]. In any case, it will be necessary to develop qualitative studies aimed at knowing the perceptions, attributions and reasons that professors have for using some types of tests and not others. The latter would have to consider the professors of the specialty courses of each of the degrees; since in the current study it has been found that not only the students would be different, but also the professors in terms of the evaluative process[22]. Also, other variables or characteristics of the students should be considered in future studies in order to discriminate more clearly the phenomena under study (even more so when we only found the degree program as a relevant variable). For example, considering the semester or cycle and the type of subject (general studies and specialty) will make it possible to characterize and understand more precisely the perceptions regarding the tests applied in the classroom. [22]. But also, it would be pertinent to study the university professors themselves to learn about the tests they apply and the reasons they have for considering them for their work in the classroom. Finally, it is important to note that we have no previous studies on classroom assessment prior to the pandemic, which would have played a baseline role in establishing reliable comparisons to indicate whether assessment in virtual emergency education is the same or different from that practiced in face-to-face settings. Nor have we considered evidence of academic performance in the students in our sample: either their grades or their scores on tests administered by their teachers. However, it is important to clarify that in this study we are focusing on perceptions about the tests applied by teachers in their classes and on their preferences in terms of assessment [15]. In view of the above, and although we can hypothesize that classroom evaluation would have suffered a setback in this health crisis at the university we have studied, we do not have a study similar to the present one that could serve as a relevant comparator to understand evaluation in these times [23]. Acknowledgements Special thanks are extended to Professor Christian Pérez Sánchez for his collaboration in the application of the questionnaire within the framework of this study. Gratitude is also expressed to each of the university students who agreed to answer the questionnaire. References [1] M. Z. Joya Rodríguez, “La evaluación formativa, una práctica eficaz en el desempeño docente,” Revista Scientific, vol. 5, no. 16, pp. 179–193, 2020, doi: 10.29394/scientific.issn.2542- 2987.2020.5.16.9.179-193. [2] I. Montes-Iturrizaga, “La evaluación en la universidad en tiempos de la virtualidad: ¿retroceso u oportunidad?,” Revista Signo Educativo, Lima, Dec. 2020. [3] I. Montes-Iturrizaga, Evaluación Educativa: reflexiones para el debate. Madrid: UDL Editores, 2020. [Online]. Available: https://www.amazon.com/-/es/Iván-Montes-Iturrizaga- ebook/dp/B08KRZ5DJ5 [4] R. Jáuregui, L. Carrasco, and I. Montes, “Evaluando, evaluando:¿ Qué piensa y qué hace el docente en el aula,” Informe Final de Investigación. Universidad Católica Santa María. Perú. Recuperado desde http://cies. org. pe/files/active/0, vol. 204, 2003. [5] C. Rosales, Evaluar es reflexionar sobre la enseñanza, Tercera. Madrid: Narcea Ediciones, 2014. [6] W. J. Popham, Evaluación trans-formativa: El poder transformador de la evaluación formativa. Narcea Ediciones. [7] J. A. Román, “La educación superior en tiempos de pandemia: una visión desde dentro del proceso formativo,” Revista Latinoamericana de Estudios Educativos, vol. 50, no. ESPECIAL, pp. 13–40, 2020, doi: 10.48102/rlee.2020.50.especial.95. [8] R. Stake, “The countenance of educational evaluation,” Teach Coll Rec, vol. abril, pp. 523–540, 1967. [9] I. Montes-Iturrizaga, “Apreciaciones en torno a la propuesta de nueva Ley Univeritaria,” Revista Signo Educativo, Lima, pp. 26–28, May 2014. [10] A. H. Al-Maqbali and R. M. R. Hussain, “The impact of online assessment challenges on assessment principles during COVID-19 in Oman,” Journal of University Teaching and Learning Practice, vol. 19, no. 2, pp. 73–92, 2022, doi: 10.53761/1.19.2.6. [11] A. Friedman, I. Blau, and Y. Eshet-Alkalai, “Cheating and Feeling Honest: Committing and Punishing Analog versus Digital Academic Dishonesty Behaviors in Higher Education,” Interdisciplinary Journal of e-Skills and Lifelong Learning, vol. 12, pp. 193–205, 2016, doi: 10.28945/3629. [12] M. A. Rahman, D. Novitasari, C. Handrianto, and S. Rasool, “Challenges In Online Learning Assessment During The COVID-19 Pandemic,” KOLOKIUM Jurnal Pendidikan Luar Sekolah, vol. 10, no. 1, pp. 15–25, Apr. 2022, doi: 10.24036/kolokium.v10i1.517. [13] D. Domínguez-Figaredo, I. Gil-Jaurena, and J. Morentin-Encina, “The Impact of Rapid Adoption of Online Assessment on Students’ Performance and Perceptions: Evidence from a Distance Learning University,” The Electronic Journal of e-Learning, vol. 20, no. 3, pp. 224– 241, 2022, [Online]. Available: www.ejel.org [14] “Online learning and assessment during the Covid-19 pandemic exploring the impact on undergraduate”. [15] B. Hassan et al., “Oncology and Radiotherapy © Online assessment for the final year medical students during COVID-19 pandemics; the exam quality and students’ performance.” [16] “paper Formative learning assessment Covid 19 Montes-Iturrizaga & Franco-Chalco.” [17] W. J. Popham, Classroom Assessment, What Teachers Need to Know, Eigtht Edi. Los Angeles: Pearson, 2017. [18] L. Ali and N. A. H. H. al Dmour, “The shift to online assessment due to covid-19: An empirical study of university students, behaviour and performance, in the region of UAE,” International Journal of Information and Education Technology, vol. 11, no. 5, pp. 220–228, May 2021, doi: 10.18178/ijiet.2021.11.5.1515. [19] F. J. García-Peñalvo, A. Corell, V. Abella-García, and M. Grande, “Online assessment in higher education in the time of COVID-19,” Education in the Knowledge Society, vol. 21, pp. 1–26, 2020, doi: 10.14201/eks.23013. [20] W. J. Popham, Test better, teach better: the instructional role of assessment, vol. 42, no. 01. Alejandría, Virginia: Association for Supervision and Curriculum Development, 2004. doi: 10.5860/choice.42-0445. [21] I. Montes Iturrizaga and L. M. Sime Poma Elizabeth Salcedo Lobatón Edith Soria Valencia Dany Briceño Vela, Investigación educativa: técnicas para el recojo y análisis de la información ESCUELA DE POSGRADO MAESTRÍA EN EDUCACIÓN. [Online]. Available: https://posgrado.pucp.edu.pe/maestria/educacion/ [22] D. B. Wayne, M. Green, and E. G. Neilson, “Medical education in the time of COVID-19,” Sci Adv, vol. 6, no. 31, 2020, doi: 10.1126/sciadv.abc7110. [23] “Experience of e-learning and online assessment during the COVID-19 pandemic at the College of Medicine, Qassim University”.