=Paper=
{{Paper
|id=Vol-1850/TEFA2016_paper_3-5
|storemode=property
|title=Towards a Personalized Placement Assessment Method to Support Students’ Knowledge Gaps Identification
|pdfUrl=https://ceur-ws.org/Vol-1850/TEFA2016_paper_3-5.pdf
|volume=Vol-1850
|authors=Jerry Medeiros,Marlon Da Costa Monçores,Samara Werner
}}
==Towards a Personalized Placement Assessment Method to Support Students’ Knowledge Gaps Identification==
Towards a Personalized Placement Assessment Method to Support Students’ Knowledge Gaps Identification Jerry Fernandes Medeiros12 , Marlon da Costa Monçores12 , and Samara Werner1 1 Tamboro Educacional, Rua Visconde de Caravelas, 111 - Botafogo, Rio de Janeiro 2 Universidade Federal do Estado do Rio de Janeiro, Av. Pasteur, 458 - Botafogo, Rio de Janeiro Abstract. Whereas students of basic education in Brazil advance in grade without mastering all the contents, teachers need to recall content from previous years. It can become irrecoverable as learners may have fallen behind their peers, or because middle or high school teachers may not have expertise in teaching basic academic skills. In this paper, we propose an adaptive assessment technique based on graphs for identifying learning gaps. The method described here does not rely on the calibration of items, but in the structure in which the content is taught, which may be customized according to the schools’ pedagogical project. Keywords: Adaptive Assessment, Computerized Adaptive Testing, Learn- ing Gaps 1 Introduction Educational assessment is the process of making reasonable inferences about what learners know based on evidence found from observation of what they say, do or make in selected situations [4]. The enhancement of the assessment process with adaptive features is impor- tant because it makes the assessment process more dynamic and individualized, as it adapts to the learner’s performance. Furthermore, the number of questions required to estimate learners knowledge level is usually reduced, resulting in a less monotonous assessment process. A Computer Adaptive Test (CAT) is a test administered by a computer where the presentation of each question, as well as the decision to finish the test are dynamically adapted based on the answers of the test takers. The test items are dynamically adjusted to a student’s performance level. As a result, the test tends to be shorter and more accurate [7]. Numerous methodological approaches have been developed for CAT, most of them evolved from the principles of psychometric measurement. Linacre et al. [3] details the theoretical background involved in adaptive testing. Numerous procedures have been developed for implementing the basic tasks needed to select and score adaptive tests. No method has been ideal for all as- sessing situations and scenarios. What works best depends on the unique char- acteristics of a given testing goal. More directly, in adaptive testing methods, answering questions correctly or incorrectly implies in easier or more difficult questions being administered to the test taker. Questions are selected so that their difficulty matches the learner’s inferred knowledge level in a given subject. Questions that provide more information about what the learner knows are usually those with difficulty close to the learner’s knowledge level. The learn- ers knowledge level estimation depends on the number of questions answered correctly and on the difficulty level of the answered questions. The technique approached in this paper uses a pool of questions which are highly structured and always related to a subject. It does not rely on the cali- bration of the items, but in the prerequisites defined by the school. The reasons for this decision will be addressed in the section 3. An important consideration affecting the design of the assessment is the purpose on which it will be used. In this paper, we present a technique developed for finding learners knowledge gaps. The main purpose of this approach is to empower the teacher with the information needed for helping each learner to catch up with the desired progress. 2 Learning Gaps A Learning gap is the existence of a difference between what a learner knows and what the learner is expected to know at a certain point in his education, as the age or scholar grade [1]. The learning gaps passed from one grade to another can become a significant problem that causes teachers to step in. Because of the gaps, teachers are usually forced to teach less on-grade content and focus on content from past grades. Closing these gaps can become an unsolvable long-term problem as the student learning gaps interfere with mastering new subjects. Although some learning gaps are closed in following grades, the effort is high. As a consequence, less on-grade material can be discussed and learned. One of the more consequential issues of the existence of learning gaps is their tendency to increase and become more severe over time. Furthermore, if basic skills such as reading, writing, and math are not acquired by students early on in their education, it may be more difficult for them to achieve it later on. As students progress through their education, closing learning gaps tend to become more difficult, because learners may have fallen behind their peers, or because middle or high school teachers may not have expertise in teaching basic academic skills. [1] To bridge individual learning gaps, it is important that teachers have some tools that help them to identify and classify gaps in the classroom . In this con- text, the purpose of this paper is to describe an adaptive assessment method used to identify learning gaps. The motivation and the context of use are described in section 3. 3 Application Scenario The development of this work has focused on elementary school (Corresponding to the first nine years of basic education in Brazil). The Programme for International Student Assessment (PISA) is a program that assesses random students aged 15, from around the world. It is applied every three years in public and private schools and was first implemented in 2000. The aim of PISA is to collect data in order to help in the development of policies to improve the education system of the countries that are enrolled. The outcome of the maths test is divided into seven levels of proficiency ranging from ”below level 1” to level 6. Each level is associated with a set of defined skills (except the ”below level 1”). The seven levels and distribution of Brazilian students are shown in table 1 Level/Year 2003 accum. % 2006 accum. % 2009 accum. % 2012 accum. % Bellow level 1 54.4 54.4 46.6 46.6 38.1 38.1 35.2 35.2 Level 1 21.7 76.1 25.9 72.5 31 69.1 31.9 67.1 Level 2 13.9 90 16.6 89.1 19 88.1 20.4 87.5 Level 3 6.5 96.5 7.1 96.2 8.1 96.2 8.9 96.4 Level 4 2.5 99 2.8 99.0 3.0 99.2 2.9 99.3 Level 5 0.8 99.8 0.8 99.8 0.7 99.9 0.7 100.0 Level 6 0.2 100 0.2 100.0 0.1 100.0 0.0 100.0 Table 1. The evolution of Brazilian students math skill over the years (PISA) [5] [6] Based on this data, we can understand that almost all the students are on the first three levels of proficiency. By the cumulative percentages, it is clear that between 87.5% and 90% of the students are in the first three levels. This concentration in the lower region of the scale did not change significantly over the years, it may indicate the students are progressing over the grades but with some knowledge lacking. Brazilians students were ranked 58 out of 65 countries enrolled in the last edition. The content administered to Brazilian students is in part guided by the Brazilian Law, which defines a common minimum curriculum that must be ad- ministered for all schools in the nation. In addition to this common curriculum, the school is responsible for a complementary curriculum. The aim of this com- plement is to address the needs and characteristics of the local community and it’s culture [2]. It is expected that, when passing from one grade to the next, the student masters all content that was given in the previous year. However, for several reasons, this does not occur. It is important to monitor student progress to prevent gaps becoming more pronounced with the passing years. This same law states that learner that have lost some content must catch up. However, it is up to the school, based on the principle of their autonomy and their right to define their pedagogical proposal, to decide how to address the bridging of students’ gaps [2]. Given the presented scenario, where students progress to the next grade without mastering all required skills from previous years, and also considering the fact that there is not a unique curriculum, we have defined a method for assessing and finding students learning gaps. Our method does not rely on the calibration of the items, but on the structure of the content, allowing the possibility to easily customize the test to fit the schools’ or teachers’ needs. 4 Proposed Solution This section describes the operation of the adaptive assessment for the gap dis- covery. 4.1 Graph Based Adaptive Assessment We use a directed acyclic graph as the basis for the adaptive test. Let G = (V, E) be a graph, where V is the set of vertices and E is the set of edges. Each vertex represents a subject that will be administered by the assessment and each edge represents a dependency relationship between two subjects. Thus, it is assumed that the source vertex is a prerequisite to the destination vertex. Cyclic dependencies are not allowed. It means that starting from a vertex V1 there should not exist any other path that leads back to the vertex V1 . This restriction is important because the existence of any cycle would make it impossible to define a precedence order among the elements belonging to the cycle. The graph is constructed to represent the relationships between the subjects that are part of the content of a field of studies. Thus, the initial content of the subject are represented as initial graph vertices (no incoming edge) and are called roots. While subjects that are part of the final course are represented as end vertices (no outgoing edges) witch are called leafs. Other subjects are also represented as vertices, but they have incoming and outgoing edges, these are called only vertex. In addition to the vertices and their relationships, one must associate the items (questions) to each of the vertices. Each vertex must have at least one item associated, however, there is no maximum limit. Figure 1 shows an example of a valid graph. 4.2 Setting up The Graph After the graph creation with all subjects of the field and their relationships, it is recommended to pre-calibrate the graph. This pre-calibration can be made using different models of student profiles, i.e., students with low performance, Fig. 1. Example of a directed graph with 2 roots, 2 leaves, 3 other vertices, and no cycle average performance and high-performance. Because of their different paces, these students also have different gaps. So, a good pre-calibration allows a more accurate gap identification with fewer items presented to the student during the assessment. The pre-calibration algorithm aims to find out which vertices are more impor- tant to be asked for each student profile. In this regard, the algorithm simulates many students of a particular profile. For each student simulation, the algorithm simulates the student’s answers according to his profile. The hit3 probability of an item in a vertex is inversely proportional to the amount of previous direct and indirect pre-requisites of the vertex, i.e., items that are in roots are more prone to be hit and the leaves are more prone to be missed. In addition, students with higher profiles are more likely to hit more questions. The algorithm takes a graph G as a parameter. That graph contains the set of subjects and its relationship with each other. Another parameter is called QUANTITY, it is an integer and represents the number of students that will be used in the simulation. The third parameter is called LEVEL and it repre- sents the average simulated student level and the last parameter is called SD, representing the standard deviation of the student’s level. After the simulation of each student, the algorithm then highlights the ver- tices that were in the student’s knowledge borderline. Those vertices are iden- tified by: (a) vertices that the student knows having edges to vertices that the student does not know or (b) vertices that the student does not know that are preceded by vertices that the student knows. At the end of the simulation of all students, the algorithm calculates for each vertex, its OPTIMAL VALUE (OV) which is defined as a number of times the vertex appears in the border region, divided by the number of simulated students. Thus, the OV of a vertex is a number ranging from 0 to 1. When the OV is 0 it means the vertex does not belong to any border in any student and when the OV is 1 it means the vertex belongs to the border in all simulated students. 3 Every time a test taker answers correctly it is called a hit, on the other hand a wrong answer is called a miss After pre-calibration, the OV for each vertex is stored and can be used for real students having their profile similar to the LEV EL used in the simulation. 4.3 Vertex Choosing Vertex choosing is the central element of the adaptive assessment. A good vertex choice allows student’s gaps to be found with few items. So, to make this choice more efficient, we used three elements combined. 1. OPTIMAL VALUE (OV): Represents the vertex percentage in the stu- dent’s knowledge borderline. The higher OV the higher the vertex preference is in the selection process. 2. VERTEX CENTRALITY (VC): The betweenness centrality is calcu- lated for each vertex. This value is a measure of centrality of a vertex in the graph. The higher V C the higher the vertex preference is in the selection process. 3. VERTEX SCORE (VS): Initially the score of all vertices is 0. Every answer given by the student lead to vertex score update. In general, the vertex score increases when the student hits a question and decreases when the student misses. Vertices with a value close to 0 are preferred in the selection process. Our intention is to make each of those values equally weighted during the vertex choosing process. So, it is necessary that all elements have the same possible range of values. Thus, we decided to normalize its values between 0 and 1. For OV nothing needs to be done since their values are already in the defined range. However for V C it’s necessary to normalize the value. This is done by dividing each vertex V C by the highest existing V C. So, the vertex with the higher V C will have the value of 1. The V S normalization is made in two steps: 1 first, we pick up the absolute V S, then we make the operation 1+V S . Thus, the values will always be between 0 and 1 and the priority goes to vertices with a score closer to 0. The selection process also uses three constants that can be calibrated to increase or decrease the weight of any of those elements. The variables are called OPTIMAL WEIGHT (OW), Centrality WEIGHT (CW) and SCORE WEIGHT (SW). Thus, the value of each vertex is calculated according to the equation 1. (OW ∗ OV ) + (CW ∗ V C) + (SW ∗ V S) (1) 4.4 Graph Painting Technique The goal of adaptive testing is to identify the student’s learning gaps. For this identification, the algorithm proposed uses a technique that we have called graph painting. In this technique, each vertex has an associated value and a color. So, whenever a vertex has value 0, it also has the color black. When the value is negative the vertex is red and when the value is positive the vertex color is blue. The semantics of each color is simple, blue vertices mean that the algorithm identified a student who knows the vertex subject. On the other hand, the red vertices are those where the algorithm identified a student who does not know the vertex subject. Black vertices are those where the algorithm has no information. Thus, the gap of the student is defined by all red and remaining black vertices at the end of the assessment. When a student starts the assessment, all vertices are black and have it’s V S equal to zero. Then, following the equation described in Section 4.3 a vertex is selected. The student then has to answer items related to the selected vertex. If there is more than one item associated with the vertex, the choice will be random, discarding items that the student has already answered previously (vertices can be revisited during assessment). Thus, the number of items displayed in each selected vertex is variable and depends on three factors: (1) number of available items in the vertex, (2) predefined maximum items than can be asked per vertex, and (3) student’s performance. For the algorithm to consider that the student knows the vertex, it is necessary that the student hits all the questions that are shown to the vertex. So, if the student misses a question, the algorithm can change to another subject. After the vertex completion, the algorithm assigns a value to the vertex. This value is controlled by the parameter DIRECT WEIGHT (DW). The assigned value may be DW if the algorithm considers that the student knows the sub- ject or -DW otherwise. This change in vertex value will cause the vertex to be painted as blue or red. In addition, in painting the selected vertex, the algorithm also changes the value of some other vertices following a simple rule. If the se- lected vertex is painted blue, all predecessors vertices are also painted blue. If the selected vertex is painted red, all successors vertices are also painted red. However, the value assigned to these other vertices is defined by the parame- ter INDIRECT WEIGHT (IW) which, by default, is set as DW ∗ 0.1. Figure 2 shows two examples of graph painting algorithm. The first example shows the result after a hit and the second shows a result after a miss. The green dot at the center represents the selected vertex in each situation. 4.5 Graph Based Model Benefits As the subjects are mapped in a directed graph, the algorithm can infer infor- mation about subjects that weren’t asked, creating a possibility to evaluate the gap of a broad curriculum asking only a few items to students. Another benefit is that the items do not need to be calibrated because the most important factor is the relationships between subjects. Thus, it is possible that organizations using a different subject teaching order use the same set of items. The only adjustment needed is to configure the specific graph, that is, the graph can be set to identify gaps in the same order that they are presented in a specific course. Fig. 2. Example of a graph being painted after a hit and after a miss. The green center shows the selected vertex. 4.6 Considerations and Future Work In final considerations, we emphasize that the research is at an early stage. This proposed test model has not yet been applied to real students. However, we already have a graph containing more than 5000s item from all maths subjects taught in years 5-9 in Brazilian elementary school. Thus, the next step in the research is applying an assessment test using the described algorithm to students who are enrolled in these scholar grades. In our preliminary study we are simulating students with different paces and performances from different grades. The algorithm is able to paint more than 80% of the graph with less than 30 subjects selections. We are also working on reports, based on the results obtained from the assessment. These reports will help teachers in decision-making about tutoring classes for addressing the gap . References 1. Abbott, S., Guisbond, L., Levy, J., Sommerfeld, M.: The glossary of education reform. Hidden curriculum. Retrieved (2014), http://edglossary.org/learning-gap/ 2. de Diretrizes, L.: bases da educação nacional (1996) 3. Linacre, J.M., et al.: Computer-adaptive testing: A methodology whose time has come. Chae, S.-Kang, U.–Jeon, E.–Linacre, JM (eds.): Development of Comput- erised Middle School Achievement Tests, MESA Research Memorandum (69) (2000) 4. Pellegrino, J.W., Chudowsky, N., Glaser, R., et al.: Knowing what students know: The science and design of educational assessment. National Academies Press (2001) 5. de Estudos e Pesquisas Educacionais Ansio Teixeira (Inep), I.N.: Resultados na- cionais pisa 2006 (2008) 6. de Estudos e Pesquisas Educacionais Ansio Teixeira (Inep), I.N.: Relatrio nacional pisa 2012 (2012) 7. Thissen, D., Mislevy, R.J.: Testing algorithms. Computerized adaptive testing: A primer 2, 101–133 (2000)