The algorithm for knowledge assessment based on the Rusch model Alexander A. Kostikov1 , Kateryna V. Vlasenko2,3 , Iryna V. Lovianova4 , Sergii V. Volkov5 and Evgeny O. Avramov1 1 Donbass State Engineering Academy, 72 Academichna Str., Kramatorsk, 84313, Ukraine 2 National University of “Kyiv Mohyla Academy”, 2 Hryhoriya Skovorody Str., Kyiv, 04655, Ukraine 3 Technical University “Metinvest Polytechnic” LLC, 71A Sechenov Str., Mariupol, 87524, Ukraine 4 Kryvyi Rih State Pedagogical University, 54 Gagarin Ave., Kryvyi Rih, 50086, Ukraine 5 The Institute of Chemical Technologies of the East Ukrainian Volodymyr Dahl National University, 31 Volodymyrska Str., Rubizhne, 93009, Ukraine Abstract In this paper the algorithm for adaptive testing of students’ knowledge in distance learning and an assessment of its effectiveness in the educational process has been proposed. The paper provides an overview of the results of the application of modern test theory, a description and block diagram of the proposed algorithm and the results of its application in the real educational process. The effectiveness of using this algorithm for the objective assessment of students’ knowledge has been experimentally shown. Keywords adaptive algorithm, Rasch model, Item Response Theory (IRT), information function of test item, latent variables 1. Introduction 1.1. Problem statement Modern approaches to assessing students’ academic achievements are based on the use of classical testing theory and Item Response Theory (IRT). The mathematical background of pedagogical measurement theory was created by Andersen [1, 2], Andrich [3], Avanesov [4], Birnbaum [5], Guttman [6], Linacre [7], Lord et al. [8], Maslak et al. [9], Masters [10], Rasch CoSinE 2021: 9th Illia O. Teplytskyi Workshop on Computer Simulation in Education, co-located with the 17th International Conference on ICT in Education, Research, and Industrial Applications: Integration, Harmonization, and Knowledge Transfer (ICTERI 2021), October 1, 2021, Kherson, Ukraine " alexkst63@gmail.com (A. A. Kostikov); vlasenkokv@ukr.net (K. V. Vlasenko); lirihka22@gmail.com (I. V. Lovianova); sergei.volkov@ukr.net (S. V. Volkov); avramzenek@gmail.com (E. O. Avramov) ~ http://www.dgma.donetsk.ua/index.php?option=com_content&Itemid=635&id=3686&lang=uk&layout=edit& view=article (A. A. Kostikov); http://formathematics.com/uk/tyutori/vlasenko/ (K. V. Vlasenko); https://kdpu.edu.ua/personal/ilovianova.html (I. V. Lovianova); http://formathematics.com/uk/tyutori/sergij-volkov/ (S. V. Volkov)  0000-0003-3503-4836 (A. A. Kostikov); 0000-0002-8920-5680 (K. V. Vlasenko); 0000-0003-3186-2837 (I. V. Lovianova); 0000-0001-7938-3080 (S. V. Volkov); 0000-0002-8405-7164 (E. O. Avramov) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 28 [11] and other scientists. In IRT, the concept of a latent variable is used. The term “latent variable (parameter)” is usually understood as a theoretical concept that characterizes a certain hidden property or quality (for example, the level of students’ ability, the difficulty of the test task), which cannot be directly measured. The advantages of the classical testing theory are the provision of information about the indicators of the knowledge quality of the subjects, the clarity of the performed calculations and the simple interpretation of the processing data. The main disadvantage is the dependence of the results of evaluating the participants’ parameters on the difficulty of the proposed tasks. Application of IRT, based on Rush’s models, provides the possibility of the evaluation independence of the latent parameter “ability level” calculated values of participants 𝜃𝑖 from the values of the “item difficulty” 𝛽𝑖 . This helps to increase the objectivity of the obtained assessments of the students’ ability level and allows to build effective algorithms for assessing knowledge. The purpose of this paper is to develop an algorithm of adaptive testing for objective assess- ment of students’ knowledge in distance learning, which becomes especially relevant in the quarantine of COVID-19. 1.2. State of arts and review The educational standards of the new generation are based on a competency-based approach to assessing the quality of a student’s training, when it is not his knowledge that is tested, first of all, but his readiness to apply it in practice and to act productively in a non-standard situation, the ability to create the required mode of action. Therefore, the quality of training is understood as the degree of the student’s readiness to demonstrate the relevant competencies. The generalization of the world experience in the implementation of the competence-based approach to assessing learning outcomes allows us to make the following conclusions that determine the main approaches to assessing the level of competence mastery, the main of which are the following: • competencies are dynamic, since they are not an invariable quality in the structure of a pupil’s personality, but are able to develop, improve or completely disappear in the absence of an incentive to manifest them. Therefore, we can talk about the level of competence, assess it quantitatively, and monitor it. • when assessing learning outcomes, it is necessary to consider them in dynamics, which requires diagnostics of the educational process using monitoring procedures. • the level of possession of a competence is a hidden (latent) parameter of the pupil and direct measurement is not amenable. It can be estimated with a certain probability. Therefore, when evaluating it, a probabilistic approach should be used. It follows from this that in order to create tools for the automated assessment of the learning outcomes, it is necessary, first of all, to solve two problems: 1) develop theoretical and methodological foundations for modeling and parameterization of the learning process and the diagnostic tools used to evaluate its results; 2) theoretically substantiate and implement software-algorithmic means for processing the results of participants’ diagnostics (testing, questionnaires), as well as tools for assessing learning outcomes and the quality of diagnostic tools. 29 The theoretical and methodological basis for solving these problems was the study results, first of all, by Brown [12], Cronbach [13], Guilford [14], Gulliksen [15], Guttman [6], Kuder and Richardson [16], Luce and Tukey [17], Lord et al. [8], Sax [18], Spearman [19]. They developed the theoretical foundations for the creation of diagnostic materials and the classical approach to processing, analysis and interpretation of diagnostic results: the conceptual apparatus of the classical test theory , criteria and indicators of the quality of diagnostic tools, methodological basics of their design and quality expertise. The issues of scaling and comparison of processing data have been deeply investigated. The theoretical basis for the creation of tools for automatic assessment of the results of the educational process has received its further development due to the creation of the IRT (Item Response Theory), the foundations of which are set out by Andrich [3, 20], Bezruczko [21], Bond and Fox [22], Bond et al. [23], Eckes [24], Fischer and Molenaar [25], Andrich et al. [26], Ingebo [27], Kim and Baker [28], Lazarsfeld [29], van der Linden and Hambleton [30], Lord [31], Luce and Tukey [17], Perline et al. [32], Smith and Smith [33], Rasch [11], Wilson [34], Wright [35], Wright and Masters [36], Wright and Stone [37], Wright and Linacre [38]. 2. Algorithm of adaptive testing based on Rasch model Adaptive testing is a type of testing in which the order of presentation of test items and the difficulty of the next task depends on the participant’s answers to previous items. The basis of adaptive testing systems are statistical models. Very easy and very difficult tasks are automatically uninformative. Therefore, for most tests, the optimal level of difficulty is the item, to which the correct answer is given by about half of the test participants. The difficulties of the test items is determined experimentally, and the measurement process consists of determining the percentage of participants who are able to give the correct answer to the task in previous experiments. The problem of developing adaptive algorithms has been considered by Al-A’ali [39], Weiss [40, 41]. The Rush model was used to construct the adaptive testing algorithm. This model is defined by formulas: exp(𝜃𝑛 − 𝛽𝑖 ) 𝑃𝑛𝑖 = (1) 1 + exp(𝜃𝑛 − 𝛽𝑖 ) where 𝑃𝑛𝑖 is the probability that the participant 𝑛, 𝑛 = 1, . . . , 𝑁 with the ability 𝜃𝑛 correctly performs the task 𝑖, 𝑖 = 1, . . . , 𝐼, with the difficulty 𝛽𝑖 . To start the algorithm it is necessary to determine the initial levels of difficulties. To this end, at the beginning of the testing session the accumulation of primary information about the level of preparation of the participant is carried out. To do this, participant receive 𝑁𝑝 tasks with an average level of difficulty. Tasks to determine the initial level of the participant are chosen by the teacher. Then, using the received answers, the initial estimation of the ability level of the student is calculated, and also recalculation of the difficulty level current values of test items is carried out. The initial assessment of the ability level of the 𝑖-th student (in logs) is based on the formula: (︂ )︂ 𝑝𝑖 0 𝜃𝑖 = ln , 𝑖 = 1, 2 · · · 𝑁, (2) 𝑞𝑖 30 where 𝑁 is the number of test participants, 𝑝𝑖 is the proportion of correct answers of the 𝑖-th participant to all tasks, 𝑞𝑖 is the proportion of incorrect answers (𝑞𝑖 = 1 − 𝑝𝑖 ). The difficulty level of test items in logs is determined by the formula: (︂ )︂ 𝑞𝑗 0 𝛽𝑗 = ln , 𝑗 = 1, 2 · · · 𝑀, (3) 𝑝𝑗 where 𝑀 is the number of test items, 𝑝𝑗 is the proportion of correct answers of all participants to the 𝑗-th test item, 𝑞𝑗 is the proportion of incorrect answers. At the next stage, the initial values in the logs of the ability level of participants 𝜃𝑖0 and the initial values in the logs of the difficulty level of the test item 𝛽𝑗0 are reduced to a same interval scale [8]. The formula for such transition is based on the idea of reducing the impact of the items difficulty on the assessments of test participants. Pre-calculating the average value of the initial logits of the students’ ability level 𝑁 𝜃𝑖0 ∑︀ 𝜃 = 𝑖=1 𝑁 and the standard deviation 𝑉 of the initial values distribution of the parameter 𝜃 𝑁 (︀ )︀2 𝜃𝑖0 − 𝜃 ∑︀ 𝑉 2 = 𝑖=1 , 𝑁 −1 we obtain a formula for calculating the dofficulty level logit of the 𝑗-th item 𝛽𝑗 = 𝜃 + 𝑌 · 𝛽𝑗0 , 𝑗 = 1, 𝑀 , (4) where √︂ 𝑉2 𝑌 = 1+ 2.89 Similarly, calculating ⎯ 𝑀 ⎸ 𝑀 (︁ 0 ⎸ ∑︀ )︁2 𝛽𝑗0 ∑︀ ⎸ 𝛽𝑗 − 𝛽 𝑗=1 ⎷ 𝑗 𝛽= , 𝑊 = 𝑀 𝑀 −1 we get the formula for calculating the ability level logit of the 𝑖-th student: 𝜃𝑖 = 𝛽 + 𝑋 · 𝜃𝑖0 , 𝑖 = 1, 𝑁 , (5) (︁ )︁ 1 𝑊2 2 where 𝑋 = 1 + 2.89 . The obtained values allow to compare the level of students’ ability with the level of test item difficulty. If 𝜃𝑖 − 𝛽𝑗 is a negative quantity and is large in modulus, then the problem of difficulty 𝛽𝑗 is too difficult for a student with the ability level 𝜃𝑖 , and it will not be useful for measuring the level of knowledge of the 𝑖-th student. If this difference is positive and large in 31 modulus, then the task is too easy, it has long been mastered by the student. If 𝜃𝑖 = 𝛽𝑗 , then the probability that the student correctly completes the task is equal to 0.5. The information function of the 𝑖-th problem for the Rush model (1) 𝐼𝑖 (𝜃) is defined as the product of the probability of the correct answer 𝑃𝑖 (𝜃) to this problem on the probability of the incorrect answer 𝑄𝑖 (𝜃) [8] 𝐼𝑖 (𝜃) = 𝑃𝑖 (𝜃) · 𝑄𝑖 (𝜃) (6) Figure 1 shows the information function of the 𝑖-th item. Figure 1: Information function of the test task. Figure 1 shows that the test item, the answer to which all students know, does not provide any information, as well as the item, the answer to which no one knows. We get useful information when some participants know the answer to the task and some do not. The information function of the test is calculated as the sum of the information functions of the test items [8]: 𝑀 ∑︁ 𝐼(𝜃) = 𝐷2 · 𝐼𝑗 (𝜃) (7) 𝑗=1 where 𝐷 is the correction factor (𝐷 = 1.7), necessary to approximate the distribution of logistic probability to the law of normal distribution. After calculating the information function, the measurement error 𝑆𝐸 is calculated, the value of which is used to check the condition of the end of the test procedure. In the Rusch model, the measurement error depends on the level of training 𝜃 and is calculated by the formula [8]: 1 𝑆𝐸(𝜃) = √︀ (8) 𝐼(𝜃) If the error takes a value less than the threshold set by the teacher, the adaptive testing algorithm ends. Otherwise, the following test task is selected. To select the next task, use the value of 𝜃𝑖 , calculated by formula (5). The next task is the one whose difficulty level is 32 closest to the current assessment of the ability level of the participant. This task has the largest information contribution and its choice reduces the total number of required test tasks. Thus, the developed adaptive testing algorithm consists of the following stages: 1. Selection of 5 tasks of average difficulty from the bank of questions, which is determined by the teacher. 2. Finding the initial level of student’s ability 𝜃𝑖0 and the initial difficulty level of items 𝛽𝑗0 by formulas (2) and (3). 3. Reduction of the obtained initial values 𝜃𝑖0 and 𝛽𝑗0 to a single interval scale using formulas (5) and (4). 4. Calculation of the information function of test tasks to which the student answered by formulas (6) and (7). 5. Finding the measurement error by the formula (8). 6. If the measurement error is less than the threshold, the adaptive testing is completed. 7. If not, then the next task is selected from the condition |𝜃𝑖 − 𝛽𝑗 | = min. 8. Then the algorithm is repeated starting from point 3. The block diagram of the algorithm is shown in figure 2. 3. Results Let us consider the procedure for calculating the parameters of student ability level 𝜃𝑖 and item difficulty parameter 𝛽𝑖 from empirical data. As initial data we will take results of testing of students in Moodle system on discipline “Higher Mathematics” of the Mathematics and Modeling Departement of the Donbass State Engineering Academy (Table 1). Table 1 shows the records of the first 10 test participants. A total of 50 participants took part in the testing. Table 1 Test results in the Moodle system in the discipline “Higher Mathematics” of the Mathematics and Modeling Department of the Donbass State Engineering Academy Participant’s number Score Number of correct answers 𝑝𝑖 𝑞𝑖 𝜃𝑖0 1 90 18 0.9 0.1 2.197225 2 75 15 0.75 0.25 1.098612 3 85 17 0.85 0.15 1.734601 4 100 20 1 0 ∞ 5 75 15 0.75 0.25 1.098612 6 100 20 1 0 ∞ 7 90 18 0.9 0.1 2.197225 8 90 18 0.9 0.1 2.197225 9 70 14 0.7 0.3 0.847298 10 85 17 0.85 0.15 1.734601 The test in this discipline consisted of 20 questions. First, it is necessary to calculate the proportions of correct 𝑝𝑖 and incorrect 𝑞𝑖 answers of participants. These values are calculated 33 Figure 2: Block diagram of the adaptive testing algorithm. by formulas 𝑅𝑖 𝑝𝑖 = , 𝑞𝑖 = 1 − 𝑝𝑖 , (9) 𝑁 where 𝑅𝑖 is the number of correct answers for the 𝑖-th test item, = 1, 2, ..., 𝑛, and 𝑛 is the number of items in the test. For example, for the first participant of testing we have 18 𝑝1 = = 0.9 𝑞1 = 1 − 0.9 = 0.1 20 The values 𝑝𝑖 and 𝑞𝑖 are given in columns 3 and 4 of table 1. Next, calculate the initial values 𝜃10 of the ability level of participants by formula (2). For the first participant we have 0.9 𝜃10 = ln = 2.197 0.1 34 Using the statistical module Moodle, the following characteristics were obtained for test tasks: facility index(F), standard deviation (SD), random guess score (RGS), intended weight, effective weight, distinction, distinction efficiency. These data are shown in table 2. Table 2 Statistical characteristics obtained using the statistical module of the Moodle system based on the results of final testing in the discipline “Higher Mathematics” Q# F SD RGS Intended weight Effective weight Distinction Distinguishing efficiency 1 98.00% 14.14% 33.33% 5.00% -11.54% -28.62% 2 94.00% 23.99% 33.33% 5.00% 3.41% 6.93% 11.28% 3 90.00% 30.30% 16.67% 5.00% 6.75% 44.07% 65.85% 4 94.00% 23.99% 20.00% 5.00% 4.66% 22.91% 39.11% 5 96.00% 19.79% 20.00% 5.00% 3.34% 11.72% 23.66% 6 90.00% 30.30% 14.29% 5.00% 3.18% -1.53% -2.22% 7 92.00% 27.40% 14.29% 5.00% 6.32% 43.38% 70.79% 8 84.00% 37.03% 20.00% 5.00% 6.48% 26.08% 35.44% 9 88.00% 32.83% 20.00% 5.00% 5.32% 17.26% 23.76% 10 74.00% 44.31% 20.00% 5.00% 9.75% 68.31% 84.84% 11 98.00% 14.14% 20.00% 5.00% 2.85% 14.64% 35.69% 12 100.00% 0.00% 16.67% 5.00% 0.00% 13 94.00% 23.99% 33.33% 5.00% 4.93% 27.00% 45.87% 14 90.00% 30.30% 33.33% 5.00% 5.51% 23.81% 34.88% 15 88.00% 32.83% 25.00% 5.00% 5.32% 17.26% 23.76% 16 90.00% 30.30% 33.33% 5.00% 5.51% 23.81% 33.33% 17 42.00% 49.86% 20.00% 5.00% 5.23% -2.60% -3.57% 18 80.00% 40.41% 33.33% 5.00% 8.11% 45.46% 56.25% 19 56.00% 50.14% 20.00% 5.00% 7.01% 13.80% 17.23% 20 82.00% 38.81% 20.00% 5.00% 6.32% 21.10% 27.68% Based on the data in table 2, we can estimate the initial values of the item difficulty parameter. By formula (3) for the first problem we obtain 2 𝛽10 = ln = −3.891 98 The results of calculations of the initial values of the item difficulty parameter are given in table 3. As can be seen from table 3, all participants in the quiz answered the 12th item, so the score was equal to infinity with a minus sign. But practically at 𝛽 > − 6 the probability value 𝑃𝑖 (𝛽) close to one. These items are performed by all participants and they become redundant. Items with 𝛽 >6 are also useless. Such items will not be overcome by any participant and they do not carry any information about differences in the students’ ability levels. In tables 1 and 3, the parameter values 𝜃𝑖0 and 𝛽𝑖0 are on different interval scales. In order to reduce them to a single scale of standard estimates, it is necessary to calculate the variances 𝑉 2 and 𝑊 2 using the data from tables 1 and 3. Infinite data are excluded from consideration. 35 Table 3 Initial values 𝛽𝑖0 of the item difficulty parameter Q# Progress 𝑝𝑖 𝑞𝑖 𝛽𝑖0 1 98.00% 0.98 0.02 -3.89182 2 94.00% 0.94 0.06 -2.75154 3 90.00% 0.90 0.10 -2.19722 4 94.00% 0.94 0.06 -2.75154 5 96.00% 0.96 0.04 -3.17805 6 90.00% 0.90 0.10 -2.19722 7 92.00% 0.92 0.08 -2.44235 8 84.00% 0.84 0.16 -1.65823 9 88.00% 0.88 0.12 -1.99243 10 74,00% 0.74 0.26 -1.04597 11 98.00% 0.98 0.02 -3.89182 12 100.00% 1.00 0.00 -∞ 13 94.00% 0.94 0.06 -2.75154 14 90.00% 0.90 0.10 -2.19722 15 88.00% 0.88 0.12 -1.99243 16 90.00% 0.90 0.10 -2.19722 17 42,00% 0.42 0.58 0.322773 18 80.00% 0.80 0.20 -1.38629 19 56,00% 0.56 0.44 -0.24116 20 82.00% 0.82 0.18 -1.51635 Calculating the variance, we obtain ∑︀𝑁 (︀ 0 )︀2 2 𝑖=1 𝜃𝑖 − 𝜃 𝑉 = = 0.634, 𝑁 −1 ∑︀𝑀 (︁ 0 )︁2 𝑗 𝛽 𝑗 − 𝛽 𝑊2 = = 4.873 𝑀 −1 Next, we calculate the angular coefficients [8]: √︂ 𝑉2 𝑌 = 1+ = 1.104 2.89 √︂ 𝑊2 𝑋 = 1+ = 1.63 2.89 Next on the formulas 𝜃𝑖 = −2.103 + 1.104𝜃𝑖0 𝛽𝑖 = 1.86 + 1.63𝛽𝑖0 calculate the scaled values 𝛽𝑖 and 𝜃𝑖 . In tables 4 and 5 scaled parameter values are provided. 36 Table 4 Scaled values of item difficulty parameter 𝛽𝑖 Q# 𝛽𝑖0 𝛽𝑖 1 -3.89182 -4.48367 2 -2.75154 -2.625 3 -2.19722 -1.72148 4 -2.75154 -2,625 5 -3.17805 -3.32023 6 -2.19722 -1.72148 7 -2.44235 -2.12103 8 -1.65823 -0.84291 9 -1.99243 -1.38766 10 -1.04597 0.155071 11 -3.89182 -4.48367 13 -2.75154 -2,625 14 -2.19722 -1.72148 15 -1.99243 -1.38766 16 -2.19722 -1.72148 17 0.322773 2.386121 18 -1.38629 -0.39966 19 -0.24116 1.466906 20 -1.51635 -0.61165 Table 5 Scaled values of the ability level 𝜃𝑖 Participant’s number 𝜃𝑖0 𝜃𝑖 1 2.197225 0.322736 2 1.098612 -0.89013 3 1.734601 -0.188 5 1.098612 -0.89013 7 2.197225 0.322736 8 2.197225 0.322736 9 0.847298 -1.16758 10 1,734601 -0.188 The sum of the scaled difficulty levels of test items is -27.93. This means that the test items are very easy. This test is not balanced, it contains a lot of easy items. It is necessary to strive to ensure that this amount is close to zero. Thus, the assessment of latent parameters allows to determine noninformative items that should be excluded from the quiz. The use of the developed adaptive algorithm will allow to objectively assess the level of students’ knowledge. The graph of the information function of test items and the test as a whole, defined by formulas (6) and (7), is shown in figure 3. Figure 3 shows that the information function has one clearly expressed maximum. This is a 37 b=-4.48367 7 b=-3.32023 b=-2.625 6 b=-2.12103 b=-1.72148 b=-1.72148 5 b=-1.38766 information b=-0.61165 4 b=0.155071 b=2.386121 3 inf.function 2 1 0 −8 −6 −4 −2 0 2 4 6 8 ability level Figure 3: Information functions of the test and test items. sign of a “good” test. However, it can be seen that the test contains a lot of easy test items with difficulties in the interval (-3; -2), which can be excluded from the test. Also in the test there are many easy tasks with the same difficulties, which can also be excluded from the test without violation of its information content. However, the more difficult tasks (with difficulties of 1-2 logits) are clearly not enough in the test, so it is necessary to add more complex tasks. The graph of the measurement error, depending on the level of training, is shown in figure 4. It can be seen from the graph that the measurement error is large for the values of the ability in the interval (2,4), which is associated with the lack of test items of increased difficulty. 4. Discussion The purpose of this paper was to automate the process of testing students’ knowledge, which is especially relevant for distance learning. To achieve this goal, an adaptive testing algorithm based on the Rush model was proposed and the modeling of the students’ knowledge assessment process using this algorithm was carried out. The results of testing their knowledge in the course “Higher Mathematics” obtained in the Moodle system were taken as the initial values of the tasks complexity and the levels of the students’ ability. 38 5 4 measurement error 3 2 1 −4 −2 0 2 4 ability level Figure 4: Measurement error graph. As a result of modeling, the levels of students’ abilities were recalculated, the information functions of the test tasks and the entire test as a whole were built, the standard measurement error was calculated, depending on the student’s ability level. The analysis of the obtained results allows us to conclude that the test is not balanced, contains too many easy tasks. In this case, these are tasks with numbers 1, 3, 11. Removing them from the test will reduce the number of test items and speed up the process of determining the student’s level of training. A change in the assessment of the student’s ability level as a result of testing indicates the need to introduce an adaptive testing system into the educational process, which will improve the quality of assessment of student knowledge. These conclusions are confirmed by the works of other authors. So, Al-A’ali [39] shown that the use of adaptive testing based on IRT made it possible to reduce the number of test tasks and increase the reliability of determining the level of student readiness. The effectiveness of the use of adaptive testing to improve the quality of pedagogical measurements is evidenced by Weiss [40, 41]. 39 5. Conclusions As a result of this work, the following results were obtained: 1. An algorithm of adaptive knowledge assessment based on the IRT approaches was pro- posed. This algorithm consists of an initial assessment of the difficulty level of test items and students’ abilities, scaling of these parameters, selection of the next question based on minimizing the module of their difference and estimation of the measuring error of the knowledge level by the information function of the proposed question. 2. The test parameters were evaluated on the basis of IRT theory, which identified non- informative test questions that should be excluded from the set of test items. The results of the study showed the effectiveness of using IRT to assess knowledge. References [1] E. B. Andersen, The asymptotic distribution of conditional likelihood ratio tests, Journal of the American Statistical Association 66 (1971) 630–633. doi:10.1080/01621459.1971. 10482321. [2] E. B. Andersen, A goodness of fit test for the Rasch model, Psychometrika 38 (1973) 123–140. doi:10.1007/BF02291180. [3] D. Andrich, Rasch Models for Measurement, Thousand Oaks, 2021. URL: https://methods. sagepub.com/book/rasch-models-for-measurement. doi:10.4135/9781412985598. [4] V. S. Avanesov, The problem of psychological tests, Soviet Education 22 (1980) 6–23. doi:10.2753/RES1060-939322066. [5] A. Birnbaum, Combining independent tests of significance*, Journal of the American Statistical Association 49 (1954) 559–574. doi:10.1080/01621459.1954.10483521. [6] L. Guttman, A basis for scaling qualitative data, American Sociological Review 9 (1944) 139–150. doi:10.2307/2086306. [7] J. M. Linacre, Predicting responses from Rasch measures, Journal of Applied Measurement 11 (2010) 1–10. [8] F. M. Lord, M. R. Novick, A. Birnbaum, Statistical theories of mental test scores, Addison- Wesley, Oxford, 1968. [9] A. A. Maslak, G. Karabatsos, T. S. Anisimova, S. A. Osipov, Measuring and comparing higher education quality between countries worldwide, Journal of Applied Measurement 6 (2005) 432–442. [10] G. N. Masters, Educational measurement: Prospects for research and innovation, The Australian Educational Researcher 15 (1988) 23–34. doi:10.1007/BF03219425. [11] G. Rasch, Studies in mathematical psychology: I. Probabilistic models for some intelligence and attainment tests, Nielsen & Lydiche, 1960. [12] W. Brown, Some experimental results in the correlation of mental abilities1, British Journal of Psychology, 1904-1920 3 (1910) 296–322. doi:10.1111/j.2044-8295.1910. tb00207.x. [13] L. J. Cronbach, Coefficient alpha and the internal structure of tests, Psychometrika 16 (1951) 297–334. doi:10.1007/BF02310555. 40 [14] J. P. Guilford, Fundamental statistics in psychology and education, McGraw-Hill, New York, 1942. [15] H. Gulliksen, Perspective on Educational Measurement, Applied Psychological Measure- ment 10 (1986) 109–132. doi:10.1177/014662168601000201. [16] G. F. Kuder, M. W. Richardson, The theory of the estimation of test reliability, Psychome- trika 2 (1937) 151–160. doi:10.1007/BF02288391. [17] R. D. Luce, J. W. Tukey, Simultaneous conjoint measurement: A new type of funda- mental measurement, Journal of Mathematical Psychology 1 (1964) 1–27. doi:10.1016/ 0022-2496(64)90015-X. [18] G. Sax, Principles of educational and psychological measurement and evaluation, 3rd ed., Wadsworth Pub. Co., Belmont, 1989. [19] C. Spearman, Correlation calculated from faulty data, British Journal of Psychology, 1904-1920 3 (1910) 271–295. doi:10.1111/j.2044-8295.1910.tb00206.x. [20] D. Andrich, The Rasch model explained, in: Applied Rasch measurement: A book of exemplars, Springer, 2005, pp. 27–59. [21] N. Bezruczko (Ed.), Rasch measurement in health sciences, Jam Press Maple Grove, MN, 2005. [22] T. Bond, C. Fox, Applying the Rasch model: Fundamental measurement in the human sciences, second ed., 2007. doi:10.4324/9781410614575. [23] T. Bond, Z. Yan, M. Heene, Applying the Rasch model: Fundamental measurement in the human sciences, fourth ed., Routledge, 2020. [24] T. Eckes, Introduction to Many-Facet Rasch Measurement, Peter Lang, Bern, Switzerland, 2011. URL: https://www.peterlang.com/view/title/13347. [25] G. H. Fischer, I. W. Molenaar (Eds.), Rasch models: Foundations, recent develop- ments, and applications, Springer Science & Business Media, 1995. doi:10.1007/ 978-1-4612-4230-7. [26] D. Andrich, B. Sheridan, G. Luo, Rumm2010: Rasch unidimensional measurement models, 2001. URL: http://www.rummlab.com.au/. [27] G. S. Ingebo, Probability in the Measure of Achievement, Mesa Press, 1997. [28] S.-H. Kim, F. B. Baker, birtr: A Package for “The Basics of Item Response The- ory Using R”, Applied Psychological Measurement 42 (2018) 403–404. doi:10.1177/ 0146621617748327. [29] P. F. Lazarsfeld, Regression analysis with dichotomous attributes, Social Science Research 1 (1972) 25–34. doi:10.1016/0049-089X(72)90056-7. [30] W. J. van der Linden, R. K. Hambleton (Eds.), Handbook of Modern Item Response Theory, Springer Science & Business Media, 1997. doi:10.1007/978-1-4757-2691-6. [31] F. M. Lord, Applications of Item Response Theory To Practical Testing Problems, Routledge, 1980. doi:10.4324/9780203056615. [32] R. Perline, B. D. Wright, H. Wainer, The Rasch model as additive conjoint mea- surement, Applied Psychological Measurement 3 (1979) 237–255. doi:10.1177/ 014662167900300213. [33] E. V. Smith, R. M. Smith (Eds.), Introduction to Rasch measurement: Theory, models and applications, JAM Press, 2004. [34] M. Wilson, Constructing Measures: An Item Response Modeling Approach, Routledge, 41 2005. [35] B. D. Wright, Solving Measurement Problems with the Rasch Model, Journal of Educational Measurement 14 (1977) 97–116. URL: http://www.jstor.org/stable/1434010. [36] B. D. Wright, G. N. Masters, Rating scale analysis, Mesa Press, Chicago, 1982. [37] B. D. Wright, M. H. Stone, Best test design, Mesa Press, Chicago, 1979. URL: https://www. rasch.org/BTD_RSA/pdf%20[reduced%20size]/Best%20Test%20Design.pdf. [38] B. D. Wright, J. M. Linacre, Dichotomous rasch model derived from specific objectivity, Rasch measurement transactions 1 (1987) 5–6. URL: https://www.rasch.org/rmt/rmt11a. htm. [39] M. Al-A’ali, IRT-Item Response Theory Assessment for an Adaptive Teaching Assessment System, in: Proceedings of the 10th WSEAS International Conference on APPLIED MATH- EMATICS, MATH’06, World Scientific and Engineering Academy and Society (WSEAS), Stevens Point, Wisconsin, USA, 2006, p. 518–522. [40] D. J. Weiss, Improving measurement quality and efficiency with adaptive testing, Applied Psychological Measurement 6 (1982) 473–492. doi:10.1177/014662168200600408. [41] D. J. Weiss, Computerized adaptive testing for effective and efficient measurement in counseling and education, Measurement and Evaluation in Counseling and Development 37 (2004) 70–84. doi:10.1080/07481756.2004.11909751. 42