Pedagogical Diagnostics with Use of Computer Technologies Lyudmyla Bilousova1, Oleksandr Kolgatin1 and Larisa Kolgatina1 1 Kharkiv National Pedagogical University named after G.S.Skovoroda, Kharkiv, Ukraine lib215@list.ru, kolgatin@ukr.net, larakl@ukr.net Abstract. The technology of the automated pedagogical diagnostics is ana- lysed. The testing strategy, oriented for pedagogical diagnostics purpose, and grading algorithm, which corresponds to Ukrainian school grading standards, are suggested. "Expert 3.05”software for automated pedagogical testing is de- signed. The methods of administration of the database of the test items are pro- posed. Some tests on the mathematical topics are prepared with "Expert 3.05". The approbation of these tests in the educational process of Kharkov National Pedagogical University named after G.S.Skovoroda is analysed. Keywords. E-learning, Diagnostics, Test Key terms. InformationCommunicationTechnology, Teaching Process 1 Introduction Pedagogical diagnostics is the integral part of adaptive E-learning courses. The un- conditional quality of testing is its high informative abilities. However, in practice the large part of the test information is often not used. Computer technologies give us possibility to organize the qualitative pedagogical diagnostics at new level. Modern automated systems which can be qualified as expert systems are capable to supply comprehensive algorithms of testing and analysis of the test results. Testing with use of computers allows a teacher to obtain the summary characteristics of knowledge and skills of the pupils' group and to use this information to choose the teaching methods. A study of such algorithms is a wide field of the scientific work. Therefore, the aim of our paper is to design methods of the pedagogical diagnostics, which satisfy following demands:  Different forms of the intellectual activities of an examinee are attracted in process of testing;  The automated system of the pedagogical diagnostics ensures its diagnostic abili- ties at wide differences of the examinees mastering;  Processing of the test results provides maximum information for an examinee and a teacher to correct the educational process. 210 L. Bilousova, O. Kolgatin and L. Kolgatina 2 Objectives The first stage of pedagogical diagnostics organising is a construction of an idealised pedagogical model that is allocation of basic elements of knowledge and skills, as well as detecting the level of its mastering. The second stage represents creation of the problems system which covers all ele- ments of knowledge and skills and all levels of their mastering. We cannot design test as a system of test items of equal difficulty, in spite of rec- ommendation of the classic test theory. Such approach gave the best tests for dis- crimination of examinees into several groups. However, the test with equal items has low validity for examinees with bad mastering because of guessing answers. Validity of such a test is also low for good mastering examinees because of lack of attention. Therefore, it is certainly necessary to include problems of different difficulty to the test. How to design a test item of advanced difficulty? What is difficulty? Why the most of examinees do not solve some problems? We cannot use problems of the reproductive level as items of advanced difficulty. There are not difficult facts and easy facts. Our educational process should be organ- ised to provide steady knowledge of all compulsory facts. If the most of examinees do not know some compulsory facts, it means that we should correct our teaching. We are against using items which correspond to facts that are fragmentary studied and which are not basic for the tested topic. Therefore, all problems corresponding to the reproductive knowledge must have equal difficulty. Someone can increase the difficulty of an item by combining several operations in this item. Such approach leads to increasing influence of lack of attention on the test results, as well as to necessity of using weight coefficients and to decreasing of the measuring accuracy of the test. We are also against using problems which correspond to facts that are fragmentary studied and do not form basis of the topic being tested. In our opinion, the item of advanced difficulty should be connected with use of more difficult, not reproductive kinds of the intellectual activities [1], [2]. Full and high-qualitative pedagogical diagnostics should be built on the system of test items of all levels: reproductive and productive. By analogy with levels of educa- tional achievements [2], which are standardised by the Ukrainian Ministry of Educa- tion and Science [3] we propose the following levels of the test items: 1. Initial level - it is the very simple problems which assume the reproductive charac- ter of the student’s activities, mainly distinguishing. The difficulty index of these test items is about 1, the most of the examined students execute these items cor- rectly. 2. Average level - it is the problems which assume the reproductive activities, these problems cover all basic facts and unary skills according to curriculum. A database of items of this level is designed the most naturally. According to the Ukrainian standards [3], the student can continue education, if he (she) knows not less than 50% of compulsory facts determined by curriculum. Therefore, by linear estima- tion, average index of difficulty must be near 75% for the reproductive items. Pedagogical Diagnostics with Use of Computer Technologies 211 3. Sufficient level - these items assume the examinee applies his knowledge and skills for solving problems in standard situation. 4. High level - these test items are practical problems which assume executing of new algorithm, carrying knowledge into new, non-standard situation, etc. These items can lose creative nature, if the method of its solving was explained in the process of learning. Therefore, the database of the items of the level 4 requires continuous analysis and modernisation. We propose the vector processing of the test results – separate calculation of the score for items of every level. It allows to avoid the use of the artificial weight coefficients and to provide the comprehensive algorithm of adaptive strategy of testing and grad- ing. We also propose the separate processing of the results for the test items according the elements of knowledge and skills. Using computer for test administration allows to analyse the examinee's results di- rectly in process of testing and to suggest an examinee the items, which mostly corre- spond to his (her) level of educational achievements. Such approach is often called adaptive or quasi-adaptive testing [4]. 3 Model and Algorithm A choice of the items level for start of testing is an important question of the adaptive strategy. Testing usually starts from the simplest items. Such approach provides de- creasing of psychological discomfort and creates the atmosphere of competition, the feeling of growth according to complication of the problems. Taking into account this consideration we propose to start testing just from items of the level 2 which are the simplest for the examinees that will obtain positive grade. There is additional argument to choose the level 2 as the start level of testing. The test items of the level 2 reflect the compulsory facts of the study topic; these problems cannot be excluded from the test process. It is not worthwhile to start test from the items of the level 3, because productive and, especially, creative problems are based on sufficiently wide spectrum of knowledge, and it is not always possible to detect, which exactly element of curriculum is not mastered by an examinee. The items of the level 1 are intended for students whose mastering is not satisfactory; therefore, there is no need to suggest these items to all examinees. Our testing strategy and algorithm of grading are presented on the fig. 1. Here are some comments to fig. 1. The testing starts with the items of the level 2. An examinee solves the compulsory minimum of the items on the level 2, automated system calculates S2 - his (her) score on the level 2 and estimates the error of the score. If accuracy is enough, the automated system chooses a grade or increases the level of items, which are being suggested to the examinee. Otherwise, the items of the level 2 are being suggested to the examinee until the accuracy become satisfactory. It should be underlined that accuracy depends not only on the number of items, but it depends on the individual test score [5]. The necessary accuracy is also conditioned 212 L. Bilousova, O. Kolgatin and L. Kolgatina by the differences between the test score and the key points for decision about grading or a rise of a level. Begin Solving of problems of the level 2 N Yes S2>=0.5 Solving of problems N S2>=0.8 Ye of the S1>=0.8 Yes N Grade 3 N Ye S1>=0.5 А Grade 1 Grade 2 А А N Ye Solving of S2>=0.6 problems Grade 5 of the N S2>=0.5 Ye А N S3>=0.8 Ye Grade 3 Grade 4 Solving of А А problems of the N S4>=0.7 Ye N Ye N Ye S3>=0.5 S4>=0.5 Grade12 Grade 8 N Ye Grade11 А N Ye S4>=0.3 S3>=0.3 А А Grade 6 Grade 7 Grade 9 Grade10 А А А А А End Fig. 2. Grading algorithm Testing on the levels 3, 4 and 1 is carried out by analogy with level 2 with scores S3, S4 and S1 accordingly. Such algorithm of testing improves accordance between the level of items being suggested and the level of the examinee's mastering. Influence of a lack of attention on the test score for examinees with an excellent mastering is decreased. The exami- Pedagogical Diagnostics with Use of Computer Technologies 213 nees with bad mastering solve easy problems, which correspond to the most important parts of the educational content. The psychological discomfort, connected with con- stantly incorrect answers, is excluded, but such easy items give a possibility to deter- mine the structure of the knowledge and skills, as well as to distinguish examinees, who have not mastered the compulsory minimum according to curriculum. In any case the result of the pedagogical diagnostics will be more careful in com- parison to testing without special selection of items. 4 Software The computer support of offered technology is provided with information system "Expert 3.05" designed by us as a distributed database in the MicrosoftAccess2003 environment. The important advantage of our information system in our opinion is a modular principle of its construction which allows the author of the test items (proba- bly, with the help of the programmer) to create and to add to the database new forms of the test items. The central database of information system is the test items database. The test items are grouped by topics for convenience of viewing. The elements of an educational content are picked out in each topic. To each element the author provides the comment for a student who has not mastered this content. Some blocks of the test items of a different level of difficulty are offered to verify student's mastering in each element of an educational content. The author specifies for the every block such pa- rameters: a test items level (0-4), a weight factor and maximum time of exposition of one item. All items of the block should be of one type, that is, the identical dialogue form. The student will be offered one or several items from each block by a casual choice in a process of testing. Quantity of items blocks and filling these blocks are determined by required quality of diagnostics. The database, which contains the information on the answers of each examinee on each item, is formed by results of testing. This database includes such fields: the code of the test item, the correctness of the given answer, the level of the item, probability of casual guessing of a correct answer, time of item solving. The additional service information (time and date of testing, examinee's grade etc.) is also being stored. The examinee receives (fig. 2) the diagnostic data on each element of knowledge; chart, which reconstructs a structure of his (her) knowledge; recommendations for independent work. The author receives the statistical analysis of each item difficulty and discrimination, correlation with the test score (grade). The diagram shows de- pendence of item difficulty on the examinee's test score (grade) and is very useful (fig. 3). The author has an opportunity to generate database query by means of Access environment and to pass the data for the further analysis in spreadsheets. 214 L. Bilousova, O. Kolgatin and L. Kolgatina Fig. 3. Diagnostic data for the examinee Fig. 4. Statistical analysis of a test item for the author 5 Experience of Diagnostics with “Expert 3.05” We have prepared the system of the test items with "Expert 3.05" on some courses: “Mathematical methods in psychology”; “Theoretical basis of informatics”; “Archi- tecture of a personal computer”. Now we are able to start the third stage of the preparation of the system of the pedagogical diagnostics - approbation and verification of the test items. Pedagogical Diagnostics with Use of Computer Technologies 215 Our technology of verification is based on the requirements of the Standard of the Ukrainian Ministry of the education and science [6] and takes into account features of the automated pedagogical diagnostics. The analysis of the test begins with detecting of a level of educational achieve- ments of the students based on the expert's rating, for example, it may be a traditional examination. The complete verification procedure assumes that the experts determine such rating irrespective of the verifying test. Approbation should be organised with enough number of examinees to guarantee sufficient number of answers for any test item. Determination of the student's rating by experts cannot be organised so often, as it is necessary for constant updating of the test items database. Therefore, for the cur- rent verification it is possible to offer detecting of a level of student’s educational achievements with the help of the same automated system of pedagogical testing and to check a correlation of the separate item with the test score. In such a case, the ap- probation data are accumulated continuously, including the independent work of the students with the automated system. Validity of the automated system of testing as a whole is checked through comparison of integrated results of testing with results of other kinds of the control: interview, examination, execution of practical works etc. For maintenance of reliability of the current verification, the automated system does not include into the analysis the answers of not registered examinees: teachers and other users, whose names are not in the lists of the students groups. The answers of the test pass, which has not been finished, are not analysed too. The answers are not taken into account, if the time of its execution is smaller, than it is necessary for ac- quaintance with the text of the problem. There is an opportunity to specify additional conditions of selection of the valid answers with use of the Access environmental (for example, date and time of testing, educational group, variant of the test etc.). After distribution of the students according to their educational achievements, the conformity of a level of the test item and its empirical index of difficulty is checked. According to the requirements of the Ukrainian educational standards, the students with an average level of educational achievement (the grade of 4 on the grading scale with 12 grades) "... knows about half of educational content, is capable to reproduce it, to repeat after the model the certain operation..." [3]. Just the items of the level 2 in our classification have such contents. Thus, the difficulty index of the items of the level 2 can not be less than 0.5 for such students and we consider it to be within the range 0.5-0.9. It is necessary to note that such a range of difficulty is not convenient from the viewpoint of improvement of test statistical parameters. However, the items of the level 2 represent a set of educational content facts, which are obligatory for mastering. Therefore, the problems author cannot change its difficulty without chang- ing the curriculum. Thus, for the items of the level 2 on the sample of students with an average level of educational achievements we have the following algorithm of analysis of the item quality:  An item does not require correction, if its difficulty index within 0.5-0.9, the dis- crimination index is higher than 0.25, that is, discrimination index satisfies the re- quirements of the standard [6] (fig. 4.). 216 L. Bilousova, O. Kolgatin and L. Kolgatina  The item difficulty index is more than 0.9. This item should be analysed with the help of the diagram of dependence of difficulty from educational achievements of the students. It should be determined, whether this item has the discrimination abil- ity for the students with the initial level of educational achievements, accordingly, this item level should be changed (fig. 5). Otherwise, this item should be removed from the test.  The difficulty index is less than 0.5, it is necessary to analyse the content of the item, such situations are possible: ─ The item is not reproductive, its discrimination ability satisfies the requirements of the standard. It is necessary to increase the item level (fig. 6). ─ The item has low discrimination ability for all students; it signifies that some mistakes in the formulation of the item take place. This item should be removed or corrected (fig. 7). There is a possibility of the situation, when all experts agree that the problem is correctly designed and satisfies the curriculum, in such a case mastering of the students should be checked by some another method and, may be, the quality of educational process should be analysed. The best range of difficulty index for items of levels 1, 3, 4 will be within 0.5-0.6 (allowable 0.3-0.7) on the sample of the examinees of appropriate level of mastering. The analysis of these items on a difficulty is carried out by analogy with level 2. Fig. 5. A typical item of the level 2 Pedagogical Diagnostics with Use of Computer Technologies 217 Fig. 6. A typical item of the level 1 Fig. 7. A typical item of the level 3. It is not a reproductive problem, if the students do not learn the table of the binary codes for numbers until 15. After finishing three stages of preparation the system of pedagogical diagnostics is ready for practical use. The stage of practical application of the system combines procedures of testing and statistical processing of the obtained results including the interpretation of the results for students, teachers and authors of the test problems. The expert system of pedagogical diagnostics needs continuous modernisation of its database. Naturally, it requires returning to previous stages of the work with the sys- tem. 218 L. Bilousova, O. Kolgatin and L. Kolgatina Fig. 8. An unsuccessful item The “Expert 1.01-3.05” software is used in Kharkiv National Pedagogical Univer- sity named after G.S.Skovoroda since 2001. Here are some latest results on approba- tion. The test on mathematical methods of statistical analysis of the pedagogical diag- nostics data was suggested to future teachers of informatics, mathematics, chemistry as an element of courseware "Information systems in pedagogical activities” in 2012- 2013 academic year. The purpose of testing was the self-diagnostics of students. So, the students could pass the test many times, studying the problem elements of the learning content and improving their results. We took into account the best results of testing and compared its with the examination results. The Pearson correlation was 0.7 at sample of 51 students and we can consider it as the test validity. The test results gave us possibility to study the structure of students’ knowledge at basic questions of statistical analysis of the pedagogical diagnostics data (fig. 8). The error estimation for the data on fig. 8 was evaluated as a half of the 95% confidence interval 1.95s , y  n where s is the estimation of the standard deviation and n is the number of test items of the given learning element, which were passed by students. The errors are different for every point on fig. 8, because the number of test items on various elements is dif- ferent, so we show the ranges of errors in table 1. The results (fig. 8) show that problems of choosing the scales for the pedagogical evaluations are the most difficult for students. The problems of reproductive level, where student should to choose the method or formula for estimation of some parame- ters of statistical distribution, are the easiest. But on the productive level, when stu- dents should explain the influence of the values and number of variants in a sample on the estimated parameters, such problems are the most difficult. Pedagogical Diagnostics with Use of Computer Technologies 219 Fig. 8. Difficulty index (probability of correct answer) as a function of the student’s grade for problems of various elements of the learning content 220 L. Bilousova, O. Kolgatin and L. Kolgatina Table 5. Errors of difficulty index estimation at different student’s grades Student’s grade Estimation error 1 0.01-0.04 2 0.04-0.1 3 0.02-0.05 4-5 0.01-0.07 6-7 0.02-0.16 8 0.03-0.18 9 0-0.4 10-12 0-0.6 6 Conclusions 1. New comprehensive algorithm of testing and grading is suggested. This algorithm takes into account possibilities of the computer technologies and requirements of Ukrainian standards. 2. The automated system of the pedagogical diagnostics is designed. 3. The methods of the items database administrating is proposed and used in practice in the educational process of the Kharkov National Pedagogical University. References 1. Bloom, B.S.: Taxonomy of Educational Objectives. Book 1: Cognitive Domain. Longman, Inc., New York (1956) 2. Bespalko, V.P.: Basis of the Pedagogical Systems Theory: Problems and Methods of Psy- chological and Pedagogical Providing of Technical Teaching Systems, Voronezh Univer- sity Press, Voronezh (1977) 3. Criterions of Grading of the Educational Achievements of Pupils in the System of Secon- dary Education. Education of Ukraine, 6 (2001) 4. Zaitseva, L.V.; Prokofyeva, N.O.: Models and Methods of Adaptive Knowledge Diagnos- tics. Educational Technology & Society, 7, http://ifets.ieee.org/russian/depository/v7_i4/ pdf/1.pdf (2003) 5. Kolgatin, O. G.: The Statistical Analysis of the Test with Different Forms of Items. Means of Teaching and Research Work, 20, KhSPU, Kharkiv (2003) 6. Means of Diagnostics of a Level of Educational and Professional Training. The Tests of the Objective Assessment of a Level of Educational and Professional Training. Order of the Ministry of Education and Science of Ukraine, № 285, 31 July 1998 (1998)