Operational statistical analysis of the results of computer-based testing of students

Cover Page

Abstract


The article is devoted to the issues of statistical analysis of results of computer-based testing for evaluation of educational achievements of students. The issues are relevant due to the fact that computerbased testing in Russian universities has become an important method for evaluation of educational achievements of students and quality of modern educational process. Usage of modern methods and programs for statistical analysis of results of computer-based testing and assessment of quality of developed tests is an actual problem for every university teacher. The article shows how the authors solve this problem using their own program “StatInfo”. For several years the program has been successfully applied in a credit system of education at such technological stages as loading computerbased testing protocols into a database, formation of queries, generation of reports, lists, and matrices of answers for statistical analysis of quality of test items. Methodology, experience and some results of its usage by university teachers are described in the article. Related topics of a test development, models, algorithms, technologies, and software for large scale computer-based testing has been discussed by the authors in their previous publications which are presented in the reference list.


Issues of a development and usage of pedagogical tests are studied in a huge range of scientific researches and publications. Among these issues we can name theories of pedagogical tests [1; 21; 23] and their terminology [3], tests’ design [2], methodological rules of pedagogical tests formation [4; 12], test administration [6; 21], statistical analysis and interpretation of obtained results [20; 22]. Researches of computer tests usage for assessing students’ academic achievements were widely presented in the following publications covering such themes as a comparability of paper and computer testing [18], computer adaptive testing [5; 14], development of models, algorithms, complex technologies and information-computational systems of computer testing [10], administration of massive testing sessions of students in computer classes and by Internet [13; 17], statistical analysis of results of students’ computer testing [19; 20], usage of computer testing in a practical work of teachers [8; 11], governmental registration of computer testing programs and test questions banks [16; 19]. Results of computer testing characterize students’ academic achievements and support evaluation of educational process’s quality. In this situation such questions as operational statistical analysis and correct evaluation of test results as well as quality control of test materials have become of great importance for teachers. Statistical Analysis in Computer-Based Testing. The collection and analysis of statistical information provide feedback, without which testing cannot be considered a scientific method for controlling knowledge and evaluating academic achievements. Conditionally, all statistics can be divided into two parts: 1) information considering participants and test results; 2) data on the quality of test materials. When working with the first part the system of computer testing usually allows selecting the object of statistical processing. For the selected object you can get the number of participants by categories, the division of the participants by their scores, the percentage of correct answers for each test question. Based on the obtained data the system of computer testing can build diagrams, make a comparative analysis of the results of testing various chosen objects. It allows seeing the list of participants and, if needed, performing a quick search and desired sorting. It is possible to print out statistical forms and lists or parts of them. Finally, you can export the data of interest for analysis in a special statistics package. To assess the quality of questions in the test you can use methods of statistical analysis. For this purpose the test must be performed by a sufficiently large number of participants. Based on the collected and processed protocols you can draw conclusions considering the quality of the tests. This should be done to improve the quality of the tests and to correct questions to use them in the following sessions of computer testing. Often with the help of the computer testing system the minimum, average, maximum value, standard deviation and coefficient of variation of the obtained test scores are determined. You can determine the number of tests, in which the question of interest appeared, and the number of participants who took part in the test. The following characteristics of the test tasks can be calculated: 1. The difficulty of the question defined by the percentage of participants who have performed the task correctly. The range of the question difficulty should be from 20% to 80%. It is necessary to exclude from the bank (or correct the text of) the most difficult questions (which were performed correctly less than by 20% of the participants) and the simplest ones (which were performed correctly more than by 80% of the participants). 2. The differentiating ability of a question is a measurement of its validity. It is calculated as a difference between the two values: the difficulty of the question for a strong group of participants and its difficulty for a weak one. The higher the differentiating ability of a question, the better it divides the tested by the level of preparation. 3. Point-biserial correlation allows estimating, how the performance of this question is connected with the performance of the entire test. In other words it is an indicator of the suitability of the question for the test. The analysis of the work of distractors in the question determines how each of the proposed answers “operates” in each task. The data of the distraction analysis makes it possible to draw conclusions about the existence of “good” questions, “badly” formulated questions, which the participants do not understand, question that need material that students have not studied so far, and questions that do not contain the correct answer. The distraction analysis includes obtaining the following list of characteristics: 1) the frequency of the choice of the correct response to the task (in percentage); 2) response rate (in percentage); 3) the differentiating ability of the response; 4) deviation of frequency of the wrong answer selection from the average value, etc. To determine complex characteristics, such as reliability, validity, test effectiveness, special statistical packages are used. For example, reliability analysis in SPSS [12] can be used to select the most suitable test questions. In order to do so a preliminary version of the test is developed which has an excessive number of questions. It is tested on a fairly representative number of students. Then a reliability analysis is carried out, which allows eliminating unsuitable questions. The bank of test questions for computer testing should clearly reflect the structure of the test for a specific subject. Each question of the test should have a set of variants of different level. All statistical parameters of these test question variants must be real. They can be determined on the basis of statistical analysis of the quality of trial tests used in different groups of students. Normally, the volume of the analyzed sample should go up to several hundred results for each question. The algorithm of inserting questions to the bank is based on the results of the specified statistical parameters calculation and on excluding inappropriate questions. “StatInfo” Program. To solve the problems that a teacher faces the authors of the article developed the technology of the operative statistical analysis of the computer testing results [9]. It is based on the functionality of the “StatInfo” program, which has the state registration certificate]. The effectiveness and reliability of the technology has been proven by many years of applying it to assess the academic achievements of students in the credit system of education. In the article [19], the authors showed as an example the results of an operative statistical analysis of the real results of the final computer testing of students at the “Informatics” course for 15 semesters. The “StatInfo” program runs under the operating system MS Windows. The informational basis of this program is the table-based (relational) database. The material accumulated in the database can be used for scientific and practical work in the field of computer testing and monitoring the quality of education. The program can be installed locally or in a network. The “StatInfo” program allows a teacher: to select the object of statistical processing (group, faculty, etc.) and get the number of participants for this object, get their distribution by obtained scores or by points taking into account the transfer of test scores to the credit points, get the percentage of correct answers to the test question. Based on the received data, it is possible to build diagrams, perform a comparative analysis of the results of testing various objects of statistical processing, look at the lists of participants, and print out statistical forms and lists (or parts of them). If necessary, any statistical form or list can be saved in a separate text file or transported to electronic worksheets to build necessary diagrams and tables. The list of table fields with the results of computer testing is available for the formation of queries. An essential feature of the program is the “flexibility” of a possible query. The “Query” dialog contains a filter table in which, for each column of the information database, you can specify the values range allowed in the query. To form the received results the “Statistics” button opens a dialog where you can select a necessary statistical form. The following options are available: 1) the number of test participants; 2) received scores; 3) received points; 4) correct answers. An important role in the system is assigned to the lists of participants. To switch to working with lists of participants, use the “List” button. In this mode, you can perform the following operations: 1) select the necessary attributes of the test participants and determine the order of their succession for viewing on the screen (printing on the printer) by pressing the “Columns” button; 2) by pressing the “Sort” button, order the list of participants by any combination of attributes (by default, entries in the list are sorted by last name); 3) by pressing the “Search” button, search participants in the database by asking any combination of attributes (by name, subject code, number of points scored, etc.). Before starting the search, the list must be sorted by the according attributes. The storage of all necessary information (results of computer testing) in a single relational (table-based) database makes it possible to significantly speed up the process of calculating basic values and allows the author of the test to quickly move from the figure received in the report to the test question it came from. To perform calculations in MS Excel, a response matrix is often required. It is represented by a rectangular table, in each position of which the responses of the testing participant are indicated. Usually the line number corresponds to the number of the participant, and the column number corresponds to the number of the test question. To make a matrix of answers in the “StatInfo” program you must select the “Responses” column (students answers to the test questions) or the “Verification” column (the results of checking students’ answers to the tests questions) and in the “Export” mode make a text file (with the separator “ ; ”). The matrix of responses can additionally determine the median, mode, excess, asymmetry and other statistical characteristics of test tasks. Conclusion. The technology of the operative statistical analysis of the results of computer testing developed by the authors for assessing students’ academic achievements is based on the functional capabilities of the “StatInfo” program. Using the outcome of the statistical processing of the results of students computer testing errors in test questions (long formulation or lack of a correct answer) are identified and the sections of the course that are well or poorly apprehended by students are determined.

Viktor Ivanovich Nardyuzhev

Рeoples’ Friendship University of Russia (RUDN University)

Author for correspondence.
Email: vin111@mail.ru
Miklukho-Maklaya str., 6, Moscow, 117198, Russian Federation

candidate of engineering sciences, associate professor of the department “Computer technologies” of the philological faculty of the Peoples’ Friendship University of Russia

Ivan Viktorovich Nardyuzhev

Software development department, JSC “The Seventh Continent”

Email: inard@rambler.ru
Building, 21, MKAD 47, v. Govorovo, Moscow, Russia, 142784

candidate of engineering sciences, programmer of the software development department of JSC “The Seventh Continent”

Victoria Evgenievna Marfina

Рeoples’ Friendship University of Russia (RUDN University)

Email: vika434221@gmail.com
Miklukho-Maklaya str., 6, Moscow, 117198, Russian Federation

student of the master course at the department of comparative educational policy of the Peoples’ Friendship University of Russia

Ivan Nikolayevich Kurinin

Рeoples’ Friendship University of Russia (RUDN University)

Email: kurinin_in@pfur.ru
Miklukho-Maklaya str., 6, Moscow, 117198, Russian Federation

candidate of economic sciences, associate professor, head of the department “Computer technologies” of the philological faculty of the Peoples’ Friendship University of Russia

  • Anastazi A. Psikhologicheskoe testirovanie [Psychological Testing]. M.: Pedagogika, 1982. 320 p.
  • Avanesov V.S. Kompozitsiya testovykh zadanii [Composition of test questions]. M.: Arena, 2002. 240 p.
  • Balykhina T.M. Slovar’ terminov i ponyatii testologii [Testology terms and concepts dictionary]. M.: Izd-vo RUDN, 2000. 164 p.
  • Vasil’ev V.I., Demidov A.N., Malyshev N.G., Tyagunova T.N. Metodologicheskie pravila konstruirovaniya komp’yuternykh pedagogicheskikh testov [Methodological rules of designing computer pedagogical tests]. M.: Izd-vo VTU, 2000. 64 p.
  • Vasil’ev V.I., Tyagunova T.N. Osnovy kul’tury adaptivnogo testirovaniya [The basics of adaptive testing culture]. M.: Ikar, 2003. 584 p.
  • Efremova N.F. Sovremennye testovye tekhnologii v obrazovanii [Modern test technologies in education]: uchebnoe posobie. M.: Logos, 2003. 176 p.
  • Kurinin I.N., Nardyuzhev V.I., Nardyuzhev I.V. Komp’yuternoe testirovanie v otsenke uchebnykh dostizhenii studentov [Computer testing in assessing students’ academic achievements]: uchebnometodicheskoe posobie. M.: Izd-vo RUDN, 2008. 308 p.
  • Kurinin I.N., Nardyuzhev V.I., Nardyuzhev I.V. Sbornik testovykh zadanii po kursam “Informatika” i “Komp’yuternye tekhnologii v nauke i obrazovanii” [Corpus of test question for courses “Informatics” and “Computer Technologies in Science and Education”]. M.: Izd-vo RUDN, 2010. 306 p.
  • Kurinin I.N., Nardyuzhev V.I., Nardyuzhev I.V. Operativnyi statisticheskii analiz rezul’tatov komp’yuternogo testirovaniya v kreditnoi sisteme obucheniya [Operative statistical analysis of the results of computer testing in the credit system of education]. Vestnik Rossijskogo universiteta druzhby narodov. Serija «Informatizacija obrazovanija» [Bulletin of the Russian university of friendship of the people. “Education Informatization” series]. 2013. No. 1. Pp. 115—125.
  • Kurinin I.N., Nardyuzhev V.I., Nardyuzhev I.V. Kompleksnaya tekhnologiya komp’yuternogo testirovaniya [Comprehensive technology of computer testing]. Vestnik Rossijskogo universiteta druzhby narodov. Serija «Informatizacija obrazovanija» [Bulletin of the Russian university of friendship of the people. “Education Informatization” series]. 2013. No. 2. Pp. 112—121.
  • Kurinin I.N., Marfina V.E., Nardyuzhev V.I., Nardyuzhev I.V. Informatizatsiya prakticheskoi raboty prepodavatelya [Informatization of the teacher’s practical work]. Vestnik Rossijskogo universiteta druzhby narodov. Serija «Informatizacija obrazovanija» [Bulletin of the Russian university of friendship of the people. “Education Informatization” series]. 2015. No. 1. Pp. 42—52.
  • Maiorov A.N. Testy shkol’nykh dostizhenii: konstruirovanie, provedenie, ispol’zovanie [School achievements tests: design, administration, usage]. SPb.: Obrazovanie i kul’tura, 1996. 304 p.
  • Nardyuzhev V.I., Nardyuzhev I.V. Testirovanie na komp’yuterakh cherez Internet [Computer testing in the internet]. Trudy Tsentra testirovaniya. Vypusk 2. M.: Prometei, 1999. Pp. 139—157.
  • Nardyuzhev V.I., Nardyuzhev I.V. Dostoinstva i nedostatki komp’yuternogo adaptivnogo testirovaniya [Advantages and disadvantages of computer adaptive testing]. Razvitie sistemy testirovanija v Rossii: tezisy dokladov 2 Vserossijskoj nauchno-prakticheskoj konferencii [The development of the testing system in Russia: abstracts of 2 all-Russian scientific-practical conference]. M.: Prometei, 2000. Vol. 3. Pp. 47—48.
  • Nardyuzhev V.I., Nardyuzhev I.V. Modeli i algoritmy informatsionno-vychislitel’noi sistemy komp’yuternogo testirovaniya [Models and algorithms of information technology computer-based testing system]: monografiya. M.: Prometei, 2000. 148 p.
  • Nardyuzhev V.I., Nardyuzhev I.V. Svidetel’stvo ROSPATENT № 2000610068 ot 27 yanvarya 2000 g. ob ofitsial’noi registratsii programmy dlya EVM “Operativnyi statisticheskii analiz rezul’tatov testirovaniya na komp’yuterakh cherez Internet (STATINFO)” [Certificate ROSPATENT No. 2000610068 dated January 27, 2000 on the Official Registration of the Computer Program “Operational Statistical Analysis of Computer-based Test Results in the Internet (STATINFO)”].
  • Nardyuzhev V.I., Nardyuzhev I.V. Analiz urovnya komp’yuternoi gramotnosti uchastnikov tsentralizovannogo komp’yuternogo testirovaniya [Analysis of the level of computer literacy of participants in centralized computer testing]. M.: Tsentr testirovaniya Minobrazovaniya Rossii, 2001. Pp. 86—93.
  • Nardyuzhev V.I., Nardyuzhev I.V. Sravnenie rezul’tatov tsentralizovannogo testirovaniya na blankakh i na komp’yuterakh [Comparison of the results of centralized testing on blank sheets and on computers]. M.: Narodnoe obrazovanie, 2002. 10 p.
  • Nardyuzhev V.I., Nardyuzhev I.V., Marfina V.E., Kurinin I.N. Svidetel’stvo № 2016620662 ot 24 maya 2016 g. o gosudarstvennoi registratsii bazy dannykh “Testovye zadaniya. Komp’yuternye tekhnologii v nauke i obrazovanii” [Certificate No. 2016620662 Dated May 24, 2016 on the State Registration of the Database “Test tasks: Computer Technologies in Science and Education”]. Vydano Federal’noi sluzhboi po intellektual’noi sobstvennosti (RosPatent) [RosPatent].
  • Nasledov A. SPSS 19: professional’nyi statisticheskii analiz dannykh [SPSS 19: Professional Statistical Analysis of Data]. SPb.: Piter, 2011. 400 p.
  • Neiman Yu.M., Khlebnikov V.A. Vvedenie v teoriyu modelirovaniya i parametrizatsii pedagogicheskikh testov [Introduction to the theory of modeling and parametrization of pedagogical tests]. M.: Prometei, 2000. 168 p.
  • Tyurin Yu.N., Makarov A.A. Statisticheskii analiz dannykh na komp’yutere [Statistical analysis of data on the computer]. M.: INFRA-M, 1998. 528 p.
  • Chelyshkova M.B. Razrabotka pedagogicheskikh testov na osnove sovremennykh matematicheskikh modelei [Development of pedagogical tests based on modern mathematical models]: uchebnoe posobie. M.: Issledovatel’skii tsentr problem kachestva podgotovki spetsialistov, 1995. 32 p.

Views

Abstract - 329

PDF (English) - 121

PlumX


Copyright (c) 2017 Nardyuzhev V.I., Nardyuzhev I.V., Marfina V.E., Kurinin I.N.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.