Applying the item response theory with two-parameter, three-parameter models in the evaluation of multiple choice tests
Main Article Content
Abstract
The article presents the results of analyzing and evaluating multiple-choice items based on Item Response Theory (IRT) with two-parameter and three-parameter models through analysis results of data from R software (package ltm). Data in this study are the results of answering 50 multiple-choice items of 590 students who took the English 1 test organized at Dong Thap University in 2018. By evaluating each multiple-choice item based on their difficulty, discrimination parameters and guessing parameter according to the models, the study has identified good items to put into item bank, and point out items that are not really optimal, thus should continue to be considered before being put into use. The review and analysis of multiple-choice items based on both models help evaluate items more comprehensive and item selection more accurate. In addition, the research results show that if the evaluation of the test is only based on the subjective opinions of professional lecturers, not on the process of analyzing and evaluating based on IRT, the not good items could be introduced into the test without being detected.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Keywords
Item Response Theory, two-parameter model, three-parameter model, multiple- choice items
References
Bui, N. Q. (2017). Evaluation of the quality of multiple choice test bank for the module of Introduction to Anthropology by using the RASCH model and QUEST software. Science of Technology Development, 20 (X3), 42-54.
Bui, A. K., & Bui, N. P. (2018). Using IATA to analyze, evaluate and improve the quality of the multiple-choice items in chapter power functions, exponential functions and logarithmic functions. Can Tho University Journal of Science, 54(9C), 81-93.
Doan ,H. C., Le, A. V., & Pham, H. U. (2016). Applying three-parameter logistic model in validating the level of difficulty, discrimination and guessing of items in a multiple choice test. Ho Chi Minh city University of Education Journal of Science, 7(85), 174-184.
Duong, T. T. (2005). Test and measure academic achievement. Hanoi: Social Sciences Publishing House.
Lam, Q. T., Lam, N. M., Le, M. T., & Vu, D. B. (2007). VITESTA software and analysis of test data. Vietnam Journal of Education, 176, 10-12.
Lam, Q. T. (2011). Measurement in Education - Theory and Application. Hanoi: Vietnam National University Publishing House.
Le, A. V., Pham, H. U., Doan, H. C., & Le, T. H. (2017). Using Gibbs Sampler to evaluate item difficulty in Rasch model. Ho Chi Minh city University of Education Journal of Science, 14(4), 119-130.
Rizopoulos, D. (2006). An R package for latent variable modeling and item response theory analysis. Journal of Statistical Software, 17(5) 1-25.
Nguyen, B. H. T. (2008). Using Quest software to analyze objective test questions. Journal of Science and Technology, Da Nang University, 2, 119-126.
Nguyen, P. H. (2017). Using GSP chart and ROC method to analyze and select mutiple choice items. Dong Thap University Journal of Science, 24 (2), 11-17.
Nguyen, P. H., & Du, T. N. (2015). The analysis and selection of objective test items based on S-P chart, Grey Relational Analysis, and ROC curve. Ho Chi Minh City University of Education Journal of Science, 6(72), 163-173.
Nguyen, T. H. M., & Nguyen, D. T. (2006). Measurement Assessment in the objective test: Question difficulty and Examinees’ ability. Vietnam National University Journal of Science, 4, 34-47.
Nguyen, V. C., & Nguyen, Q. T. (2020). Applying ConQuest software with the two-parameter IRT model to evaluate the quality of multiple-choice test. HNUE Journal of Science, 65(7), 230 – 242.
Pham, T. M., & Bui, D. N. (2019). The IATA software for analyzing, evaluation of multiple-choice questions at Ha Noi Metropolitan University. Scientific Journal of Ha Noi Metropolitan University, 20, 97-108.
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Danish Institute for Educational Research.
Most read articles by the same author(s)
- Van Canh Nguyen, Measuring student satisfaction of the quality of Dong Thap University library's services , Dong Thap University Journal of Science: Vol. 9 No. 4 (2020): Social Sciences and Humanities Issue (Vietnamese)
- Van Canh Nguyen, Van Tac Pham, Thi Bich Van Le, Measurement of students’ satisfaction in online courses at Dong Thap University , Dong Thap University Journal of Science: Vol. 12 No. 7 (2023): Social Sciences and Humanities Issue (English)
- Van Canh Nguyen, Assessing the students’ satisfaction level on training services at Dong Thap University , Dong Thap University Journal of Science: No. 40 (2019): Part A - Social Sciences and Humanities
- Van Canh Nguyen, Measuring the job requirements satisfaction level of teacher training majored graduates: A study based on employer feedbacks , Dong Thap University Journal of Science: Vol. 11 No. 6 (2022): Social Sciences and Humanities Issue (Vietnamese)
- Van Canh Nguyen, Van Tac Pham, Quoc Tuan Nguyen, Analysis and evaluation of question items: A solution to enhance the quality of multiple-choice test , Dong Thap University Journal of Science: Vol. 12 No. 3 (2023): Social Sciences and Humanities Issue (English)
- Thi Bich Van Le, Van Canh Nguyen, Children’s outdoor activity expressions at some kindergartens in Cao Lanh City, Dong Thap Province , Dong Thap University Journal of Science: No. 32 (2018): Part A - Social Sciences and Humanities