Research: The Impact of item flaws, testing at low cognitive level, and low distractor functioning on multiple-choice question quality

Title: The Impact of item flaws, testing at low cognitive level, and low distractor functioning on multiple-choice question quality
Authors: Syed Haris Ali, Kenneth G. Ruit
Access the original paper here
Listen to a deep-dive podcast:

Paper summary

This research article examines how improving multiple-choice questions (MCQs) in medical education affects assessment validity. Researchers manipulated MCQs in three groups: one group corrected item flaws and raised the cognitive level, another replaced ineffective distractors, and a control group remained unchanged. The study measured the impact of these changes on item difficulty and discriminatory ability, finding that improvements to the MCQs led to significantly better discrimination between high- and low-performing students. These findings highlight the importance of carefully constructed MCQs for accurate and valid medical assessments. The study also identifies some limitations, including a small sample size and pre-existing differences between groups.

What are the key implications for teachers in the classroom?

The sources suggest several key implications for teachers in the classroom, particularly those who use multiple-choice questions in their assessments:

Item Flaws: Teachers should be aware of common item writing flaws that can affect the validity of scores. These flaws, which can lead to construct-irrelevant variance, include grammatical or logical cues, absolute terms, long correct answers, word repeats between the stem and the correct answer, and convergence strategies. Teachers should avoid these flaws in order to make sure that student performance reflects knowledge of content and not test-wiseness. Additionally, teachers should avoid options that are long, complicated or double, inconsistent numeric data, vague terms, non-parallel language, non-logical order of options, use of ‘none of the above,’ tricky or complicated stems, and hinged questions.
Cognitive Level: Teachers should strive to assess higher cognitive functions, such as the application of knowledge, rather than just factual recall. Using clinical vignettes can help assess a student’s ability to apply knowledge. Questions that test lower cognitive functions are more likely to have item flaws. The study found that a higher proportion of items in the control group tested at low cognitive levels than the experimental groups, which may have impacted study results.
Distractor Functioning: Teachers should pay attention to how well distractors are functioning in multiple-choice questions. A functioning distractor is an incorrect option selected by at least 5% of the students and is more often selected by low-performing students. Teachers should avoid or replace non-functioning distractors to improve the discriminatory ability of the questions.
Item Analysis: Teachers should use item analysis data to evaluate the quality of multiple-choice questions. This data can help teachers identify questions that are too easy or too difficult, or that have poor discriminatory ability. Specifically, teachers can look at the item difficulty index, point-biserial correlation, and the number of functioning distractors.
Intervention: Teachers can improve the quality of their multiple-choice questions by correcting item flaws, enhancing the cognitive level of the questions, and replacing or removing non-functioning distractors. According to the study, these interventions can increase the number of functioning distractors, improve the discriminatory ability of the questions and prevent construct-irrelevant variance from affecting the validity of scores.
Professional Development: The findings suggest a need for faculty development in the area of item writing and assessment. Improving the quality of multiple choice assessments is a challenging task requiring skill and resources, and professional development can help teachers learn how to create and improve multiple choice assessments.
Iterative Improvement: Teachers should consider using a free-response format of the question to identify recurring incorrect responses, which can then be used to develop distractors for the multiple-choice version of the question. This iterative approach can further refine the quality of multiple-choice questions.

In summary, teachers should focus on writing clear, unbiased questions that assess higher-order thinking and include functional distractors, and use item analysis data to continuously improve the quality of their multiple-choice assessments.

Quote

Correction of item flaws, removal or replacement of non-functioning distractors, and enhancement of tested cognitive level positively impact the discriminatory ability of multiple-choice questions