Design, Development and Scoring of Alternative Item Types in a Medical Examination
Track: Test Development and Administration
As computer technology becomes increasingly sophisticated, alternative item types such as multiple-response and drag-and-drop questions are being used in assessments more often. The appeal of including alternative item types is based on being able to assess constructs with high fidelity, measure higher-order thinking, as well as the ease of administration and scoring compared to traditional paper-and-pencil assessments that use constructed-response questions. However, there can be many challenges to their successful implementation.
The purpose of this presentation is to use a case-study to illustrate the design, development, and scoring model decisions for an assessment containing various alternative item types. The presentation will walk the audience through the first steps in selecting appropriate alternative item types (e.g., aligning constructs to various item types), to scoring, with the anticipation that the steps we outline might apply to other certification and licensure exam programs. Item template designs will be shared to demonstrate how to standardize the way in which each new item format is constructed, as well as for production efficiency. In addition, the use of expanded item writer guidelines will be discussed based on best practice standards. Pilot study data will also be shared to demonstrate how to analyze data in order to compare and contrast various scoring models (e.g., dichotomous or polytomous scoring) by item type. The results of the data will reveal the best scoring model given the purpose and design of the exam. Lastly, challenges, lessons learned, and the final decisions for scoring alternative item types for the case study will be discussed.
Overall, this presentation will provide a practical guide to test developers who are looking to or currently use alternative item types. It will also highlight various issues to consider, and activities that can be incorporated into the design, development, and scoring of alternative item types in an assessment. The guidance provided will also highlight approaches to support the validity of test score interpretations.