USMLE Step 1과 Step 2에서 멀티미디어의 활용 (Acad Med, 2009)

Use of Multimedia on the Step 1 and Step 2 Clinical Knowledge Components of USMLE: A Controlled Trial of the Impact on Item Characteristics

Kathleen Z. Holtzman, David B. Swanson, Wenli Ouyang, Kieran Hussie, and Krista Allbee





1990년대 초반 USMLE가 도입된 이후, 모든 세 Step 모두 환자 vignettes의 형태로 된 MCQ를 활용하였다. 주어진 상황을 기초과학 관점에서 해석하고, 진단을 내리고, 다음 단계를 특정하는 문항들이었다. 이러한 문항은 의사결정능력 평가를 위한 low-fidelity clinical simulation의 하나이다. CBT가 발전하면서 patient situation의 authenticity는 느리지만 꾸준히 증가되어왔고 멀티미디어를 문항 줄기에 포함시킴으로써 fidelity를 더 향상시키고자 하였다.

Since the introduction of the United States Medical Licensing Examination (USMLE) in the early 1990s, all three Steps have commonly used multiple- choice questions (MCQs) that take the formof patient vignettes describing a clinical situation and challenge examinees to interpret the situation froma basic science perspective, reach a diagnosis, or specify the next step in patient care. Such test items are reasonably viewed as low- fidelity clinical simulations designed to assess medical decision-making skills.1 Since the advent of computer-based testing (CBT) for USMLE in 1999, the authenticity with which patient situations are described has slowly but steadily increased, and further improvements are planned as all three Steps incorporate multimedia into item“stems,” enriching the fidelity with which patient findings can be presented.




방법

Method


심음

Heart sounds


Using these lists, NBME staff worked with an external vendor to obtain recorded auscultation findings and develop an interactive Flash-based format (Figure 1) for examinees to use in eliciting auscultation results and viewing related physical findings (i.e., movement of the chest and neck veins).



 


 

Step 2 CK study items


Step 1 study items


절차

Procedure


USMLE는 점수와 상관없는unscored 문항을 점수를 내는scored 문항과 함께 포함시켜서 실제로 사용하기 전에 문항에 대한 점검을 한다. 응시자들은 이에 대해 사전에 안내받는다.

USMLE routinely includes unscored material intermingled with scored test items to obtain information about (pretest) items and new item formats prior to scored use, and examinees are notified prospectively about this practice in registration materials.



문항 특성

Item characteristics studied


For each version of each study item, six indices were calculated from USMG and IMG item responses.

  • The first was the item difficulty (P value), calculated as the proportion of examinees who responded to the item correctly.

  • The second was a logit transform of the item difficulty  log [p / (1  p)], where p is the item difficulty. This nonlinear transformation is commonly used because the “distance” from an item difficulty of 0.50 to 0.60 is much smaller than the “distance” from 0.85 to 0.95.

  • The third was an index of item discrimination: the item-total (biserial) correlation, calculated as the correlation between the item(scored 0/1 for incorrect/correct) and the reported total score.

  • The fourth was an r-to-z transformation of the biserial correlation, also commonly used to correct for nonlinearities in the magnitude of correlation coefficients.

  • The fifth index was the mean response time in seconds, and

  • the sixth was the mean of the natural logs of response times, a transformation commonly used to normalize response times.



결과

Results


 


난이도

Item difficulty


Table 1 provides means for USMGs and IMGs for item difficulty, item discrimination, and response time for both Step 1 and Step 2 CK study items.



변별도

Item discrimination


멀티미디어 활용 문항의 변별도가 더 낮았다.

Multimedia items were less discriminating than matched text versions for both groups in each Step;


응답 시간

Response time


응답시간의 차이는 매우 컸다.

Differences in mean response times were very large from a practical perspective, with multimedia versions of items requiring, on average, 30 to 60 seconds longer for a response than text versions (P  .0001 for both groups in each Step).


고찰

Discussion


청진 소견을 멀티미디어로 제시하는 것은 문항의 난이도와 응답시간에 상당한 영향을 주었으며, 문항의 변별도에 대한 영향력은 중등도였다. 응시자들은 청진소견이 authentic, undigested 형태로 주어졌을 때보다 텍스트로 주어졌을 때 더 쉽게 해석하였다. 평균적으로 응시자들이 심음을 청진하게 하는 것은 문항당 50초 정도 응답시간을 증가시켰다.

Use of multimedia for presentation of auscultation findings has a sizable impact on item difficulty and response time, as well as a more modest impact on item discrimination. Examinees can more readily interpret auscultation findings described textually using standard medical terminology than the same findings presented in a more authentic, undigested format. On average, requiring examinees to listen to heart sounds, rather than read medical terminology accurately interpreting them, increased response times by roughly 50 seconds per item.



 



 2009 Oct;84(10 Suppl):S90-3. doi: 10.1097/ACM.0b013e3181b37b0b.

Use of multimedia on the step 1 and step 2 clinical knowledge components of USMLE: a controlled trial of theimpact on item characteristics.

Author information

  • 1National Board of Examiners, 3750 Market Street, Philadelphia, PA 19104, USA. kholtzman@nbme.org

Abstract

BACKGROUND:

During 2007, multimedia-based presentations of selected clinical findings were introduced into the United States Medical Licensing Examination. This study investigated the impact of presenting cardiac auscultation findings in multimedia versus text format on itemcharacteristics.

METHOD:

Content-matched versions of 43 Step 1 and 51 Step 2 Clinical Knowledge (CK) multiple-choice questions describing common pediatric and adult clinical presentations were administered in unscored sections of Step 1 and Step 2 CK. For multimedia versions, examinees used headphones to listen to the heart on a simulated chest while watching video showing associated chest and neck vein movements. Text versions described auscultation findings using standard medical terminology.

RESULTS:

Analyses of item responses for first-time examinees from U.S./Canadian and international medical schools indicated that multimediaitems were significantly more difficult than matched text versions, were less discriminating, and required more testing time.

CONCLUSIONS:

Examinees can more readily interpret auscultation findings described in text using standard terminology than those same findings presented in a more authentic multimedia format. The impact on examinee performance and item characteristics is substantial.

[PubMed - indexed for MEDLINE]


+ Recent posts