MCQ에서 CIV와 IWF: 원칙이 차이를 만드는가?(Acad Med, 2002)

Meded. 2016. 11. 7. 11:42

2016. 11. 7. 11:42

MCQ에서 CIV와 IWF: 원칙이 차이를 만드는가?(Acad Med, 2002)

Construct-irrelevant Variance and Flawed Test Questions: Do Multiple-choice Item-writing Principles Make Any Difference?

STEVEN M. DOWNING

EVALUATION METHODS: WHAT DO WE KNOW?

Moderator: Reed G. Williams, PhD

Messick 은 CIV를 다음과 같이 정의하였다.

Messick defines construct-irrelevant variance (CIV) as

‘‘. . . excess reliable variance that is irrelevant to the interpreted construct.’’2

Testwiseness, teaching to the test, and test irregularities (cheat- ing) 등이 모두 CIV이다.

Testwiseness, teaching to the test, and test irregularities (cheat- ing) are all examples of CIV that tend to inflate test scores by adding measurement error to scores.

문헌 고찰

Review of the Literature

NBME 연구에서 기본적인 문항작성원칙에 violation이 있음을 보여주었다.

Yet, a recent study from the National Board of Medical Examiners (NBME)6 shows that viola- tions of the most basic item-writing principles are very common in achievement tests used in medical education.

개별 IWF는 연구되었으나 cumulative effect는 연구된 바가 없다.

While several individual item flaws have been studied (negative stems,6 multiple true–false items,7 none of the above option8), the cumulative effect of grouping flawed items together as scales mea- suring the same ability has not been investigated.

방법

Method

Three independent raters, blinded to item-performance data, classified the items using the standard principles of effective item writing as the universe of item-writing principles.4

Absolute passing standards were established for this test by the faculty responsible for teaching this instructional unit using a mod- ified Nedelsky method.9

다음을 계산함

Typical item-analysis data were computed for each scale:

means,
standard deviations,
mean item difficulty,
mean biserial discrimi- nation indices, and
Kuder-Richardson 20 reliability coefficients,
to- gether with the absolute passing score and the passing rate (pro- portion of students passing).

Results

22개의 표준문항과 11개의 오류문항을 비교했을 때, KR20은 0.62 vs 0.44였다.

Comparing the standard (22 items) and the flawed (11 items) scales,

the observed K-R 20 reliability was .62 versus .44.
The standard-scale mean p value was .70; the flawed-scale mean p value was .63 (t197 = 6.274, p < .0001).
The standard-scale items were slightly more discriminating than the flawed items, rbis = .34 versus.30 (using the total test score as criterion).
The flawed and the standard scales were correlated r = 0.52 (p < .0001).

고찰

Discussion and Conclusions

1/3에서 1개 이상의 IWF 발견

One third of the questions in this test have at least one item flaw.

오류문항에서 난이도가 상승하였다. 문항이 제대로 안 만들어질 경우 인위적인 난이도 추가가 발생하는 것. 이 CIV는 시험점수의 정확하고 meaningful한 해석에 방해가 되고 passing rate에도 부정적으로 작용함.

The increased test and item difficulty associated with the use of flawed item forms is an example of CIV, because poorly crafted test questions add artificial difficulty to the test scores. This CIV inter- feres with the accurate and meaningful interpretation of test scores and negatively impacts students’ passing rates, particularly for pass- ing scores at or just above the mean of the test score distribution.

2002 Oct;77(10 Suppl):S103-4.

Construct-irrelevant variance and flawed test questions: Do multiple-choice item-writing principles make any difference?

Downing SM1.

Author information

1Visiting Professor, University of Illinois at Chicago, College of Medicine, Department of Medical Education, 808 South Wood Street, Chicago, IL 60612-7309, USA.

PMID:: 12377719

[PubMed - indexed for MEDLINE]

저작자표시 비영리 변경금지 (새창열림)

'Articles (Medical Education) > 평가법 (Portfolio 등)' 카테고리의 다른 글

Blurprinting을 위한 12가지 팁(Med Teach, 2009) (0)	2016.11.14
차이에 관하여: 왜 전문가-초심자 비교가 타당도주장에 기여하는 바가 별로 없는가(Adv in Health Sci Educ, 2015) (0)	2016.11.07
MSPE 랭킹의 학교별 차이: 매우 나쁜 학생이지만 "Good" (Acad Med, 2016) (0)	2016.11.07
평가자의 인지Rater Cognition: 리뷰 (Med Educ, 2016) (0)	2016.11.02
국지적으로 개발된 MCQ에서 타당도에 대한 위협: 구인-무관-변이와 구인 과소반영(Adv Health Sci Educ Theory Pract. 2002) (0)	2016.10.21

Passing the Torch : 의학을 가르치는 것은 횃불을 전달하는 것과 같다.

MCQ에서 CIV와 IWF: 원칙이 차이를 만드는가?(Acad Med, 2002)

Construct-irrelevant variance and flawed test questions: Do multiple-choice item-writing principles make any difference?

Author information

'Articles (Medical Education) > 평가법 (Portfolio 등)' 카테고리의 다른 글

+ Recent posts

티스토리툴바