선진국에서 대규모 면허시험의 효과: Systematic review (BMC Med Educ, 2016)

The impact of large scale licensing examinations in highly developed countries: a systematic review

Julian Archer1* , Nick Lynn1, Lee Coombes2, Martin Roberts1, Tom Gale1, Tristan Price1 and Sam Regan de Bere1




배경

Background


의료 규제는 역사적으로 자기 자신을 '의사'라고 부를 수 있는지를 확립하는 것이었고, '미용사'나 '사기꾼'을 배제시키는 것이었다. 그러나 의료 규제는 이러한 정적인 접근에서 벗어나 (단순히 등록을 마치면 되는 것에서 벗어나) 더 역동적이고 미래적prospective한 것으로 바뀌었다.

Medical regulation has historically involved establishing who is appropriately qualified to call themselves a med- ical doctor and keeping certain people, such as “barbers” [1] and “charlatans” [2], out. But medical regulators have moved away from this static approach, of simply holding a register, to a more dynamic and prospective one.


관심의 많은 부분이 의료행위를 시작하는 시점에 가있다. 의과대학에서 근무지로 이행하는 이 시점은 IMG가 의료인력에 편입되는 순간이기도 하다.

Much of the attention has perhaps understandably focused at the beginning of clinical practice. This transi- tion point from medical school into the workplace is also often a point at which international medical graduates enter the workforce.


왜  면허시험이 중요하다고 여겨지는지는 이해하기 쉽다. 면허시험이 시행되는 시점은 의과대학이 졸업생을 배출하는 순간이다. 요구 기준을 충족시키는 사람만이 그 사법권 내에서 의료행위를 할 수 있다. 따라서 NLE를 지지하는 사람들은 국민집단이 '안전하게' 진료를 할 수 있는 사람만이 '자격을 갖추게 됨'을 reassure하게 해준다고 주장한다.

It is easy to understand why the concept of a licensing exam is hailed as important. They sit at the point at which medical schools graduate their students. Only those who achieve the required standards are then allowed to practice in the jurisdiction. In this way, the advocators argue, a nation’s population is reassured that only capable doctors who can practice safely are quali- fied.


그러나 NLE가 정확히 어떤 형식을 따르는지, 어떤 내용을 커버해야 하는지, 누구를 평가해야 하는지 등은 논쟁의 영역으로 남아있고, 특히 의사와 여러 의료직들이 국과와 지역간 경계를 넘어서 다니기 때문에 더 그러하다.

But exactly what form NLEs should take, what they should cover and who they should assess remains a source of debate [5–11], as doctors and other healthcare workers increasingly wish to move across national or regional (state) boundaries [5, 12–16].


영국은 현재 NLE가 없으나, 역사적으로 external examiner에 의존해왔다. GMC와 같은 기관에서 나온 방문 의학교육자들이 의과대학의 질을 확인한다. 국외에서 들어온 의사들은 다른 경로를 따르게 되는데 PLAB라는 시험을 본다.

The United Kingdom (UK) does not currently have a NLE but has historically relied on external examiners – visiting medical educators from other organizations – and General Medical Council (GMC) inspections to assure quality across UK medical schools. Doctors from overseas take a different route into licensure in the UK; predomin- ately through the Professional and Linguistic Assessments Board (PLAB) examination [17].


그러나 2014년 말, GMC는 NLE를 도입하겠다고 발표했고, 2015년에는 모든 영국 졸업생과 비-EEA 졸업생이 응시해야 하는 MLA를 2021년까지 도입하기 위한 timeframe을 내놓았다. 유럽법이 EEA졸업생은 면제 범위에 둔다.

However at the end of 2014 the GMC announced that it planned to establish a NLE and in June 2015 it laid out a timeframe for the introduction of a ‘Medical Licensing Assessment’ (MLA) which will ultimately be taken by all UK graduates and non-European Economic Area (EEA) graduates who wish to practice in the UK by 2021 [18]. As European Law stands EEA graduates will be exempt under freedom of movement legislation [19].

 


 


방법

Methods


Data sources and searches


Study selection


Data extraction, synthesis and analysis


 

결과

Results



우리는 문헌에서 수많은 논란은 찾았지만 근거는 그것보다 훨신 덜 찾았다. 우리가 문헌들을 타당도 프레임워크에 배치하였을 때 73개 문헌 중 24개만이 면허시험에 대한 타당도 근거를 가지고 있었다. 남은 50개 문헌은  informed opinion이거나 editorials이거나 단순히 논쟁을 지속시키는데 기여한 것일 뿐이었다.

We found a lot of debate in the literature but much less evidence. After we mapped the papers to the validity framework, only 24 of the 73 papers were found to contain validity evidence for licensing examinations. The remaining 50 papers consisted of informed opinion, editorials, or simply described and contributed to the continuing debate. We summarize the overall review process in Fig. 1.

 


 

표2는 24개 문헌이다. 이 중 내용타당도에 대한 것은 4개, 응답프로세스에 대한 것은 3개, 내적구조에 대한 것은 4개였다.

Table 2 summarizes the 24 papers mapped to the validity framework. Of these reviewed papers only four offered evidence for content validity [27–30], three for response process [27, 29, 31], and four for internal struc- ture [27–29, 32].


다른 variable과의 관련성 근거

Relationship to other variables as evidence for validity


relationship to other variables를 본 것은 세 종류로 나누었다. 과거와 미래의 퍼포먼스, 환자 성과/불만과의 관련성, 자국내 수련받은 의사와 IMG와의 차이

The papers that explored the relationship to other vari- ables, as evidence for validity, we sub-grouped into three areas of enquiry: prior and future performance by individ- uals in examinations; relationship to patient outcomes and complaints; and specifically the variation in performance between home-trained doctors and IMGs.


1. 의과대학에서 잘 했던 학생이 이후의 시험도 잘 보았다.

First, several authors explored the relationship between medical school examination performance and subsequent established large scale testing e.g. the USMLE [28, 33–35]. Overall they found, perhaps not surprisingly, that those who do well in medical school examinations also do well in subsequent testing.


2. 근거가 혼재되어 있는데, NLE가 진료의 퀄리티에 영향을 준다는 근거는 거의 없다라는 리뷰문헌.

Second, there is mixed evidence on the relationship with other variables when NLE test scores are compared with criterion based outcomes around complaints and patient welfare. Sutherland & Leatherman concluded in a 2006 review that “there is little evidence available about [national licensing examinations’] impact on qual- ity of care” across the international healthcare system [37].


3. IMG가 조금 뒤떨어지는 것으로 나온다. 이는 영어 실력이 부족해서일 수도 있지만 스위스의 연구결과에서는 IMG가 뒤떨어지는 영역이 '의사소통기술'이 아닌 다른 부분이었다.

Third, a series of papers each demonstrated that IMGs do less well in large scale testing [32, 35, 40, 41]. Some argue that the differences were due to a lack of proficiency in spoken English [32, 40], but a paper from Switzerland found that while IMGs did less well than Swiss candidates in their Federal Licensing Examination, the IMGs’ lower scores were in areas other than communication skills [27].

 


 

Consequential validity

Consequential validity


본 리뷰의 중요한 부분 중 하나는 환자성과가 NLE의 도입으로 향상된다는 근거가 부족하다는 점이다.

An important finding of this review is the lack of evidence that patient outcomes improve as a conse- quence of the introduction of national licensing exams.


비록 Norcini나 Tamblyn의 연구결과에서 시험의 중요성을 훌륭하게 주장하였지만, 이들의 주장은 상관관계에 그칠 뿐이며 인과관계에까지 도달하진 못하였다. 다른 말로 하자면, 더 좋은 의사들이 NLE에서 더 잘한다는 근거는 있지만, 의사들이 NLE의 결과로 더 향상된다는 근거는 없으며, 이러한 전-후 비교가 문헌에서는 부족하다. 또한 다른 교란요인은 USMLE에서 점수를 잘 받은 사람이 더 나은 기관에서 더 나은 직장을 갖는다는 것이다.

Although the aforementioned studies by Norcini et al. [38] and Tamblyn et al. [39] demonstrate excellent argu- ments for the importance of testing, and medical educa- tion more generally, their findings are limited to establishing correlations between testing and outcomes and not causation. In other words, there is evidence that better doctors do better in NLEs, but not that doctors improve as a consequence of introducing NLEs; this kind of before and after evidence is absent in the extant lit- erature. One confounding factor to a causal link between testing performance and subsequent care is the fact that those who do well in the USMLE get the better jobs in the better institutions [36, 42].


전체적으로 보았을 때 일부 저자들은 NLE는 전문직으로 들어서는 진정한 장벽이 아니며, 따라서 대중을 보호하지 못한다고 주장하기도 한다. 예컨대 USMLE를 보는 거의 모든 지원자가 궁극적으로 이 시험에 통과한다.

Overall, some authors argue that NLEs are not real barriers to entry into the profession, and therefore do not protect the public. For example nearly everyone who takes the USMLE passes it in the end [43].


NLE가 의과대학 교육과정에 미치는 영향에 대한 확실한 그림도 없다. 한 연구에서 1/3의 응답자가 "교육과정의 목표, 내용, 강조점"이 달라졌다고 응답하였다. 한 연구에서는 기존에 NLE가 존재하는 상태에서 NLE에 더해진 새로운 요소에 초점을 두며 과연 NLE가 의과대학의 관심을 국가적으로 부족하다고 드러난 술기/기술에 두게 할 수 있을지에 대한 의문이 제기되어 왔으며, 이 연구에서는 clinical skills component가 그러한 사례라고 언급하였다.

There is no clear picture from the literature as to the impact of NLEs on the medical school curricula. The study found that over one third of respondents reported changes to the “objec- tives, content, and/or emphasis of their curriculum” (p.325) [46]. While the study focuses only on the intro- duction of one new component of a licensure exam, within an already well established NLE, it does raise the question of whether NLEs can be used to focus medical schools’ attention to nationally identified skills/know- ledge shortages, as appears to be the case with the clinical skills component in this study [46].


그러나 동시에 NLE가 균질성을 장려하거나 교육과정 설계의 혁신을 저해하는지에 대한 의문도 있다. 그러나 플로리다의 치과의사 사례를 차치하더라도, NLE가 동질성homogeneity를 장려한다는 근거는 없다.

At the same time however, this raises the question that NLE exams may encourage homogeneity or a lack of innovation in curriculum design. Yet aside from one dental example in Florida [34], there appears to be no empirical evidence that NLEs encourage homogeneity




고찰

Discussions


예를 들어 NLE점수가 낮은 지원자가 궁극적으로 덜 좋은less respected 또는 퍼포먼스가 떨어지는 기관에서 근무하는 결과를 낳는다는 것을 보여준 연구가 있다. 나아가 규제가 헬스케어에 미치는 영향에 대한 리뷰문헌에서, 저자들은 NLE합격점수가 환자진료 혹은 미래의 Displinary action의 예측인자가 된다는 근거가 "희박"함을 밝혔다. 우리의 리뷰도 이러한 결과를 지지한다.

For example, studies demonstrate that candidates with lower NLE scores tend to end up working in less respected institutions [36, 42] and poorer performing or- ganizations [51]. Moreover, a comprehensive review on the role of regulation in improving healthcare by Suther- land and Leatherman [37], found “sparse” evidence to support the claims that NLE pass scores are a predictor of patient care or future disciplinary actions. Our review supports that conclusion.



NLE와 의사의 수행능력 간에 인과관계는 아니어도 상관관계에 대한 타당도 근거는 있으며, 이것이 NLE에 찬성하는 주장이 될 수는 있지만, 이 역시 NLE의 목적에 따라 달려있을 것이다. Schuwirth는 최근 "NLE의 목적이 대중들에게 '면허를 받은 의사는 안전하고 독립적 진료 수행이 가능하다'라는 것을 보여주는 것이다"라고 했다. 유사하게 Swanson and Robert는 NLE의 역할에 대해서 "환자, 대중, 고용기관에 '어디서 수련을 받은 의사이든 최소한의 역량은 갖추었음'을 보장하기 위한 것"이라고 하였다. 그러나 Schuwirth가 지적한 바와 같이 대중의 안심reassurance는 "최소한, 적어도 그 일부는 대중의 인식에 달려 있다"라고 하였다. 여기서 위험한 점은 대중과 정책결정자가 NLE의 역살이라고 인식하는 것과, NLE가 실제로 달성하는 것의 잠재적 차이이다. NLE가 환자안전을 향상시킨다는 잘못된 신뢰는 - NLE가 실제로 하는 일이 대중을 안심시키는 것 밖에 없을 때 - 의료 규제의 다른 중요한 측면으로부터 관심을 distract시키는 것이 된다.

That there is validity evidence for the correlation, as opposed to causation, between NLEs and doctors’ per- formance may in itself be an argument for national li- censing [4], but this will depend on the policy purpose of the NLE. Schuwirth has recently pronounced that, “In essence the purpose of national licensing is to reassure the public that licensed doctors are safe, independent practitioners” [52]. Similarly, Swanson and Roberts point to the role of NLEs in “reassuring patients, the public, and employing organisations that, regardless of where their doctor trained, they can be sure of a minimum level of competence” [4]. However, as Schuwirth notes, public reassurance is, “at least partly, based on public perception” [52]. The danger here is a potential disjunc- ture between what the public, and indeed policy-makers, perceive that NLEs do, and what they actually achieve; misplaced trust in the impact of national licensing to enhance patient safety, when what they actually do is simply reassure the public, may potentially divert atten- tion from other important aspects of medical regulation.


마지막으로 IMG의사에 대해서 포함/배제/공정함에 대한 어려운 문제가 있다. 스웨덴에서는 IMG의 경험에 따르면 스웨덴 시스템이 적극적으로 역량있는 IMG들에게 불이익을 준다고 인식하며, 스웨덴 시스템은 결함이 많고, Overlong하고 frustrating하다. 이러한 어려움은 다수의 캐나다 연구에서도 밝혀진 바 있다.

Lastly, there are difficult questions raised about inclu- sion, exclusion, and fairness in respect to IMG doctors [14]. In Sweden, which has a regulatory system similar to other countries across Europe and elsewhere, IMGs’ experiences suggest that the Swedish system may ac- tively disadvantage competent IMG practitioners; partic- ipants viewed the Swedish system as flawed, overlong, and frustrating [44]. Such difficulties have also been highlighted by a number of Canadian studies [13, 53], providing some descriptive evidence of the way in which practitioners, provincial licensing authorities, and em- ployers use the system to balance the demands arising from physician shortages, making it difficult for both IMGs and those that employ them to negotiate the li- censing system.



연구의 강점과 약점

Strengths and weaknesses of the study


 

 

의사와 정책입안자들에 대한 함의

Implications for clinicians and policymakers


 

NLE를 반대하는 사람 뿐 아니라 지지하는 사람에게도 근거는 취약하다.

The weakness of the evidence base exists for those who argue against national licensure examinations [7], as well as for those who advocate such a system [55–57].


답해지지 않은 문제들

Unanswered questions and future research


IMG가 뒤쳐진다는 강력한 통계적 근거가 있지만, 그 이유는 불명확하다.

Whilst a strong body of statistical evidence exists to show IMGs perform less well in licensure examina- tions than candidates from the host countries, [27, 59] the reasons for this phenomenon remain unclear.


NLE도입을 통해 의사의 수행능력이나 환자 안전이 향상된다는 근거는 부족하나, 시험결과와 Overall performance의 상관관계는 강력하다. 따라서 NLE를 도입하는 것의 이득은 기존의 규제 시스템이 얼마나 효과적으로 작동하는가에 달려있을 것이다. 따라서 정책입안자들과 규제기구는 one size fits all 접근법에서 벗어나야 하며, NLE를 기존 '규제 시스템'의 관점에서 근거를 살펴보아야 한다.

We have argued that the evidence for NLEs improving doctor performance and patient safety as a consequence of their introduction is weak, whereas the evidence for a correlation between test results and overall performance is strong. As such, the relative benefits of introducing aNLE may well be contingent upon the efficacy of exist-ing regulatory systems. As such, policy-makers and regu-lators may consider moving beyond a one size fits allapproach to NLE; evidence should be examined in lightof existing regulatory systems 


Conclusions


The main conclusion of our review is that the debate on licensure examinations is characterized by strong opin- ions but is weak in terms of validity evidence.


9. Ricketts C, Archer J. Are national qualifying examinations a fair way to rank medical students? Yes. BMJ. 2008;337:a1282.


51. Noble ISG. Are national qualifying examinations a fair way to rank medical students? No. Br Med J. 2008;337:a1279.



49. McMahon GT, Tallia AF. Perspective: Anticipating the challenges of reforming the United States medical licensing examination. Acad Med. 2010;85(3):453–6.


26. Cook DA, Lineberry M. Consequences validity evidence: evaluating the impact of educational assessments. Acad Med. 2016;91(6):785–95.



Ahn, D., & Ahn, S. (2007): Reconsidering the Cut Score of the Korean National Medical Licensing Examination



 

 




The effects of violating standard item writing principles on tests and students: the consequences of using flawed test items on achievement examinations in medical education.

Author information

  • 1Department of Medical Education (MC 591), College of Medicine, University of Illinois at Chicago, 60612-7309, USA. sdowning@uic.edu

Abstract

The purpose of this research was to study the effects of violations of standard multiple-choice item writing principles on test characteristics, student scores, and pass-fail outcomes. Four basic science examinations, administered to year-one and year-two medical students, were randomly selected for study. Test items were classified as either standard or flawed by three independent raters, blinded to all item performance data. Flawed test questions violated one or more standard principles of effective item writing. Thirty-six to sixty-five percent of the items on the four tests were flawed. Flawed items were 0-15 percentage points more difficult than standard items measuring the same construct. Over all fourexaminations, 646 (53%) students passed the standard items while 575 (47%) passed the flawed items. The median passing rate difference between flawed and standard items was 3.5 percentage points, but ranged from -1 to 35 percentage points. Item flaws had little effect on test score reliability or other psychometric quality indices. Results showed that flawed multiple-choice test items, which violate well established and evidence-based principles of effective item writing, disadvantage some medical students. Item flaws introduce the systematic error of construct-irrelevant variance to assessments, thereby reducing the validity evidence for examinations and penalizing some examinees.

[PubMed - indexed for MEDLINE]


+ Recent posts