의과대학 입학에 있어서 인적특성 평가(Med Educ, 2005)

Assessment of personal qualities in relation to admission to medical school

Mary Ann Lumsden,1 Miles Bore,2,3 Keith Millar,4 Rachael Jack1 & David Powis2







의과대학 학생선발 절차의 목적은 학부의학교육을 잘 이수하여 미래에 좋은 의사가 될 사람을 뽑고, 전문직에게 악평을 남길 사람을 배제하는 것이다.

The aim of a medical school admissions procedure is to select those who will perform well as undergraduates and make good doctors in the future, and to exclude those who will bring the profession into disrepute.


추가로, 의과대학 졸업생의 사회적, 문화적, 인종적 배경 구성이 어떻게 되어있느냐는 전체 환자집단의 다양성을 반영해야 한다.

In addition, the social, cultural and ethnic backgrounds of medical graduates should reflect the diversity of the patient population.


고등학교졸업생을 대상으로 한 영국의 의과대학생 선발은 높은 학업성취도를 이룬 학생을 주로 선발하며, 그 절차는 비밀스럽고 다양하다.

The admission of school-leavers to medical schools in the UK is usually based upon high academic attain- ment, the procedures often being secretive and varied.2


그러나 학업성취도가 높은 학생을 선발하는 것은 그 성장배경이 애초에 불리한 사람들을 차별하는 결과를 가져올 수 있다. 그리고 이는 지원자와 교사 모두의 의욕을 꺾는 일이다. 더 나아가서 불리한 성장배경에서 자라난 지원자는 현재의 의과대학 선발요건을 달성하기 어려운 경우가 많은데, 이는 그들의 학업적 역량을 최대로 발휘할 기회가 주어지지 않았기 때문이다.

However, high academic thresholds may discriminate against those from disadvantaged backgrounds7,8 as these discourage both applicants and teachers. Furthermore, applicants from disadvantaged backgrounds find current admissions requirements difficult to attain because they often lack opportunity or encouragement to maximise their academic potential. 


더 광범위하게는, 의사들은 우수한 역량과 더불어 공감과 윤리성을 갖추어야 한다. 

More broadly, it is generally accepted that doctors need to be competent, empathic and ethical. (Leaving aside particular psychomotor skills, generic competence in this context refers to the intellectual ability to solve problems by applying acquired knowledge and using logical reasoning.)


psychometric 검사는 학습자료 이외에서 드러나는 인적특성을 측정하는 도구로 사용될 수 있고, 따라서 이는 사회적 배경에 영향을  받는다.

Psychometric tests can be used to measure personal- ity characteristics and abilities rather than learned material, and accordingly, performance should be less influenced by social background.


의과대학 지원자에 대한 Psychometric testing은 미국과 호주에서 몇 년간 사용되어왔다. 인지능력은 그 자체로 혹은 지원자의 다른 학업기록과 함께 사용될 경우에도 의과대학 수행능력의 신뢰도 높은 예측인자임이 확인되어왔다.

Psychometric testing of applicants to medicine has been used in the USA and Australia for some years. Cognitive ability scores alone, and in combination with the academic record of the applicant, have been shown to be reliable predictors of medical school performance.10



PQA가 호주에서 개발되었으며 세 가지 검사를 포함하고 있다.

A test battery called the Personal Qualities Assess- ment (PQA) has been developed by researchers in Australia. The battery consists of 3 tests, of which the 

  • first is designed to measure individual differences in cognitive reasoning ability, 
  • the second to identify an Involved (empathic and confident with others) or Detached (narcissistic and aloof) personality trait, and 
  • the third to determine ethical ⁄ moral orientation.

PQA에 대한 추가 정보

Further details are available from the PQA website.13


인성검사과 윤리/도덕 지향검사는 그러한 특성이 강하게 발전된 개개인을 대상으로 활용될 수 있다. 예컨대 그 검사를 통해서 공감이 높은 사람 혹은 자기확신이 강한 사람을 찾을 수 있으며, 자기애적인 사람과 무관심한 사람을 찾을 수 있다. 이 두 가지 검사도구 결과를 병합하면 Table1 과 같은 결과가 나온다.

The personality and ethical ⁄ moral orientation tests can be used to identify individuals in whom certain characteristics are strongly developed. For example, they allow for identification of those who are very empathic or self-confident, as well as those who are particularly narcissistic or aloof. In addition, the scores from these 2 assessment tools can be combined, enabling an empirically-based definition of different personality types (Table 1).



요약하자면 PQA는 다음과 같은 것을 검사한다.

In summary, the PQA is used to determine: 

• cognitive reasoning ability; 

• the degree to which individuals are involved or detached, and 

• the extent to which individuals value individual freedom at the expense of societal needs or the needs and expectations of society at the expense of the individual.



The tests ⁄ questionnaires


Participants completed the following tests.


  • Test 1: Cognitive Skills (TUNRA, University of Newcastle, Australia): This comprised a 38-itemtest to measure the ability to reason logically and to apply logic to problemsolving.
  • Test 2: Mojac Scale11,14–16: This provides a relative measure of a libertarian-dual-communitarian dimension of moral ori- entation (LibCom score).
  • Test 3: NACE⁄ (ECAN) Scale12,16: This provides scores on narcissism(N), aloofness (A), self-confidence(C) andempathy (E). Anoverall ECAN score was calculated: highscores indicate a tendency to be involved with others (high empathy and self- confidence) while lowscores indicate a tendency to be detachedfromothers (highaloofness andnarcissism).


도덕성 유형 

Moral types


이 연구를 위해서 평균에서 1.5SD 떨어진 것을 임의로 기준으로 삼았다.

For this research the types were defined arbitrarily as a score on either dimension greater than 1.5 standard deviations from the mean. The raw scores were converted to T scores, having a mean of 50 with an SD of 10. Accordingly, outlying scores lie below 35 or above 65 on either dimension.




통계 분석

Statistical analysis


Results are expressed as means (SD) for normally distributed data. The Kruskall)Wallis and Mann–Whitney tests were used to examine non-parametric data.




부모의 직업 자료는 얻는 것이 불가능해서 DepCat 분류를 사용했다.

As data on parental occupation were not available, social class was defined by deprivation category (DepCat), whereby the individual’s postcode serves as a proxy for socioeconomic class.19


1~7까지이며, 1,2 / 3,4,5 / 6,7로 구분하였음.

DepCats range from 1 to 7, the latter indicating areas of severe deprivation. DepCat data were available on 427 individuals. Due to the small numbers inDepCat 6and7particularly, the cohort was divided into 3 groups for analysis.19 Group 1 comprisedthoseinDepCats 1and2 (n ¼ 164, 38.4%); group 2 comprised those in DepCats 3, 4 and 5 (n ¼ 229, 53.6%) and group 3 comprised those in DepCats 6 and 7 (n ¼ 34, 8.0%).




PQA 프로파일

Profile of the PQA


인지기술

Cognitive skills


The range and normal distribution of the scores shown in Fig. 1 indicate that there are widely differ- ent cognitive abilities in this sample, which is other- wise uniformin terms of its members having achieved high grades in school-leaving examinations.



MOJAC과 NACE 점수

MOJAC and NACE scores


Figure 2 shows the distribution of applicants by moral type:


Gender


Educational background


Deprivation category (DepCat)








선발절차에 PQA를 포함시키는 것의 효과

The effect of incorporating the PQA into theselection process 


평균보다 2SD이상 떨어지는 지원자는 의학에 부적합하다고 볼 수 있다. 이 지원자들을 배제하고, 성격검사에서 극단에 있는 사람들을 제외한다면, 109명의 학생이 남는다. 그러나 이 중에 실제로는 15명만이 선발단계에서 배제되엇다.

It may be hypothesised that those applicants withrelatively poor cognitive skills (> 2 SD below thecohort mean score) are less suited to a career inmedicine. If these individuals are excluded, together with those whose personality traits lie at the extreme ends of the scale, 109 (23%) of the Scottish applicants would not have obtained a place. Only 15 of these applicants were in fact excluded on the standard admissions criteria applied. Figure 4 shows the actual acceptance and rejection rates of the present sample in their applications to study medicine, and their application outcomes if the PQA or standard criteria had been applied.







DISCUSSION


인지기술 점수는 가우시안 분포를 따랐으며, 이 전체 지원자의 학업성취가 전반적으로 극도로 높음에도 이러한 인지적 능력에는 상당한 차이가 있는 것을 보여준다. 이는 'A레벨을 받는 것'자체로는 - 이미 그러한 학생들이 수가 늘어나서 - 충분히 쓸모있는 선발기준이 되지 못한다는 여러 기관의 우려와 일치한다.

The scores of the cognitive skills test component of the PQA follow a Gaussian distribution and it is possible to distinguish different degrees of cogni- tive ability despite the fact that academic attain- ment across the cohort is extremely high. This is an important finding in light of the concerns expressed by both the media and organisations such as the Council of the Heads of Medical Schools that A-levels are becoming less useful as a selection criterion because increasing numbers of candidates achieve the top grades.


인지적 능력 뿐 아니라 PQA는 NACE과 Mojac도 확인한다. PQA에서 드러난 특성은 교육적 배경에 따라 영향을 받지 않으나, 일부 DepCat에 따라서 약간의 차이는 있다. 더 Deprivation이 심한 집단에서 ECAN점수가 높으며, 이는 이들이 더 공감적이고 자신감이 있으며, 덜 자기애적이고 덜 무관심하다는 것을 보여준다.

As well as cognitive ability, the PQA has also been developed to identify empathy and related and opposing traits (NACE) and moral ⁄ ethical orienta- tion (Mojac). The qualities identified by the PQA were not influenced by educational background, although some minor differences with deprivation category were noted: those from areas of greater deprivation produced higher ECAN scores, indica- ting that they were more involved (empathic and confident) and less detached (narcissistic and aloof) than those from areas of lesser deprivation.




13 Personal Qualities Assessment (PQA). http://www. pqa.net.au







 2005 Mar;39(3):258-65.

Assessment of personal qualities in relation to admission to medical school.

Author information

  • 1Division of Developmental Medicine, Glasgow Royal Infirmary, Glasgow, UK. M.A.Lumsden@clinmed.gla.ac.uk

Abstract

BACKGROUND:

Recently there has been much scrutiny of the medical school admissions process by universities, the General Medical Council and the public. Improved objectivity, fairness and effectiveness of selection procedures are desirable. The ultimate outcome sought is the graduation of competent doctors who reflect the values of and are in tune with the communities they serve.

METHODS:

Applicants to the Scottish medical schools sat a battery of psychometric tests to measure cognitive ability, personality traits and moral/ethical reasoning (Personal Qualities Assessment, PQA). Analysis determined the potential impact of the latter variables, and those of educational background and socioeconomic class (assessed by residential 'deprivation category'), upon success in gaining a place to study medicine.

RESULTS:

Cognitive ability did not vary significantly as a function of gender or educational background, although there was a trend for it to be lower in individuals from more deprived backgrounds. Women as a group were more empathic, with a greater communitarian orientation, than men. There was no significant difference between individuals attending independent and state-funded schools in respect of any of the qualities measured by the PQA. Applicants from deprived backgrounds and those attending state-funded schools would not be disadvantaged by an admissions process based on the PQA.

CONCLUSION:

The incorporation of an assessment tool such as the PQA may have positive implications for widening access and the objective selection of suitable medical students, resulting in the training of doctors who are more representative of the community at large. A longterm follow-up of the professional careers of those medical students who completed the PQA will be undertaken.

Comment in

PMID:
 
15733161
 
[PubMed - indexed for MEDLINE]


의과대학 입학생의 성공에 필요한 핵심 인성역량 : 무엇이고, 어떻게 입학단계에서 평가할 수 있는가?(Acad Med, 2013)

Core Personal Competencies Important to Entering Students’ Success in Medical School: What Are They and How Could They Be Assessed Early in the Admission Process?

Thomas W. Koenig, MD, Samuel K. Parrish, MD, Carol A. Terregino, MD, Joy P. Williams, Dana M. Dunleavy, PhD, and Joseph M. Volsch, MPA




입학시에 의과대학 입학생이 지녀야 할 학업적 역량에 대해서는 일반적 합의가 있다.

There is general agreement in the medical education community about the academic competencies that medical students should demonstrate when they matriculate.


비록 의학교육계는 졸업시에 갖춰야 하는 인성역량에 대래서는 동의를 이뤘으나, 입학시에 어떠해야 하는가에 대해서는 합의가 없다.

Although the community has agreed on the personal competencies that medical students should demonstrate at graduation,1 it has not reached consensus on those that are important at entry or how to incorporate them into the admission process.


Albanese 등은 입학과정에서 87개 이상의 서로 다른 인성특성이 평가되고 있다고 추정하였다. 의과대학간 합의가 이토록 부족한 것은 어떤 인성역량이 의과대학 성과와 정의 관계게 있다는 연구 결과를 감안하면 무척 놀라운 것이다.

Albanese and colleagues2 estimated that more than 87 different personal qualities are assessed during the admission process. This lack of consensus among schools is surprising given that research has linked certain personal competencies to positive admission and medical school outcomes.


예컨대 Carrothers 등은 다음의 인성이 의과대학에서 주로 평가된다고 했다.

For example, Carrothers and colleagues3 found that 

    • having good interpersonal skills, 
    • knowing one’s emotions, 
    • recognizing emotions in others, 
    • possessing the ability to manage one’s emotions in difficult situations, and 
    • being able to motivate oneself 

were frequently cited by medical school admission committee members as desirable attributes for prospective medical students. 


비슷하게 Adams 등은 의과대학 교수/전공의/학생들은 의과대학에서의 성공에 중요한 요소로 다음의 것을 꼽았다고 했다.

Similarly, Adams and colleagues4 found that demonstrating 

    • motivation, 
    • a desire to learn, 
    • integrity and ethics, 
    • self-management, and 
    • strong interpersonal and 
    • teamwork skills 

were reported by medical school faculty members, residents, and students as being important to success in medical school.



연구 결과를 살펴보면 어떤 인성 특징든 환자 성과, 환자들의 의사에 대한 평가에도 연관된다고 했다.

Researchers have related some of these personal characteristics and skills to improved patient care outcomes and to patients’ ratings of their physicians.5,6


관련된 연구 결과로는 다음과 같은 것이 있다.

  • For example, good teamwork and collaboration are correlated with improved patient outcomes, patient satisfaction, and greater job satisfaction among physicians.7 
  • Patients who report being treated with dignity by their physicians are more likely to adhere to treatment plans and to be satisfied with their care.8 
  • Similarly, physicians who “adopt a warm, friendly, and reassuring manner” with their patients are more effective than those who keep consultations formal and do not offer reassurances.9 
  • Recently, Hojat and colleagues10 found that patients of physicians with high levels of empathy have better health outcomes than patients of physicians with moderate and low levels of empathy. 
  • Moreover, when physicians’ personal skills are lacking, negative professional outcomes are likely. 
  • For instance, Papadakis and colleagues11 showed that unprofessional behavior in medical school (e.g., irresponsibility, lack of capacity for self-improvement) predicts later disciplinary action by state medical boards.




의과대학 입학에서 인성역량의 역할

The Role of Personal Competencies in Medical School Admissions


AAMC의 리더들과 다른 의학교육계 사람들은 지원자의 인성 특성을 더 강조해야 한다고 말했다.

Leaders of the Association of American Medical Colleges (AAMC)12 and others in the medical education community have called for more emphasis to be placed on applicants’ personal competencies in the admission process.


자료를 보면 면접까지 가기 전에 상당한 수의 지원자가 제외된다. 2011년, 한 지원자는 평균 14개의 의과대학에 지원했으나 평균적으로 2개 이하의 학교에서만 면접까지 갔다.

Data show that a significant part of admission screening takes place before interviews: In 2011, the average applicant submitted 14 applications but received less than 2 interview invitations.13


이 문제를 해결하기 위해서 가장 먼저 해야 할 것은 평가할 인성역량을 정하고, 의학교육계의 수요와 목적의 균형에 맞는 실무적, psychometric적 이슈를 해결하기 위한 도구에 합의해야 한다.

To meet this challenge, the medical education community must first agree on a universal set of personal competencies to measure as well as a set of tools that balances the needs and goals of the admission community with practical (e.g., cost, accessibility) and psychometric issues.



Defining Core Personal Competencies for Entering Medical Students


비록 역량을 정의하고자 했던 시도는 없었던 바 아니나(AAMC는 1970년대와 1990년대에 시도한 바 있음), 의학교육계 내에서 이에 대한 합의를 이루고자 하는 시도는 거의 없었다. 따라서 AAMC는 다년간의 상당한 노력을 들여서 핵심 인성역량을 규명했다.

Although there have been attempts to systematically define the personal competencies that medical school matriculants should demonstrate on entry (e.g., the AAMC explored this in the 1970s and 1990s),16 there has been little effort to build consensus about these competencies in the wider medicaleducation community. Therefore, the AAMC undertook a rigorous, multi- year process to research and identify core personal competencies for students entering medical school in the 21st century. This process (Table 1) 


16 Etienne PMJ, Julian ER. Assessing the personal characteristics of premedical students. In: Camara WKE, ed. Choosing Students: Higher Education Admissions Tools for the 21st Century. Mahwah, NJ: Lawrence Erlbaum Associates; 2005.





의과대학 성공에 중요한 인적 특성

Identifying personal characteristics important to success in medical school


MR5 Committee는 두 개의 설문 진행

The MR5 Committee began the process by conducting two surveys

      • In 2008, U.S. and Canadian admission officers were asked to describe their school’s admission process and to rate the importance of 41 personal characteristics to success in medical school.
      • In 2009, U.S. and Canadian academic affairs officers were asked to rate the importance of 72 characteristics to success in medical school. Data from these two surveys17,18


핵심 인성역량 개발

Developing the set of core personal competencies


다음으로, ILWG는 직무분석을 했다.

Next, the ILWG conducted a multistep job analysis to identify the core set of personal competencies


각각의 인성특성에 대해서 다음의 질문을 던졌다.

We then asked the following questions about each personal characteristic: 

    • 1. Is this characteristic related to medical student performance, particularly the behaviors associated with success in medical school? 
    • 2. Do students need to display this characteristic at entry into medical school? 
    • 3. Is it reasonable to assume that medical school applicants can demonstrate this characteristic? (Is it developmentally appropriate?) 
    • 4. Is this characteristic fixed, or is it malleable? Is it something that medical education can build on as the student matures and is exposed to new experiences?

이 질문에 대한 답변에 따라 하위집단을 선택함
On the basis of the answers to these questions, we selected a subset of personal characteristics to develop into core personal competencies.


평균적으로 매우-극도로 중요하다고 평가하였으나, 입학과정에서 관련되어 제공되는 정보의 질에는 만족하지 못함

As shown in Table 2, on average, all of the draft personal competencies were rated by admission officials as “very important” to “extremely important.” Respondents were not, however, satisfied with the quality of information available about these competencies during the admission process.




피드백 수집

Collecting feedback on the core personal competencies


The ILWG’s recommendation served as the foundation for the AAMC Admissions Initiative. One of that group’s first projects was to review the ILWG’s draft definitions of the recommended core personal competencies.





9개의 핵심인성역량 승인

Approving the nine core personal competencies for entering medical students


2013년 2월 AAMC COA는 최종 9개 리스트를 만들었다. 

In February 2013, the AAMC’s COA endorsed the final list of nine core personal competencies for entering medical students (defined in Table 4): 

      • ethical responsibility to self and others; 
      • reliability and dependability; 
      • service orientation; 
      • social skills; 
      • capacity for improvement; 
      • resilience and adaptability; 
      • cultural competence; 
      • oral communication; and 
      • teamwork.





평가도구 검색

Exploring Tools to Assess the Core Personal Competencies Early in the Admission Process


비록 ILWG 설문이 인성역량을 평가하는 도구도 제안하긴 했지만, 그것의 활용과 가치에 대해서는 아직 답해야 할 질문이 많다.

Although the ILWG survey suggested a desire among admission officers for tools that assess applicants’ personal competencies early in the admission process, there are many unanswered questions about the use and value of such measures in medical school admissions.22


50개 이상의 논문, 6개의 미발표된 보고서 등등을 보았음.

We identified more than 50 seminal articles (including several meta-analyses) and six nonpublished technical reports about tools currently used to measure personal competencies in higher education and employment settings. We made subjective, holistic judgments about tools’ potential to provide information on applicants’ core personal competencies for use in the pre-interview screening stage of the admission process. 


여섯 개의 도구를 여덟 개의 준거로 평가하였다.

We judged six types of tools according to the following eight criteria: 

    • validity, 
    • reliability, 
    • group differences, 
    • susceptibility to faking and coaching, 
    • applicant reactions, 
    • user reactions, 
    • cost/resource utilization, and 
    • scalability for use in pre-interview screening (Appendix 1).



SJT

Situational judgment tests


SJT의 진행방식. 평가자는 답을 골라야 하고 포멧이 다양할 수 있다.

In situational judgment tests (SJTs), examinees are asked to indicate how they would (or should) respond to dilemmas presented in text-based, video, or animated scenarios. Response formats vary

      • Examinees may be asked to select from multiple-choice options, 
      • identify the most and least effective responses, 
      • and/or answer open-ended questions. 


캐나다, 벨기에, 이스라엘 등에서 활용되고 있음

SJTs have been used in medical school admission processes in Canada (the CASPer assessment23), Belgium,24–26 and Israel.27


직원 채용에 관한 문헌을 보면 SJT의 신뢰도 타당도 근거가 충분하다. 1997년부터 의과대학 선발에 사용해온 벨기에에서의 결과도 마찬가지이다. 추가적으로 영국에서의 연구결과를 보면 SJT가 의사로서의 수행역량을 예측하며, 임상에서의 문제해결 시험에 incremental validity를 제공한다. 또한 지원자도 SJT에 호의적이다.

The employment literature28 provides strong evidence for the reliability and validity of SJTs, as does research conducted in Belgium,26 where an SJT has been used in the medical school admission process since 1997. Additionally, research from the United Kingdom shows that SJT scores predict competency-based ratings of physician performance and provide incremental validity above and beyond a clinical problem-solving test.29 Further, applicants hold generally positive attitudes about SJTs.30


SJT수행에 약간의 인종/민족간 차이가 있을 수 있다. 그러나 법과대학 결과를 보면 AA나 Latino의 입학을 더 높여준다.

There is some evidence suggesting that there may be small racial/ethnic group differences in performance on SJTs that emphasize decision making.31 However, research conducted by the College Board and the Law School Admission Council indicates that including these tests in the admission process may increase the percentage of African American and Latino matriculants compared with using academic data alone, and that performance on SJTs is the best predictor of “lawyering effectiveness.”32,33


SJT는 개발에 비용이 많이 들며, 시나리오 개발에 기술전문가가 필요하다. 그러나 면접 전 스크리닝에 scalable하며 대규모의 지원자에게 시행할 수 있다. 결과 자료도 활용하기 용이하다.

SJTs are somewhat expensive to develop because of the technical expertise needed to create and score scenarios. However, they are scalable for use in pre-interview screening because they can be administered to a large number of applicants before the interview. Further, when SJTs are scored, data are presented in a format that is easy to consume.


표준화 수행 평가(SEP)

Standardized evaluations of performance

평가자는 그래픽, 상호비교, 행동-관련-평가스케일을 활용하여 지원자를 특정 역량에 따라 평가한다.

In standardized evaluations of performance (SEPs), raters use a graphic, comparative, or behaviorally anchored rating scale to evaluate applicants on a set of competencies.


대부분의 의과대학 입학절차에서 비표준화 추천서를 사용하지만, 비학업적 변인에 대해서는 평가자간 신뢰도가 낮고, 예측 타당도도 낮고, comparative data도 부족하다. 다른 전문직 교육과정에서는 SEP을 사용해왔으며 ETS에서는 PPI를 도입했다. 

Although most medical school admission processes use nonstandardized letters of recommendation—which have poor interrater reliability for nonacademic variables,34 have poor predictive validity, and lack comparative data—other graduate and professional program (e.g., veterinary medicine, optometry, physical therapy) admission processes use SEPs. In 2009, the Educational Testing Service introduced the Personal Potential Index,35 an SEP for use in graduate admissions, but there is no published literature to date on its psychometric properties.


의과대학 지원자의 수행능력을 보면 SEP 점수와 작지만 유의미한 정의 상관관계를 확인할 수 있다. 입학담당자는 SEP에 긍정적인 태도를 가질 가능성이 높은데, 왜냐하면 평가자들이 지원자의 인성역량을 묘사할 구체적인 행동 예시를 포함시켜야 하기 때문이다.

Research on the Medical Student Performance Evaluation shows small but significant observed positive correlations between standardized evaluations and performance on comprehensive clinical performance exams.36 Admission officers are likely to have positive attitudes about SEPs because raters must include specific examples of behaviors illustrating applicants’ personal competencies.37


점수를 잘 주는 평가자가 있을 수 있고 평가에 variance가 있을 수 있다.

There is potential for rater leniency and consequent lack of variance in ratings, though.



성취기록

Accomplishment records


성취기록은 자기소개서와 비슷하며, 성취와 경험의 표준화된 기술이다. 지원자는 중요한 인성역량과 관련한 행동을 기술하도록 한다.

Accomplishment records, also known as autobiographical questionnaires, are standardized descriptions of achievements and experiences. Applicants are asked to describe behaviors related to a set of important personal competencies.


신뢰도는 이러한 성취기록이 감독관이 있는 상황에서 수집되고, 다수의 평가자가 평가하여 확보할 수 있다. 타당도 자료는 수집 불가능하다. 지원자와 의과대학은 미적지근할 수 있는데 일이 늘어나기 때문이다. 점수를 부여하지 않은 성취기록에 대해서는 거의 문헌자료가 없으나 개발이 어렵지 않고 다수의 지원자에게 적용가능하다.

Reliability is best when accomplishment records are collected in proctored settings and are scored by multiple raters.39 Validity data are not available with respect to their use in admissions. Applicants and users may have lukewarm reactions to them because of the added workload. There is little published research on unscored accomplishment records, but they are inexpensive to develop and can be administered to large numbers of applicants.


인성검사, 전기적 자료

Personality and biographical data inventories


인성검사도구와 전기적 자료는 지원자에게 리커트 스케일로 특정 명제에 대해서 어느 정도로 자신이 해당되는지 평가하게 한다. 이들 도구는 상대적으로 저렴하다.

Personality inventories and biographical data inventories ask applicants to indicate the extent to which a series of statements accurately describe them, typically using a Likert-type response scale. These tools are relatively inexpensive to develop and can be administered to large numbers of applicants.


그러나 high-stake 맥락에서 우려가 있다. 가장 큰 것은 코칭과 faking이다.

However, there are concerns about their use in a high-stakes admission context. A primary concern is the potential for coaching and faking responses.


낮은 SES의 지원자는 이러한 코칭을 받지 못해 불이익이 있을 수 있다.

Applicants from low socioeconomic backgrounds who do not have access to such coaching may be at a disadvantage.



면접

Local interviews


인터뷰 종류는 비구조화에서 구조화까지 다양하나 대부분 반구조화 면접을 사용한다.

Interview types range from unstructured to structured, but most medical school interviews are semistructured.


면접은 한계가 많은데, 우선 비구조화 면접의 신뢰도가 낮으며, 지원자의 지원서 자료를 같이 면접관에게 제공하는 평가에 오류가 생길 가능성이 높다. 또한 평가자 오류가 있을 수 있고, 지원자보다 평가자에 따라 점수가 달라진다.

Local interviews have a number of limitations, however. Reliability for unstructured interviews is poor, and the practice of providing interviewers with access to applicants’ application data introduces bias.42,43 In addition, local interviews are subject to rater error, and ratings may have more to do with the interviewer than the interviewee.43 


구조화된 면접은 다음과 같은 것이 있다.

structured interviews conducted at the University of Iowa Carver College of Medicine,47 and “behavioral event interviews” used by the Scholarly Excellence, Leadership Experiences, Collaborative Training program at the Morsani College of Medicine48


마찬가지로 코칭과 faking에 대한 우려

One concern about interviews is the potential for coaching and faking.



AC
Assessment centers


AC는 다양한 표준화된 평가를 활용할 수 있다.

Assessment centers can employ several standardized exercises (e.g., interviews, role-plays, in-baskets, group discussions) to provide multiple opportunities for multiple raters to evaluate applicant behaviors.


다양한 자료가 수집가능하나 자원이 많이 든다.

Data from assessment centers provide important information about applicants’ personal competencies, but such centers are resource intensive. Thus, it is not feasible to conduct them on a national level to provide data in time for pre-interview screening.



추가 연구 권고사항

Tools recommended for future study



우리의 권고는 SJT, SEP, 성취기록이다.

After reviewing the literature and evaluating potential tools on the eight criteria, we suggested that the AAMC further investigate three tools for possible use in assessing applicants’ core personal competencies during the admission process: SJTs, SEPs, and accomplishment records. We recommended these tools because each of them


그 이유는 아래와 같다.

provides data about personal competencies in a format that is easy to use and would be available in time for pre-interview screening, 

allows for multiple sources of assessment, 

• has acceptable validity and is likely to provide predictive value beyond UGPAs and MCAT scores in predicting nonacademic outcomes, 

• demonstrates less potential risk of coaching and faking effects compared with other tools, 

• is likely to be accepted by applicants and admission officers, and 

• avoids exorbitant costs that would likely be passed on to applicants.



어떤 도구도 모든 상황에 완벽하지 않아서 다양한 도구를 사용할 것을 권고한다. SJT SEP 성취기록은 모두 toolbox로서 함께 사용되어야 한다.

No tool is perfect for all situations, so we recommend that multiple tools be employed to assess personal competencies to enable admission officers to evaluate the information collected (just as they currently consider both UGPAs and MCAT scores in context). SJTs, SEPs, and accomplishment records should be used together—as part of an “admissions toolbox”—along with data on applicants’ academic competencies, in deciding which applicants to interview.


SJT에 대한 연구가 더 필요하다. 

Future research on the use of SJTs in medical school admissions should explore different formats for 

    • presenting scenarios (e.g., actors, avatars), 
    • alternative response formats (e.g., rank order, narrative responses), 
    • validity, and 
    • the impact of coaching/faking on validity and user acceptance.





2 Albanese MA, Snow MH, Skochelak SE, Huggett KN, Farrell PM. Assessing personal qualities in medical school admissions. Acad Med. 2003;78:313–321.


22 Bardes CL, Best PC, Kremer SJ, Dienstag JL. Perspective: Medical school admissions and noncognitive testing: Some open questions. Acad Med. 2009;84:1360–1363. Acad Med. 2009;84:1360–1363. 


SJT 23 Dore KL, Reiter HI, Eva KW, et al. Extending the interview to all medical school candidates—Computer-Based Multiple Sample Evaluation of Noncognitive Skills (CMSENS). Acad Med. 2009;84(10 suppl):S9–S12. 


SJT 24 Lievens F, Sackett PR, Buyse T. The effects of response instructions on situational judgment test performance and validity in a high-stakes context. J Appl Psychol. 2009;94:1095–1101. 


SJT 25 Lievens F, Sackett PR. Situational judgment tests in high-stakes settings: Issues and strategies with generating alternate forms. J Appl Psychol. 2007;92:1043–1055. 


SJT 26 Lievens F, Sackett PR. The validity of interpersonal skills assessment via situational judgment tests for predicting academic success and job performance. J Appl Psychol. 2012;97:460–468. 


SJT 27 Ziv A, Rubin O, Moshinsky A, et al. MOR: A simulation-based assessment centre for evaluating the personal and interpersonal qualities of medical school candidates. Med Educ. 2008;42:991–998.


SEP 35 ETS. ETS Personal Potential Index. http:// www.ets.org/ppi. Accessed January 25, 2013.


Structured Interview 48 Carney A. Building a better doctor. USF Magazine. 2011:53(3). http://www.magazine. usf.edu/2011-fall/building-a-better-doctor. aspx. Accessed January 14, 2013.



















 2013 May;88(5):603-13. doi: 10.1097/ACM.0b013e31828b3389.

Core personal competencies important to entering students' success in medical school: what are they and how could they be assessed early in the admission process?

Author information

  • 1Johns Hopkins University School of Medicine, Baltimore, Maryland, USA.

Abstract

Assessing applicants' personal competencies in the admission process has proven difficult because there is not an agreed-on set of personalcompetencies for entering medical students. In addition, there are questions about the measurement properties and costs of currently available assessment tools. The Association of American Medical College's Innovation Lab Working Group (ILWG) and Admissions Initiative therefore engaged in a multistep, multiyear process to identify personal competencies important to entering students' success in medical school as well as ways to measure them early in the admission process. To identify core personal competencies, they conducted literature reviews, surveyed U.S and Canadian medical school admission officers, and solicited input from the admission community. To identify tools with the potential to provide data in time for pre-interview screening, they reviewed the higher education and employment literature and evaluated tools' psychometric properties, group differences, risk of coaching/faking, likely applicant and admission officer reactions, costs, and scalability. This process resulted in a list of ninecore personal competencies rated by stakeholders as very or extremely important for entering medical students: ethical responsibility to self and others; reliability and dependability; service orientation; social skills; capacity for improvement; resilience and adaptability; cultural competence; oral communication; and teamwork. The ILWG's research suggests that some tools hold promise for assessing personal competencies, but the authors caution that none are perfect for all situations. They recommend that multiple tools be used to evaluate information about applicants' personalcompetencies in deciding whom to interview.

PMID:
 
23524928
 
[PubMed - indexed for MEDLINE]


지난 25년간의 의과대학 입학과 관련된 연구와 실제(Teach Learn Med, 2013)

A Perspective on Medical School Admission Research and Practice Over the Last 25 Years

Clarence D. Kreiter and Rick D. Axelson

Department of Family Medicine, University of Iowa Carver College of Medicine, Iowa City, Iowa, USA






다양한 관점에도 불구하고 입학 프로그램은 상당히 보수적인 편이다. 실제로 지난 25년간 북미 의과대학의 학생선발 과정을 보면 거의 변한게 없다.

Despite a wide diversity of viewpoints, admission programs tend to be quiteconservative and to perpetuate the professional status quo. Infact, a comparison between current admission practices at North American medical schools and those used 25 years ago suggest that procedures have changed very little.1–4


한 가지 가능한 설명은 개념적 장벽, 그리고 기관 차원의 장벽이다.

A more likely explanation is that conceptual and organizational barriers have slowed evidence-based progress.



연구 결과 활용의 장애요인

BARRIERS TO THE USE OF RESEARCH FINDINGS


우리가 보는 관점에서, 발전을 저해하는 가장 중요한 요소는 공통의 의제와 개념 토대의 부족이다.

In our view, what has most slowed progress has been the lack of a shared agenda and conceptual foundation for guiding scholarship on admission issues.


발전을 저해하는 첫 번째 개념적 문제는 의과대학이 지원자에서 어떤 부분을 평가하고자 하는 것에 대한 모호함이다. 여러 의과대학에서 평가하는 인적 특성의 다양성은 의과대학 지원자에게 바라는 바람직한 특성에 대한 합의가 적음을 뜻한다.

The primary conceptual problem hindering admission pro- cess refinement has been ambiguity regarding what medical schools are looking for in candidates. The wide range of per- sonal qualities that are purportedly being assessed by different colleges indicate that there is little agreement regarding the desired attributes.7


AAMC의 COA는 ILWG의 결과물에 네 개의 추론과 두 개의 핵심 과학 역량을 더했다.

The AAMC’s Committee on Admissions, building upon the work of the Innovation Lab Working Group, specified four reasoning and two science core competencies 


이 역량 목록에 대한 동의를 만드는 것이 이 문제를 해결하고 지원자에게 요구되는 바람직한 특성과 관련된 여러 이슈들을 해결하는데 도움이 될 것이다.

Agreement on this list of competencies will likely be an important step toward resolution of this matter and other issues conflated with the debate over desired candidate qualities.


보다 덜한 장애요인 중 하나는 학생선발의 목적이 복잡하다는 것에 있다. 다양한 목적들이 있으며, 이는 서로 상충하기도 하고, 각각은 다시 다양한 하부 이슈로 구성되어 있어서 학생 선발에 관한 의사결정 환경은 다면적인 특성을 지닌다. 이 복잡성은 과연 validity에 대한 근거가 선발 절차에 적용되기는 하는가에 대한 의구심을 낳는다. 비록 validity theory는 한 프로그램이 어떤 목적을 가지고 있느냐에 대해서는 중립적이지만, 변화에 대한 보수적인 관점과 새로운 근거기반 방법론을 지지하는 사람간의 관점 차이는 그 사람이 psychometric validity가 특정 목적이나 목표를 향해서 편향되어있는지 여부에 어떻게 생각하느냐에 달려있다.

A less recognized obstacle to evidence-based progress relates to the complexity of admission goals. The broad and often competing concerns for social justice, educational quality, and healthcare outcomes each contain numerous subsidiary issues that ultimately create a multifaceted and complex decision environment.This complexity has led some to question whether validity evidence even applies to selection procedures. Although validity theory is neutral regarding what goals are set for a program,often the difference between a conservative viewpoint toward change and one that advocates new evidence-based methods hinges upon whether one believes that psychometric validity is inherently biased in favor of a particular goal or objective. 


근거기반 방법론을 활용하는데 관련된 또 하나의 장애는 증가하는 정치적, 법적 압박이다. 비록 학생선발 절차가 그러한 압력에 반응해야 하는 것은 맞지만, 우리는 과학적 타당도 근거가 법적/정치적 영역에서도 공정성과 효율성을 정의하는 것의 핵심이 되어야 한다고 생각한다.

A related obstacle for utilizing evidence-based methods stems from the increasingly strong political and legal pressures placed on medical schools. Although it is clear that the admissions process must respond to such pressures, we believe that scientific validity evidence should be instrumental in defining fairness and efficiency in both the legal and political arenas.


마지막으로, 인간의 판단의 역할을 객관적으로 보는 것에 대한 거부감이 있다. 타당도에 대한 근거가 인간의 판단을 평가하는데 무관하다는 것, 그리고 양적 방법론이 개개인의 판단력에 위협을 가한다는 인식이 중요한 연구 결과를 무시하게끔 만들어왔다.

Lastly, there has been a reluctance to objectively evaluate therole of human judgment. The perception that validity evidenceis irrelevant in evaluating human judgments and that quantita-tive methods threaten the individual’s decision-making powerhas led to important research findings being ignored.



다섯개 영역에 대한 관점
A PERSPECTIVE ON THE FIVE AREAS OF INQUIRY 


입학절차는 미국 의학교육에 큰 영향을 미쳐왔다. 효과적인 교육적 개입은 고작해야 효과크기 0.20혹은 그 이하 정도의 이득에 그치나, 근거중심 선발의 효과는 훨씬 더 강하다. 실제로 잘 설계된 선발 절차는 수행능력을 1 SD 이상 향상시킬 수 있다.

Admission procedures have had a profound impact on North American medical education. Although effective educational interventions typically produce only small gains in learning, usually with effect sizes of .20 or less, evidence-based selec- tion is comparatively far more powerful. In fact, when well de- signed, selection procedures in medical education can achieve performance gains easily exceeding 1 standard deviation.14–16



1. 인터뷰와 관련 기술들

1. The Interview and Related Techniques


어떤 도구의 타당도도 그 신뢰도의 SQRT값을 초과할 수 없기 때문에, 입학 전 면접 성적의 재현가능성은 타당도의 필요조건이다.

Because the maximum validity displayed by any measure cannot exceed the square root of its reliability, establishing the reproducibility of the preadmission interview score is a nec- essary precondition for validity.


완벽한 시험-재시험 방식의 신뢰도가 면접의 유용성을 결정하는데 가장 관련이 높기 때문에, 다면화된 일반화가능도 분석을 통해 도출한 G 계수가 - 평가자간, 평가자내 일관성 지표보다 - 가장 이론적으로 면접의 타당성을 잘 보여준다.

Because a comprehensive test–retest type of reliability is the most relevant for determining the utility of the interview, a G coefficient from a multifaceted generaliz- ability analysis, rather than an interrater or internal consistency index, has the most theoretical relevance for establishing the validity of the interview.


연구 결과를 살펴보면, 전통적인 방식의 입학면접 점수는 전반적으로 재현가능성이 매우 낮으며, 이 낮은 신뢰도로부터의 타당도 역시 면접이 중요한 성과들을 예측하는데 효과가 없음을 보여주고 있다. 이러한 결과는 입학면접을 매우 강조하는 대부분의 북미 의과대학에 시사하는 바가 크다. 명백하게, 전통적인 면접은 선발에서 중요한 요소가 되어서는 안된다. 그리고 면접 점수를 최종 결정에 활용하는 것은 공정하고 타당한 평가가 이루어질 것이라는 지원자의 기대에 반하는 것이다.

Research suggests that the overall re- producibility of the traditional preadmission interview score is very low, and the validity implication of this lowreliability have been confirmed by studies demonstrating that the interviewis in- effective in predicting important outcomes.18–22 These findings have significance for the majority of North American medical schools as they continue to maintaining a strong emphasis on the preadmission interview.1,2 Clearly, the traditional interview should not be an influential component in selection, and the use of an interview score to make the final decision may violate an applicant’s expectation of fair and valid assessment practice.9


MMI는 면접에서 가장 중요한 혁신인데, OSCE 형식의 시험을 활용한 것이다. 다수의 독립된 평가자의 결과를 종합하는 이 평가방식은 신뢰도와 타당도 측면에서 일관된 결과를 보여주고 있다. MMI가 널리 활용되고 있지는 않지만, 이는 아마도 그 절차의 복잡성 때문이며, 비학업적 평가에 있어서 신뢰도 있는 결과를 내놓다는 것은 확실하다. 다수의 독립된 평가가 필요하다는 것은 명백하나, 어떤 행동을 관찰할 것이고 평가할 것인가에 대해서는 의견이 상반된다.

The MMI has as its most important innovation, the use of objectivelystructuredclinical examination–style mea- surement techniques.23 This well-examined assessment method, using multiple independently rated samples of behavior, has consistently produced summary measures displaying accept- able reliability and promising validity evidence.24–28 Although the MMI has not been widely adopted, perhaps due to the com- plexity of the measurement process, it does demonstrate the fea- sibility of generating a reliable nonacademic assessment from an interview-like procedure. Although it is clear that multi- ple independent measures of performance are needed, there are competing concepts regarding which behaviors should be ob- served and rated.19,26 

 


2. 입학 시험

2. Admission Tests


가장 높은 예측력에도 불구하고 입학시험을 활용하는 것의 타당도에 대해서는 의견이 갈린다. 예컨대 MCAT이 어떤 임상기술이나 의과대학 후반/전공의/진료의 성취 결과를 예측하는데 비효과적이라는 결과로부터 어떤 사람들은 MCAT이 선발에서 제한적으로만 활용되어야 한다고 주장한다.

Despite generating the highest predictive coefficients, there have been conflicting views regarding the validity of using an aptitude/achievement test for selecting applicants to study medicine. For example, because the MCAT has been reported to be ineffective in predicting certain clinical skill and achieve- ment outcome measures fromthe later years of medical school, residency, and practice, some have maintained that the MCAT is of limited use in selection.31–35


메타분석을 해보면, 타당도에 대한 일반화 연구로부터 MCAT의 예측력이 비교적 의과대학 전 학년에 걸쳐서 일관되게 나타나며 MCAT이 시험점수로 나타나는 학업성취 뿐 아니라 임상기술 예측도 잘 하는 것을 보여준다.

Using meta-analytic techniques, validity generalization research has convincingly demonstrated that the predictive power of the MCAT remains relatively consistent across the medical school years and beyond and that the MCAT tends to predict clinical skills almost as well as written test-based academic achievement outcomes.15


입학시험은, 그 이름을 뭐라고 부르든, 일반적인 지능 및 추론 능력과 매우 높은 상관이 있다. 이는 입학시험을 바탕으로 한 선발은 지능을 바탕으로 한 선발이라는 의미이기도 하다.

Admission tests, whether labeled as aptitude or achievementassessments, are highly correlated with general intelligence andthe ability to reason.38,39 This implies that selection based onadmission testing also selects on general intelligence. 


다른 학생선발 맥락과 비교해보면, 의학교육에서 선발을 하는 사람들은 대른 분야에 비해서 고도로 동기부여가 되어있고 똑똑한 학생 중에서 산발하는 것이다. 이런 이유로 informed selection 방법은 겨우 의과대학 지원요건/면허취득요건을 충족한 지원자를 찾는 것보다 훨씬 더 능력 있는 지원자를 선발하는 것이다.

Compared to other selection contexts, medical education ad- mission professionals are in the enviable position of being able to select from a large pool of highly motivated and intelligent applicants. For this reason, informed selection methods are ca- pable of achieving far more than identifying applicants who will merely pass the requirements of medical school and the associated licensure and certification examinations.42


Section 4에서 논의할 것과 마찬가지로, 이 결론의 함의는 다양한 입학 프로그램의 목적을 달성하고자 할 때, 가장 타당한 모델은 MCAT 평균점수를 최대화하되, 다른 유형의 목표에 의해서 제한되도록 하는 것이다.

As we discuss in Section 4, the practical implication of this conclusion is that when attempting to achieve diverse admission programobjectives, the most valid models are those that maximize the average MCAT within the constraints imposed by other class composition goals.43,44


MCAT과 관련된 중요한 이슈 중 하나는 uGPA와 높은 상관이 있다는 점이다. 두 척도간에는 다중공선성이 높아 MCAT이 의과대학 성적이나 USMLE 점수를 예측하는데 약간정도만 기여하게 된다. 따라서 MCAT을 배제하고 uGPA만 사용하는 것이 타당해 보일 수 있으나, MCAT을 유지해야 하는 이유가 있다. 하나는 대학 성적의 의미는 대학이 어디인지 전공이 무엇인지 따라서 매우 다르다는 것이다. 또한 이러한 이유로 uGPA만 활용하는 것은 불공정하고 교육 절차를 훼손시킬 수 있다. MCAT은 uGPA를 여러 대학간 동질화시키는 효과적인 수단이다.

An additional and important interpretive issue concerning MCAT research relates to the fact that admission test scores correlate highly with undergraduate grade point average (uGPA). Multicollinearity between the two measures results in the MCAT contributing only modestly to the incremental vari- ance explained in medical school grades and USMLE scores.14 Althoughit might therefore seemreasonable tosimplyuse uGPAwithout MCAT, there are compelling reasons to retain MCATin the selection process. One of the primary shortcomings ofcollege grades relates to the fact that the meaning of uGPAvaries dramatically across undergraduate institutions and ma-jors.45 The variability in standards across institutions suggeststhat it is unfair and likely damaging to the educational process,to use uGPA in isolation. Because MCAT is an effective meansof equating uGPA across institutions, it also serves a vital rolein maintaining the integrity of our educational metrics.46 



3. 개인 역량에 대한 다른 척도들

3. Other Measures of Personal Competencies


보통 의과대학 입학에는 추천서가 필요하다. 2012년 봄, 미국 의과대학들은 personal competency가 널리 평가되고 있고 의과대학 선발에서 중요하다는 것을 확인했다. 설문에 응한 99명의 학장은 모두 추천서를 활용하고 있다고 했으며, '중요한' 정보원으로 평가했다. 그럼에도 불구하고 추천서에 기록된 정보는 여전히 우려의 대상인데, 그 정보의 질은 작성자에게 제공되는 형식의 구조화 정도에 영향을 받는 것으로 보인다.

Letters of reference/evaluation for candidates are generally required for medical school admission. A spring 2012 sur- vey of U.S. medical school admission deans confirmed their widespread use and importance in admission processes for as- sessing personal competencies.47 All 99 deans responding to the survey (70%of the 142 medical schools) reported using let- ters of evaluation in their admissions processes and, on average, rated them as “important” sources of information for deciding whom to interview and admit. Nevertheless, the quality of the information provided in letters is a source of ongoing concern. The information quality appears to vary substantially depending upon the degree of structure provided to letter writers.


표준화되지 않은 추천서는 낮은 평가자간 신뢰도, 낮은 예측타당도를 보여준다. 결과적으로 구조화된 형태에 대한 관심이 모아지고 있다. 또한 핵심 역량을 설명하기 위해서 척도와 사례 기술을 활용하고자 하고 있다.

Nonstandardized letters of reference have generally been found to have lowinterrater reliability and little predictive valid- ity.21,48 Consequently, there is considerable interest (75%of the aforementioned medical school admission dean respondents) in moving toward a structured format employing more detailed letter writing instructions and/or the use of ratings and narrative descriptions to address a set of core competencies. The AAMC’s Admission Initiative has provided a sample set of guidelines to support such efforts.49


인적특성 평가 도구는 또 다른 방법이다.

Personality tests/inventories are another method used for gathering information about candidates’ personal competen- cies.


비록 평균적인 예측타당도는 매우 낮지만 Tett와 Christiansen은 잘 구조화된 인적특성 측정 도구는 적절한 환경에서 사용하면 쓸만한 수준으로 예측타당도를 보여줌을 말한다. 이러한 목적에 따르면, 문제는 적절한 검사와 검사 조건이 무엇인가 하는 것이다.

Although their average predictive validity appears to be quite low, Tett and Christiansen50 argued that when well-constructed personality tests are used under the proper conditions, they can achieve useful levels of predictive validity. For our purposes, the issue then becomes whether the proper test(s) and conditions can be developed for selecting medical school applicants with sufficient predictive validity. 


이러한 조건이 의과대학 선발과정에 가능하냐는 우려가 있다. Bardes 등은 비인지적 척도와 의과대학 수행능력의 상관관계에 일부 근거가 있지만, 이는 의과대학생을 대상으로 한 것이며 지원자를 대상으로 한 것이 아니라고 지적했다. 선발 과정이 high stake 성격을 가지므로, 지원자들이 솔직하게 대답하기보다는 '바람직한' 답을 쓸 수도 있다.

There is considerable skepticism regarding whether theseconditions can be attained in medical student selection pro-cesses. As Bardes et al.52 noted, although there is some ev-idence of links between noncognitive measures and medicalstudent performance, these studies were based on medical stu-dents rather than applicants. Given the high stakes nature of theselection process, there is strong incentive for applicants to givethe answers desired by an admissions committee rather than an-swering questions candidly


자기소개서는 세 번째 방법

Autobiographical essays are a third method for collective information about nonacademic attributes. 


의과대학이 이 정보를 수집하고 점수를 매기는 방법은 매우 다양하며 멕마스터대학에서의 연구에 따르면 평가자간 신뢰도가 낮아서 성과 척도와 낮은 상관관계만을 보여준다. 

There appears to be much variation in how medical schools collect and score this information. Re- search conducted at McMaster University suggests that scores on the essays have little correlation with outcome measures, at least in part, because of their low interrater reliability.22 Subse- quent research found that the reliability and predictive potential for these scores was improved by administering the essays in an onsite, proctored, time-controlled environment and by rating the resulting essays using a horizontal scoring method.53


따라서 위의 세 가지 중에서는 현재로는 지침에 따라 작성하는 표준화된 추천서가 personal competency를 평가하는 가장 좋은 방법으로보인다.

Thus at present, standardized letters of recommendation guided by instructions such as those developed by the AAMC’s Admission Initiative appear to be the most promising of these three methods for gathering reliable and valid information about candidates’ personal competencies through admission processes.


SJT와 Sternberg's successful intelligence test도 있다.

Situational judgment tests (mentioned earlier) and Sternberg’s successful intelligence test54 are two examples of ability-based tests that might capture additional information about applicants’ personal competencies. Successful intelli- gence includes creative and practical skills that enable individuals to envision, evaluate, and implement practical solutions to problems. Using a sample of 793 college students, Sternberg54 showed that such measures substantially improved the predic- tion of students’ college GPA.



4. 합격자 결정 과정

4. The Decision Process



궁극적으로, 최종 결정의 integrity가 선발 과정의 성패를 결정한다. 의학교육을 위해 고안된 평가 기준에서 알 수 있든, 결정 과정의 타당도가 매우 중요하다. 그러나 타당도 근거에 대한 인식은 매우 낮다.

Ultimately, the integrity of the final decision is what defines the success of the selection process. As recognized by assess- ment standards created specifically for medical education, the validity of the decision procedure is every bit as important as the validity of the evaluative measures used to describe the ap- plicant.55 Yet, as discussed in the introduction, there has been little recognition of validity evidence in designing the admission process that ultimately generates the final decision.


이 결정은 지원자의 정보가 인간의 판단과 결합되어야 하며, 혹은 수학 공식과 결합되어야 한다. 국가 수준에서 전인적 기법을 사용하려는 시도가 지난 십년간 있어왔고 북미 의과대학 중 다수에서는 입학위원회에서 최종 결정을 내린다. 그러나 이러한 방식에 대한 근거는 없으며, 연구를 살펴보면 고도로 노동집약적인 입학위원회의 절차는 오히려 신뢰도과 타당도를 떨어뜨린다. holistic 방법을 사용하는 것은 입학을 비밀스럽게 만든다. 투명성, 타당성, 신뢰성을 높이려면 연구 근거에 따르면 actuarial model을 활용해야 한다. 

The final decision requires that applicant information be com- bined using either human judges (i.e., holistic, clinical methods) or a mathematical formula (i.e., statistical, actuarial methods). Over the last decade there has been an effort at the national level to promote the use of holistic techniques, and the admissions committee is currently used by the vast majority of North Amer- ican medical schools to make the final admission decision.4,56 Unfortunately, there is little evidence to support these practices, and the research evidence suggests that labor-intensive admis- sion committee procedures may ultimately compromise both the reliability and validity of the admission process.57 The use of holistic methods might also contribute to admissions being characterized as “secretive” at many institutions.58 For promot- ing transparency, validity, and reliability, the research evidence clearly supports the use of actuarial models over holistic meth- ods.59 Actuarial models can take many forms and can be tailored to meet the specific goals of the college. A recently introduced actuarial model that uses constrained optimization methods has been shown to be efficient and flexible in responding to unique and varied admission goals and may be capable of enhancing diversity.43,44



5. 척도 정의와 측정

5. Defining and Measuring the Criterion


안타깝게도, 지난 25년간 성과 척도를 개발하는데 거의 자원이 투입되지 않았다. 많은 연구들이 성공적인 학생 똔느 의사를 만드는데는 많은 요소가 필요함을 인정하지만, 정확상과 편의성을 이유로 성적/면허시험점수/증명서 등에 안주하고 있다.

Regrettably, over the last 25 years, there have been fewresourcesallocated toward the development of outcome measures. Moststudies acknowledge that there are many elements that define a successful student or physician, but for reasons of measurement precision and convenience, settle for grades, licensure scores, and/or successful certification.


사용가능한 다양한, 하지만 불완전한 지표들을 합하여 척도를 만드는 것이 지금까지 간과되어 온 방법이다. 통계적인 insight에 따르면, 척도가 다양하더라도 우리가 성공적인 의사라고 정의한 그 방향을 모두 일관되게 향하고 있다면 그것들을 합함으로써 신뢰도와 타당도 높은 결과를 얻을 수있다.

One overlooked method for building an index that could pos- sibly reflect physician success might be found in a composite measure constructed from a large set of available but imperfect indicators. Statistical insights suggest that if the components of a composite measure all vary in the same direction as what we are trying to define/predict (a successful physician), a reli- able and valid composite is likely to evolve.


어떤 사람은 어떻게 이렇게 다양한 요소들을 - 각각이 기여하는 정도가 어느 정도인지도 모르고서 - 합할 수 있느냐고 무렁볼 수 있는데, 통계 연구를 살펴보면, 비록 어떤 요소들이 더 중요하고 더 신뢰도가 높을 수있지만, 표준화된 점수를 합하는 것 만으로도 가중치가 적용된 점수와 거의 완벽하게 상관관계를 가진다.

One might wonder howweights for such diverse components could possibly be generated when we have so little knowledge about each indicator’s relative contribution in defining the con- struct. Fortunately, statistical research demonstrates that this problem is easily solved. Although it is reasonable to regard some of these component indicators as more important and re- liable than others, statistical evidence clearly indicates that a sum of standardized scores (simple unit weighting) will gener- ate a composite almost perfectly correlated with any optimally weighted composite we might derive with more complete information.60



19. Axelson RD, Kreiter CD. Rater and occasion impacts on expected pread- mission interview reliability. Medical Education 2009;43:1198–202.


26. Ziv A, Rubin O, Moshinsky A, Gafni N, Kotler M, Dagan Y, et al. MOR: A simulation-based assessment centre for evaluating the personal and interpersonal qualities of medical school candidates. Medical Education 2008;42:991–8.


43. Kreiter CD. The use of constrained optimization to facilitate admission decisions. Academic Medicine 2002;77:148–51. 44. Kreiter CD, Stansfield B, James PA, Solow C. A model for diversity in admissions: Areviewof issues and methods and an experimental approach. Teaching and Learning in Medicine 2003;15:116–22.


48. Siu E, Reiter H. Overview: What’s worked and what hasn’t as a guide to- wards predictive admissions tool development. Advances inHealthSciences Education 2009;14:759–75.


49. Association of American Medical Colleges. Guideline for writing letter of evaluation for a medical school applicant. 2011. Available at: https://www. aamc.org/initiatives/admissionsinitiative/letters./ Accessed July 15, 2013.


54. Sternberg RJ. Assessing students for medical school admissions: Is it time for a new approach?. Academic Medicine 2008;83(Suppl 10):S105–10.

59. McGaghie WC, Kreiter CD. Holistic versus actuarial student selection. Teaching and Learning in Medicine 2005;17:89–91.






 2013;25 Suppl 1:S50-6. doi: 10.1080/10401334.2013.842910.

perspective on medical school admission research and practice over the last 25 years.

Author information

  • 1a Department of Family Medicine , University of Iowa Carver College of Medicine , Iowa City , Iowa , USA.

Abstract

Over the last 25 years a large body of research has investigated how best to select applicants to study medicine. Although these studies have inspired little actual change in admission practice, the implications of this research are substantial. Five areas of inquiry are discussed: (1) the interview and related techniques, (2) admission tests, (3) other measures of personal competencies, (4) the decision process, and (5) defining and measuring the criterion. In each of these areas we summarize consequential developments and discuss their implication for improving practice. (1) The traditional interview has been shown to lack both reliability and validity. Alternatives have been developed that display promising measurement characteristics. (2) Admission test scores have been shown to predict academic and clinical performance and are generally the most useful measures obtained about an applicant. (3) Due to the high-stakes nature of the admission decision, it is difficult to support a logical validity argument for the use of personality tests. Although standardized letters of recommendation appear to offer some promise, more research is needed. (4) The methods used to make the selection decision should be responsive to validity research on how best to utilize applicant information. (5) Few resources have been invested in obtaining valid criterion measures. Future research might profitably focus on composite score as a method for generating a measure of a physician's career success. There are a number of social and organization factors that resist evidence-based change. However, research over the last 25 years does present important findings that could be used to improve the admission process.

PMID:
 
24246107
 
[PubMed - indexed for MEDLINE]







학부의학교육과 의사 프로페셔널리즘의 토대(JAMA, 2015)

Undergraduate Medical Education and the Foundation of Physician Professionalism


Darrell G. Kirch,MD Association of American Medical Colleges,Washington, DC.

Maryellen E. Gusic, MD Association of American Medical Colleges,Washington, DC.

CoriAst,MHSA Association of American Medical Colleges,Washington, DC.





프로페셔널리즘은 이렇게 정의된다.

Professionalism is the demonstrated commitment to carrying out professional responsibilities and an adherence to ethical principles”1 


여러 전문직 조직에서 어떻게 의사들의 프로페셔널리즘을 유지할 것인가에 대한 논쟁이 있지만, 학부의학교육에서는 공유거버넌스 모델(shared governance model)이 이 중요한 역량 개발의 프레임워크를 제공한다.

Although there is current controversy regarding how diverse professional organizations should ensure professionalism among practicing physicians, during undergraduate medical education a shared governance model, as described below, provides the framework for developing and accessing this critical competency.


프로페셔널리즘의 토대는 의과대학 이전에 시작된다.

The Foundation of Professionalism Begins Before Medical School


수년간의 개인적 경험을 통해서 의사가 되고자 하는 사람들은 프로페셔널리즘 역량의 토대가 되는 전-프로페셔널리즘을 갖추게 되며, 의과대학 학생선발 과정에서 전-프로페셔널리즘을 평가하는 것이 중요하다.

Aspiring physicians, through many years of personal ex- periences prior tomedical school, establishthe “prepro- fessional” foundation for competence in professional- ism, making it important to assess preprofessional attributes in medical school admissions.


이러한 의사결정의 기반을 이룰 9개의 내적, 대인관계적 역량이 기술된 바 있다.

Providing the groundwork for these decisions, 9 core interpersonal and intrapersonal competencies have been articulated for entering medical students: 

    1. ethical responsibility to self and others; 
    2. reliability and dependability; 
    3. service orientation;
    4. social skills; 
    5. capacity for improvement; 
    6. resilience and adaptability; 
    7. cultural competence; 
    8. oral communication;and 
    9. teamwork.2

이러한 개인 역량들은 대부분의 의과대학에서 임상실습 뿐 아니라 이후의 진료의 성공을 예측하는 요인임이 밝혀졌다.

Importantly, these personal competencies for entering students have been shown to be predictive of success at the majority of medical schools, both in clinical rotations and later in practice.2


이들 전-프로페셔널리즘 역량의 개발에 기여하는 경험들은 학생의 가정과 지역사회 맥락에서 발생하며, K-12, 학부, (종종) 대학원 교육에서 더 강화된다. 추가적으로 점차 '비전통적' 지원자가 늘어나고 있는데, 이들은 봉사경험이 있거나 해당 분야에서 근무한 적이 있어서 더욱 강화된 전-프로페셔널리즘 특성을 지니고 있다.
The experiences that contribute to the development of these preprofessional competencies occur initially within the context of a student’s family and community and,hopefully, are reinforced by experiences in K-12, under-graduate, and(in some cases) graduate education. Inad-dition, increasing numbers of applicants are “nontradi-tional,” having had work or service-oriented experiences prior to medical school that enhance preprofessional at-tributes.


의과대학 입학 후에 강화되는 프로페셔널리즘

Admission to Medical School Reinforces the Commitment to Professionalism


전-프로페셔널리즘 역량에 대한 평가는 의과대학에서 "전인 평가"의 핵심이다.

The evaluation of preprofessional competencies is central to the “holistic review” of medical school applicants.


의과대학 학생선발 절차의 중심에는 MCAT이 있는데, 이 MCAT은 최근 21세기 의사에게 필요한 더 광범위한 포트폴리오를 요구하는 방향으로 수정되었다.

Central to the admissions process, the Medical College Admissions Test (MCAT) has recently been revised to emphasize the broader portfolio of skills required by physicians in the 21st century. 


개정된 MCAT은 다른 몇 가지 선발절차의 변화와 동반되고 있는데, 그 중 하나는 추천서 작성에 대한 가이드라인을 만든 것이다. 또한 표준화된 지원서 양식을 통해 학사 학업내용 뿐 아니라 관련된 개인 경험을 기술하도록 하였다.

The revised MCAT has been coupled with other changes in the admissions process, including the creation of guidelines for letters of recommenda- tion to ensure inclusion of information about the core competencies for entering medical students and, in the standardized application form, asking applicants to document relevant personal experiences in addition to their coursework.


도입 초기이지만 MMI는 의과대학 수행능력을 예측하는 것으로 보인다.

In early stages of implemen- tation, MMIs appear predictive of future performance in medical school.4


아직 미국 내에서는 검토단계이지만 SJT는 벨기에 의대생 선발에서 1997년부터 사용되어왔다.

Although still under assessment in the United States, an SJT has been used by Belgium for medical school admission since 1997.2



학부의학교육에서 프로페셔널리즘 강조하기

Promoting Professionalism Within Undergraduate Medical Education


점차 프로페셔널리즘은 임상 세팅에서 롤모델을 하는 것 만으로 형성되는 것이 아니며, 초기부터, 지속적으로, 교수법과 경험학습을 정교하게 활용하여야 형성된다. LCME 인증 기준은 의과대학이 프로페셔널리즘을 기를 수 있는 학습환경을 만들어야 함을 공식화한다.

Increasingly, it is recognized that professionalism is not cultivated solely by role-modeling in clinical settings but rather that pro-fessionalism must be taught early, longitudinally, and deliberately using both targeted instruction and experiential learning. The LCME accreditation standards formalize the institutional responsibility by requiring that medical schools maintain a learning environment that cultivates the development of professionalism among learners. 


중요한 점은, 한 학생이 전문직으로 성장하는 것에 대한 책임은 행정, 교수, 학생이 모두 공유하는 것이다.

Importantly, this responsibility for students’ professional development is shared with administrators, faculty, and students serving on various administrative committees.


프로페셔널리즘 평가를 위해서 절반 이상의 미국과 캐나다 의과대학은 다음에 의존하고 있다.

To support these assessments, more than half of medical schools in the United States and Canada rely on 


“defined, written standards of non-cognitive behavior, [including] 

    • honesty; 
    • professional behavior; 
    • dedication to learning; 
    • professional appearance; 
    • respect for law and others; 
    • [and adherence to standards related to] confidentiality; and 
    • [lack of issues related to] substance abuse,” 


in addition to the academic standards for promotion established by each school.6 


프로페셔널리즘 평가는 생화학 지식 평가보다는 덜 정확할지 모르나, 다양한 도구들이 있다.

Although assessing competence in professionalism may be less precise than assessing competence in the knowledge of biochemistry, today there are tools to assess professionalism in students, including 

  • patient evaluations, 
  • self and peer assessments, 
  • behavioral observation, 
  • psychological testing, and 
  • even structured examinations.6

학생의 진급을 평가하는 위원회가 물론 기준에 따라야 하나, shared obligation이 있다.

While promotion committees have formal responsibility for adherence to standards, identifying deficiencies in professionalism is a shared obligation among individual faculty educators and extends throughout medical school



전문직 양성에 성공 모델로서의 공유거버넌스
Shared Governance Is a Successful Model for Professional Formation

"한 아이를 키우는데는 한 마을이 필요하다"라는 말이 있듯, 이는 의사를 양성하는 것에도 마찬가지이다. 

The dictumthat “it takes a village to raise a child” also appears to be true of educating a physician. Although the ultimate responsibility for professionalism rests with the physician aspirant, multiple parties are involved in shaping the preprofessional attributes of aspiring physicians, as well as those in the undergraduate process of training, including 

committees for admissions, curriculum, and progression that engage administrators, faculty, and students in the oversight, development, and assessment of professionalism. Affiliated organizations, including the LCME and AAMC, play a significant role in setting standards and providing tools related to teaching and assessing professional development.



4. Pau A, JeevaratnamK, Chen YS, Fall AA, Khoo C, Nadarajah VD. The Multiple Mini-Interview(MMI) for student selection in health professions training—a systematic review. Med Teach. 2013;35 (12):1027-1041.



6. Boon K, Turner J. Ethical and professional conduct of medical students: review of current assessment measures and controversies. J Med Ethics. 2004;30(2):221-226.





 2015 May 12;313(18):1797-8. doi: 10.1001/jama.2015.4019.

Undergraduate medical education and the foundation of physician professionalism.

Author information

  • 1Association of American Medical Colleges, Washington, DC.
PMID:
 
25965213
 
[PubMed - indexed for MEDLINE]


미래 의사 선발: 미래 보건의료인력의 핵심(Acad Med, 2013)

Selecting Tomorrow’s Physicians: The Key to the Future Health Care Workforce

Kelly E. Mahon, MA, Mackenzie K. Henderson, and Darrell G. Kirch, MD





최극 미국 내 의료혁신은 세 가지 목적을 가지고 있다. 하나는 개개인의 건강을 향상시키는 것, 둘째는 인구집단의 건강을 향상시키는 것, 그리고 마지막으로 비용을 낮추는 것이다. 이 목적을 달성하기 위해서 전통적으로 '자율성'을 바탕으로 진료를 해왔던 의사들은 점차 팀-기반 진료 모델의 구성원이 될 것을 요구받고 있다.

Recent health care reform efforts in the United States have focused on the “triple aim”1 of improving health care for individuals, improving population health, and lowering costs. Physicians, who traditionally have practiced with considerable autonomy, will be required to become members of the team-based patient care models that are necessary to achieve these goals



의과대학 입학: 역사적 유물

Medical School Admissions: A Historical Legacy 



플렉스너가 20세기 초반, 북미 전역에 걸쳐 의과대학을 평가했을 때, 의학전교육요건(premedical education requirements)과 의과대학 입학절차가 부실하다는 점을 지적했다. 이는 미국에 기준에 미달하는 의과대학들이 횡행하는 이유이며 "의학교육과 의료에 전례없는 해를 끼칠 수 있는 토양"이라고 묘사했다.

When Flexner traveled across North America in the early 20th century, he decried the lack of rigor in premedical education requirements and medical school admission processes, describing the proliferation of substandard medical schools in the United States as “the fertile source of unforeseen harm to medical education and to medical practice.”3 


플렉스너 보고서의 중요한 유산 중 하나는, 미래 의사가 최소한의 기초과학 지식을 갖춰야 한다는 것이다.

The key enduring legacy of the Flexner Report is its argument that future physicians should possess a minimum threshold of knowledge in the basic and natural sciences.4 


MCAT시험은 의과대학지원자의 과학지식에 대한 성취 정도를 평가할 뿐 아니라, GPA와 더불어 의과대학에서의 성적, 의사면허시험에서의 성적을 예측하는 도구로 활용되어 왔다.

The MCAT exam has become the tool of choice not only to measure medical school applicants’ mastery of scientific content, in conjunction with their grade point averages, but also to act as a reliable predictor of success in medical school and initial licensure examinations.5 


오랫동안 의과대학 지원자를 GPA나 MCAT 점수를 넘어선 어떤 기준에 의해서 평가해야 한다는 열망이 있었으며, 1980년대 초반에는 의과대학 지원자를 단순한 예비 학자가 아니라 하나의 인간으로서 평가하는 live interview가 등장하였다. 최근 면접이 개선됨에 따라 면접이 스크리닝 도구로서 강점을 가지게 되었음에도 불구하고, 면접은 전통적으로 약하고, 주관적이고, 비일관된 도구라고 인식되어 왔다.

There long has existed a clear need to assess applicants beyond their grades and MCAT scores. By the early 1980s, live interviews emerged as a tool to help admissions officers get to know an applicant as a person and not merely as a scholar.6 Although recent innovations, as we will discuss below, are showing great promise to strengthen the interview as a screening tool,7 interviews traditionally have been a relatively weak, subjective, and inconsistent means by which to assess medical school applicants.8



미래 의료인력 선발: 의료 혁신의 핵심

Selecting the Future Physician Workforce: A Key to Health Care Reform


플렉스너의 연구 결과에 따라서 만들어진 입학 시스템이 20세기와 21세기 의사들이 전통적인 자연과학 분야의 튼튼한 토대를 갖출 수 있도록 성공을 거둬온 것은 사실이다. 그러나 이는 역설적으로 보건의료 시스템을 변혁할 수 있는 혁신적 의사를 찾아내는데는 약점을 보여왔다.

Although the admission system created in response to Flexner’s findings has been successful in ensuring that 20th- and 21st-century physicians are grounded in the natural and traditional life sciences, it has fallen short in identifying the innovative physicians who can transform the health care system.


미국의 의료는 비용은 많이 들고 그 성과는 나쁘다.
The United States has the highest health care spending when compared with similar developed nations, yet it has poor outcomes on numerous measures, including life expectancy, infant mortality, and obesity.9

게다가 미국의 건강 격차는 심하기로 악명높다.

In addition to high costs and poor outcomes, the United States suffers from pernicious health disparities along the lines of race, ethnicity, and geographic location.12 


2013년 2월, 100명 이상의 의과대학 리더가 회담을 가지고 현재의 의료비용의 지속불가능성에 대해 논의했다. 여기서 나타난 합의는 진정으로 높은 가치를 가지는 의료시스템만이 비용을 줄일 수 있으며, 이를 위해서는 진정한 재설계가 필요하다는 것이다.

In February 2013, more than 100 leaders of medical schools and teaching hospitals convened at a summit hosted by the Association of American Medical Colleges (AAMC) to address the unsustainability of current health care costs. A consensus emerged that creating a truly high-value health system will require more than revenue expansion and expense reduction; it will entail a true redesign.15


의사들은 시스템 기반 사고를 할 수 있어야 하며, 국가의 보건의료시스템에 긍정적 변화를 이끌어나갈 수 있어야 한다.

Physicians must have the capacity to engage in systems-based thinking and work in teams to lead positive change in the nation’s health care system.




의과대학 입학을 다시 생각하기: 여러 요인의 합류
Rethinking Medical School Admissions: A Confluence of Factors


의과대학 입학에 대해 생각할 때 몇 가지 함께 고려할 요인이 있다.

Several major factors have converged to influence thinking about medical school admissions. 

  • national debate surrounding health care reform 
  • passage of the Patient Protection Affordable Care Act— 
  • issues regarding professionalism 

이렇게 여러 요인들이 합해짐에 따라서, AAMC는 이를 지금까지 이뤄진 정기적인 MCAT시험에 대한 검사에 대한 것을 넘어서 더 넓은 차원에서의 입학절차 개혁을 이룰 기회로 보았다. AI를 창단하였으며 그 목적은 다음과 같다.

In this confluence of factors, the AAMC recognized an opportunity to consider a broader transformation of the medical school admissions process beyond its regularly scheduled review of the MCAT exam.19 The association launched its Admissions Initiative (AI), aimed at transforming the way in which medical school applicants are assessed and selected in order to identify those who will become the kinds of physicians best suited to practice in a dynamic health care environment. Specifically, the AI is designed to 

  • support the implementation of holistic admissions, 
  • explore ways to ease the transition to competency- based learning and assessment in undergraduate medical education, and 
  • examine new and better ways to measure core, entry-level competencies for medical students.20 

AAMC와 전 국가적으로 지난 100년간 의과대학을 지지해온 입학 시스템이 자연과학 뿐 아니라 '좋은 임상 매너'를 갖춘 의사가 되기 위한 대한 탄탄한 기초를 갖춘 의사 양성을 위해서 개선되어야 한다는 인식이 있었다. 이는 더 높은 수준의 프로페셔널리즘, 잘 다듬어진 의사소통기술, 미래의 환자들을 이해하고 상호작용할 수 있는 능력 등을 포함한다.

There was increased recognition at the AAMC and nationally that the admission system that had served medical schools well for the past century could be improved to identify those future physicians with both a strong foundation in the natural sciences and a “good bedside manner,” that is, a high degree of professionalism, well-honed communication skills, and an ability to interact with and understand their future patients.21–23



전인적 학생선발

Supporting Holistic Admissions


전인적 학생선발(Holistic admissions)은 AI의 핵심적 요소이며, 다음의 것을 의미한다.

Holistic admissions, an integral component of the AI, refers to a “flexible, highly individualized process by which balanced consideration is given to the multiple ways in which applicants may prepare for and succeed as medical students and doctors.”24


이러한 절차는 미국 대법원의 'holistic review'를 따른 것으로, 2003년 만들어졌으며, 개개인에 대해서 평가를 할 때 그 지원자가 교육환경의 다양성에 어떻게 기여할 수 있는가를 고려해야 한다는 것이다. 전인적 학생선발은 미션에 기반하여, 광범위하게, 각 기관별로, 지원자 전체에 걸쳐 일관되게 적용되어야 한다. 세 가지 목표가 있다.

This process complies with the U.S. Supreme Court’s “holistic review” rubric, which was established in 2003 by Grutter v. Bollinger, and calls for an individualized review of each applicant that considers how that applicant might contribute to a diverse educational environment. Evaluation criteria for a holistic review process must be mission driven, broad based, institution-specific, and applied across the applicant pool consistently.25 Holistic review has three goals: 

  • to assess applicants’ academic readiness for medical school, 
  • to identify and assess applicants’ interpersonal and intrapersonal competencies, and 
  • to encourage diversity in medical education.




학업 준비도 재정의

Redefining academic readiness


의학교육에서의 역량 정의

To define medical education competencies, two working groups identified the skills and knowledge that future physicians should possess on entry to or completion of medical school. 

  • Issued in 2009, “Scientific Foundations for Future Physicians” 
  • The companion report, “Behavioral and Social Science Foundations for Future Physicians,”

최근 AAMC와 다른 다섯개의 협회가 IPEC를 구성

More recently, the AAMC and five other health associations representing schools of osteopathic medicine, dentistry, nursing, pharmacy, and public health jointly created the Interprofessional Education Collaborative (IPEC).


여기서는 전문가-간-역량을 정의하고 보고서 발간

This group initially defined four interprofessional competencies that health professions students should acquire over the course of their training: 

  • values and ethics, 
  • understanding roles and responsibilities, 
  • interprofessional communication, and 
  • teamwork. 

The result of IPEC’s efforts, “Core Competencies for Interprofessional Collaborative Practice,” represents the first time consensus has been reached about competencies required for team- based practice in a variety of settings, including in the clinic and at the bedside.27


MCAT이 현재 의과대학생 선발에서 중요한 도구로 사용되고 있기에 MR5는 2015년부터 MCAT의 개선된 버전을 사용할 것을 권고하였다. 가장 큰 변화는 행동과학, 사회과학 개념이 포함된다는 것이다.

In recognition of the MCAT exam’s status as an important tool for medical student selection,2 the fifth MCAT review (MR5) committee recommended, and the AAMC Board of Directors approved in February 2012, revisions to the MCAT exam beginning in 2015.19,28–30 One of the most prominent changes is that the 2015 exam will add a section that tests knowledge of concepts from the behavioral and social sciences to complement testing in the basic and natural sciences.31


행동, 인식, 문화, 빈곤, 심리학과 사회학의 여러 개념 등을 이해하는 것이 '좋은 의사'양성에 도움이 될 것이다.

An understanding of behavior, perception, culture, poverty, and other concepts from psychology and sociology included on the new MCAT exam contributes to the creation of the “good doctor.”32


2015 MCAT에는 비판적 분석과 추론 기술 영역이 신설된다.

The 2015 MCAT exam also adds a “Critical Analysis and Reasoning Skills” section, which is designed to help medical schools assess how applicants reason.29


새로운 MCAT은 현재의 빅데이터 환경에서 학생들은 단순 암기능력보다 자료를 찾고 추론하는 능력이 더 중요하기 때문이다.

The new MCAT section reflects the understanding that, in today’s environment of big data, students’ ability to seek and reason through information is more important than their capacity for rote memorization.



내적 역량과 대인관계 역량 도출 및 평가

Identifying and assessing interpersonal and intrapersonal competencie


전인적 평가의 두 번째 목표는 전인격을 갖춘 의사가 될 학생을 찾아내는 것이다. AAMC는 바람직한 역량을 아래와 같이 도출했다.

Holistic review’s second goal is to identify applicants who possess the traits, experiences, and attributes that will lead them to become well-rounded physicians. In 2013, the AAMC identified the most desirable interpersonal and intrapersonal competencies for entering medical students34,35 (see Table 1).





2013년 4월, 표준화 추천서 가이드라인을 배포하였고, 여기서는 지원자의 의과대학 적합성을 '지지'하기보다는 '평가'하라고 권고하고 있다. 또한 구체적인 행동과 그 결과에 초점을 맞춰 쓸 것을 권고한다.

In April 2013, the association issued standardized guidelines to aid writers of letters of recommendation. These new guidelines recommend that evaluators assess rather than advocate for the applicant’s suitability for medical school, and focus on specific observed behaviors and their consequences when writing letters of recommendations.38


또한 AAMC는 학생들의 내적, 대인관계적 역량을 평가하는데 도움이 될 두 가지 다른 방법을 고려하고 있는데, 첫 번째는 AMCAS에서 스스로의 내적, 대인관계적 역량에 대해 성찰한 내용을 기술하게 하는 것이다.
Additionally, the AAMC is considering two other methods to help medical school admission committees assess students’ interpersonal and intrapersonal competencies. The first is a potential revision to the American Medical College Application Service (AMCAS) to include a “Reflections on Interpersonal and Intrapersonal Competencies” section, where applicants would be prompted to reflect on experiences in which they have demonstrated some or all of these competencies.


두 번째는 SJT를 활용하여 평가하는 것이다. SJT는 다음과 같은 것이다.

Secondly, the AAMC is exploring the development of a situational judgment test (SJT) as another tool to probe applicants’ interpersonal and intrapersonal competencies.39 SJTs, which “confront applicants with written or video-based scenarios and ask them to indicate how they would react by choosing an alternative from a list of responses,” have shown great promise in identifying interpersonal skills.40


앞에서 말한 바와 같이 새로운 면접 기술이 등장하고 있으며 MMI는 멕마스터 의과대학에서 처음 도입되어 대부분의 캐나다 의과대학과 미국 의과대학 중 22개 이상의 의과대학에서 활용중이다.

As mentioned earlier, new interview techniques are emerging to allow medical schools to probe better dimensions of applicants’ competencies, ranging from how applicants respond to novel situations to their reactions to an ethical conflict. The multiple mini-interview (MMI) was pioneered by the Michael DeGroote School of Medicine at McMaster University and is now employed by the majority of Canadian medical schools and more than 22 U.S. medical schools.7,39





의학교육의 다양성 지탱하기

Supporting diversity in medical education


이러한 변화는 미래의 의사들이 높은 수준의 문화적 역량을 갖출 것을 요구할 것이며 이는 다음과 같이 정의된다.

This change will require that tomorrow’s physicians possess a high degree of cultural competence, which has been defined as “a set of congruent behaviors, knowledge, attitudes, and policies that come together in a system, organization, or among professionals that enables effective work in cross-cultural situations.”43


Page는 다양한 배경에서 온 다양한 사람으로 이뤄진 그룹이 문제해결을 더 잘 잘하며, 어떤 개개인보다도 뛰어나다는 것을 보여주었다. 또한 추가적으로 의과대학생들은 같은 의대생들의 다양성을 가치롭게 생각하며, 학업경험은 물론 환자를 보는 경험 역시 이 다양성에 의해서 향상된다고 평가했다.

Page44 has shown that diverse groups of people from varied backgrounds do better at problem solving and, in many ways, are smarter than any individual. Further evidence shows that “students in medical schools value diversity in their classmates and find both the academic experiences and their abilities to work with patients from differing backgrounds enhanced by this diversity.”45


100년 전, 의과대학은 표준화된 시험을 통해서 과학적 배경이 잘 갖춰진 학생을 선발하였으며, 이것이 플렉스너 보고서에서 드러난 문제를 해결하는 확실한 방법이었다. 그러나 100년이 지난 지금, 환경은 변하고 있다.

A century ago, the academic medicine community concluded that providing physicians with a rich scientific background, verified through the use of standardized tests, was the definitive answer to addressing the problems revealed by the Flexner Report. As the last 100 years have demonstrated, however, changing circumstances in the health care landscape necessitate constant transformation.









23 Swick HM. Toward a normative definition of medical professionalism. Acad Med. 2000;75:612–616. 


24 Addams AN, Bletzinger RB, Sondheimer HM, White SE, Johnson LM. Roadmap to Diversity: Integrating Holistic Review Practices Into Medical School Admission Processes. Washington, DC: Association of American Medical Colleges; 2010. https://members.aamc.org/eweb/upload/ Roadmap%20to%20Diversity%20 Integrating%20Holistic%20Review.pdf. Accessed August 21, 2013.


25 Witzburg R, Sondheimer H. Holistic review: Shaping the profession of medicine one applicant at a time. N Engl J Med. 2013;368:1565–1567. http://www.nejm.org/ doi/pdf/10.1056/NEJMp1300411. Accessed August 21, 2013.


38 Association of American Medical Colleges. Letters of evaluation guidelines. https://www.aamc.org/ initiatives/admissionsinitiative/332572/ lettersofevaluationguidelines.html. Accessed August 21, 2013.


27 Interprofessional Education Collaborative Expert Panel. Core Competencies for Interprofessional Collaborative Practice: Report of an Expert Panel. Washington, DC: Interprofessional Education Collaborative Expert Panel; 2011. http://www.aacn.nche. edu/education-resources/ipecreport.pdf. Accessed August 21, 2013.











 2013 Dec;88(12):1806-11. doi: 10.1097/ACM.0000000000000023.

Selecting tomorrow's physicians: the key to the future health care workforce.

Author information

  • 1Ms. Mahon is a speechwriter, American Nurses Association, Silver Spring, MD. At the time of writing, she was senior executive communications specialist, Association of American Medical Colleges, Washington, DC. Ms. Henderson is senior engagement solutions specialist, Association of American Medical Colleges, Washington, DC. At the time of writing, she was research and policy analyst to the president, Association of American Medical Colleges. Dr. Kirch is president and CEO, Association of American Medical Colleges, Washington, DC.

Abstract

Recent U.S. health care reform efforts have focused on three main goals: improving health care for individuals, improving population health, and lowering costs. Physicians, who traditionally have practiced with considerable autonomy, will be required to become members of the team-based patient care models necessary to achieve these goals. In this perspective, the authors assert that medical school admissions, the selection of thefuture physician workforce, is a key component of health care reform. They review the historical context for medical school admission processes, which have placed a premium on grades and standardized test scores, and examine how admission practices are undergoing fundamental changes in order to select physicians with both the academic and interpersonal and intrapersonal competencies necessary to operate in the health caresystem of the future. The authors describe how new techniques, such as holistic review and multiple mini-interviews, are contributing to the shift toward competency-based medical education. Innovations underway at the Association of American Medical Colleges to transform medical school admissions also are explored. The authors conclude by arguing that although the admission process has great potential to transform the future healthcare workforce, major overhauls of the health care payment and delivery systems must be achieved alongside innovations in health professions education to truly transform the U.S. health care system.

PMID:
 
24128626
 
[PubMed - indexed for MEDLINE]



의과대학 입학시 스무고개 수행능력을 통한 임상수행능력 예측 (Med Educ, 2015)

Twenty Questions game performance on medical school entrance predicts clinical performance

Reed G Williams1 & Debra L Klamen2






"지식이 있다"는 것은 무엇인가? - 네 가지가 있음

-'누가'에 대한 지식

-'무엇을'에 대한 지식

-'어떻게'에 대한 지식

-'언제'에 대한 지식

White,2 in a careful analysis of what it means to possess knowledge, concluded that knowledge is manifested as ability and that ability takes many forms. The more typical and easily measured forms of knowledge involve knowing...

      • who (e.g. who invented the transistor) or 
      • what (e.g. what significant event occurred in New York City on 11 September 2001). 


However, knowing also includes knowing 

      • how (e.g. how to ride a bike) and 
      • when (e.g. when to add elements to a mixture in a process). 


'안다'라는 것은 단순히 기존에 알려진 답을 다시 되풀이하는 것이 아니라 새로운 문제에 대한 답을 찾아내는 기능을 뜻한다.

Most importantly, White argued that knowing does not merely involve the ability to produce old answers previously acquired, but also the facility to find new answers to new problems.2



기존의 지식은 최소한 다섯 가지 형태로 결합될 수 있다. (사실, 개념과 원리, 절차, 전략, 신념)

To cope successfully with the tasks and demands of everyday life, humans must be proficient in combining previously learned knowledge, skills and attitudes (beliefs) into at least five forms: 

  • facts (bits of information); 
  • concepts and principles (e.g. knowledge of cause-and-effect relationships); 
  • procedures (e.g. knowledge of step-by-step processes regarding how and when to carry out an action, such as in how and when to carry out long division computations); 
  • strategies (general methods for approaching problems such as by breaking a problem into parts), and 
  • beliefs (e.g. stable attitudes that lead to predictable behaviours, such as beliefs about the factors that lead to patient behaviour changes). 

이 다섯 가지 요소의 다양한 조합을 통해서 업무를 수행하게 된다.

All five fit White’s2 construction of knowledge as ability. Various combinations of these five elements are compiled and drawn upon to meet the various tasks and demands placed on people.


본 연구는 스무고개 게임의 능력이 일상생활에서 얻은 지식과 어떻게 그것을 구조화하고, 효과적으로 저장하여 효율적/효과적으로 인출, 결합, 활용할 수 있는가를 보여줄 수 있다는 전제에서 시작하였다.

The present study is based on the premise that the TQ parlour game tests the knowledge people have acquired in the course of their everyday lives and how well their organising and storing of that knowledge allows them to efficiently and effectively retrieve, combine and use it to address the challenges posed in everyday life.


본 연구에서 가능한 추가적 이점은 '인내심'의 척도로 활용가능하다는 것이다. 답을 맞추지 못하였더라도 20개의 질문을 모두 한 학생이 있고 중간에 포기한 학생이 있다.

A further benefit of this task is that it provides a  measure of perseverance. Students who fail to solve the problem posed and quit before they have used their entire quota of questions may be providing evidence of low perseverance, which may be a negative indicator of potential success as a medical student.




Study design

This was a prospective, longitudinal, observational cohort study. All students entering Southern Illinois University School of Medicine in 2009 were invited to play a single game of TQ on a non-medical topic during the first week of medical school



Description of TQ tasks and game process

Each participating entering student played a single game of TQ in a one-to-one encounter with the investigator at the time of his or her orientation to medical school. The TQ tasks posed were based on non-medical knowledge acquired through normal life experiences. A number of objects (correct answers) were selected in advance. The object (correct answer) for each participant was selected using a random selection process.


The investigator kept an essentially verbatim record of the number and nature of the questions asked and guesses offered by the student. The investigator also kept detailed notes about the strategies used in playing the game.


질문의 접근법을 네 가지로 구분

Based on the notes taken about student performance on the TQ task, a second investigator, blinded to SCCX and diagnosis justification (DJ) performance, classified the performance into one of four groups based on the student’s approach to the task: Essentially random; Somewhat random; Somewhat logical, and Logical.


논리적 접근은 이런 것이다.

These performances: 

(i) started out with broad questions in order to define the right path; 

(ii) built new questions based on previous answers, and 

(iii) offered educated guesses rather than random (off-the-wall) guesses.



Diagnosis justification exercise (진단 정당화 시험)

For eight cases, students were required to provide a written justification for their final diagnosis as part of the post-encounter exercise. The specific instructions given to students were as follows: ‘Please explain your thought processes in getting to your final diagnosis; how you used the data you collected from the patient and from laboratory work to move from your initial differential diagnoses to your final diagnosis. Be thorough in listing your key findings (both pertinent positives and negatives) and explaining how they influenced your thinking’.


Each response was blindly read and rated by two physician judges (the case author and one additional physician who was an expert in diagnostic reasoning). More details on this task and its use have been published in earlier manuscripts.11,12





Logical Group이 하위 두 개 그룹보다 SCCX와 DJ Performance가 우수함


Logical Group이 SCCX와 DJ에서 가장 우수하며 경향성을 보임


답이 틀렸어도 끝까지 20문항을 모두 질문했는지 여부에 따른 차이


DJ exercise 예측력에 있어 MCAT 점수와의 비교




Effect size interpretations (large and medium) are based on conventions described by Kirk.13




다른 의과대학, 문화권에서 확인이 필요함

As with any research results, confidence in these results will increase if the study can be successfully replicated in this and other medical schools. It is especially important to determine whether similar results are observed in other cultures because it is certainly possible, if not probable, that the results will vary based on the child-rearing and educational practices observed elsewhere.









 2015 Sep;49(9):920-7. doi: 10.1111/medu.12758.

Twenty Questions game performance on medical school entrance predicts clinical performance.

Author information

  • 1Department of Surgery, Indiana University School of Medicine, Indianapolis, IN, USA.
  • 2Department of Medical Education, Southern Illinois University, Springfield, IL, USA.

Abstract

CONTEXT:

This study is based on the premise that the game of 'Twenty Questions' (TQ) tests the knowledge people acquire through their lives and how well they organise and store it so that they can effectively retrieve, combine and use it to address new life challenges. Therefore, performanceon TQ may predict how effectively medical school applicants will organise and store knowledge they acquire during medical training to support their work as doctors.

OBJECTIVES:

This study was designed to determine whether TQ game performance on medical school entrance predicts performance on a clinicalperformance examination near graduation.

METHODS:

This prospective, longitudinal, observational study involved each medical student in one class playing a game of TQ on a non-medicaltopic during the first week of medical school. Near graduation, these students completed a 14-case clinical performance examination. Performanceon the TQ task was compared with performance on the clinical performance examination.

RESULTS:

The 24 students who exhibited a logical approach to the TQ task performed better on all senior clinical performance examination measures than did the 26 students who exhibited a random approach. Approach to the task was a better predictor of senior examination diagnosis justification performance than was the Medical College Admission Test (MCAT) Biological Science Test score and accounts for a substantial amount of score variation not attributable to a co-relationship with MCAT Biological Science Test performance.

CONCLUSIONS:

Approach to the TQ task appears to be one reasonable indicator of how students process and store knowledge acquired in their everyday lives and may be a useful predictor of how they will process the knowledge acquired during medical training. The TQ task can be fitted into one slot of a mini medical interview.

© 2015 John Wiley & Sons Ltd.

PMID:
 
26296408
 
[PubMed - in process]


입학 후 학업능력 예측: 학생의 배경환경과 과거 학업능력의 상대적 중요성 (Med Educ, 2015)

Predicting performance: relative importance of students’ background and past performance

Karen M Stegers-Jager,1 Axel P N Themmen,1,2 Janke Cohen-Schotanus3 & Ewout W Steyerberg4






academic failure와 관계되어있다고 보고된 입학전 특성들

Pre-admission characteristics that have been reported to relate to academic failure are ethnic minority status,6,7 maturity,8,9 male gender,7,8,10 and lower levels of previous academic performance, in particular low Medical College Admission Test (MCAT) scores and low science grade point averages (GPAs).7–9


의과대학 입학 후 첫 달의 성적과의 관계를 밝힌 것도 있음.

Several studies have confirmed the relationship between student performance during the first months at university and subsequent performance. 15–17 A recent study by Winston et al. showed that results on an examination administered after the first 2 weeks of medical school represented a strong early predictor of success or failure.4



그러나 각 의학교육의 단계마다 낮은 학업능력의 risk factor는 서로 다름

In addition, it has been shown that risk factors for poor performance vary at different stages of the medical course.7,19,20




Participants and procedure

코호트 선택의 이유

Students from six consecutive cohorts (2002–2007) at Erasmus MC Medical School were included in this study (n = 2357). We selected these six cohorts for two reasons: (i) the curriculum did not change during this period, and (ii) data on the ethnicity of these cohorts were available from a national database of students in higher education in the Netherlands 1cijferHO. Data on academic performance were derived from the university student administration system and confidentiality was guaranteedBecause data were collected as part of regular academic activities, individual consent was not necessary.



모든 variable을 넣고 logistic regression.

As all variables are known to be associated with medical school performanceall were entered simultaneously in a multivariable logistic regression model. We also included cohort as a stratification variable. Statistical interaction terms were used to study the potentially differential effects of one predictor by values of another predictor.










1학년의 첫 4개월의 GPA의 영향은 남학생보다 여학생에서 두드러졌다. 같은 성적이라면 남학생이 1학년을 이수할 가능성이 더 높았다.

The effect of GPA at 4 months on the Year 1 completion rate was less prominent for males than for females: male students with a high GPA at 4 months were less likely to complete the Year 1 course on time than female students with a similar GPA, whereas male students with a low GPA at 4 months (< 4.5 on a scale of 1–10) were more likely to complete the Year 1 course on time than female students with a similar GPA.



본 연구는 미래 학업능력 예측에 가장 최근의 학업능력(입학 전이든 후든)의 중요성을 보여준다. 그러나 임상실습 단계에서는 학생의 background가 주요한 예측인자로 작용한다.

This study confirms the importance of the most recent past performance – either before or at medical school – as a predictor of future performance in pre-clinical training. However, it also reveals that in clinical training the student’s background becomes the main predictor of performance as students from all minority groups and first-generation university students had a higher risk of achieving lower clinical grades in a model that included pre-clinical performance.



입학전 GPA는 그 자체로는 강력한 예측인자였지만 입학 후 Performance가 포함된 모델에서는 유의성을 잃었다. 이 말은 의과대학 입학 직후 4개월간의 학업능력이 훨씬 더 중요하다는 것을 보여준다. 의과대학에서의 첫 시험에서의 성적을 바탕으로 위험학생을 선별해내는 것을 권고하는 것이며, 학생과 학업환경의 상호작용에 따른 결과가 입학시점에서의 자질 평가를 바탕으로 한 것보다 더 정확하다는 것을 보여준다. 

However, our study offers a more nuanced picture. Although pu-GPA was the most important predictor in the model that included only preadmission data, this factor was rendered insignificant by the addition of early performance at medical school. In other words, the factor pu-GPA was greatly outweighed by study performance data that became available during the first 4 months at medical school. This confirms the suggestion that the identification of at-risk students based on the first results of the interaction between a student and the academic environment is more accurate than identification using entry qualifications and is also in line with the findings of others.4,18



그러나 남학생의 경우는 1학년 학업능력을 첫 몇 개월의 성적으로 예측하는 것에서 예외였는데, 여학생에서보다 남학생에서 예측력이 더 낮았다. 이는 아마도 남학생의 self-efficacy가 더 높은 것이 원인일 수 있다. 남학생은 첫 4개월동안 성적이 높으면 스스로를 과대평가 할 수도 있으나, 낮은 성적을 받는 것 역시 여학생에 비해서 self-confidence에 덜 detrimental하다.

Apparently, the differences found in Year 1 performance for these subgroups can be explained to a large extent by performance during the first months at medical school. An exception to this is male gender, as being male remains a predictor of poorer Year 1 performance after the addition of early medical school performance. The interaction effect we found between gender and GPA at 4 months suggests that early performance is less predictive of later performance for males than it is for females. This may be explained by higher self-efficacy in males than in females, which has been reported previously in medical students.30,31 In male students, high grades during the first 4 months may lead to an over-estimation of their own ability, whereas achieving low early grades may be less detrimental to their self-confidence than it is in female students



인종과 사회적 배경은 임상 수행능력의 중요한 예측인자였다. 이는 임상수련은 전임상교육과는 서로 다른 메커니즘에 의해서 작동됨을 보여준다. 한 가지 가능한 이론은 문화적 자본(cultural capital)이다. Bourdieu에 따르면 이는 다음과 같이 정의된다. 비전통적인 소수인종 혹은 가계 내 최초대학입학자의 경우 이들의 cultural capital이 institutional habitus에 잘 맞지 않는다고 볼 수 있다. 임상실습에서는 보다 주관적인 평가가 많이 작동하기 때문에, 문화적 자본이 전임상교육 기간보다 더 강하게 작동할 수 있다. 

The finding that ethnicity and social background were important predictors of clinical performance, even after adjusting for pre-clinical performance, suggests that performance in clinical training is explained by mechanisms other than those referred to in pre-clinical training. A possible mechanism refers to the concept of cultural capital, which, according to Bourdieu, can be understood as ‘knowledge of the norms, styles, conventions and tastes that pervade specific social settings and allow individuals to navigate them in ways that increase their odds of success’ (see Massey et al.33). The cultural capital of non-traditional – ethnic minority and first-generation university – medical students is less likely than that of traditional medical students to be recognised and positively valued within medical school; that is, it does not fit the ‘institutional habitus’ (see Thomas34). As more subjective examination methods are used in clinical training than in pre-clinical training,35 it may be that the role of cultural capital is more prominent during clinical than pre-clinical training. Although further research is required to confirm these proposed effects of cultural capital, our assumption is supported by our finding that having a medical doctor as a parent is related to poorer performance in pre-clinical but better performance in clinical training.


임상실습과 관련하여 또 다른 가능성은, 문화적 자본과도 관련되어있지만, 문화적 편견이 평가자에게 작용했을 수 있다는 것이다. 사람들은 자기와 같은 그룹에 속한 사람을 보다 신뢰하는 경향(in-group bias)가 있고, 자신과 비슷하거나 그들이 좋아하는 사람과 비슷한 경우 더 신뢰하는 경향이 있다(similarity principle) 이러한 것을 평가자가 스스로 통제하려 하지만, 소수인종이나 first-generation university student는 "전통적" 학생그룹보다 낮은 점수를 받았을  수 있다.

Another possible mechanism in clinical training, which is related to cultural capital, is cultural bias on the part of the examiners. Inevitably, people will have more positive views of those they believe to be part of their group (referred to as ‘in-group bias’36) and people tend to trust those who are similar to themselves or who are similar to people they like (a phenomenon known as the ‘similarity principle’37). Unless traditional examiners are aware of and attempt to control these automatic reactions,38 it is likely that ethnic minority and first-generation university students will receive lower grades than their traditional counterparts. More detailed experimental studies may assist in elucidating the processes underlying judgement and decision making in clinical assessments.


본 연구의 첫 번째 한계점은 일부 factor에 대해서 제한된 숫자의 학생만 응답한 것인데 missing value에 대해서 multiple imputation 기법을 활용하였다. 

A first limitation of our study is that data on the pre-admission factors ‘first-generation university student’ and ‘medical doctor as parent’ were collected for a restricted number of participants. However, to deal with the missing values, we used the technique of multiple imputation, which is widely accepted as suitable.26 As they allow the use of data that are available for other predictors that would otherwise be lost, imputation methods, especially multiple imputations, are superior to complete case analysis. 26,41,42 The ORs calculated in the imputed dataset in our study were similar and, if different, generally more conservative than the ORs in the unimputed dataset (Table S1).




 2015 Sep;49(9):933-45. doi: 10.1111/medu.12779.

Predicting performancerelative importance of students' background and past performance.

Author information

  • 1Institute of Medical Education Research Rotterdam, Erasmus MC University Medical Centre Rotterdam, Rotterdam, the Netherlands.
  • 2Department of Internal Medicine, Erasmus MC University Medical Centre Rotterdam, Rotterdam, the Netherlands.
  • 3Centre for Research and Innovation in Medical Education, University Medical Centre Groningen, University of Groningen, Groningen, the Netherlands.
  • 4Centre for Medical Decision Making, Department of Public Health, Erasmus MC-University Medical Centre Rotterdam, Rotterdam, the Netherlands.

Abstract

CONTEXT:

Despite evidence for the predictive value of both pre-admission characteristics and past performance at medical school, their relativecontribution to predicting medical school performance has not been thoroughly investigated.

OBJECTIVES:

This study was designed to determine the relative importance of pre-admission characteristics and past performance in medical school in predicting student performance in pre-clinical and clinical training.

METHODS:

This longitudinal prospective study followed six cohorts of students admitted to a Dutch, 6-year, undergraduate medical course during 2002-2007 (n = 2357). Four prediction models were developed using multivariate logistic regression analysis. Main outcome measures were 'Year 1 course completion within 1 year' (models 1a, 1b), 'Pre-clinical course completion within 4 years' (model 2) and 'Achievement of at least three of five clerkship grades of ≥ 8.0' (model 3). Pre-admission characteristics (models 1a, 1b, 2, 3) and past performance at medical school (models 1b, 2, 3) were included as predictor variables.

RESULTS:

In model 1a - including pre-admission characteristics only - the strongest predictor for Year 1 course completion was pre-university grade point average (GPA). Success factors were 'selected by admission testing' and 'age > 21 years'; risk factors were 'Surinamese/Antillean background', 'foreign pre-university degree', 'doctor parent' and male gender. In model 1b, number of attempts and GPA at 4 months were the strongest predictors for Year 1 course completion, and male gender remained a risk factor. Year 1 GPA was the strongest predictor for pre-clinical course completion, whereas being male or aged 19-21 years were risk factors. Pre-clinical course GPA positively predicted clinical performance, whereas being non-Dutch or a first-generation university student were important risk factors for lower clinical grades. Nagelkerke's R(2) ranged from 0.16 to 0.62.

CONCLUSIONS:

This study not only confirms the importance of past performance as a predictor of future performance in pre-clinical training, but also reveals the importance of a student's background as a predictor in clinical training. These findings have important practical implications for selection and support during medical school.

© 2015 John Wiley & Sons Ltd.

PMID:

 

26296410

 

[PubMed - in process]



MMI로 평가한 인적특성의 변동요인(Medical Teacher, 2014)

Variance in attributes assessed by the multiple mini-interview

NIKKI BIBLER ZAIDI1, CHRISTOPHER SWOBODA2, LEIGH LIHSHING WANG2 & R. STEPHEN MANUEL3

1University of Michigan Medical School, USA, 2University of Cincinnati, USA, 3University of Cincinnati College of Medicine, USA






Introduction

가장 신뢰도가 높은 입학면접은 어떤 형태일 것인가?, MMI에 대한 논의. MMI의 다면표집법은 낮은 신뢰도를 극복하는 수단으로 긍정적 평가를 받았다.

The medical school preadmission interview (MSPI) remains a widely used tool in medical school admissions (Monroe et al. 2013); therefore, discussions regarding the most reliable and valid MSPI format continue to evolve (Edwards et al. 1990; Goho & Blackman 2006). Over the past decade, the multiple mini-interview (MMI) has gained considerable attention as an alternative to more traditional MSPI formats. The MMI is a multi-sampling, structured interview format in which interviewers, referred to as “raters,” assess specific applicant attribute(s) using multiple 5–15 minute interview stations. Each interview station is assigned a different discussion prompt, referred to as a “scenario;” likewise, each station has a different rater who is tasked with assigning applicants scores for a set of items on an evaluation tool (Eva et al. 2004c; Pau et al. 2013). The MMI’s multi-sampling technique has been celebrated for increasing the low reliability estimates that plague traditional MSPIs (Eva et al. 2004b, c; Uijtdehaage et al. 2011), and some studies suggest that MMI scores can be used to predict performance during medical school clerkships and on medical licensure examinations (Eva et al. 2004a, 2009, 2012).


MMI의 신뢰도는 흔히 G theory를 이용해서 추정된다. 어떤 측정이든 (MMI를 포함하여) 그 목적은 진점수(true score)를 흐릿하게 만드는 원하지 않는 변이(unwanted variance)를 줄이는 것이다. G theory를 활용한 대부분의 MMI에 대한 연구에서 그 모델은 평가자와 스테이션을 facet으로 모델링하였다.

Reliability of the MMI is commonly estimated using Generalizability (G) theory because of the multi-faceted nature of the measurements. In any measurement process, including the MMI, the goal is to reduce the unwanted variance in observed scores that can obscure true scores. G theory can simultaneously capture multiple sources of unwanted variance, referred to as “facets,” to provide an estimate of generalizability – or reliability (Brennan 2001). Consequently, MMI reliability is expressed as a G coefficient and represents a “universe of admissible observations” – a “universe that is defined by the specific facet(s) that the researcher decides to include in the model. The decision regarding facets for inclusion is based on the context of a measurement to which the researcher plans to generalize findings. For instance, if some raters are always more lenient or more severe than other raters, then raters are a source of unwanted variance in MMI scores, and the rater facet would be modeled in a subsequent G study if the researcher wishes to generalize across raters. Most MMI studies associated with medical school admissions have modeled raters and/or stations as facets (Eva et al. 2004b, c; Uijtdehaage et al. 2011).


각 facet은 condition으로 구성되는데, 조건(condition)은 CTT에서 factor의 수준에 해당하는 것이다. 연구자들은 Condition에 대해서, condition은 어떤 측정의 질을 낮추지 않으면서도 바꿀 수 있다라고 가정한다. 

Each facet that defines the universe of admissible observations is comprised of “conditions” (Brennan 2001). These conditions are analogous to the levels of a factor in classical test theory (CTT). Overall, the MMI literature reports facets with a range of corresponding conditions and it is presumed, as an assumption for most applications of G theory, that the varying conditions represent random samples from these facets. Therefore, it is also generally assumed by researchers that these conditions can be altered without making the measurement any less acceptable. Although G theory can examine the extent to which such changes in a facet’s conditions make the measurement more or less acceptable (Shavelson & Webb 1991), this concept of interchangeability has not been examined for all potential facets of the MMI.


MMI에 대한 G coefficient연구에서 rater와 station을 facet으로 했지만, 어떤 인적특성을 평가하는가는 대체로 무시해왔던 것이 사실이다. MMI에서 평가하는 인적특성은 지금까지는 잘 포함시키지 않아왔지만 상당히 큰 변이(variance)의 원인이 될 수 잇다. 여러 연구에서 의사에게 요구되는 서로 다른 인적특성을 최대 87개까지 추출한 바 있지만, 입학면접에서는 그 중 일부만을 평가할 수 있을 뿐이다. 따라서 MMI에서 평가되는 구체적인 인적특성은 의과대학마다 리더십, 문화적 감수성, 대인관계, 비판적 사고 등으로 다양할 것이다. 또한 이들 평가는 주로 Likert scale로 평가하게 되며, 평가대상이 되는 인적특성은 의사로서 중요한 다양한 특성 중 무작위로 선정된다는 암묵적 가정을 기반으로 한다. 즉, MMI 연우게서는 이러한 인적특성들이 item 측면에서 상호교환가능함을 가정하고 있다. 즉, '리더십'에 대한 점수는 '문화적 감수성'에 대한 점수와 동등하고 상호교환가능하다는 뜻이다. 연구자들은 이러한 가정의 안면타당도에 의문을 제기하였으며 추가적 연구가 필요하게 되었다.

The extant literature reports moderate to high G coefficients for medical school MMIs ranging 0.58–0.81 (Eva et al. 2004b, c; Uijtdehaage et al. 2011). These reports, however, are based on studies that have modeled raters and stations as facets but have essentially ignored the impact of the attributes assessed. The attributes assessed by the MMI have the potential to introduce additional and largely unaccounted for, variance in MMI scores. The medical literature identifies up to 87 different attributes considered important for an aspiring physician (Price et al. 1971; Albanese et al. 2003); yet, an MSPI, including the MMI, can only reasonably capture a handful of these attributes. Therefore, the specific attributes assessed by an MMI will vary across medical schools and can range from leadership potential, cultural sensitivity, interpersonal skills, and critical thinking to a single, overall performance score (Eva et al. 2004c; Reiter et al. 2007; Uijtdehaage et al. 2011). These attributes are generally assessed as items on a Likert-like scale (Eva et al. 2004c) and carry the implicit assumption that an institution’s choice of attribute(s) can be considered a random selection from the domain of characteristics deemed important for the medical profession. Subsequently, MMI studies have largely considered attributes to be interchangeable conditions within the item facet. This would suggest that it is reasonable to believe that scores for the item, “leadership potential,” are parallel and interchangeable with scores for the item, “cultural sensitivity.” Consequently, the researchers question the face validity of this assumption and believe it warrants further investigation.


더 나아가서 MMI에서 평가하는 인적특성의 구성은 item facet을 넘어서 station facet으로 들어간다. 각 MMI 스테이션은 특정한 주제에 맞는 특정한 시나리오를 가지고 진행되는데, 이 시나리오를 바탕으로 평가서식에 의해서 'item'화 되는 사전에 결정된 인적특성에 대해 평가하게 된다. 결과적으로 station scenario 사이의 차이는 한번 더 의도하지 않은 변이를 유발할 수 있다. 기존 연구들은 station facet을 포함시키기는 했으나, station facet을 1회의 측정사건(measurement occasion)에만 국한시켰다. 따라서 기존 문헌에서 측정사건의 숫자가 증가할수록 MMI의 일반화가능도가 높아지는 것으로 되어있으므로 기존 연구의 결과는 CTT의 Spearman-Brown prophecy formula에만 부합하는 것일 수 있다. 

Furthermore, the impact of the composition of attributes assessed by the MMI has the potential to reach beyond the item facet into the station facet. Each MMI station is assigned a specific scenario that focuses on topics such as “knowledge of the healthcare system” or “critical thinking” and is intended to elicit information regarding a set of predetermined attributes that are captured as items on an evaluation form (Eva et al. 2004c). Consequently, differences among station scenarios have the potential to introduce further unwanted variance in the attributes assessed. Previous studies model the station facet into G studies (Eva et al. 2004b, c; Uijtdehaage et al. 2011); however, these studies generally recognize the station facet in terms of a measurement occasion only. Therefore, while it is well-established in the literature that increasing the number of measurement occasions increases generalizability estimates for the MMI, this finding merely aligns with the CTT’s Spearman-Brown prophecy formula. 


결과적으로 기존의 문헌은 스테이션의 시나리오가 MMI에서 인적특성을 평가하는데 미치는 영향력을 상당부분 무시해왔다고 볼 수 있다. MMI가 맥락특이성을 희석시키기 위한 목적으로 개발되었다는 점을 고려하면, 이에 대한 추가적 연구가 필요하다. 실제로 Eva의 파일럿연구를 보면, 지원자-스테이션 상호작용이 지원자 단독으로 인한 변이보다 다섯배나 컸다. 이러한 결과는 스테이션의 내용이 MMI점수에서 발생하는 오차의 중요한 원인이 될 수 있음을 보여준다. 그러나 아직까지 어떤 연구도 특정 인적특성(즉 item)에 대해서 MMI 스테이션의 내용, 즉 시나리오가 의과대학 지원자의 평가에 어떤 영향을 주는가를 연구한 바는 없다. 본 연구에서는 MMI평가서식의 구체적 특성에 의해서 정의내려진 item이 시나리오에 관계없이 여러 MMII station에 걸쳐서 일관되게 평가되어지는지를 연구해보고자 한다.

Consequently, the extant literature has largely ignored the potential influence of the stations’ scenario on the assessment of attributes within an MMI. Given the fact that the MMI was created in large part to dilute the effects of context specificity (Eva 2003), this warrants further investigation. In fact, Eva et al.’s (2004c) pilot study concluded that variance attributable to the candidate–station interaction was five times greater than that assigned to the candidate alone. This finding suggests that station content may introduce the most significant source of error in MMI scores. Yet, to the best of the authors’ knowledge, no study has examined how the MMI station content – the scenario – may influence the assessment and evaluation of medical school applicants on a set predetermined attributes (i.e. items). This study will explore whether items, defined as specific attributes on an MMI evaluation form, are assessed consistently across MMI stations regardless of station scenario.



Methods

This study examines one aspect of psychometric evidence from one United States (US) medical school that has fully adopted the MMI process as a replacement for the traditional MSPI. Using G theory, this study examines the variance attributable to the item facet and the scenario-item interaction. Data used for analysis represent MMI scores that were collected for the sole purpose of making admissions decisions. These data come from a US medical school that receives approximately 4000 admissions applications annually and interviews approximately 625 applicants each year. This institution fully adopted the MMI to select the entering class of 2009. With IRB approval (# 10-06-08-01), all applicants who participated in the MMI from 2009 to 2013 are included in the dataset used for analysis. This empirically collected dataset represents a nested design; therefore, only a small subset of applicants was used in this analysis in order to create a fully crossed design because in G theory, nested facets make it impossible to estimate all variance components separately (Brennan 2001).


'의사소통'과 이를 평가하기 위한 여섯 개의 구체적 특성 + 하나의 총괄평가

After a comprehensive blueprinting process, the school’s Admissions Committee identified one overarching characteristic – communication – to assess through the MMI. The rationale for choosing this single construct was largely rooted in literature that suggests that one of the chief patient complaints concerns poor communication between the patient and physician (Wofford et al. 2004). Communication was selected as the single construct from the larger domain of attributes deemed important for an aspiring physician. To operationalize this construct, six specific attributes and one “overall score” were used as sub score items in the MMI evaluation tool. The specific attributes assessed by this MMI included (1)multiple perspectives, (2)reflection of scenario, (3)articulation, (4)interest in dilemma, (5)non-verbal communication, and (6)interpersonal skills. These seven items were measured on a seven-point Likert-like scale that assumes equal intervals between the anchors (Unsatisfactory-1; Below Average-2; Slightly Below Average-3; Average-4; Slightly Above Average-5; Above Average-6; Outstanding-7).


G-String IV software (Bloch & Norman, Hamilton, Ontario, Canada), was used to estimate variance components attributable to the facets of measurement. In G theory, the object of measurement is not considered a facet. Therefore, this study’s two-facet design includes the object of measurement – applicants (p) – and two facets of generalization – scenario (s) and item (i).


Facet of differentiation

The object of measurement is considered the facet of differentiation. This facet of differentiation is analogous to the dependent variable and is considered the only desired source of variation. In other words, the object of measurement is the “universe” or “true” score (Brennan 2001). The facet of differentiation is the person, the applicant (p) facet, which represents the true MMI score for the applicant. Therefore, this variance should be large and other modeled sources of variance are expected to be small.


Facets of generalization

The facets of generalization are analogous to the independent variables and they contribute unwanted sources of error to the universe score, or for this study – MMI scores. These facets of generalization include the sources of measurement error that the researcher intends to generalize from the sample to the universe of admissible observations. Because this study intends to generalize applicant scores from one scenario to applicant scores from a much larger set of scenarios, scenario (s) is considered a facet of generalization. Likewise, because this study intends to generalize from applicant scores on one attribute item to applicant scores on a much larger set of items, item (i) is also a facet of generalization. In line with G theory assumptions, both the scenario facet and the item facet are considered random and conditions within these facets are deemed interchangeable (Shavelson & Webb 1991).


Confounded facet

Because there is one rater assigned to each scenario at this US medical school, the variance attributable to rater cannot be disentangled from variance attributable to scenario. Therefore, rater and scenario variance are completely confounded. For the purposes of this study; however, this confounded effect will be recognized as a limitation and the variance accounted for by this confounded facet will be considered attributable to the scenario (s).


Sample

지원자에 대한 MMI점수가 서로 다른 시나리오 아래서 수집되는 nested structure이다. 따라서 scenario facet이 applicant facet과 fully crossed 되는 subset을 찾는 purposive sampling을 하였음. 결과적으로 completely crossed design을 위하여 동일한 시나리오에서 동일한 아이템으로 평가받은 지원자를 표집하였다.

This study uses actual admissions data; therefore, the data structure represents a pragmatic design in which inevitable nesting and confounding occurs. A fully crossed G study elicits the most information; however, the existing data set represents a nested structure in which MMI scores are collected for applicants by using different scenarios. Therefore, a purposive sampling method was employed to generate a subset of data in which the scenario (s) facet (confounded but representing the same raters within scenario combinations) was fully crossed with the applicant (p) facet. Consequently, a subset of the full dataset was intentionally sampled for applicants rated within the same scenario using the same items to ensure a completely crossed design. The sample included 16 applicants who were evaluated within the same six scenarios and scored on the same seven items. This small, purposive sample was necessary in order to examine the variance attributable to the main effect of the scenario (s) facet and the scenario- item (si) interaction (Shavelson & Webb 1991), which is information pertinent to the study’s objectives.



Results

While the true score (p) should represent a sizable amount of variance, Table 1 shows that the applicant (p) represents only 6% of total variance. The estimated variance components from the G study suggest that the greatest amount of variance is attributable to the main effect of the scenario (s) facet and the interaction between scenario and applicant (ps). Collectively, these two variance components account for 77% of the total variance. The item facet (i) represents the lowest estimated variance, estimating only 0.6% of the total variance in MMI scores. Likewise, the scenario-item interaction (si) accounts for only 1.4% of the total variance. The low estimate of variance attributable to the item facet is reinforced by a high Cronbach’s alpha (0.97) for the seven items, which suggests very high internal consistency among the attributes measured by this MMI.





Discussion

일곱 개의 sub scores (items)의 높은 내적 일관성으로부터 현 MMI에서는 하나의 단일한 차원의 인적특성을 평가하고 있음을 알 수 있다. item facet으로부터 유발되는 variance가 2%에 불과한 것도 이를 지지한다. 만약 이 item들이 하나의 단일차원의 특성을 평가하는 것이라면 일곱 개의 item은 하나의 item으로 압축될 수 있다. p와 i에 의한 변이가 적다는 점은 대부분의 변이가 s에 기인한다는 것을 의미한다.

The high internal consistency of the seven sub scores (items) may support assumptions that the current MMI process is measuring one unidimensional attribute; this is further supported by only 2% of variance attributable to the item facet– (i), (pi), and (si). These findings either support the G theory assumption that conditions of the item facet can be considered interchangeable or it may suggest that raters do not understand how to use the items associated with the MMI evaluation tool and simply assign the same value for each item. If the items are capturing one unidimensional attribute, then a seven-item evaluation tool could be condensed into a single item tool. The low percentage of variance attributable to both items (i) and the true score – the applicant (p) – further suggests that the variation in MMI scores is mostly attributable to scenarios (s). 


스테이션 시나리오의 내용에 차이가 있다는 점을 감안하면, 시나리오에 의해서 지원자가 보여주는 인적특성의 비일관성이 높아진다고 보는 것이 타당하다. 예컨대 지원자는 윤리적 딜레마를 포함하고 있는 시나리오와 팀워크 활동을 포함하는 시나리오에서 서로 다른 특질을 보여줄 것이다. 따라서 MMI가 item차원에서는 하나의 단일차원 특성(one unidimensional attribute)을 측정하게끔 한다 하더라도, 스테이션의 내용은 그 특성(attribute)에 대한 측정을 변화시킴으로서 시나리오 수준에서의 다차원(multidimensionality)을 유발 할 수 있다. 연구자들은 이러한 차이가 시나리오-아이템 상호작용으로부터 나타날 것으로 기대했으나 본 연구의 결과는 이러한 가설이 틀렸음을 보여준다. 아마도 item facet으로 인한 변이의 비율이 낮기 때문에 이런 결과가 나왔을 것이다. 따라서 시나리오-아이템 상호작용이 작다는 것은 item facet으로 인한 variance가 scenario facet에 의한 variance에 포함되어버리기 때문일 수 있다. 결과적으로 이러한 상호작용이 MMI점수의 variance중 상당한 부분을 차지하게 될 것이나, 이러한 것이 이번 연구 샘플에서는 드러나지 않았다. 

Given the variation among the content of station scenarios, it is plausible to believe that scenarios promote inconsistencies among attributes exhibited by an applicant. For instance, a scenario involving an ethical dilemma might highlight different attributes than a scenario requiring an applicant to engage in a teamwork activity. Therefore, even if the MMI is supposedly measuring one unidimensional attribute at the item level, the content of the stations may elicit different measurements of attributes, thereby introducing multidimensionality at the scenario level. While the researchers expected to find this disparity manifested as a large variance component associated with the scenario-item interaction, this initial analysis does not support the assumption. This potential effect may be obscured by the low percentage of variance attributable to the item facet. Therefore, it is possible that the small scenario-item interaction is a result of variance attributable to the item facet being subsumed by the variance attributable to the varying conditions of the scenario facet. Consequently, the interaction may indeed contribute substantial variance in MMI scores; but this was not identified within this study’s sample.


AERA, APA, NCME 기준에서 드러나듯, "만약 문항 개발자가 시험을 수행하는 조건이 응시자에 따라서 다를 수 있음을 적시한다면, 그러한 조건에서 허용가능한 변이가 확인되어야 하고, 서로 다른 조건을 인정하는 rationale가 명시되어야 한다"라고 언급하고 잇다. item facet으로부터 기인하는 변이가 작다는 것이 item facet이라는 서로 다른 조건에 대한(즉 서로 다른 특성들에 대한) 허용가능성을 의미할 수 있지만, 이러한 가정은 scenario facet에는 해당되지 않는다. 본 연구의 결과는 scenario facet이라는 조건에 대한 상호교환가능성에 의문을 제기한다. 따라서 시나리오 선정에 보다 주의를 기울일 필요가 있다.

As outlined by the AERA, APA and NCME Standard 3.21, “If the test developer indicates that the conditions of administration are permitted to vary from one test taker or group to another, permissible variation in conditions for administration should be identified, and a rationale for permitting the different conditions should be documented” (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education 1999, p. 47). While the low variance attributable to the item facet may suggest permissibility for permitting different conditions of the item facet (i.e. attributes), this assumption may not hold for the conditions of the scenario facet. The results of this study suggest that interchangeability for the conditions of the scenario facet is questionable. Subsequently, more attention should be directed towards the selection of scenarios.


이러한 연구결과에도 불구하여 몇 가지 한계가 있다.

Despite these findings, this study has some limitations. Because this study uses empirically collected data intended for admissions purposes, the researchers did not have direct control over the data collection process. Consequently, this G study is limited by the need for a purposive sample which results in a very small sample size relative to the larger dataset. The researchers felt that the benefit of using a fully crossed design justified the small sample size. In G theory, the nested facet cannot be estimated separately from its interaction effects because nesting creates missing cells in the design; additionally, the scenario-item (si) interaction, a major focus of this study, could only be examined using a fully crossed design (Brennan 2001). Nonetheless, the sample size used in this study may not be representative of the larger population; subsequently, the variance components might be influenced by sampling error. Because estimated variance components can be very unstable when the number of conditions within a measurement is small, this study could be replicated using a larger sample size. In addition, only two facets of generalization are modeled and one of these facets, the scenario facet, is confounded with another potential source of variation – the rater facet. Therefore, other facets could be added to the model in order to expand the universe of admissible observations and the corresponding generalizability of the study. Consequently, this study’s external validity is limited to the extent that other MMIs mirror the one used in this analysis. Despite these limitations, this study offers a solid framework for future exploration into the impact that scenario content can have on the attributes assessed by the MMI.



Conclusions

Because of the variation in how and what an institution-specific MMI measures, psychometric properties must be examined for each medical school that chooses to adopt the MMI as a replacement for the MSPI. This study adds to the growing body of literature related to psychometric analyses of the MMI. Because the extant literature has primarily focused on predictive validity and largely ignored other aspects of validity, this study adds to the foundation for further exploration into construct validity. As the MMI continues to gain momentum as a replacement for the traditional MSPI, the measurement process deserves careful attention, especially in terms of how and what is measured. Future analysis should explore the potential that both items and scenarios have on subsequent MMI scores. Overall, the results of this study reinforce the need to examine all psychometric properties of a measurement process – especially one, such as the MMI, that is used for high-stakes admissions purposes.










 2014 Sep;36(9):794-8. doi: 10.3109/0142159X.2014.909587. Epub 2014 May 12.

Variance in attributes assessed by the multiple mini-interview.

Author information

  • 1University of Michigan Medical School , USA .

Abstract

INTRODUCTION:

While the extant literature has explored the impact of stations on multiple mini- interview (MMI) scores, the influence of station scenarios has been largely overlooked.

METHOD:

A subset of MMI scores was purposively sampled from admissions data at one US medical school. Generalizability (G) theory was used to estimate variance components attributable to applicants and two facets of generalization - scenarios, the content of the station, and items, the attributes assessed.

RESULTS:

G study suggests that the greatest amount of variance is attributable to the main effect of the scenario (s) facet and the interaction between applicant and scenario (ps), which account for 77% of the total variance. The item facet (i) accounts for only 0.6% of total variance; likewise, the scenario-item interaction (si) accounts for only 1.4% of the total variance.

DISCUSSION:

While the researchers expected to find a large variance component associated with the scenario-item interaction, this analysis does not support this assumption. The researchers interpret the small scenario-item interaction as a result of variance attributable to the item facet being subsumed by the variance attributable to the content of the scenarios.

CONCLUSIONS:

The results of this study reinforce the need to examine psychometric properties of the MMI.

PMID:

 

24820377

 

[PubMed - in process]


불명료함을 견디는 능력: 의과대학 학생선발의 윤리기반 준거 (Academic Medicine, 2013)

Tolerance for Ambiguity: An Ethics-Based Criterion for Medical Student Selection

Gail Geller, ScD, MHS







몇 해 전, 저자는 의료윤리 과목에 대한 학생들의 반응을 흥미롭게 지켜본 적이 있다. 학생들이 의료의 '불확실성'에 대응하는 방법에 확연한 차이가 있었다. 

Several years ago, I coordinated the ethics course that was required for first year medical students at my institution, the Johns Hopkins School of Medicine. was keenly aware of significant differences in students’ reactions to the course. (...) Intrigued by what I noted as variability in students’ tolerance for ambiguity, I searched for and discovered a substantial social science literature on this topic. (...) The way students respond to uncertainty in medicine deserves heightened attention in light of imminent changes to the medical student selection process, which motivated me to write this Perspective



임박한 의과대학 입학절차 변화 Impending Changes to the Medical School Admission Process

AAMC는 의과대학 입학절차를 바꾸고자 지난 10년간 많은 노력을 해왔으며, 그 목표 중 하나는 인문학적 특성을 평가하여 다른 사람과 의사소통을 잘 하고, 환자와 좋은 관계를 맺을 수 있으며, 윤리적 판단을 내릴 수 있는 잠재력을 가진 학생을 선발하자는 것이다. MCAT시험에 사회과학과 행동과학 내용을 추가하도록 결정되었고, 의과대학 지원시 선수과목으로 요구하도록 했다. 2015년부터 도입 예정이다.

For the last decade, the Association of American Medical Colleges has been interested in and committed to transforming the medical school admission process. The goal is to enable the assessment of humanistic characteristics and, thus, to select students who are more likely to become physicians who can communicate and relate with patients and engage in ethical decision making. Recently, the decision was made to revise the MCAT exam to include more social and behavioral science and to adjust the prerequisite course requirements for admission to medical school.2 These changes will be implemented in 2015.




The Concept of Tolerance for Ambiguity

지난 몇 년간 의학교육과 의료에 있어서 '불명료함'과 '불확실성'의 영향에 대한 많은 연구가 이뤄져왔다. 이 두 용어가 혼재되어 사용되긴 하지만 동일한 개념은 아니다. Ellsberg는 두 가지가 모두 어떤 "risk"의 한 종류지만, 그 "가능성"에 있어 차이가 있다고 하엿다. 즉, '불확실성(uncertainty)'는 어떤 결과가 일어날 가능성을 아는 것이며, '불명료함(ambiguity)'은 어떤 결과가 일어날 가능성을 모르는 것이라고 구분하였다. Grenier는 시간에 따른 구분을 제시했는데, 불확실성은 미래에 일어날 일에 대한 것이며, 불명료함은 현재의 상황에 대한 것이다. 이러한 것들을 고려한다면 '불명료한' 상황이 조금 더 모호한 상황을 만들며, 더 시급한 것이고 따라서 더 많은 tolerance를 요구한다.

In the past several years, there has been extensive scholarship on the impact of ambiguity and uncertainty on medical education and medical care. Although  these concepts are related and have beenused interchangeably, ambiguity and uncertainty are not equivalent.3,4 Ellsberg5 writes that both are types of “risk,” but they vary in probability: In a case of uncertainty, the probability of a particular outcome is known; with ambiguity, the probability is unknown. Grenier et al3 propose a time-oriented distinction, with uncertainty relating to an event in the future and ambiguity concerning circumstances in the present. In this light, “ambiguous” situations have either more shades of gray or greater urgency and may, thus, require more tolerance.


여러 문헌에서 불명료함을 회피하고자 하는 성향이 가져올 수 있는 부정적 결과가 연구된 바 있다.

It is also important to note the recent literature on ambiguity aversion and its adverse consequences in both medical practice6 and clinical research.7


불명료함을 견디지 못하는 것, 혹은 회피하려고 하는 것에 대해서 처음 언급된 것은 50년도 더 전이다. 처음에 이러한 특성은 '새롭거나 복잡하고 정답이 없는' 상황을 '위협의 근원'으로 받아들이는 것으로 묘사되었다. 보건의료는 그 특성상 새롭고, 복잡하고, 종종 정답이 없기 때문에, 어떻게 그러한 상황에서 의사들이 반응하는지를 이해하는 것이 중요하다. 불명료함에 대한 내성은 권위주의, 독단주의, 완고함, 규정에 대한 순응, 윤리적 편견 등과 관련이 있다. 이러한 특징은 명백히 휴머니즘, 문화적 역량, 환자중심 등과 대치되는 개념이다.

Intolerance of ambiguity, or aversion to ambiguity, was first identified more than 50 years ago.8 It was described as a personality characteristic in which situations that are “novel, complex or insoluble” are perceived as “sources of threat.”9 To the degree that medicine and health care are characterized by novelty, complexity, and sometimes insolubility, it is extremely important to understand how clinicians react to such circumstances. 

  • In general, individuals with high ambiguity tolerance are drawn to or captivated by the unknown. 
  • By contrast, those with low tolerance tend to deny, avoid, or minimize ambiguity, and experience significant stress when faced with it.9 

Ambiguity intolerance has been associated with other personality traits such as authoritarianism, dogmatism, rigidity, conformity, and ethnic prejudice.9,10 Clearly, these traits contradict the humanistic, culturally competent, and patient-centered qualities underlying ethical medical practice.




Tolerance for Ambiguity in Medical Practice and Education

불명료함에 대한 내성은 의대생들의 태도와 행동에 큰 영향을 준다. 의과대학생의 불명료함에 대한 내성에 관한 연구에 따르면 여러 사회인구학적, 행동적 특성과 연관되어 있다. 이러한 근거를 따르면 학생의 높은 불명료함 내성은 리더십 능력, 농촌지역에서의 근무의사 등과 상관이 있다. 반대로, 낮은 내성은 실수에 대한 두려움, 취약계층에 대한 부정적 태도, 알콜 남용자에 대한 편견과 높은 상관이 있다. 불명료함에 대한 내성이 전공 선택과 관련이 있는지 여부는 불확실한데 일부 연구에서 관련성이 없다고 보고된 바 있고, 다른 연구에서는 관련성이 보고된 적도 있다.

Tolerance for ambiguity also exerts a powerful influence on the attitudes and behaviors of medical students. Numerous studies have measured students’ levels of ambiguity tolerance1,19–21 and correlated their scores with a range of sociodemographic and behavioral characteristics.19–26 This evidence suggests that higher tolerance for ambiguity is associated with students’ leadership ability25 and their willingness to practice in rural areas.26 Conversely, there is a strong relationship between students’ low tolerance for ambiguity and their fears of making mistakes,22 their negative attitudes toward the underserved,23,24 and bias against those who abuse alcohol.1 It remains unclear whether tolerance of ambiguity is linked to students’ specialty choices. In some studies, there was no association.20,21 In others, specialties that require high levels of precision, such as surgery, tended to attract individuals with low ambiguity tolerance. Conversely, specialties that are inherently ambiguous, such as psychiatry, appealed to individuals with higher tolerance.1


이러한 중요도에도 불구하고 불명료함에 대한 내성은 의과대학생 선발이나 교육과정에서 모두 간과되어왔다. 사회학자들은 오래 전부터 의학에 대해서 '확실한 것'을 보상하는 경향이 있음을 지적한 바 있다.불명료함과 불확실성이 의과대학의 문화에서 배제되고 있음을 인정한다면, 불명료함에 대한 내성을 가르쳐야 한다는 최근의 제안도 받아들여질 필요가 있다. 레지던트를 대상으로 한 연구를 살펴보면, 긴 시간에 걸쳐서 불명료함에 대한 내성을 기를 수 있는 것으로 보이지만, 이 문제는 의과대학생 수준에서 연구된 바 없다. 

Despite its importance, tolerance for ambiguity has been overlooked both in the selection and also the training of medical students.27–30 Sociologists of medicine have long observed that the medical education process rewards certainty.27,28 In recognition that ambiguity and uncertainty have been neglected in the culture of medicine, there have been recent proposals to acknowledge, embrace, and explicitly cultivate ambiguity tolerance in the medical curricula.29,30 This is undoubtedly a laudable goal, but it assumes that ambiguity tolerance can be taught. Although evidence among residents suggests that ambiguity tolerance can improve over time and with experience,31 this question has not been explored among medical students. Studies of ambiguity tolerance in medical education have been cross-sectional, not prospective.


하나의 검증해볼만한 가설은 불명료함에 대한 내성이 인적 특성인지, 일시적인 상태인지를 보는 것이다.

One testable hypothesis is that tolerance for ambiguity is both a personality trait and a temporal state.


문헌에 근거해서 판단해보자면, 의과대학 입학시에 불명료함에 대한 내성이 높은 학생은 의학과 의료의 불확실한 특성에 점차 자극을 받는 반면, 내성이 낮았던 학생은 이런 상황을 더 회피하려고 하게 된다는 가설이 합당하다.

Based on the conceptual literature, a reasonable hypothesis is that students who enter medical school with high tolerance for ambiguity are drawn to, and stimulated by, the uncertainties that characterize medicine and patient care. (...) By contrast, students who enter medical school with low tolerance for ambiguity may be more likely to avoid, minimize, or negate the uncertainties that characterize medicine and patient care.




A Timely Proposal

질적, 양적, 혼합적 전략이 모두 필요하다. 타당도가 검증된 검사들이 존재한다.

The assessment plan could consist of quantitative strategies, qualitative strategies, or a combination of both. With respect to quantitative strategies, a number of validated scales exist1,3,6,9,10,32,33 that could be used or adapted for use in the medical admission process.


학업적 기준을 만족하는 학생들 중 불명료성 내성이 일정 수준 이상인 학생만 면접할 수도 있고, 모두 면접대상자로 선발한 다음에 불명료성의 내성을 검사할 수도 있다.

Among students who otherwise meet the academic criteria for admission, one option would be to offer interviews only to those whose tolerance scores exceed a certain cutoff. An alternative strategy would be to offer interviews to all students who meet the academic standards for admission and, during the interview, use the tolerance scores to explore, qualitatively, students’ own assessments of their tolerance for ambiguity.


Team care, IPE와 같은 의학교육 분야의 엄청난 문화적 변화가 진행중이고, 이러한 변화를 따라가기 위해서는 학생들도 불명료함에 대한 내성을 더 기를 필요가 있다. 왜냐하면 팀의 구성원이 모든 결정에 다 동의하는 것이 아니기 때문이다. 현재까지 학생들에게 동료의 의견을 존중해야 함을 가르치고 있지만, 학생들이 그룹 단위의 의사결정에 내재된 '불명료함'에 대해서 얼마나 내성이 있고 편안하게 느낄 수 있는가는 모를 일이다.

There are already significant culture changes under way in medical education, such as the growing emphasis on team care and interprofessional education. These changes may require greater tolerance for ambiguity among students because, occasionally, members of the team will disagree. Although students are being taught to “respect” their colleagues’ opinions (i.e., listen openly and not criticize), they may not be comfortable tolerating the ambiguity inherent in group decision making.






 2013 May;88(5):581-4. doi: 10.1097/ACM.0b013e31828a4b8e.

Tolerance for ambiguity: an ethics-based criterion for medical student selection.

Author information

  • 1School of Medicine, Department of Medicine and Berman Institute of Bioethics, Johns Hopkins University, Baltimore, Maryland 21205, USA. ggeller@jhu.edu

Abstract

Planned changes to the MCAT exam and the premedical course requirements are intended to enable the assessment of humanistic characteristics and, thus, to select students who are more likely to become physicians who can communicate and relate with patients and engage in ethical decision making. Identifying students who possess humanistic and communication skills is an important goal, but the changes being implemented may not be sufficient to evaluate key personality traits that characterize well-rounded, thoughtful, empathic, and respectful physicians. The author argues that consideration should be given to assessing prospective students' tolerance for ambiguity as part of the admission process. Several strategies are proposed for implementing and evaluating such an assessment. Also included in this paper is an overview of the conceptual and empirical literature on tolerance for ambiguity among physicians and medical students, its impact on patient care, and the attention it is given inmedical education. This evidence suggests that if medical schools admitted students who possess a high tolerance for ambiguity, quality of care in ambiguous conditions might improve, imbalances in physician supply and practice patterns might be reduced, the humility necessary for moral character formation might be enhanced, and the increasing ambiguity in medical practice might be better acknowledged and accepted.

PMID:

 

23524934

 

[PubMed - indexed for MEDLINE]


도덕지향에 따른 의과대학생 선발(Medical Education, 2005)

Selection of medical students according to their moral orientation

Miles Bore,1,2 Don Munro,2 Ian Kerridge3 & David Powis1






INTRODUCTION:

Consideration has been given to the use of tests of moral reasoning in the selection procedure for medical students. We argue thatmoral orientation, rather than moral reasoning, might be more efficacious in minimising the likelihood of inappropriate ethical behaviour in medicine. A conceptualisation and measure of moral orientation are presented, together with findings from 11 samples of medical school applicants and students.

AIM:

To provide empirical evidence for the reliability and validity of a measure of moral orientation and to explore gender, age, cultural and educational influences on moral orientation.

METHODS:

A questionnaire designed to measure a libertarian-dual-communitarian dimension of moral orientation was completed by 7864 medicalschool applicants and students in Australia, Israel, Fiji, New Zealand, Scotland and England and by 84 Australian psychology students between 1997 and 2001.

RESULTS:

Older respondents produced marginally higher (more communitarian) moral orientation scores, as did women compared to men. Minor but significant (P <0.05) cultural differences were found. The Israeli samples produced higher mean moral orientation scores, while the Australian psychology student sample produced a lower (more libertarian) mean score relative to all other samples. No significant change in moral orientationscore was observed after 1 year in a sample of Australian medical school students (n=59), although some differences observed between 5 cohorts of Australian medical students (Years 1-5; n=234) did reach significance. Moral orientation scores were found to be significantly correlated with a number of personality measures, providing evidence of construct validity. In all samples moral orientation significantly predicted the moral decisions made in response to the hypothetical dilemmas embedded in the measurement instrument. Discussion The results provide support for the conceptualisation of a libertarian-dual-communitarian dimension of moral orientation and demonstrate the psychometric properties of the measurement instrument. A number of questions concerning the use of such tests in selection procedures are considered.





Introduction

최근 CMA Journal에 의과대학생이 학년이 올라감에 따라 도덕적 추론의 단계가 하강하는 것 같다는 연구결과에 대하여 commentary를 하면서 Singer는 윤리적인 의사를 선발하고 교육하는 것의 복잡함을 지적한 바 있다. 특히 Singer는 의과대학생을 논리적추론능력에 기반하여 선발해야한다고 주장하였다. 이러한 주장이 일견 타당해보이지만, 우리는 도덕적추론에 대한 시험을 선발도구로 사용하는 것이 부적절하며, 그보다는 도덕지향(moral orientation)을 사용하는 것이 더 좋음을 주장하고자 한다.

In a recent commentary in the Canadian Medical Association journal, written in response to findings that students' stages of moral reasoning appeared to decline over the course of medical education,1 Singer2 addressed some of the complexities of selecting and training ethical practitioners. In particular, Singer suggested the need for the selection of medical students to be based on level of moral reasoning. While this seems reasonable, we would argue that tests of moral reasoning are inappropriate for use as selection instruments and that it may be more valuable to consider individual differences in ‘moral orientation’.



Moral reasoning

도덕 심리학과 도덕적 추론과 관련한 기존의 연구는 콜버그의 이론을 기반으로 하고 있다. 콜버그는 도덕성 발달의 인지적 발달이론을 주창한 사람으로서, 6단계의 도덕적 정의추론 단계를 구성하였다. 이러한 단계는 개개인이 추상적인 윤리적 이론을 실제 윤리적 문제에 적용하여 활용하는 인지적 발달단계이며, 콜버그의 도덕적 추론에 대한 이론은 MJI, SROM, DIT 등 다양한 검사의 근간을 이루고 있다.

Much of the literature relating to moral reasoning and moral psychology is based upon the work of Lawrence Kohlberg,3,4 who outlined a cognitive developmental theory of moral development constructed upon 6 stages of moral justice reasoning (Table 1). These stages reflect an individual's progressive cognitive developmental ability to utilise abstract ethical principles in addressing ethical issues. Kohlberg's theory of moral reasoning forms the basis of a number of tests of morality, including the Moral Judgement Interview (MJI),5 the Sociomoral Reflective Objective Measure (SROM)6 and the Defining Issues Test (DIT).7




콜버그의 이론과 척도가 널리 받아들여지고 있지만 여러가지 의문도 제기된 바 있어서, 문화적, 특히 서구문화의 bias가 작용할 수 있다는 점, 6단계와 일부 5단계에 대한 근거가 부족하다는 점, 성별에 따른 bias가 존재할 수 있다는 점, MJI의 점수산출방법, 도덕추론단계가 실제 행동을 예측하는 능력 등등이 지적되어왔다. 그러나 도덕추론에 대한 척도(MJI, DIT, SROM등)를 개발하는데 있어서 보다 근본적인 문제가 있다. 응답자는 가상의 딜레마 상황을 제시받게 되고, 도덕적 판단을 내린 뒤 응답자가 그렇게 응답한 근거를 선택하게 되어있다. 이 때 근거로 제시한 이유에 따라서 도덕적 추론능력 점수가 매겨지는데, 따라서 이러한 검사가 보여주는 것은 의사결정을 내린 후 사후에 그것을 정당화하는 능력이다. 콜버그의 이론이 한 개인의 도덕적 가치관이나 신념에 대해서 다뤘다기보다는 추상적인 윤리적 원칙을 활용하는 능력에 대한 발달단계를 다루고 있다는 점에서 위 척도들의 설계는 적절하며 논리적인 것으로 보인다. 한 개인이 교육을 받고 성숙해감에 따라서 추상적 이론을 실제 도덕적 의사결정에 활용하여 타당성을 입증할 수 있다.

While Kohlberg's theory and measure are highly regarded, questions have been raised about cultural or Western liberal bias,8–11 a lack of evidence for stage 6 and to some extent stage 5,12 a possible gender bias,13 the scoring method of the MJI,14,15 and the ability of moral reasoning stages to predict behaviour.14,16,17 There is, however, a further, more fundamental problem with the design of major measures of moral reasoning (e.g. the MJI, DIT and SROM). In each of these instruments, respondents are presented with a hypothetical dilemma and asked to make a moral decision concerning the dilemma and to give reasons for their decision. Moral reasoning stage scores are determined from the reasons they provide. Thus, these tests appear to provide a measure of the ability to produce post-decisional justifications. Given that Kohlberg was not concerned with a person's moral values and beliefs (cognitive content), but with the development of the capacity to utilise abstract ethical principles (cognitive structure), the measurement design is appropriate and logical. As individuals mature and gain greater levels of education,18 they become increasingly able to utilise abstract principles to justify their moral decisions.


그러나 자신의 결정이 도덕적으로 옳음을 보여주는 능력은 실제로 그 행동이 도덕적인가와는 다를 수 있다. 역사적으로 이것을 보여주는 무수한 사례가 있으며, 최근 이라크 침공에 대한 보도도 그것이 정당하다는 기사와 정당하지 않다는 기사가 모두 존재한다. Bandura는 '비인간적인 행동의 도덕적 정당성을 입증하는 것은 어렵지 않다'라고 했다.

The ability to justify a decision, however, may have little to do with whether that decision or subsequent behaviour was actually ‘moral’. History is replete with examples of individuals who committed atrocities but were quite able to justify their actions by reference to normative ethical theories or to ethical principles. More recently, the media has reported principled justifications for invading Iraq and principled justifications for not invading Iraq. As Bandura19 notes, ‘It is not uncommon for sophisticated moral justifications to subserve inhumane endeavours.’


도덕적 의사결정이 이미 내려진 의사결정의 정당성을 입증하는 것과 무관하다는 것은 비인간적인 행동에만 국한된 것은 아니다. 콜버그의 이론에서 최상위 단계의 근간인 칸트의 윤리원칙을 사용해서 인공호흡기를 유지할 것이라는 결정 뿐 아니라 유지하지 않을 것이라는 결정도 정당화할 수 있다.

The independence of a moral decision and the ability to justify the decision are not limited to inhumane endeavours. For example, the decisions to discontinue or to continue mechanical ventilation can both be justified using the abstract Kantian ethical principles that underpin Kohlberg's highest stage of moral reasoning ability.


의사결정을 내리고 거기에 대해서 타당한 근거를 제시하는 것에 초점을 두기보다는, 개개인이 의사결정을 내릴 때 어디에 근거해서 그러한 결정을 내리는가에 초점을 맞추는 관점이 필요하다. 말하자면 '어떤 심리학적 변인이 윤리적 민감성, 도덕적 의사결정에 대한 개인간 차이를 만들어내는가?' 우리는 세 가지를 주장하고자 한다.

Rather than focussing on the justifications that an individual might give for their decisions, an alternative view is to consider what it is about an individual that determines their opinions, their decisions and their actions. To put this in question form: what psychological variables lead to individual difference in ethical sensitivity (the recognition of an ethical situation), moral decision making, the decisions made, and the interpersonal behaviours displayed in making and enacting the moral decision? We would argue that 3 factors are highly relevant:


1. 개개인의 도덕적 지향

2. 도덕적 행동과 관련한 인적 특성

3. 도덕적 규준/정책/전문원칙에 대한 개인의 지식과 경험 

1. an individual's moral orientation;

2. personality traits that may influence moral decision making and the performance of moral behaviour, and

3. an individual's knowledge and experience of moral norms, laws, policies, professional principles and the professional culture in which the person is operating.





도덕지향 Moral orientation

도덕지향에 대한 개념은 Gilligan에 의해서 처음 제시되었는데, 그는 여성이 보다 care-oriented되어있고, 남성은 justice-oriented 되어있다고 주장하엿다. 연구를 바탕으로 한 명확한 근거가 존재하지는 않지만, 도덕지향에 관한 개개인의 차이가 도덕적 행동에 영향을 줄 수 있다는 주장은 눈여겨볼 만 하다.

The concept of moral orientation was proposed by Gilligan,13 who suggested that women are more care oriented while men are justice oriented. While this hypothesis has not been clearly supported by empirical studies,20,21 the contention that individual differences in moral orientation might be influential in moral behaviour is noteworthy.


우리는 도덕적 의사결정을 내리기 전에 가지고 있던 도덕지향을 측정하기 위해 만들어진 설문지를 활용한 연구에 근거하여,도덕지향에 관한 또 다른 개념을 개발하고자 했다. 본 연구는 개개인의 도덕지향의 차이가 정규분포를 이루며, 한 극단에는 libertarian이, 다른 극단에는 communitarian이 있음을 보여준다. 대부분의 응답자는 그 중간 어디에 위치하며, 대체적으로 개인적 요구와 사회의 요구를 균등하게 고려한 의사결정을 내린다.

We have developed an alternative conceptualisation of moral orientation based on insights arising from empirical studies using a questionnaire designed to measure moral orientation prior to the making of a moral decision.22 This research indicated that individual differences in moral orientation formed a normally distributed trait-like dimension with, 

    • at one extreme, respondents consistently placing greater importance on the needs, rights and well-being of individuals and relatively less importance on the rights, needs, norms and well-being of society and referent groups within society. We labelled this a ‘libertarian’ moral orientation. 
    • The opposite was apparent at the other extreme of the dimension, with respondents consistently placing greater importance on group/society needs and relatively less importance to the needs of individuals: a ‘communitarian’ orientation. 
    • A majority of respondents, occupying the central area of the score distribution, appeared to give approximately equal importance to individual needs and group/societal needs, indicating a ‘dual’ moral orientation.


도덕지향에 대한 이러한 관점에 따르면, 도덕적 딜레마 상황에서, 도덕지향은 맥락을 이해하고 처리하는데 관여하며 가능한 선택사항과 개인과 집단에 미치는 영향을 평가하여 의사결정을 내리게 된다. 

The articulation of this view of moral orientation is that, when presented with a moral dilemma, the moral orientation of the respondent mediates the perception and processing of the context, the evaluation of potential options and consequences for individuals and groups and determines/predicts the moral decisions the respondent makes. In short, when confronted by a moral dilemma, 

    • libertarians will ‘see’ and place greater value on the needs of and potential consequences for the individual/s in the context, 
    • communitarians will ‘see’ and place greater value on the needs of and potential consequences for society and important referent groups within that society, and 
    • the dual-oriented will ‘see’ and approximately equally value the needs of and consequences for both the individual/s and society.



The above conceptualisation emerged in concert with the development of a questionnaire-based measure of libertarian−communitarian moral orientations that we called the Mojac Scale.



Mojac Scale의 점수가 관련되어있는 가치들을 살펴보면 construct validity의 근거도 어느 정도 있는 것으로 보인다. 또한 이 점수는 도덕추론 단계와는 상관이 없는 것으로 나타난다.

Scores from this scale have been found to be empirically related to the values of hedonism and social power (favoured by libertarians), beneficence and tradition (favoured by communitarians), thus providing some evidence of construct validity. Furthermore, the scores were found to be unrelated to moral reasoning stage.22,23



The study reported below aimed to examine the influence of education, age, gender and culture and the relationship of the libertarian−communitarian dimension to particular personality traits and the prediction of moral decisions.



Methods


Participants

연구참여자 

From 1997 to 2001 data were collected from 11 samples of applicants to medical schools and medical school students in Australia, England, Scotland, New Zealand, Fiji and Israel. The samples were chosen for the purpose of determining test norms and examining the variables of education, age, gender and culture in conjunction with a broader research project reported elsewhere.24 Sample description, size, age and gender details are shown in Table 2.





Instruments

두 가지 버전의 Mojac Scale 사용. 24문항의 짧은 버전과 45문항의 긴 버전.

All participants completed either the short (24 items) or long (45 items) measure of the Mojac Scale.22,23 The short measure (Mojac-24) consists of 3 hypothetical dilemmas (vignettes); Mojac-45 contains an additional dilemma. Respondents read each dilemma and then respond to a series of statements relevant to the needs of individuals or to the needs and moral expectations/norms of society using a 4-point Likert scale (strongly agree to strongly disagree). Respondents were also asked to make a forced choice 2-option ‘final decision’ for each dilemma. Responses to the 24 (or 45) statement items were used to derive a libertarian (low score) to communitarian (high score) moral orientation score (LibCom score). An example of the Mojac protocol, using a dilemma based on the ‘Heinz’ dilemma used in the MJI, the DIT and SROM is given in the Appendix.


Dilemma example

Mr D's wife is dying from cancer. A new but expensive treatment for this type of cancer is available. However, all of Mr D's savings and assets have been spent on previous treatments and hospitalisation. The only way to obtain the treatment for his wife is to embezzle a large amount of money from the bank where Mr D has worked as a valued and trustworthy employee for 28 years.

What is your opinion? How do you feel about each of the following statements?

There is never any excuse for theft (group item)

(a)Strongly agree

(b)Agree

(c)Disagree

(d)Strongly disagree

A husband should try to save his wife's life (individual item)

(a)Strongly agree

(b)Agree

(c)Disagree

(d)Strongly disagree

Even in this situation stealing is wrong (group item)

(a)Strongly agree

(b)Agree

(c)Disagree

(d)Strongly disagree

Mr D should maintain his trustworthy reputation (group item)

(a)Strongly agree

(b)Agree

(c)Disagree

(d)Strongly disagree

Saving a person's life is more important than upholding the law (individual item)

(a)Strongly agree

(b)Agree

(c)Disagree

(d)Strongly disagree

Final decision question example

You now have to make a decision about what Mr D should do. For the next question select either (a) or (b)

(a)Mr D should steal the money

(b)Mr D should not steal the money


H그룹과 I그룹은 다른 인성검사도 시행했음. J그룹도 다른 검사 시행. 

Samples H and I also completed the following personality tests: Right-wing Authoritarianism,25 Social Desirability,26 the International Personality Item Pool (IPIP) measure of the Big 5 Factors of Personality27 (extroversion, neuroticism, openness, agreeableness and conscientiousness), the NACE Scale24 (a measure of narcissism, aloofness, confidence and empathy), and the 16 Personality Factors28 (16PF) scale. Sample J completed the Sensitivity to Punishment and Sensitivity to Reward Scale29 (SPSRS), the IPIP and the Eysenck Personality Questionnaire30 (EPQ: Extroversion, Neuroticism and Psychoticism). These tests were chosen as the traits they measure (or specific traits within multi-trait scales) were expected to provide further evidence of the construct validity of the Mojac Scale. In samples A and G, participants also completed a battery of tests for the purpose of selection to medical schools; however, scores from the selection tests were not included in this study.



Procedure

절차

For all samples, the tests were administered under supervision in pen and paper format, using either optical mark reading (OMR) response forms or hand-marked forms, to participants in either a large hall or room. The response sheets were then collected and the data either scanned or hand-entered into spreadsheets for statistical analysis.



Results


Reliability

신뢰도

Cronbach's α reliability coefficients of 0.82−0.87 for the 24-item short form and 0.83−0.92 for the 45-item version were found (Table 2), indicating a high and stable internal consistency for the measure.



Age and gender

연령과 성별 - 약하지만 유의한 상관관계

Although the age distribution was greatly skewed in all samples, weak but significant (P < 0.05) positive correlations were found between age and combined Mojac-24 samples LibCom scores (r = 0.19) and combined Mojac-45 samples (r = 0.18). Weak but significant gender differences were also found. In the Mojac-24 samples the mean LibCom score for women (66.4, SD = 9.7) was significantly higher than the mean LibCom score for men (65.0, SD = 9.1; t = − 4.88, P < 0.001). This difference was also observed in the Mojac-45 samples (women 114.9, SD = 14.3; men 110.7, SD = 15.4; t = 8.51, P < 0.001).



Differences between samples

표집간 차이 - 거의 비슷했으나 일부 차이 있는 집단 존재

Generally, differences in the mean LibCom scores, standard deviation and range across samples (Table 2) were not large and a similar near-normal distribution was evident in all samples. Some differences did reach statistical significance as indicated by a 1-way analysis of variance (anova) and Tukey's post hoc pairwise comparisons with a family error rate of P = 0.05. For the Mojac-24 samples, the means from both Israeli samples were significantly higher than all other Mojac-24 samples, while the mean for the psychology students was significantly lower than all other Mojac-24 samples [F(5, 4227) = 30.1, P < 0.001]. A 1-way anova of the Mojac-45 samples also reached statistical significance [F(4, 3714) = 3.54, P = 0.007]; however, no significant difference between any pair of means was found.



Influence of medical education

교육의 효과: C그룹에서 1년 차이를 보았을 때는 거의 차이 없음. 

A subsample of sample C completed the Mojac-24 again 12 months after the initial testing in 1999. If medical education does influence moral orientation, then a significant difference in the 1999 and 2000 sample mean scores would be expected. For this subsample of 59 students, the 2000 LibCom mean of 64.0 (SD = 7.6) was not significantly different from the 1999 mean of 62.9 (SD = 7.3). The correlation between scores produced in 1999 and 2000 was r = 0.77, indicating only minor changes in moral orientation after 1 year. While this finding also suggests acceptable test-retest reliability, a study with a more typical period of 3−4 weeks between test and retest has not yet been undertaken.


C그룹에서 1,2,3,4,5학년을 비교했을 때 3학년과 5학년이 1학년보다 높음.

Sample C was of sufficient size to allow cross-sectional comparison of LibCom means between students from Years 1, 2, 3, 4 and 5 of the medicine programme, with 65, 43, 59, 30 and 37 students in each year cohort, respectively. anova indicated that there were significant differences between the year groups [F(4, 229) = 4.72, P = 0.001]. A Tukey's pairwise comparison of the means with a family error rate of P = 0.05 indicated that Year 1 participants had significantly lower scores than Year 3 and Year 5 participants. This is also indicated in the plot of the means and 95% confidence intervals shown in Fig. 1. No other significant differences between year levels were found.





학년은 LibCom score의 유의한 예측인자이나, 나이는 그렇지 않음.

A tendency for later-year students to produce higher LibCom scores (more communitarian) is apparent in Fig. 1. A regression analysis found that year of study was a significant predictor of LibCom scores (t = 2.85, P = 0.005), while age was not (P > 0.05). However, the variance of LibCom scores accounted for by the predictors was minimal (R-Sq = 4.2%; t = 3.18, P = 0.002). These findings indicate modest differences between the year cohorts tested. Observation of any change in moral orientation requires a longitudinal design and such a study is yet to be completed.



Construct validity

세 표집에서 construct validity를 확인해보았음. marker test와 비교했을 때 유의한 상관관계를 보임

The conceptualisation of Mojac scores as indicative of a continuum of libertarian to communitarian moral orientation was tested against several well validated personality measures in 3 samples: 508 Scottish medical school applicants (sample J) and 2 samples of New Zealand medical school students (samples H and I in which a total of 204 participants completed the same test battery). Table 3 shows highly significant (P < 0.001) correlations between Mojac-45 LibCom scores and scores from the ‘marker’ tests.





Predictive validity

예측타당도: Mojac Scale의 마지막 문항인 최종결정에 대한 점수와 LibCom score 비교하였을 때, 30%~40%의 변동을 설명할 수 있다.

The relationship between the libertarian−communitarian dimension and the moral decisions individuals make was examined using the final decision items embedded in the Mojac Scale. Final decision scores for each respondent were determined by coding the response options for each of the 3 (Mojac-24) or 4 (Mojac-45) final decision items as 1 for a decision that favoured the individual in the dilemma and 2 for a decision that favoured the group. The final decision items were then summed to produce an overall final decision score.


Regression analysis was used to examine the relationship between LibCom scores and final decision scores. Across all samples, LibCom scores were found to account for approximately 30% (Mojac-24) to 40% (Mojac-45) of the variance in final decision scores. R-Sq values for all samples are given in Table 4. Additionally, in the sample of 2906 Australian medical school applicants, 3 moral orientation groups were created using a tri-median split of LibCom scores. Figure 2 indicates that libertarian-oriented respondents showed a strong tendency to make decisions that favoured the outcome for individuals in each of the Mojac dilemmas, while communitarians made decisions that favoured the maintenance of group norms, values and laws. Dual-oriented respondents sometimes favoured the individuals and sometimes favoured the group in their final decisions. Analysis by anova found the differences between each group to be highly significant [F (2, 2901) = 897.39, P < 0.001].






Discussion

도덕지향은 최종 의사결정 문항을 유의미하게 예측했다.

This study has provided empirical evidence of the validity of the Mojac measure of moral orientation. In addition, the results of our research support the hypothesis that an individual's libertarian−communitarian moral orientation is a determinant of their moral decision making. In each of the samples tested, a person's moral orientation was found to be a significant predictor of their responses to the final moral decision questions embedded in the Mojac Scale.


연령, 성별, 문화에 따른 차이는 크지는 않았지만 유의했음. 나이를 들수록 communitarian이나 표집의 연령 폭 자체가 좁음.

Age, gender and cultural differences, although not large, were significant. Older respondents tended to be more communitarian; however, the distribution of respondents' ages in the samples is not representative of the general population. Some differences in LibCom scores were observed across 5 year-of-study student cohorts and were found to be weakly predicted by exposure to medical education rather than by age. However, the cross-sectional design of the study does not allow any inference concerning change in moral orientation. Longitudinal research exploring moral orientation change is required.


남성이 보다 libertarian, 여성이 보다 communitarian. 

Men generally were more libertarian and women more communitarian, although a notable exception was the predominantly female psychology student sample, which produced a significantly lower mean LibCom score (more libertarian) compared to the medical school samples. Respondents from the Israeli samples, coming from a somewhat more collectivistic culture, were generally more communitarian. While a near normal distribution of scores was observed within each group (men versus women and within each cultural group), indicating that the differences within groups were much greater than the differences between groups, further research regarding the influence of age, gender and culture is required. If differences are consistently found then the establishment of separate norms might be warranted.


도덕지향이 기존의 인성검사에서 확인되는 scale과 잘 일치하는 양상을 보임

Importantly, libertarian−communitarian moral orientation scores were found to be related to well validated personality scales in a conceptually coherent patterning. 

  • High Mojac scorers (indicating an extreme communitarian moral orientation) had tendencies (as identified by parallel test instruments) to be authoritarian, conscientious, perfectionistic and self-controlled, while 
  • low scorers (indicating an extreme libertarian orientation) tended to be disorderly, narcissistic, abstracted and unrestrained. 

Thus, when presented with an ethical dilemma in a medical situation, extreme communitarians might tend to be inflexible, reliant on procedures, rules and their perception of the ‘authority’ of medicine at the expense of the unique needs, rights and autonomy of their patients. Conversely, extreme libertarians might be overly flexible and ignore or bend the usual rules of procedure while being disproportionately concerned for the rights, well-being and liberty of patients and themselves as doctors.


Mojac Scale이 의료와 관계없는 딜레마 상황을 사용하여 이뤄졌지만, 다양한 맥락에 대한 일반화가 가능하다는 가정을 지지하는 많은 연구가 있음.

An important point concerns the use of non-medical dilemmas in the Mojac Scale (which is also the case with the well known tests of moral reasoning). The aim of such tests, and many others, is to measure individual differences in a particular psychological construct: in this case, moral orientation. The assumption is that these individual psychological differences influence a person's behaviour across situations. While this is arguable, the extensive literature on personality traits generally supports the assumption that traits generalise across different contexts. The correlations found between the Mojac moral orientation scores and the personality trait scores noted above empirically support the notion that, regardless of the stimulus used, the scale is measuring a psychological trait or tendency.


의과대학 학생선발에서의 활용. 인지적 척도에 대해서는 일정 점수 이상, 혹은 일정 석차 이상의 학생을 선발하게 된다. 그러나 도덕지향에 대해서는 극단의 성향을 보이는 지원자를 배제하는 것이 보다 합당하다. 2SD정도를 제안한다.

The use of tests in medical school selection procedures, be they tests of academic ability, cognitive skills, personality traits or moral orientation, requires that each test reliably measures the trait or ability it purports to. For ethical reasons, those charged with the responsibility for assessing and selecting medical school students clearly would need to consider the properties of any measure used. Additionally, considerable care needs to be taken in establishing how scores determine selection. Typically, as is the case with cognitive measures, test scores are ranked from highest to lowest and a ‘cut-point’ determined, above which applicants are retained in the selection pool. However, this might be an inappropriate procedure with tests that indicate individual differences in moral orientation, moral reasoning or moral values. To admit only high scoring applicants on such tests would require a test to produce a range of scores from the ‘most likely to be moral’ to the ‘least likely to be moral’ and the validity of such a test would be highly questionable. In view of the correlations found in the present study, an argument can be made that extreme high and low scorers, in this case extreme communitarians and extreme libertarians (perhaps defined by cut-points of + 2 SD and − 2 SD from the mean, respectively), could be considered for exclusion from the applicant pool on the grounds that their moral orientation is likely to be vocationally incongruent with the ethical standards and requirements of the medical context. The substantial majority who remain in the applicant pool would approximately equally value the needs, rights and well-being of individual patients and the needs, rights and well-being of others, the profession and society as a whole and so might be more likely to behave in an ethically appropriate way in the practice of medicine.



Screening out extreme scorers assumes, by definition, that a majority of applicants have the qualities to practise medicine ethically, particularly, as noted by Singer,2 if the medical education undertaken by successful applicants includes ethics training, evaluation of ethics in performance, and an ethical learning environment. Rather than select on the basis of high moral reasoning scores, it would seem more realistic and appropriate to screen out those few who indicate an extreme moral orientation. This would allow for moral development with time, education and experience and acknowledgement of the fact that most health professionals behave ethically. Most people are able to consider the needs and perspectives of both the individual and the group in their daily lives. If most did not, it is unlikely we humans would have survived and thrived as we have.









 2005 Mar;39(3):266-75.

Selection of medical students according to their moral orientation.

Author information

  • 1Faculty of Health, University of Newcastle, Callaghan, New South Wales, Australia. Miles.Bore@newcastle.edu.au

Abstract

INTRODUCTION:

Consideration has been given to the use of tests of moral reasoning in the selection procedure for medical students. We argue thatmoral orientation, rather than moral reasoning, might be more efficacious in minimising the likelihood of inappropriate ethical behaviour in medicine. A conceptualisation and measure of moral orientation are presented, together with findings from 11 samples of medical school applicants and students.

AIM:

To provide empirical evidence for the reliability and validity of a measure of moral orientation and to explore gender, age, cultural and educational influences on moral orientation.

METHODS:

A questionnaire designed to measure a libertarian-dual-communitarian dimension of moral orientation was completed by 7864 medicalschool applicants and students in Australia, Israel, Fiji, New Zealand, Scotland and England and by 84 Australian psychology students between 1997 and 2001.

RESULTS:

Older respondents produced marginally higher (more communitarian) moral orientation scores, as did women compared to men. Minor but significant (P <0.05) cultural differences were found. The Israeli samples produced higher mean moral orientation scores, while the Australian psychology student sample produced a lower (more libertarian) mean score relative to all other samples. No significant change in moral orientationscore was observed after 1 year in a sample of Australian medical school students (n=59), although some differences observed between 5 cohorts of Australian medical students (Years 1-5; n=234) did reach significance. Moral orientation scores were found to be significantly correlated with a number of personality measures, providing evidence of construct validity. In all samples moral orientation significantly predicted the moral decisions made in response to the hypothetical dilemmas embedded in the measurement instrument. Discussion The results provide support for the conceptualisation of a libertarian-dual-communitarian dimension of moral orientation and demonstrate the psychometric properties of the measurement instrument. A number of questions concerning the use of such tests in selection procedures are considered.

Comment in

PMID:

 

15733162

 

[PubMed - indexed for MEDLINE]


의과대학 지원자의 '윤리성'을 평가하는 것이 가능할까?(Journal of Medical Ethics, 2001)

Is it possible to assess the “ethics” of medical school applicants?

Michael Lowe, Ian Kerridge, Miles Bore, Don Munro and David Powis Fiji School of Medicine, Fiji, and University of Newcastle, Australia






의과대학의 학생선발에서 지원자의 도덕성을 평가하는 어렵지만 중요한 일이다. 그러나 윤리지식, 도덕추론능력, 윤리적 신념 등을 평가하는 것은 부적절한데, 이런 것은 교육을 통해 개발될 수 있는 것이기 때문이다. 윤리적 이슈에 관한 태도와 윤리적 민감성은 인적특성에 대한 검사의 맥락에서 시험의 대상이 될 수도 있다. 모든 '윤리'시험은 입학에 적용되기 전에 validation이 필요하다. 

Questions surrounding the assessment of medical school applicants’ morality are difficult but they are nevertheless important for medical schools to consider. It is probably inappropriate to attempt to assess medical school applicants’ ethical knowledge, moral reasoning, or beliefs about ethical issues as these all may be developed during the process of education. Attitudes towards ethical issues and ethical sensitivity, however, might be tested in the context of testing for personality attributes. Before any “ethics” testing is introduced as part of screening for admission to medical school it would require validation.We suggest a number of ways in which this might be achieved. (Journal of Medical Ethics 2001;27:404–408)





UME나 GME가 장기적으로 윤리적 행동양상을 바꿀 수 있다는 근거가 부족한 상황에서, 비윤리적인 의사를 줄이는 길은 UME나 GME에 들어서기 이전에 사전에 차단하는 것이다. 

Given the paucity of evidence that undergraduate or postgraduate education may change or shape ethical practice in the long term, it seems that the only way to prevent people like Dr Shipman from continuing in the profession might lie in attempting to identify unethical doctors prior to entering medical school or during their undergraduate or postgraduate education, and excluding them from the profession before they cause harm.


윤리학은 우리가 어떻게 행동해야 하는가에 대한 학문이며, 비윤리적 의사란 하지 말아야 할 일을 하거나, 해야 하는 일을 하지 않는 의사이다. 

Ethics is the study of what we ought to do. An unethical doctor is therefore a doctor who does things that he or she ought not, or does not do the things that he or she should.


의사들을 윤리적으로 행동하게 만드는 요인에는 여러가지가 있다. 교육적 차원에서 이 요인들은 두 가지로 나뉘는데, 하나는 가르칠 수 있는 것이고 다른 하나는 타고나는 것이다.

There are a number of factors that enable doctors to act in an ethical way, including a desire or motivation to do so, a knowledge of ethical issues, the development of communication skills and other skills required for medical competence, a capacity for moral or ethical reasoning (we will use the terms “moral” and “ethical” synonymously), and an individual’s beliefs, attitudes, and sensitivity to ethical issues. Educationally, these factors appear to fall into two main groups—those that can be taught, and those that appear to be innate.


학생의 윤리적 추론능력을 평가해야하는가에 대한 또 다른 문제는, 윤리적 추론이 다른 종류의 추론과 유사해서 그 원리들을 도덕적인 문제에 적용시키기만 하면 되는 것이다. 다른 시험을 잘 보는 학생이 이 시험도 잘 볼 가능성이 높다.

A different problem occurs with the question as to whether we should examine students’ ethical reasoning. In many ways, ethical reasoning is like any other form of reasoning, it is simply the application of logic to matters of morality. Students who are selected for medicine on the basis of other tests of logic, are likely to do well at tests of moral reasoning as well.


그러나 '도덕적 추론'이라는 용어는 조금 다른 의미를 갖는데, 단순히 그 논리적 절차 뿐만 아니라 콜버그의 도덕성 발달이론에서 언급되는 것과 마찬가지로 윤리적 성숙, 윤리적 발달 과정이라는 것이다. 콜버그는 여섯 개의 단계를 거쳐 도덕성 발달이 이뤄진다고 결론지었다. 이 이론을 바탕으로 MJI, DIT, SMR 등의 도덕적 추론의 척도가 개발되었다.

However, the term “moral reasoning” also has a slightly different meaning, referring not only to a process of logic, but also to a process of ethical maturation or development, such as in Kohlberg’s theory of ethical development. Kohlberg came to the conclusion that moral development occurred in six defined stages, leading from a state of moral immaturity in which ethical decisions were taken ad hoc, to higher levels of moral development which involved individuals acting objectively, rationally, and impartially, following universal ethical principles of a higher morality. His theory has been studied extensively, and it underlies the development of measures of moral reasoning such as the Moral Judgement Interview (MJI),3 The Defining Issues Test (DIT),4 and the Sociomoral Reflection Measure (SRM).5


콜버그의 이론은 나이를 먹을 수록 도덕적으로 추론하는 인지능력이 상위 단계로 올라간다고 보았으나, 나이는 도덕성 발달의 유일한 변인이 아니다. 도덕적 의사결정을 내려야 하는 상황을 다양하게 경험해볼수록 도덕 추론 점수도 높아지는 것으로 연구되고 있다. 또한 일부 연구들은 교육을 통해서 도덕추론점수를 높일 수 있음을 밝힌 바 있다.

Kohlberg’s theory suggests that as the individual ages, the cognitive ability to reason morally moves through a hierarchy of invariant stages.3 Age, however, is not the only variable in moral development, as studies suggest that the opportunity to experience an enriched moral decision making environment may also influence moral reasoning scores.6 Self, Baldwin, and Wolinsky provided an example of this effect when they demonstrated that medical students had a highly significant gain in the adoption of principled reasoning as measured by the DIT after a course in medical ethics.7 This has also been observed in other longitudinal studies,8 and in a meta-analysis by Schlaefli, Rest, and Thomas. 9 These findings support the notion that educational experience can increase moral reasoning scores.


그러나 도덕적 추론이 도덕적 의사결정으로 연결되는지는 불분명한데, 이는 도덕적 추론능력에 대한 검사가 도덕적 의사결정을 내리고 난 뒤의 합리화 과정에 근거하기 때문이다. 실제로 콜버그의 이론도 도덕적 추론에 대해서 도덕적 추론이 도덕적 의사결정과는 다르다고 전제하고 있다.

However, it is unclear how moral reasoning is related to moral decision making, since tests of moral reasoning tend to be based upon the justifications produced by an individual after a moral decision has been made. Indeed, Kohlberg’s theoretical premise in developing his theory of moral reasoning was that reasoning is independent of moral decisions made.


따라서 도덕적 추론이 교육적 경험에 따라서 바뀔 수 있는 것이고, 도덕적 의사결정과 무관할 수 있기 때문에 도덕적 추론에 대한 척도는 의과대학 학생선발에 활용하기는 부적절하다.

It appears therefore that, since moral reasoning has been shown to change with the educative experience and may be unrelated to the moral decisions individuals make, measures of moral reasoning are unlikely to be suitable for inclusion in the selection of applicants for medical education.




윤리적 신념, 태도, 민감성 Ethical beliefs, attitudes and sensitivity


지원자의 윤리적 신념을 검사하는 것은 기술적으로 어렵지는 않으며, 어떤 윤리적 신념은 일부 의료환경에서는 진료를 어렵게 만들기도 한다. (여성의 할례, 동물에 대한 학대)

It would not be technically difficult to examine applicants for particular ethical beliefs, and there are some ethical beliefs that may make medicine difficult to practise in some environments. For example, applicants from some ethnic groups may believe that it is reasonable to perform female circumcisions despite this being widely considered in Western society to be immoral; some applicants may be willing to sabotage animal experiments out of interest for the animals; yet others may believe it is a valuable aim of humanity to pharmacologically enhance sportsmen and women so they can perform better.


그러나 우리는 나중에 교육을 통해서 개발되거나 바뀔 수 있는 윤리적 신념때문에 지원자들이 의과대학에 불합격하는 것을 바라지 않는다. 신입생에게는 윤리적 문제에 대한 단순한 신념 정도만 있으면 되고, 그 지평을 넓혀주는 것이 의학교육의 역할이다.

However,we do not believe applicants to medical school should be rejected because of their individual ethical beliefs, as ideas can be developed or discarded by individuals throughout their medical training and later careers. Unsophisticated beliefs about ethical subjects should be expected in junior students, and one role of medical education is to broaden their experience and knowledge-base.


의과대학 지원자에 대하여 사례를 통한 윤리적 민감성을 평가하고, 과제를 줄 수도 있다. 예컨대 예상가능한 윤리적 이슈라든가, 다양한 행동에 대한 이유를 합리화하는 것 등이다.

It may be possible to test for ethical sensitivity by providing applicants with a vignette and giving them a task to perform, such as coming up with a list of ethical issues that might arise, or justifying the various courses of action.


비윤리적 의사의 모습은 - 자기애적 자기중심적이고, 자신의 이익밖에 고려하지 않는 - 의료전문직을 아는 사람이라면 누구나 친숙한 모습일 것이다. 

This picture of unethical doctors—as narcissistic egotists, unconcerned with anyone’s interests but their own—is familiar to anyone involved with the medical profession or its representations in the popular press; and these descriptions match the profiles described in some other studies of unethical doctors.




윤리학 vs 정신과학적 진단 Ethics versus psychiatric diagnosis

비윤리적 행동과 관련될 수 있는 다양한 인적특성을 다뤘지만, 극단적 경우에는 이런 인적특성이 인격장애처럼 보일 수 있다. 

We have described a number of personality characteristics that may be associated with unethical behaviour. In the extreme cases, some of these personality traits may even be described as personality disorders, although clearly this does not apply in all cases.


일부 연구자들은 'bad'와 'mad'사이의 관계를 지적한 바도 있지만, 이 두 가지 영역이 가능한 서로 멀리 떨어져있어야 한다고 느끼는 듯 하다. 왜냐하면 정신과학적 진단은 도덕적 판단과 구분되어야 하기 때문이다. 그러나 정신과학적 진단은, 특히 인격장애에 있어서는 도덕적 판단의 범위까지 넘어오기도 하고 겹치는 부분이 있기도 하다.

Several authors have commented upon the links between the ethical domain (“bad”) and the psychiatric domain (“mad”).13 Many commentators feel these domains should be kept apart as far as possible, and that psychiatric diagnosis should be kept separate from moral judgments.14 Yet psychiatry has always had a tendency to move beyond its brief, and moral judgments and psychiatric diagnoses often appear to overlap, particularly in the area of personality disorders.


지원자의 인격장애를 검사하려면, 대부분의 의과대학이 높은 학업지능을 기반으로 학생을 선발하는 것을 인식할 필요가 있다. 반사회적 성향과 싸이코패쓰에 관한 연구를 보면 이러한 사람들을 "정상인, 심지어는 매력적, 매혹적인 외형의 사람들과 구분하는 것"이 얼마나 어려운지를 보여준다. "반사회적 인격장애를 가진 사람들은 진실을 말하지 않으며, 일반적 도덕성 잣대에 따라서 임무를 수행한다."

If we are going to test applicants for personality disorders, it is important to realise that most medical school applicants have already been picked on the basis of a high level of academic intelligence. The literature on antisocial personalities and psychopaths emphasises how difficult it is to diagnose this condition as people with these conditions may present with “a normal and even a charming and ingratiating exterior . . .. Antisocial personality disorder patients do not tell the truth and cannot be trusted to carry out any task or adhere to any conventional standard of morality.”15


자기애적 성향은 그 반대인데, 잡아내기가 그다지 어렵지 않다. 그리고 이러한 사람을 진단하는 많은 도구가 있다. 그러나 이렇나 사람들을 걸러내는 것의 문제는 비록 이들이 함께 일하기에 그다지 즐거운 사람들은 아니지만, 어느 분야의 선구자에게 이러한 특징들이 심심찮게 발견된다는 사실이다. 만약 자기애적 성향을 가진 사람들을 다 걸러낸다면, 모든 사람이 꺼려하는 분야에서도 자신의 신념을 고집해서 새로운 아이디어를 시도할 줄 아는 소중한 미래의 지도자를 잃는것은 아닐까?

The narcissistic personality on the other hand, appears to be relatively easy to trap in his or her own conceits, and there are a number of instruments used for diagnosing this condition. The difficulty with excluding people with this type of personality is that, although it is widely agreed that they are unpleasant to work with, we are struck by the prevalence of narcissistic traits among leaders of the profession. If we reject the narcissists, do we lose valuable future leaders who through their own egotism try new ideas and procedures that others do not dare?



결론 Conclusion

Medical school entry is based upon a number of factors. Cut off marks for academic performance are perhaps the most popular methods of excluding potential applicants, although there is no evidence to justify the extremely high marks required for many courses. Courses are now including tests of logical reasoning, tests of lateral thinking, and testing that is known to discriminate in favour of certain groups (eg women) to the disadvantage of others.


여태껏 의과대학 지원자의 도덕적, 윤리적 특성 명확히 드러내는 검사는 없었다. 

We are not aware of any medical schools which test explicitly for moral or ethical attributes of applicants for medicine, although these topics are frequently covered in interviews. The reason for this is probably concerns about the methodological issues involved in defining and testing ethical attributes, and fear of introducing new biases and new forms of unjustified discrimination into the selection process. Indeed, if we are to develop measures for assessing applicants’ attitudes and sensitivity to moral issues, it is important that these should not be based purely on theoretical structures, but that they also be validated empirically. The main difficulty with validation is how to define unethical behaviour well enough to test any measures developed.




 2001 Dec;27(6):404-8.

Is it possible to assess the "ethics" of medical school applicants?

Author information

  • 1Fiji School of Medicine, Fiji, and University of Newcastle, Australia.

Abstract

Questions surrounding the assessment of medical school applicants' morality are difficult but they are nevertheless important for medical schools to consider. It is probably inappropriate to attempt to assess medical school applicants' ethical knowledge, moral reasoning, or beliefs about ethical issues as these all may be developed during the process of education. Attitudes towards ethical issues and ethical sensitivity, however, might be tested in the context of testing for personality attributes. Before any "ethics" testing is introduced as part of screening for admission to medicalschool it would require validation. We suggest a number of ways in which this might be achieved.

PMID:
 
11731605
 
[PubMed - indexed for MEDLINE] 
PMCID:
 
PMC1733480
 

Free PMC Article

입학전형과 학업능력: 세 가지 입학전형 비교(Medical Education, 2014)

Selection and study performance: comparing three admission processes within one medical school

Nienke R Schripsema,1,2 Anke M van Trigt,2 Jan C C Borleffs2 & Janke Cohen-Schotanus1,2



OBJECTIVES:

본 연구의 목적은 (1)고등학교 성적, 자발적 다면선발, 추첨이라는 세 가지 방법으로 입학한 학생들의 학업능력을 비교하고, (2)다면선발을 통해서 입학한 학생이 여기서 떨어진 학생보다 우수한지를 알아보고 (3)다면선발에 지원했다는 것 자체가 학업능력과 연관되는가를 보고자 한다.

This study was conducted to: (i) analyse whether students admitted to one medical school based on top pre-university grades, a voluntary multifaceted selection process, or lottery, respectively, differed in study performance; (ii) examine whether students who were accepted in the multifaceted selection process outperformed their rejected peers, and (iii) analyse whether participation in the multifaceted selection procedure was related to performance.


METHODS:

지식시험, 프로페셔널리즘 점수, 학습발달, 중퇴의 네 가지를 비교하였다. 추첨입학 학생군을 다면선발에 지원한 학생과 그렇지 않은 학생으로 나눴다. ANCOVA, 로지스틱 회귀분석, Bonferroni 검정을 사용하였다.

We examined knowledge test and professionalism scores, study progress and dropout in three cohorts of medical students admitted to the University of Groningen, the Netherlands in 2009, 2010 and 2011 (n = 1055). We divided the lottery-admitted group into, respectively, students who had not participated and students who had been rejected in the multifaceted selection process. We used ancova modelling, logistic regression and Bonferroni post hoc multiple-comparison tests and controlled for gender and cohort.


RESULTS:

고등학교에서 최고수준 성적을 받은 그룹이 지식검사에서 높은 점수를 받았으며, 1학년때 더 많은 학점을 이수하였다. 또한 이 그룹은 다면선발에 지원하지 않은 추첨입학 학생군에 비해서 최상위권 프로페셔널리즘 점수를 더 많이 받았다. 다면선발로 입학한 학생은 다면선발에 지원하지 않은 추첨선발 학생보다 지필고사 점수가 높았으며, 두 가지 추첨선발 학생군보다도 프로페셔널리즘 점수를 잘 받았다. 다면선발에 지원하지 않았던 추첨선발 학생군은 다른 모든 그룹보다 1학년과 2학년 이수학점이 적었고, 중퇴율은 학생군별로 달랐지만, 보정 후에는 유의하지 않았다.

The top pre-university grade group achieved higher knowledge test scores and more Year 1 course credits than all other groups (p < 0.05). This group received the highest possible professionalism score more often than the lottery-admitted group that had not participated in the multifaceted selection process (p < 0.05). The group of students accepted in the multifaceted selection process obtained higher written test scores than the lottery-admitted group that had not participated (p < 0.05) and achieved the highest possible professionalism score more often than both lottery-admitted groups. The lottery-admitted group that had not participated in the multifaceted selection process earned fewer Year 1 and 2 course credits than all other groups (p < 0.05). Dropout rates differed among the groups (p < 0.05), but correction for multiple comparisons rendered all pairwise differences non-significant.


CONCLUSIONS:

고등학교에서 최상위 GPA를 받은 것이 학업능력을 가장 잘 예측하였다. 비학업적 수행능력에 있어서는 다면선발이 가장 효율적인 방식인 것으로 보인다. 다면선발에 지원하여 참여하는 것 자체가 높은 수행능력의 예측인자였다. 

A top pre-university grade point average was the best predictor of performance. For so-called non-academic performance, the multifaceted selection process was efficient in identifying applicants with suitable skills. Participation in the multifaceted selection procedure seems to be predictive of higher performance. Further research is needed to assess whether our results are generalisable to other medical schools.





입학점수와 입학 후 수행능력의 관계에 대한 연구에서 두 가지 한계가 있다. 첫째는 합격자만을 대상으로 하기에 restriction of range가 생긴다는 것이고, 둘째는 단일한 전형으로 학생을 선발하는 많은 학교에서는 서로 다른 입학전형간 비교가 어렵다는 점이다.

Studies on the relationships between admission scores and later performance, however, are characterised by two limitations. Firstly, most study samples have a restriction of range because performance data are only available for applicants who were accepted in the selection process. It remains unclear how rejected applicants would have performed if they had been admitted. Secondly, in most medical schools, all students are admitted through the same admission process based on a single set of criteria, which limits opportunities to compare different admission processes.


합격자와 불합격자의 비교연구

In the literature, few studies investigating performance differences between accepted and rejected applicants have been reported.8,9


서로 다른 입학전형간 비교

A few studies have addressed the effects of different admission processes on student performance within one medical school.


네덜란드의 3단계 의과대학 학생선발 시스템

The Netherlands has a three-step admission system. In the first step, applicants with a top pre-university GPA are offered admission. Participation in the second step is voluntary. In this step, students participate in a multifaceted selection process that is organised by each medical school separately. The third step is a national weighted lottery, in which applicants who were rejected in the second step as well as applicants who have not participated in the second step can enrol.



METHODS


Context

네덜란드의 그로닝겐 의과대학. 3년의 BA와 3년의 MA. 교육과정 개괄

This study was performed at the University of Groningen in the Netherlands. The problem-based curriculum consists of a 3-year pre-clinical Bachelor’s and a 3-year clinical Master’s programme. The first year of the Bachelor’s programme includes four 10- week blocks, a year-long professionalism course, and the inter-university progress test.15,16 Each part provides students with a fixed number of course credits under the European Credit Transfer System. One course credit equals 28 hours of study. The maximum number of course credits per year is 60.



Admission processes in the Netherlands

네덜란드 학생선발 단계

The Netherlands has a national policy for medical school admissions according to which applicants can be admitted through one of three steps. 

  • In the first step, students with a pre-university GPA of ≥ 8 (on a scale ranging from 1 = poor to 10 = excellent) gain admission to the medical school of their choice without further assessment. This grade average is calculated by averaging each applicant’s mean grade on pre-university school examinations and mean grade on national final examinations. As only approximately 4% of all pre-university graduates achieve a GPA of ≥ 8, this grade indicates excellent achievement. 
  • In the second step, applicants can be accepted in a multifaceted selection process. Each medical school organises its own selection process in which participation is voluntary. Selection processes differ among medical schools, but they usually consist of two rounds of assessments in which various knowledge-based and behavioural variables are measured. 
  • In the third step, applicants are admitted through the national weighted lottery. In the lottery, applicants are categorised based on their pre-university GPA. Four categories of GPA are distinguished: 7.5–7.9; 7.0–7.4; 6.5–6.9, and < 6.5. The ratio of applicants admitted by category is 9 : 6 : 4 : 3.11,17 Applicants who were rejected in the multifaceted selection process can be admitted through the lottery. Until 2011, approximately 50% of the places available at each medical school were assigned through the national weighted lottery.


네덜란드에서는 매년 8500명이 지원하여 2780개의 정원을 놓고 경쟁함. 410명이 그로닝겐 의과대학에 배정됨. 

Every year, around 8500 applicants apply to medical school in the Netherlands. There are 2780 places available, 410 of which are at the University of Groningen. Approximately 60 of these places are allocated to the International Bachelor of Medicine programme. The remaining 350 are allocated to the Dutch Bachelor of Medicine programme. In the current study, we included only students enrolled in the Dutch Bachelor of Medicine programme.



Participants

연구대상자

We included all 1055 students admitted to the Dutch Bachelor of Medicine programme at the University of Groningen in 2009, 2010 and 2011 (69% female; mean age at the start of Year 1: 18.6 years; mean pre-university GPA: 7.3). We defined four groups of students: 

  • (i) students who were admitted based on a pre-university GPA of ≥ 8 out of 10 (n = 143; 71% female; mean age at the start of Year 1: 18.0 years; mean pre-university GPA: 8.2); 
  • (ii) students who were accepted in the multifaceted selection process (n = 295; 74% female; mean age at the start of Year 1: 18.5 years; mean pre-university GPA: 7.1); 
  • (iii) lottery-admitted students who had been rejected previously in the multifaceted selection process (n = 315; 69% female; mean age at the start of Year 1: 18.5 years; mean pre-university GPA: 7.1), and 
  • (iv) lottery-admitted students who had not participated in the selection process (n = 302; 63% female; mean age at the start of Year 1: 19.1 years; mean pre-university GPA: 7.0). Descriptive statistics on percentages of women, mean age and pre-university GPA within the four groups are depicted in Table 1.



The privacy policy of the University of Groningen states that student records can be used for research purposes as long as reports cannot be traced back to individual students.18 In accordance with this privacy policy, anonymised data were derived fromthe university administration.



다면선발 절차 Multifaceted selection process

1단계와 2단계가 있음.

In 2009, 2010 and 2011, the University of Groningen selection process consisted of two rounds.


1단계

In the first round, applicants were asked to send in a pre-structured written portfolio, based on the procedure developed by Erasmus MC medical school,11 with an additional section on reflection. 

  • The portfolio contained three sections which covered, respectively, pre-university education, extracurricular activities and reflection. 
  • In the section on extracurricular activities, points were granted when applicants were able to show an ability to participate in multiple activities at the same time (i.e. to combine pre-university education with additional activities). Extracurricular activities yielded points if they met fixed criteria on total duration and the amount of time spent on the activity per week. Only activities that had been carried out in the preceding 1.5 years, for at least 5 months consecutively, for more than 3.5 hours per week, yielded points. 최근 1년6개월 내에 5개월 이상 연속적으로 매주 3.5시간 이상 한 활동만을 인정함
  • For the section on reflection, applicants were asked to carry out a number of reflective assignments. For example, applicants were required to ask three people in their network to give two reasons why the medical profession might suit them and one reason why they might be better off not practising medicine. Applicants were asked to reflect on these statements in a short essay. In the evaluation process, only the reflection was assessed. Evaluations of the selection process indicated that applicants required 40–60 hours to complete the first-round portfolio. 한 가지 사례는 3명의 사람에게 지원자가 의과대학에 적절한 이유와 그렇지 않은 이유에 대해 받은 뒤, 이것에 대한 짧은 에세이 기술


2단계: 225명

The 225 applicants who scored highest in the first round were invited to participate in the second round of the multifaceted selection process, which lasted an entire day and took place at the University of Groningen Medical School. 

The day was divided into four blocks comprising, respectively, a writing assignment, a patient lecture with subsequent assignments, a scientific reasoning block, and a series of short interviews and role-plays. 

  • For the writing assignment, applicants were asked to write an essay about a societal problem for which ethical decision making was key. For example, in 2009, applicants were asked to write an essay on China’s one-child policy. To help them prepare for this assignment, applicants were sent a package with information about the subject a week before the second round. 
  • The lecture was focused on medical knowledge, ethical dilemmas and professional behaviour, and consisted of a presentation with integrated videos, similar to the format of a video-based situational judgement test.19,20 Applicants were required to answer questions about the content of the lecture, describe and analyse the presented ethical issues and recognise (un)professional behaviour. 
  • In the scientific reasoning block, applicants were asked to read a scientific article about which they were then required to answer questions that tested analytic, creative and practical skills.21,22 
  • The fourth block consisted of an MMI like 5,23 series of short assignments in which applicants were asked to reflect on the assignments in their first-round portfolio, carry out a role-play that focused on communication skills, and a three-phase role-play in which they collaborated with two fellow applicants. In this scenario, three applicants first had to prepare a problematic interaction with an actor. They then held the actual conversation, after which they were asked to reflect on the course of the conversation.


Scores on the four blocks were calculated and applicants were ranked based on their total scores. The fourth block was given a double weighting. 



통계분석 Statistical analysis


We performed analysis of covariance (ANCOVA) to examine group differences in written test grades and in the number of course credits students earned in their first, second and third years. To determine which groups differed, we performed Bonferroni post hoc multiple-comparison tests for the corrected means. We conducted logistic regression analyses with changing reference groups to examine group differences in the percentages of students who achieved the highest possible score in the professionalism course. Students who dropped out of medical school in the first half of each year were excluded from these analyses because they were unable to earn course credits in the entire period under analysis and their inclusion might have caused us to overestimate effects on study progress. To assess group differences in dropout rates, we conducted logistic regression analysis including the entire group (n = 1055). All analyses were performed using IBM SPSS Statistics for Windows Version 20.0 (IBM Corp., Armonk, NY, USA) and controlled for gender, age and cohort. After initial analyses, age was eliminated as a covariate as it was not a significant predictor in any of the models. We did not correct for pre-university GPA because the group of students who were admitted based on a top pre-university GPA was defined by this variable and the other groups did not differ significantly.










기존 연구에서 중퇴율, 성적, 임상실습성적 등에서 추첨선발 학생군(4군)의 학업능력이 낮게 보고된 것은 추첨선발 학생군 전체(3군+4군)이 아니라 4군에 의한 영향일 가능성이 높아 보인다. 우리의 연구 결과에 따르면 지식검사와 학습발전정도에 있어서는 다면선발이 좀 더 예측력을 갖는 것으로 보인다. 이러한 결과를 종합하면 의과대학은 지원절차에 시간과 노력을 쏟은 지원자만을 선발하는 것이 도움이 될지도 모른다. 이전에 어떤 전형에도 지원한 적이 없는 추첨입학학생군의 낮은 학업능력은 다면선발에 내재된 '자발성'에 의해 설명될 수 있다.이는 '자기선발'과정이기도 하다. 동기부여가 더 잘 된 학생은 의과대학 입학을 위한 모든 가능한 기회를 다 사용해볼 것이며, 동기부여가 덜 된 학생은 그냥 추첨선발까지 기다려볼 것이다.

The lower performance of non-participants might indicate that previously reported differences in dropout rates,11 test grades28 and clerkship grades12 between selected and lottery-admitted students in the Dutch system can be attributed to lottery- admitted applicants who have not participated in the selection process, rather than to the lottery-admitted group as a whole. Our results in fact indicate that for knowledge test scores and study progress, participation in the multifaceted selection process may be more predictive than acceptance in this process. As such, it might be profitable for medical schools to admit only applicants who have put time and effort into their applications. The lower performance of non-participants may be explained by the voluntary nature of the multifaceted selection process, which implies a mechanism of self-selection that is induced by the Dutch admissions system.29 Highly motivated students might use every opportunity to achieve admission into medical school, whereas less motivated students might choose to wait for the lottery.


이러한 동기부여의 차이에 대하여 가능한 설명은 선발절차에 참여(participation)하는 것에도 40~60시간의 노력이 필요하다는 것이다. 동기부여가 잘 된 학생은 이러한 것에 쉽게 노력과 시간을 투자할 수 있는 반면 그렇지 않은 학생은 이 자체가 힘들 수 있다.

A possible explanation for this difference in motivation is that participation in the selection process requires 40–60 hours of work. Highly motivated students might easily have invested this kind of time and effort, whereas less motivated students may have perceived this threshold as too high.





 2014 Dec;48(12):1201-10. doi: 10.1111/medu.12537.

Selection and study performancecomparing three admission processes within one medical school.

Author information

  • 1Center for Research and Innovation in Medical Education, University of Groningen and University Medical Center Groningen, University of Groningen, Groningen, the Netherlands; Institute for Medical Education, University of Groningen and University Medical Center Groningen, Groningen, the Netherlands.

Abstract

OBJECTIVES:

This study was conducted to: (i) analyse whether students admitted to one medical school based on top pre-university grades, a voluntary multifaceted selection process, or lottery, respectively, differed in study performance; (ii) examine whether students who were accepted in the multifaceted selection process outperformed their rejected peers, and (iii) analyse whether participation in the multifaceted selection procedure was related to performance.

METHODS:

We examined knowledge test and professionalism scores, study progress and dropout in three cohorts of medical students admitted to the University of Groningen, the Netherlands in 2009, 2010 and 2011 (n = 1055). We divided the lottery-admitted group into, respectively, students who had not participated and students who had been rejected in the multifaceted selection process. We used ancova modelling, logistic regression and Bonferroni post hoc multiple-comparison tests and controlled for gender and cohort.

RESULTS:

The top pre-university grade group achieved higher knowledge test scores and more Year 1 course credits than all other groups (p < 0.05). This group received the highest possible professionalism score more often than the lottery-admitted group that had not participated in the multifaceted selection process (p < 0.05). The group of students accepted in the multifaceted selection process obtained higher written test scores than the lottery-admitted group that had not participated (p < 0.05) and achieved the highest possible professionalism score more often than both lottery-admitted groups. The lottery-admitted group that had not participated in the multifaceted selection process earned fewer Year 1 and 2 course credits than all other groups (p < 0.05). Dropout rates differed among the groups (p < 0.05), but correction for multiple comparisons rendered all pairwise differences non-significant.

CONCLUSIONS:

A top pre-university grade point average was the best predictor of performance. For so-called non-academic performance, the multifaceted selection process was efficient in identifying applicants with suitable skills. Participation in the multifaceted selection procedure seems to be predictive of higher performance. Further research is needed to assess whether our results are generalisable to other medical schools.

© 2014 John Wiley & Sons Ltd.

PMID:
 
25413913
 
[PubMed - in process]


의과대학 학생선발 과정에 사용되는 자료들과 중요도의 변천사: 1980년대 이후로 무슨 일이 있었나? (Academic Medicine, 2013)

An Overview of the Medical School Admission Process and Use of Applicant Data in Decision Making: What Has Changed Since the 1980s?

Alicia Monroe, MD, Erin Quinn, PhD, MEd, Wayne Samuelson, MD, Dana M. Dunleavy, PhD, and Keith W. Dowd, MA




PURPOSE:

현재 미국/캐나다 의과대학의 학생선발 절차를 알아보고, 1986년과 어떻게 다른지 비교한다.

To investigate current medical school admission processes and whether they differ from those in 1986 when they were last reviewed by the Association of American Medical Colleges (AAMC).


METHOD:

2008년 MCAT점수를 활용하는 모든 미국과 캐나다의 의과대학을 대상으로 학생선발 절차에 관한 설문을 시행하였다.

In spring 2008, admission deans from all MD-granting U.S. and Canadian medical schools using the Medical College Admission Test (MCAT) were invited to complete an online survey that asked participants to describe their institution's admission process and to report the use and rate the importance of applicant data in making decisions at each stage.


RESULTS:

120개 대학이 응답하였다. 면접을 통해서 인적특성을 평가한다고 대답하였으며, 1986년과 비교하였을 때 1단계(면접대상자 선발 평가)에서 학업 성적관련 자료의 사용이 늘었다. 1986년에는 GPA가 거의 모든 단계의 합격자 결정에 있어서 가장 중요했던 반면, 2008년에는 1단계, 2단계에 따라서 중요시되는 자료가 서로 달랐다. MCAT과 학부 GPA는 1단계에서 중요했으며, 면접과 추천서 등이 최종적으로 누가 합격할지를 결정하는데 중요했다.

The 120 responding admission officers reported using a variety of data to make decisions. Most indicated using interviews to assess applicants' personal characteristics. Compared with 1986, there was an increase in the emphasis placed on academic data during pre-interview screening. While GPA data were among the most important data in decision making at all stages in 1986, data use and importance varied by the stage of the process in 2008: MCAT scores and undergraduate GPAs were rated as the most important data for deciding whom to invite to submit secondary applications and interview, whereas interview recommendations and letters of recommendation were rated as the most important data in deciding whom to accept.


CONCLUSIONS:

의과대학 학생선발절차의 복잡성을 보여주는 연구이며, 지원자의 댜양한 측면을 총체적으로 평가하는 holistic approach의 사용이 늘었음을 보여준다. 

This study underscores the complexity of the medical school admission process and suggests increased use of a holistic approach that considers the whole applicant when making admission decisions. Findings will inform AAMC initiatives focused on transforming admission processes.





학생선발의 여러 절차에는 어떤 순환적 구조가 있다. 의과대학이 변하면 입학정책을 통해서 지원자 풀에 영향을 주고, 입학결정이 이뤄지는 법적, 사회적 맥락이 있으며, 의과대학 교육과정이 변한다.

We suggest that they have a cyclical relationship with the admission process: Changes in academic medicine affect the admission process through their influence on the applicant pool, the legal and social contexts in which admission decisions are made, and the medical school curriculum.


1980년대에는 의과대학 지원자 풀이 작아져서 지원자의 수준이 떨어질 것이라는 우려가 있었다. 또한 소수자와 여성지원자 비율이 낮았다. 그러나 2008년 지원자 풀이 넓어졌으며 여기에는 아시아계 학생과 여학생의 영향이 크다.

Since the mid-1980s, the number and composition of medical school applicants have changed dramatically. At that time, there were concerns about a declining applicant pool and a potential decline in the academic quality of applicants.5 Additionally, the percentages of minority and female applicants were relatively low.6 By 2008, the applicant pool had grown and become more diverse with respect to Asian and female applicants.*


지원자의 수와 구성이 변한 것이 입학정책에도 영향을 주었다. (1)평가의 단계가 더 늘어났고, (2)좀더 정량화가능한 자료를 많이 사용하게 되었으며 (3)양적 자료와 질적 자료(학업적-비학업적 자료)를 모두 사용하게 되었다.

We suggest that these changes in the size and composition of the applicant pool may have affected the admission process in several ways when compared with that of the mid-1980s. 

    • Given the increase in applicants, admission committees—especially those with large applicant pools—may add stages to the process to reduce the number of applicants remaining at each stage. 
    • Second, with more applicants in the pool, admission officers may rely more heavily on data that are quantifiable and easily incorporated into pre-interview screening tools. 
    • Third, in light of the changes in composition of the applicant pool, admission committees may use a combination of quantitative and qualitative (academic and nonacademic) data in order to achieve broad diversity in the student body.


법적 환경도 영향을 많이 주는요인이다.

The legal context in which admission committees operate has changed substantially, however. For example, the Supreme Court’s 2003 decision in Grutter v. Bollinger13 affirmed the importance of mission-driven, evidence-based admission decisions and introduced the concept of educational benefits of diversity.† It also established that all applicants must be considered through the same admission process, which allowed schools to change their diversity and admission policies. In 1986, only 28% of U.S. medical schools included diversity as a primary goal of their admission process,‡ whereas 57% did in 2008.14,15



법적, 사회적 환경의 변화는 적어도 다음의 두 가지를 통헤 입학에 영향을 준다. (1)1980년대보다 학생선발 과정에서 지원자에 대한 더 많은, 다양한 자료를 활용하게 되엇다. (2)지원자의 민족/인종, 성별. 사회경제적수준 등에 대한 고려가 많아졌다.

We suggest that, together, these legal and social context changes may have affected the admission process in at least two ways. 

    • First, admission committees may now consider more and varied information about applicants in making admission decisions than they did in the mid-1980s. 
    • Second, with a slightly more diverse applicant pool and a more permissive legal environment, admission committees may now be more likely to consider information about applicants’ race/ethnicity, gender, and/or SES background in the admission process.


1990년대와 2000년대에 의과대학에 많은 변화가 있었음.

In the 1990s and 2000s, a series of structural modifications to the medical school and residency accreditation processes, as well as new curricular resources, paved the way for fundamental changes in medical education. For example, the Liaison Committee on Medical Education18 (LCME) and the Accreditation Council for Graduate Medical Education19 revised their accreditation standards to require medical schools and residency programs to teach and assess professional attributes. LCME standard MS-31-A states, “A medical education program must ensure that its learning environment promotes the development of explicit and appropriate professional attributes in its medical students (attitudes, behaviors, and identity).”18 In addition, the AAMC Medical School Objectives Project series20 and the Institute of Medicine (IOM) report on behavioral and social sciences in medical school curricula21 identified— and, importantly, provided curriculum materials to help medical schools modify their curricula to teach—the broad knowledge, skills, and attitudes that graduating medical students should possess.




The admission process in 2008

Slightly more than half (57%, 68/120) of the respondents reported that their medical school’s admission decisions are made using a two-stage process that includes an initial application and an interview.


Slightly less than half (43%, 52/120) of the respondents reported that their schools use a three-stage process to make admission decisions, the same as reported by Mitchell4 in 1986 (43%, 49/113).


Overall, admission officers rated a wide range of data as important to admission committees’ decisions about which applicants to invite to submit secondary applications, interview, and accept into medical school (Table 1)However, the uses and importance of these data differed by the stage of the process.


Table 2 compares the relative importance of 15 types of data in making acceptance decisions in 1986 and 2008. Among the most important types of data, 64% (7/11) were nonacademic in 2008 compared with 50% (5/10) in 1986,4 suggesting that nonacademic data are more important to admission decisions today than in the past.







The admission interview in 2008

All responding admission officers reported that their medical schools conduct admission interviews.


Interviews were described as one-on one by 83% (99/119) of the responding admission officers. Most respondents (87%, 104/120) reported that interviews are conducted by admission committee members, whereas 17% (20/120) reported that they are conducted by staff and 68% (81/120) indicated that, in some cases, they are conducted by medical students.



Results showed that the admission interview is somewhat structured. The majority of respondents (65%, 77/119) indicated that interviewers are given general guidance about the content of the questions they should ask. 



Admission officials indicated that interviews are most often used to assess nonacademic characteristics and skills: Over 85% (more than 100 of 119) reported that interviews include questions about applicants’ motivation for pursuing a medical career, compassion and empathy, personal maturity, oral communication skills, service orientation, and professionalism.








학부 GPA와 MCAT점수가 중요함을 보여준다. 그러나 이것만이 전부가 아니라는 것을 보여주는 결과는, UGPA와 MCAT이 모두 최상위권인 1233명의 지원자 중에서조차 9%의 학생은 모든 의과대학에 불합격했다는 점이다. 

Chart 1 shows that although UGPAs and MCAT scores are important factors in admission processes, they are not the sole determinants of acceptance decisions. For example, 105 (9%) of the 1,233 applicants with UGPAs of 3.80 to 4.00 and MCAT total scores of 39 to 45 were not accepted by any of the medical schools to which they applied in 2008–2010. In contrast, 597 (18%) of the 3,324 applicants with UGPAs of 3.20 to 3.39 and MCAT scores of 24 to 26 were accepted by at least one medical school. These findings buttress the importance ratings data presented earlier, suggesting that a wide variety of data are important to admission decisions.





변하지 않은 점 Aspects of the admission process that have not changed

변하지 않은 점은 다음과 같다. (1)다양한 자료를 활용하고 있다. (2)두 단계 혹은 세 단계로 구성되어 있다. (3)MCAT점수를 중요하게 보고 있으며, MCAT과 학부GPA를 서로 비교하여 활용한다. (4)면접의 수, 길이, 형식이 거의 비슷하다. (5)인구통계학적 특성의 중요성은 낮다고 응답했다.

Certain aspects of the admission process are largely unchanged since the mid- 1980s. 

  • First, as in 1986,4 admission officers today use a variety of data in making decisions, which suggests that they remain committed to evaluating both academic and nonacademic information.
  •  Second, our data suggest that, as in 1986, schools’ admission processes are structured into two or three stages.
  •  Third, in both 1986 and 2008, admission officers rated MCAT scores as important to each stage in the process and indicated that they use MCAT scores and UGPAs to provide an interpretive context for each another. 
  • Fourth, the number, length, and format of admission interviews are the same as those described by Johnson and Edwards3 in 1991. Similarly, the admission interview continues to be the primary source of information about applicants’ personal characteristics.
  • Finally, as in the 1986 survey,4 admission officers in our survey rated the importance of demographic characteristics in the admission process as relatively low


일부 의과대학의 입학위원회에서는 holistic review를 하는 것 자체가 소수인종이나 농촌지역 지원자들이 자신의 잠재력을 충분히 보여줄 수 있는 것이라고 생각하여, 명시적으로 인구통계학적 변인들을 고려할 필요를 낮추는 것으로 보인다.

We suggest, on the basis of the data presented in this article and comments made by focus group participants during site visits, that some medical school admission committees may feel that conducting holistic reviews allows URM and rural applicants to show their full potential and precludes the need to consider demographic variables explicitly.



변한 점 Aspects of the admission process that have changed


1980년대와 마찬가지로 두 단계, 세 단계의 과정을 거치나 각 단계에서 중요하게 보는 것이 달라진다.

As was the case in the mid-1980s, most admission committees use a multistage process to make decisions. However, our data suggest that admission committees now place different emphasis on applicant data at each stage of the process. For example, in summarizing the results of the 1986 survey, Mitchell4 noted that although test scores decreased in importance as decision making proceeded, importance ratings did not differ appreciably across the stages of the admission process. In contrast, our data suggest that admission committees now consider slightly different data when deciding whom to invite to submit secondary applications, interview, and accept. Academic data seem to be slightly more important in deciding which applicants to invite to submit secondary applications and to interview than in deciding whom to accept.


This difference is likely due to the increasing size of applicant pools and the ease of incorporating academic data into automated screening processes


However, we interpret the 2008 survey data reported in this article—and the qualitative data from the 2008 medical school site visits—as indicating that the inclusion of multiple stages does not preclude the use of a holistic admission process.


비학업적 자료에 대한 중요도가 높아진 것 역시 중요한 변화 중 하나이다.

Arguably, the most notable change in the admission process is the increased importance placed on nonacademic data in making acceptance decisionsIn 2008, admissions officers rated more nonacademic data as “of high importance” than did admission officers in 1986. Further, all types of academic data dropped in ratings of relative importance in the 2008 survey compared with the 1986 survey (except MCAT scores), whereas nonacademic data such as interview recommendations, letters of recommendation, and personal statements gained in importance.


합격률 자료를 보아도, 서로 다른 학업성취도의 학생들이 다양하게 의과대학에 입학하고 있다. 

Acceptance rate data provide additional support for this change, showing that applicants with different levels of academic preparedness (i.e., the combination of cumulative UGPA and MCAT total score) are accepted into medical school. Together, these data and the high importance ratings given to both academic and nonacademic data in the 2008 survey suggest that many medical schools are conducting a more holistic admissions process than they have in the past.


대학마다 학점의 인플레이션이 있다는 것이 보고되고 있다. 그러나 이러한 인플레이션에 대한 정보가 UGPA 활용방식에 영향을 주었는지는 불분명하다.

Additionally, grade inflation occurs at many undergraduate institutions.26 It is unclear whether knowledge of such inflation affects admission officers’ use of UGPAs in their decision making;





 2013 May;88(5):672-81. doi: 10.1097/ACM.0b013e31828bf252.

An overview of the medical school admission process and use of applicant data in decision making: what haschanged since the 1980s?

Author information

  • 1Educational Affairs, University of South Florida Health Morsani College of Medicine, Tampa, Florida, USA.

Abstract

PURPOSE:

To investigate current medical school admission processes and whether they differ from those in 1986 when they were last reviewed by the Association of American Medical Colleges (AAMC).

METHOD:

In spring 2008, admission deans from all MD-granting U.S. and Canadian medical schools using the Medical College Admission Test (MCAT) were invited to complete an online survey that asked participants to describe their institution's admission process and to report the use and rate the importance of applicant data in making decisions at each stage.

RESULTS:

The 120 responding admission officers reported using a variety of data to make decisions. Most indicated using interviews to assess applicants' personal characteristics. Compared with 1986, there was an increase in the emphasis placed on academic data during pre-interview screening. While GPA data were among the most important data in decision making at all stages in 1986, data use and importance varied by the stage of the process in 2008: MCAT scores and undergraduate GPAs were rated as the most important data for deciding whom to invite to submit secondary applications and interview, whereas interview recommendations and letters of recommendation were rated as the most important data in deciding whom to accept.

CONCLUSIONS:

This study underscores the complexity of the medical school admission process and suggests increased use of a holistic approach that considers the whole applicant when making admission decisions. Findings will inform AAMC initiatives focused on transformingadmission processes.

PMID:
 
23524917
 
[PubMed - indexed for MEDLINE]


의과대학 학생선발: 풀리지 않는 과제

Selecting medical students: An unresolved challenge*

DAVID POWIS

The University of Newcastle, Australia





거의 모든 국가에서 의과대학에는 학업적으로 우수한 지원자가 몰림에도 불구하고, 의과대학생 선발위원회가 부적절한 판단을 내렸음을 지적하는 기사나 논문이 종종 등장하여 논란을 불러일으키곤 한다. 가장 우수한 학생을 선발한다면 이들이 좋은 의사가 될 것이라는 가정이 틀린 것일까? 아마도 이제는 초점을 옮겨야 할 때가 왔을지도 모른다. 최고로 우수한 지원자들 사이에서 차이를 내 보려는 기존의 노력 대신, 비학업적 인적특성에 초점을 둬서 부적절한 학생이 입학하지 못하도록 하려는 노력이 필요하다. 의과대학 학생선발의 문제를 분석하고 해결책을 제시하고자 한다. 이 해결책은 이미 70년전에 제시된 바 있지만 다만 도입되지 않았을 뿐인 해결책이다. 이러한 접근법을 도입하지 않는다면 논쟁은 끝나지 않을 것이다. 

Despite the abundant supply of academically outstanding applicants to medical schools in most countries the regularly recurring debate in the academic literature, and indeed sometimes in the popular media, implies that admissions committees are still getting it wrong in a significant number of instances. How can this be so when our procedures are directed unashamedly at selecting the most highly academically and intellectually qualified students in the expectation that they will make the best doctors? Perhaps it is time for a radical change in emphasis. Instead of endeavouring to differentiate among the top ranks of a pool of outstandingly qualified applicants, the selection effort might be better focused on identifying those potentially unsuitable in terms of their non-academic personal qualities to ensure they do not gain entry. The account that follows is an analysis of the problems of medical student selection and offers a potential solution - a solution that was first suggested in the medical literature 70 years ago, but not adopted. It is the present author's contention that the cycle of debate will continue to recur unless such an approach is pursued.






Seventy years ago, in 1944, the UK government-appointed Goodenough Committee of review into medical education reported that: [some medical students] ‘‘who, though able to pass examinations, have not the necessary aptitude, character or staying power for a medical career’’. With the Goodenough Committee report to hand, Smyth, writing in the British Medical Journal in 1946, wrote that ‘‘Existing methods of selection [of medical students] which worked well in the past may no longer be the best possible in changing conditions’’He added: ‘‘The recent reports . . . have drawn attention to some of the problems connected with the selection of medical students . . . [and] point out the problems, without discussing ways and means of solving them’’ (Smyth 1946).



현재 어떻게 뽑고 있는가?

How were, and are, medical students selected?


The first hurdle to clear to enter medical school in most countries, both in the past and the present, is an academic achievement barrier.


There is a reasonable basis for requiring the academic hurdle: McManus et al. (2013) have described eloquently the predictive link between past academic achievement and academic progress through medical school and beyond. Their ‘‘academic backbone’’ model elegantly demonstrates the link (Figure 1).




Putting the prior academic achievement criterion into perspective a systematic review by Ferguson et al. (2002) showed that academic scores account for 23% of the variance of progress measures at medical school, but only 6% beyond medical school.



이상한 의사들 

Unsatisfactory doctors


Although everyone would freely acknowledge that many doctors are excellent, and most are entirely capable and competent, it is evident that some are not!


  • Some doctors have been convicted of criminal acts.

  • Some doctors are deficient in communication skills, failing to communicate adequately or appropriately with peers, mentors, patients or patients’ families. 

  • Some doctors are unprofessional

  • Data from the GMC annual report in 2005 showed that 1 in 15 doctors in the UK were dependent on alcohol or drugs at some stage of their professional lifetime.

  • Lastly, the professional competence of some doctors is seriously compromised by mental health issues.



원인으로는 아래와 같은 것들이 있음.

Willcock et al. (2004) suggested that the high incidences of distress and burnout can be attributed inter alia to a doctor’s stressful work environment, their long working hours, possible conflict between work and personal life tasks and their individual psychological vulnerability, a view echoed by Wallace et al. (2009) in their literature review.







이상한 학생들

Unsatisfactory medical students


Most medical educators have had experience of students who cause concern. They comprise a small proportion of any cohort, and may even be progressing adequately academically through medical school, but they exhibit attitudes and/or behaviours that many would consider unacceptable in an aspiring doctor.



몇몇은 유급을 하기도 하지만, 유급의 이유가 늘 학업문제 때문인 것은 아니다.

Notwithstanding the exceptionally high academic standard required to gain entry to medical school it is clear that some students do fail academically during their course. (...) Perhaps the reason for their academic failure lies beyond their academic ability.



자살 충동

And in Australia journalist Amanda Davey, quoting Jessica Dean, President of the Australian Medical Students’ Association (Davey 2014), stated ‘‘that mental illness [is] rife among med students’’ – ‘‘one in five admitting suicidal thoughts in the last year’’ [of their studies].



부적절한/프로페셔널하지 못한 행동

Besides distress to the students concerned ‘‘burnout’’ has been shown to be associated with unprofessional behaviour (Dyrbye et al. 2010), manifested by poor reliability and responsibility, poor initiative and motivation and a severely diminished capacity for self-improvement. Unprofessional behaviour at medical school has been shown to have a strong association with subsequent disciplinary action by a medical board Papadakis et al. (2005).




지원자에 대해서 우리가 알고 있는 것은? 

What do we know about Medical School applicants?


면접에 최종 합격에 중요하나, 전적대학 GPA가 면접 대상자 선정에 가장 중요하다.

In the USA and Canada an overview of medical school admission processes (based on data from 120 informed respondents) by Monroe et al. (2013) reported that although an interview recommendation was the most important determinant (4.5/5) of a candidate receiving an offer, the cumulative undergraduate GPA determined who were invited to interview.


학업적으로 우수한 지원자가 이렇게 많다면, 그들 사이에 구분을 어떻게 할 것인가?

Given the surfeit of academically well-qualified applicants how do medical schools currently differentiate between them?


지금까지는 학업적 수월성에 주로 관심을 두고 있다. 그러나 과연 이것이 옳은가에 대한 의문은 끊이지 않는다.

The focus is clearly still on identifying academic excellence and cognitive (reasoning) skills with only a fairly superficial and subjective investigation of the applicants’ personal qualities. The quest for the presumed ‘‘best’’ applicants remains the prime focus. Contemporary commentators have continued to ask the question: Is this the right way? (Hughes 2002; Powis 2003, 2008; James et al. 2010; Mercer & Puddey 2011; Wilson et al. 2012; Eskander et al. 2013; Leinster 2013).



Barr (2010), writing in The Lancet, ‘‘found no scientific evidence that supported the power of performance in undergraduate science courses as a way to predict clinical or professional quality as a physician’’ and ‘‘found . . . consistent evidence that performance in the premedical sciences is inversely associated with many of the personal, non-cognitive qualities so central to the art of medicine’’.




패러다임 시프트가 필요한가? 

Time for a paradigm shift?


The vast majority of medical school applicants are more than adequately equipped academically for their studies and their later professional careers.



기존의 선발도구로 부적합한 학생을 가려낼 수 있겠는가?

The practical question now becomes: Can we identify the potentially unsuitable at the outset using tools currently or

potentially available? 

    • academic record
    • cognitive skills tests
    • personal statement
    • referees’ reports
    • interview – panel, MMI
    • non-cognitive tests (personality measures)



학업 기록을 통해서 의과대학 학업을 따라가기에 어려워보이는 지원자를 가려낼 수 있다.

The academic record will identify those potentially academically inadequate to undertake medical studies.



일정 수준 이상의 학업성취도를 보유한 지원자를 가려냈다면, 이후에는 이에 따른 영향이 없어야 한다. 그렇게 된다면 사회경제적 수준이 낮은 인구집단의 지원자에게도 좀더 입학기회가 확대될 것이다. 

It could be argued that academic marks above the threshold should play no further part in student selection. This would have positive effects on widening access to applicants from lower socio-economic groups (Powis et al. 2007).



인지적 역량 Cognitive skills

보통 직원을 선발할 때는 지능이 직무수행능력의 가장 좋은 예측인자라고 하지만, 의과대학에서는 인지검사가 이후 수행능력 예측이라는 측면에서 선발에 별 도움이 되지 않는다. 

In the general field of employment selection, it has long been recognised that ‘‘Intelligence is the best predictor of job performance’’ (Ree & Earles, 1992). However, the cognitive skills tests used in medical school selection procedures have usually been found to add little to academic scores in predicting outcomes


반면, GPA에 추가적으로 MCAT점수를 보는 것은 도움이 된다.

On the other hand, in the opinion of the respondents to the survey of Monroe et al. (2013), MCAT does offer added value to GPA scores.



자기소개서 Personal statements

여러가지 문제가 있음.

There are many problems that undermine the usefulness of personal statements for informing medical school selectors.



추천서 Referees’ reports


신뢰도는 높일 수 있으나 타당도가 낮다.

Munro et al. (2012) have asserted that referees’ reports have low validity even when structured to increase reliability.


개별 추천인들의 내적일관성은 높지만, 추천인간 비교에 있어서 일치도가 매우 낮다.

The study found that while the internal consistency of individual referees responses was high, agreement between the referees in terms of individual candidates’ strengths and weaknesses was very low, almost as if they were evaluating different people.


그러나 낮은 점수에 있어서는 일치도가 충분한 것들이 있어서, 부족한 지원자를 걸러내는데는 좋은 방법이 될 수 있다.

Interestingly, there was sufficient agreement on the small number of low scores to suggest that such referees’ reports may be better suited for deselecting weak candidates.



면접 Interviews

어떻게 구조화되느냐에 따라서 신뢰도가 다르다.

An interview is a commonly included component of medical student selection procedures. However their reliability can be low depending, in part, how they are structured.


패널 인터뷰는 MMI에 비해서 신뢰도가 낮다.

Panel interviews have been shown to be rather less reliable than the ‘‘multiple mini-interview’’ (MMI) procedure, first described in the medical student selection context in articles from McMaster University, Canada (see Eva et al. 2004, and subsequently).



비인지역량 검사 Non-cognitive tests


The Newcastle, Australia medical school has focused on the following qualities to develop a battery of tests that has become known as the Personal Qualities Assessment (PQA, www.pqa.net.au):

      • Moral orientation (on a continuum ranging from ‘‘libertarian’’, the rights and needs of individuals versus ‘‘communitarian’’, the expectations and needs of society as a whole).
      • Resilience (vs. inability to cope with stress; emotionally volatile, ‘‘neurotic’’).
      • Self-control (conscientious, orderly, restrained, industrious vs. disorderly, unrestrained, unreliable, impulsive, permissive, anti-social).
      • Involvement (empathic, confident in dealing with others, co-operative, agreeable vs. aloof, narcissistic, disagreeable, manipulative, uncomfortable with others).






PQA 점수분포에서 상 하위 2.5% 안에 들어가는 학생은 선발에서 배제될 것을 고려해볼 수 있다.

The working hypothesis is that those individuals represented in the extreme region of the trait score distribution, the top and bottom 2.5%, should be excluded from consideration for entering medical school.



A model for medical school selection

In conclusion I present a model for medical student selection proposed by Bore et al. (2009), a model which Smyth (1946) would surely have approved as evidenced by his remark:


‘‘We want . . . two independent tests, or sets of tests – the one for ability, the other for character’’.


  • Besides Selecting in for
    • academic ability (academic record)
    • cognitive skills (‘‘aptitude’’ tests)
    • ability to communicate appropriately (interview)
    • good interpersonal skills (interview)
  • Select out (non-cognitive tests) those applicants who
    • demonstrate traits of psychological vulnerability (inability to handle stress appropriately; low resilience)
    • high levels of neuroticism
    • low levels of conscientiousness
    • extreme detachment, extreme emotional involvement
    • high levels of impulsiveness and permissiveness







 2015 Mar;37(3):252-60. doi: 10.3109/0142159X.2014.993600. Epub 2014 Dec 23.

Selecting medical studentsAn unresolved challenge.

Author information

  • 1The University of Newcastle , Australia.

Abstract

Abstract Despite the abundant supply of academically outstanding applicants to medical schools in most countries the regularly recurring debate in the academic literature, and indeed sometimes in the popular media, implies that admissions committees are still getting it wrong in a significant number of instances. How can this be so when our procedures are directed unashamedly at selecting the most highly academically and intellectually qualified students in the expectation that they will make the best doctors? Perhaps it is time for a radical change in emphasis. Instead of endeavouring to differentiate among the top ranks of a pool of outstandingly qualified applicants, the selection effort might be better focused on identifying those potentially unsuitable in terms of their non-academic personal qualities to ensure they do not gain entry. The account that follows isan analysis of the problems of medical student selection and offers a potential solution - a solution that was first suggested in the medical literature 70 years ago, but not adopted. It is the present author's contention that the cycle of debate will continue to recur unless such an approach is pursued.

PMID:
 
25532428
 
[PubMed - in process]


의과대학 학생선발에서 면접관의 편향

Interviewer bias in medical student selection

Barbara N Griffin and Ian G Wilson





OBJECTIVE:

면접관의 성격, 성별, 피면접자와 동성인지 여부, 면접관 훈련 등이 의과대학 학생선발 면접에 얼마나 영향을 주는가를 조사함.

To investigate whether interviewer personality, sex or being of the same sex as the interviewee, and training account for variance between interviewers' ratings in a medical student selection interview.


DESIGN, SETTING AND PARTICIPANTS:

2006년과 2007년 MMI에 참여한 지원자와 면접관을 대상으로 분석.

In 2006 and 2007, data were collected from cohorts of each year's interviewers (by survey) and interviewees (by interview) participating in a multiple mini-interview (MMI) process to select students for an undergraduate medical degree in Australia. MMI scores were analysed and, to account for the nested nature of the data, multilevel modelling was used.


MAIN OUTCOME MEASURES:

면접관이 준 점수, 지원자 점수의 분산

Interviewer ratings; variance in interviewee scores.


RESULTS:

2006년에는 153명의 면접관과 268명의 지원자가, 2007년에는 139명의 면접관과 238명의 지원자가 참여하였음. Agreeableness가 높은 면접관이 유의미하게 높은 점수를 주었으며, neuroticism이 높은 지원자가 낮은 점수를 주었다. 2006년에는 여성 면접관이 더 높은 점수를 주었다. 면접관과 지원자의 성별 조합에 따른 분산은 3.1%에서 24.8%까지 다양하였으나, 평균적 분산은 skills-based training이후에 크게 감소하였다.

In 2006, 153 interviewers (94% response rate) and 268 interviewees (78%) participated in the study. In 2007, 139 interviewers (86%) and 238 interviewees (74%) participated. Interviewers with high levels of agreeableness gave higher interview ratings (correlation coefficient [r] = 0.26 in 2006; r = 0.24 in 2007) and, in 2007, those with high levels of neuroticism gave lower ratings (r = -0.25). In 2006 but not 2007, female interviewers gave higher overall ratings to male and female interviewees (t = 2.99, P = 0.003 in 2006; t = 2.16, P = 0.03 in 2007) but interviewer and interviewee being of the same sex did not affect ratings in either year. The amount of variance in interviewee scores attributable to differences between interviewers ranged from 3.1% to 24.8%, with the mean variance reducing after skills-based training (20.2% to 7.0%; t = 4.42, P = 0.004).


CONCLUSION:

얼마나 점수를 잘 주는가가 면접관의 성격요인이나 성별에 따라 달라지지만 그 영향력은 작았다. 면접관을 무작위 배정하고, 비슷한 남-여 면접관을 배정하고, MMI를 활용하고, skill-based training을 하면 더 향상될 것이다.

This study indicates that rating leniency is associated with personality and sex of interviewers, but the effect is small. Random allocation of interviewers, similar proportions of male and female interviewers across applicant interview groups, use of the MMI format, and skills-based interviewer training are all likely to reduce the effect of variance between interviewers.





Interviewer personality

Interviewers completed the 20-item version7 of the International Personality Item Pool,measuring agreeableness, extraversion, neuroticismconscientiousness and openness to experience. They were asked how accurately each item (eg, “sympathise with other’s feelings”) described them, using a scale from 1 for very inaccurate to 5 for very accurate.



Procedure

Applicants completed a 10-station MMIwhich included one rest station. Each station lasted for 8 minutes and assessed a different quality. For example, Station 1 assessed applicants’ motivation to study medicine and Station 9 assessed communication skills. Interview format also variedsome stations involved sets of questions about past behaviour and experience (behavioural interviews), others presented scenarios or film clips for comment, and at Station 9 applicants were required to explain something to a “patient” (roleplayed by an actor). There was one interviewer per station. Ten applicants attended each MMI session and each interviewer worked for two sessions (ie, each interviewed 20 applicants).


All interviewers attended a 3-hour training session a month before the MMI. In 2006, the training was predominantly information-based, involving 2 hours of lecture about the rationale for including interviews in medical school student selection, information about the practical details of the MMI and how to score an applicant, the basics of behavioural interviewing, and instruction on avoiding bias. After a short break, the interviewers spent the remaining time in small groups practising using the rating scale and being given information about two MMI stations, with each small group studying different stations. 


Feedback from interviewers indicated that they wanted more skills training. Therefore, the 2007 training sessions were restructured to be predominantly skills-based trainingInterviewers practised rating “simulated” interviewees, comparing outcomes and discussing examples of good and bad responses, and they interviewed trainers and each other to learn to probe appropriately. Notably, this training used the actual content of four of the nine stations (Stations 1, 3, 5 and 6). In addition, interviewers attended a half-hour briefing immediately before interviewing at the 2007 MMI sessions, when they were given individual training on the content of the specific station they would be attending.


Analysis

It is essential to use multilevel modelling to account for the nested nature of the interview datasets on which studies such as ours are based.9 When interviewees are rated by a subset of interviewers, they are “nested” under that subset. Analyses that disregard this multilevel component ignore dependencies between variables, artificially reduce standard errors and introduce correlated prediction errors. Not only does this violate statistical assumptions (eg, independence), but it increases the chance of finding significant results related to interviewer variables and decreases the chance of finding significant results related to individual (applicant) differences. Hierarchical linear modelling was therefore used (HLM 6.6 [SSI Scientific Software International, Lincolnwood, Ill, USA]), in addition to correlations and t tests for comparison of means. The threshold of significance was set at P = 0.05. The research was approved by the institution’s Human Research Ethics Committee.









 2010 Sep 20;193(6):343-6.

Interviewer bias in medical student selection.

Author information

  • 1Psychology, Macquarie University, Sydney, NSW, Australia. barbara.griffin@mq.edu.au

Erratum in

  • Med J Aust. 2010 Oct 18;193(8):486.

Abstract

OBJECTIVE:

To investigate whether interviewer personality, sex or being of the same sex as the interviewee, and training account for variance between interviewers' ratings in a medical student selection interview.

DESIGN, SETTING AND PARTICIPANTS:

In 2006 and 2007, data were collected from cohorts of each year's interviewers (by survey) and interviewees (by interview) participating in a multiple mini-interview (MMI) process to select students for an undergraduate medical degree in Australia. MMI scores were analysed and, to account for the nested nature of the data, multilevel modelling was used.

MAIN OUTCOME MEASURES:

Interviewer ratings; variance in interviewee scores.

RESULTS:

In 2006, 153 interviewers (94% response rate) and 268 interviewees (78%) participated in the study. In 2007, 139 interviewers (86%) and 238 interviewees (74%) participated. Interviewers with high levels of agreeableness gave higher interview ratings (correlation coefficient [r] = 0.26 in 2006; r = 0.24 in 2007) and, in 2007, those with high levels of neuroticism gave lower ratings (r = -0.25). In 2006 but not 2007, female interviewers gave higher overall ratings to male and female interviewees (t = 2.99, P = 0.003 in 2006; t = 2.16, P = 0.03 in 2007) but interviewer and interviewee being of the same sex did not affect ratings in either year. The amount of variance in interviewee scores attributable to differences between interviewers ranged from 3.1% to 24.8%, with the mean variance reducing after skills-based training (20.2% to 7.0%; t = 4.42, P = 0.004).

CONCLUSION:

This study indicates that rating leniency is associated with personality and sex of interviewers, but the effect is small. Random allocation of interviewers, similar proportions of male and female interviewers across applicant interview groups, use of the MMI format, and skills-based interviewer training are all likely to reduce the effect of variance between interviewers.

PMID:

 

20854239

 

[PubMed - indexed for MEDLINE]


다면인적성면접: 같은 개념, 다른 접근법

Multiple mini-interviews: same concept, different approaches

Mirjana Knorr & Johanna Hissbach



The comprehensive literature on the reliability of different types of MMI and their efficiency in the use of interviewing time strongly supports the superiority of the MMI method over conventional interview methods.


Because of the many factors that can be varied, results are not easily transferable from one MMI to another. Based on the published research, we can give some recommendations for aspects of design with regard to reliability values and costs. It is important to note that measures to increase reliability are often accompanied by an increase in costs. To date, only vague statements can be made concerning validity.


Recommendations for reliability

  • Increasing the number of stations, the number of interviewers per station or the number of items will enhance an MMI's reliability. Raising the number of stations is the most advisable of these three options.
  • A station time of 5–6 minutes is sufficient.
  • The use of skills-based rater training that includes mock interviews can improve rater agreement.
  • The use of normative anchored rating scales rather than descriptive adjectives (i.e. ‘poor’ or ‘outstanding’) will encourage raters to make use of the full rating scale.
  • Stations that are too easy or too difficult should be excluded because they do not allow for the differentiation of candidates according to ability.
  • A pleasant atmosphere for candidates should be ensured.
  • With reference to the variation in station type, research so far suggests there are no differences between one-to-one and interactive stations in reliability.
  • The addition of written tasks (i.e. questionnaires or writing stations) does not guarantee an increase in reliability.


Recommendations for validity

  • Users should be aware that an MMI is best suited to the assessment of factors that are not captured by established admissions criteria such as GPA and admission tests.
  • The breadth and narrowness of constructs for MMI attributes and external criteria should be considered.


Recommendations for costs

  • The extra costs implied by station development and the use of actors should be considered in any change from a conventional interview format to an MMI format.
  • Written tasks can be used to save costs of interviewers or actors.
  • The use of an internet-based MMI (iMMI) to save facility- and travel-related costs should be considered.


OBJECTIVES:

많은 교육기관에서 기존의 면접을 MMI로 대체하고 있다. MMI는 신뢰도가 높으며 면접자의 편향에 따른 영향이 적다. MMI이 각 기관의 상황에 따라 다르게 적용가능하기 때문에 어떤 상황에서 최고의 효과를 발휘하는가가 의문으로 남는다. 

Increasing numbers of educational institutions in the medical field choose to replace their conventional admissions interviews with a multiple mini-interview (MMI) format because the latter has superior reliability values and reduces interviewer bias. As the MMI format can be adapted to the conditions of each institution, the question of under which circumstances an MMI is most expedient remains unresolved. This article systematically reviews the existing MMI literature to identify the aspects of MMI design that have impact on the reliability, validity and cost-efficiency of the format.


METHODS:

Three electronic databases (OVID, PubMed, Web of Science) were searched for any publications in which MMIs and related approaches were discussed. Sixty-six publications were included in the analysis.


RESULTS:

40개 연구가 신뢰도에 대하여 보고하였다. 스테이션당 평가자를 늘리는 것보다 스테이션 수를 늘리는 것이 일반적으로 신뢰도를 높이는데 효과적이다. 그 외에 다른 것으로는 너무 쉬운 문항을 제외하는 것, normative anchored rating을 사용하는 것, skill-based 평가자 훈련을 실시하는 것 등이 있다. 타당도에 대해서는 31개의 연구가 있었는데, 연구 설계와 무관하게 MMI와 학업척도와의 관계는 매우 작거나 없었다. McMaster 의과대학의 MMI는 의과대학 및 면허시험 수행능력을 예측하였다. 구인타당도에 대한 결과는 아직 뚜렷하지 않다. 비용과 연관되는 가장 핵심 요소는 문항 개발과 연기자에게 지급되는 비용이었다.

Forty studies reported reliability values. Generally, raising the number of stations has more impact on reliability than raising the number of raters per station. Other factors with positive influence include the exclusion of stations that are too easy, and the use of normative anchored rating scales or skills-based rater training. Data on criterion-related validities and analyses of dimensionality were found in 31 studies. Irrespective of design differences, the relationship between MMI results and academic measures is small to zero. The McMaster University MMI predicts in-programme and licensing examination performance. Construct validity analyses are mostly exploratory and their results are inconclusive. Seven publications gave information on required resources or provided suggestions on how to save costs. The most relevant cost factors that are additional to those of conventional interviews are the costs of station development and actor payments.


CONCLUSIONS:

MMI연구를 분석하여 신뢰도가 높고 비용-효과적인 MMI에 대한 제언을 할 수 있으나, 아직 중요한 요소들이 모두 밝혀진 것은 아니다. dimensionality and construct validity, the predictive validity of MMIs  등에 대한 연구가 필요하다.

The MMI literature provides useful recommendations for reliable and cost-efficient MMI designs, but some important aspects have not yet been fully explored. More theory-driven research is needed concerning dimensionality and construct validity, the predictive validity of MMIs other than those of McMaster University, the comparison of station types, and a cost-efficient station development process.




간략한 역사와 현황(2002년부터 시행, 미국/캐나다/호주/영국/유럽/중동 등)

Over the past 10 years a specific form of admission interview, the multiple mini-interview (MMI), has enjoyed increasing popularity in the health sciences field following the criticism of conventional admission interviews for their unsatisfactory reliability.[1, 2] Originally introduced at McMaster University, in Hamilton, Ontario, Canada, in 2002,[3] the MMI has found widespread application at different medical schools and in other health sciences programmes in the USA, Canada, Australia and the UK, as well as in other European and Middle Eastern countries.



핵심 특징    

The core characteristic of the MMI is a multiple independent sampling methodology.[4] In a manner similar to that of an objective structured clinical examination (OSCE), each candidate rotates through several short standardised interview stations.[5] Thereby, a candidate has several independent encounters with different interviewers instead of one single panel interview.


신뢰도 향상이 목적임. 측정하려는 구인에 맞춰서 변형가능함. 즉, MMI는 구체적으로 어떤 것을 측정하기 위한 척도가 아니라 측정방법 중 하나라고 볼 수 있음. 

The MMI aims to enhance reliability by taking into account the problem of interviewer bias, as well as the context specificity of a candidate's performance.[5, 6] The number of stations and interviewers, the station content and the scoring system are flexible and vary considerably among different institutions. Most notably, an MMI is adjustable to the constructs being measured, although most authors design their MMIs to capture a set of dimensions predominantly described in the literature as ‘non-cognitive’ attributes. Consequently, the MMI is an assessment method or process rather than a clearly defined measure.[7-10]


각 기관마다 도입하고 있는 MMI는 다양하고, 본 연구의 연구질문은 아래와 같음. 

From a practical point of view, especially for an institution that is considering implementing an MMI into its admissions procedure, a highly relevant issue concerns which aspects of the format should be considered in order to design a successful (i.e. reliable, valid and cost-efficient) MMI. In the context of the wide range of approaches to the MMI, our goal is to shed light on responses to the following questions:

    • Which factors can be varied in an MMI design?
    • Which variations of these factors contribute towards an MMI that is successful in terms of reliability, validity and cost-efficiency?


We address these questions in a systematic review of the MMI literature. Based on our findings, we provide recommendations for the design of an MMI and an outlook on directions for future research. Therefore, this article takes a perspective that differs from that of another recently published systematic review which concentrated on the common features of MMIs.[11] By contrast, this review aims to add to the existing literature by explicitly focusing on the differences among approaches to the MMI and the effects of variations




Design of the MMI

The attributes that are to be measured, the types of station used, the details of the MMI process, and finally the scoring system specify an MMI design. All of these factors provide possibilities for variation. In order to answer our first research question (‘Which factors can be varied in an MMI design?’), we looked at the different variations that were reported in the literature.


평가하는 인적특성의 수 및 종류 Attributes

The development of a programme-specific MMI starts with the definition of the characteristics that are to be measured. Only a few authors have described their approach to identify these core characteristics through literature research and stakeholder analysis.[14, 15] In the reviewed literature, lists of core characteristics range in length between three[16, 17] and 19[18] attributes. Some attributes are more commonly used (e.g. communication skills) and others are more programme-specific (e.g. leadership potential[19]).


스테이션 수행업무 Stations

Station development is based on the selected set of attributes. Usually, candidates are asked to discuss a topic or dilemma with an interviewer, to answer standardised questions, to interact with a trained actor or to collaborate with one or several other candidates. Some MMIs also include problem-solving tasks,[15] presentations,[20-22] prioritising tasks,[22] creative tasks,[23] film clips,[24, 25] writing samples[26] or debriefing stations in which the candidate's performance at a previous station is discussed.[12, 13]


절차(스테이션 수, 총 면접일 수, 동시진행 circuit 수, 입실 전 시간, 입실 후 시간) MMI process

The usual number of stations varies between six[23, 27] and 12.[8, 9, 19, 28-34] There have also been reports of MMIs with only three stations. All of these MMIs were selecting candidates at a later point in their careers as they applied for residency or junior doctor posts.[16, 17, 20, 35] The number of interview days ranges between one[33, 36] and 11[37] with up to four sets per day[9, 38, 39] and up to seven simultaneous circuits per set.[9] Candidates are given between 30 seconds[15] and 3 minutes[20] to read the scenario and prepare for the task. Station duration is between 5 minutes[16, 17, 22, 39, 40] and 15 minutes.[21]


평가 시스템 Scoring system

A candidate's performance is usually rated on 4-[41-46] to 10-point[21, 32-34, 47-49] anchored Likert scales by one[3-5, 7, 12, 13, 15, 18, 25, 29, 30, 32-34, 36, 37, 39, 41-48, 50-57] or two[5, 12, 13, 15-18, 20, 21, 24, 30, 33, 35, 38, 39, 53, 55, 58-60] interviewers at each station. The station score is either measured on one single scale[3, 5, 16, 17, 19, 22, 26, 27, 30, 34, 45, 48-50, 52, 61] or is formed by the summation or aggregation of several subscales.[4, 5, 7, 12, 13, 20, 21, 23-25, 32, 33, 37-39, 41-44, 46, 47, 51, 54, 55, 57, 58, 60, 62, 63] In some cases, these additional subscales are meant to help raters make a decision on the overall station score, but are not considered in the total score.[15, 29] Subscales can be applied at every station[4, 12, 21, 23, 24, 33, 38, 39, 47, 55, 58] or they can be station-specific.[4, 15, 37, 38, 41-44, 46, 57, 58, 60, 62]


The list of variations leads to the second question (‘Which variations of these factors contribute towards an MMI that is successful in terms of reliability, validity and cost-efficiency? ). The following sections give an overview of the possible ranges of reliability, validity and costs as reported in the literature, as well as findings related to the impact of design differences.









Reliability

신뢰도 연구를 위한 방법들 Forty of the studies reviewed reported an estimation of reliability. In 22 of these, calculations were based on generalisability theory. Generalisability theory is a framework that combines assumptions of classical test theory with analysis of variance procedures.[64] Within this framework, generalisability studies (G studies) provide estimates of variance components and measurement error which form the basis for the calculation of the overall reliability (generalisability [G] coefficient). Based on G study results, decision studies (D studies) estimate the impact of different hypothetical MMI designs on reliability. Additional measures reported in the reviewed literature were correlations, intraclass correlations (ICCs) and Cronbach's alpha. Table 1 gives a structured overview of reported reliability values sorted by different types of reliability.


G study 연구결과 A closer look at the reported G study results shows that the proportion of variance attributable to candidate differences varies between 10%[19] and 74%,[59] although in most studies candidates accounted for < 30% of the variance. As an MMI aims to detect systematic differences between candidates, the goal for an ideal MMI would be to increase this proportion of intended variance and to reduce unwanted variance (i.e. rater, station, error).


신뢰도 향상시키기 위한 방법 The influence of MMI design variations on reliability was analysed in various studies. D studies show that overall generalisability can be increased by adding more stations or more raters to each station.[41] More specifically, increasing the number of stations appears to have greater impact on reliability than increasing the number of interviewers within each station.[5, 20, 33, 60, 65] Additionally, Hanson et al.[4] found that reducing the number of items nested within raters had less impact on reliability than reducing the number of stations. One way to raise the number of stations without increasing the overall interviewing time is by reducing station time.[41] To date, two studies have shown that 5–6 minutes can be sufficient to reliably assess a candidate's performance.[49, 52] Another possible method of increasing reliability without raising the number of active stations is to include questionnaires such as the Judgement and Decision-making Questionnaire and the Biographical Questionnaire used in MOR.[12, 13] However, reliability could not be increased by the addition of a writing station.[26]


신뢰도 향상시키기 위한 방법, 영향주인 요인 As Uijtdehaage et al.[19] point out, more factors other than the number of stations and raters per station contribute to the differences in reliability estimations. For example, the exclusion of very easy stations, the change to a normative anchored rating scale, and a less intimidating atmosphere may enhance overall generalisability.[19] Interviewer bias can be significantly reduced by changing from an information-based to a more skills-based form of rater training.[25] A comparison of reliability values for two different station types (one-to-one versus interactive) showed that both achieved similar reliabilities if each station type was represented by the same number of stations.[37] Finally, by comparing a fixed-effects design (G = 0.90) with a random-effects design (G = 0.68) Sebok et al.[60] demonstrated that differences in reliability estimations can also stem from different assumptions in the statistical model.






Validity

Of 31 studies that reported validity measures, 27 provided indicators for the criterion-related validity of the respective MMI. Dimensionality was analysed in five studies. Content and face validity were usually established by blueprinting processes and evaluation surveys.


준거관련 타당도 All reported criterion-related validities are summarised in Table 2. The criteria can be structured into three main clusters: psychological constructs, other measures relevant for admission, and performance measures (in-programme or post-graduation). Only McMaster University provides results for all three of these clusters (see Table 2). Siu and Reiter explained their observation that correlations between MMI results and different performance measures trended to an increase over time by the fact that later assessments put a greater emphasis on ‘non-cognitive’ domains.[66]


탐색적 요인분석 The literature reviewed included two studies that described exploratory factor analyses (EFA) based on MMI subscores. Lemay et al. reported a 10-factor solution in which each factor represented one of 10 stations.[47] Hecker et al. performed three separate factor analyses, all of which included age and grade point average (GPA).[58] They found a three-factor solution (moral and ethical values, interpersonal ability, academic ability) based on station-specific subscales, a single-factor solution based on communication skill scores assessed at all stations, and a two-factor solution (economics, interpersonal ability) based on critical thinking skills also assessed at all stations.[58] Although MMI design and methods differed, both studies suggest a multidimensional structure for their MMI.


IRT를 활용한 분석 A second branch of research focused on item response theory (IRT) to examine the dimensionality of different MMIs. These studies report a good fit of questions or items to an assumed unidimensional construct. This construct was defined as ‘pre-professionalism’ and ‘entry-level reasoning skills in professionalism’,[41, 43, 44] ‘latent professional potential’[18] or simply ‘professionalism.[60] However, the exact nature of the underlying attributes remains unknown.[60]











Cost-efficiency

비용 대비 효과적이다. Required costs and resources were mentioned in seven of the 66 publications reviewed. Cost analyses suggest that, compared with traditional interview formats, the MMI format is especially efficient in reducing interviewing time (i.e. the number of hours required to interview all candidates). It thus allows a larger number of candidates to be interviewed in a shorter time period.[5, 21, 22, 32, 67]


추가 비용 발생 요인 Additional costs most notably arise from the development of the blueprint and the MMI stations.[32] Researchers at McMaster University estimated 3 hours and costs of US$50 for the development of a single station.[32] In a recently published study, Hissbach et al.[68] reported much higher costs associated with their procedure. Station development represented a significant cost factor and implied a cost of approximately US$2000 per station. The high expenses reflect the station development time of 40 hours per station including test runs.[68]


비용 절감 수단 Measures to cut costs include the reduction of station time,[5] the reduction of the number of stations,[5] and the implementation of an internet-based version of the MMI (iMMI) to save costs for international applicants.[46] The inclusion of writing stations allows more candidates to be interviewed without increasing interviewer hours.[26] However, costs will increase if an MMI includes stations with simulated patients.[5, 22]




Discussion

From its introduction, the MMI procedure was intended to be adjustable to the requirements and conditions of different institutions and programmes. As with every assessment method, it is interesting to learn more about the conditions under which it works best. Consequently, our goal was to take a closer look at the impact of design changes on reliability, validity and cost-efficiency based on 66 studies.


With regard to our first question, the analysis revealed great variability in MMI designs in terms of attributes, station types, process details (number of stations, sets, circuits and days, time to prepare, station duration), and scoring system (type and usage of scales and subscales, scale range, number of raters) among institutions and also between subsequent years at the same institution. The wide range demonstrates that the MMI is indeed adjustable in many aspects.


Our second question was concerned with the impact of design changes on reliability, validity and cost-efficiency. Based on numbers of studies, reliability is the most studied of these three criteria so far, followed by validity and costs. In consequence, most conclusions can be drawn with regard to reliability, but there are further aspects that could be explored. Analyses of validity and costs provide some insight, but also raise further questions. In the subsequent paragraphs we will discuss each of the three aspects.


신뢰도 Reliability

Reliability is a strong point of the MMI procedure. Despite considerable differences in design, most values of internal consistency as well as overall generalisability are satisfactory, although a possible publication bias must be kept in mind. The multi-station approach, which is the core element of all MMI formats, allows a satisfactory level of reliability to be achieved by raising the number of stations. In addition, the MMI literature already provides useful information on the impacts of other aspects of MMI design on reliability, such as the number of items, station time, station type, station difficulty and type of rating scale.


적절한 스테이션 수에 대한 결론은 내리기 어려움 Nevertheless, as a result of the limited comparability of reliability values between studies, a general recommendation for a minimum number of stations cannot be derived from the MMI literature. For instance, it is difficult to give the exact reasons why a four-station MMI[4] with one rater per station yielded a higher generalisability value than an eight-station MMI.[41] Given the large impact of different model assumptions on reliability estimations, as demonstrated by Sebok et al.,[60] one must be very cautious in interpreting reliability values from different studies. Similarly, the MMI designs lead to differences in the separation of variance components. Therefore, it is also difficult to compare proportions of variance between studies. For these reasons, reports that concentrate on a specific MMI design and analyse how systematic changes influence reliability, as does the study by Uijtdehaage et al.,[19] are of high value.


평가자간 신뢰도 Results concerning other types of reliability show a mixed picture. Inter-rater reliabilities for MMI designs that include two raters at each station are moderate to satisfactory. The selection of raters and the quality of their training may be the most relevant factors in terms of a positive influence on inter-rater reliability and a reduction in systematic and unsystematic rater variance.


스테이션간 신뢰도, 동일 특성에 대한 문항간 신뢰도 Low inter-station reliabilities, as well as low inter-item reliabilities within attributes, are typically explained by the content and context specificity of a performance.[5, 41, 47] However, estimates differ between studies and attributes. The higher reliability values for communication skills than for teamwork reported by Dowell et al.[37] may be explained by the fact that communication skills were measured at more stations than teamwork.


Based on high inter-item reliabilities for items within a station, some authors suggested that the overall station score might be sufficient and subscores might not be needed.[24, 55] This raises the question of whether it is possible to measure several distinct constructs at one station after all.



타당도 Validity

학업적 척도보다 비학업적 척도와 상관 높음. 무언가 다른걸 측정하고 있다 The tendency towards weak to non-relationships with predominantly academic measures (e.g. GPA, science subtests of the Medical College Admission Test [MCAT]) and weak to moderate correlations with less academic measures (e.g. the MCAT verbal subtest, other admission tools) suggests that MMIs cannot replace conventional admission tools, but, rather, measure something different.


McMaster를 제외하고는 예측타당도에 대해 알려진 바는 적음. 다르게 나오는 연구도 있음. While several publications from McMaster University support the incremental validity of the McMaster MMI in predicting in-programme and licensing examination performance,[3, 7, 8, 50] little is known about the predictive validity of other MMIs. Higher correlations between the MMI and the CLEO (Considerations of the Legal, Ethical and Organisational Aspects of Medicine)/PHELO (Population Health and Ethical, Legal and Organisational Aspects of Medicine) area of the Medical Council of Canada Qualifying Examination (MCCQE) do not seem surprising given that the McMaster MMI puts an emphasis on ethical decision making.[7] The non-significant correlations between MMI results and licensing examinations reported by Hofmeister et al.[33] seem to differ from McMaster findings. These results may reflect differences in attributes measured by the specific MMI, but differences in sample types must also be considered.


구인타당도, 차원성(dimensionality). EFA보다 CFA가 더 적합할 수 있음. Analyses that aimed to investigate the construct validity and dimensionality of different MMIs are characterised by their explorative nature. Broadness of constructs is a relevant point that needs to be considered, as Griffin and Wilson demonstrated that several Big Five sub-facets showed significant correlations with MMI performance even when the superordinate factor did not.[63] Because of the possibility of higher- and lower-order factors, the multidimensional structure found in EFAs and the unidimensional construct described in IRT studies should not be interpreted as contradictory. However, both EFA studies have methodological weaknesses (e.g. data that would allow checking for cross-loadings are missing; age and GPA have been included in the analysis) and confirmatory factor analysis (CFA) would be a more suitable approach to the testing of prior assumptions about dimensionality.


전체적으로 validity에 대한 확실한 답은 없음. Overall, there is no comprehensive picture of MMI validity as yet. Researchers need to draw on definitions of the measured attributes and their theoretical assumptions to explain why an MMI result is related to an external measure. Therefore, to derive further general recommendations, more theory-driven research on the construct and predictive validity of different MMIs is needed.



비용-효과성 Cost-efficiency

문항개발에 들어가는 비용이 큼 Station development was identified as an important additional resource requirement for an MMI.[32] The comparison of station development costs reported by McMaster University and by Hamburg Medical School reveals a large gap. Given that McMaster University focuses on one attribute (ethical decision making) and additionally considers communication skills and collaborative ability,[7] it may be that institutions that intend to measure a broader construct face much higher costs for station development. However, the issue of how reliable and valid stations can be developed efficiently should be further explored.


신뢰도 향상을 위해 스테이션을 늘리면 그만큼 비용 증가. 지필고사를 활용하면 비용은 줄일 수 있을지 몰라도 신뢰도가 높아지지는 않음. Reliability, validity and costs are interdependent factors. If the number of stations is increased to enhance reliability or to measure a broader set of attributes, the costs of station development and staff will rise accordingly. The inclusion of writing stations might save the costs incurred by the use of interviewers for an additional interview station, but does not seem to increase reliability.[26] Given that the use of simulated patients represents another additional cost factor,[32] it would be interesting to learn more about the value of simulation-based stations in terms of reliability and validity in comparison with interview stations.



연구 방향 Directions for future research

'인지적'역량과 '비인지적'역량을 나누는 명확한 기준은 없음. 학업적-비학업적 구분이 더 나을 수 있음. The MMI literature is still vague in the terminology it uses to describe what it is that MMIs measure. Originally, Eva et al. labelled these competencies ‘non-cognitive attributes’[5] in order to draw a line between an MMI and conventional admissions criteria that measure ‘cognitive attributes’.[7] However, there is no definitive method of assigning specific attributes to the ‘cognitive’ or the ‘non-cognitive’ domain and the use of these terms has already been questioned by others.[43, 69] The distinction between academic and non-academic attributes may be more adequate, although it would still imply a strict dichotomy. Alternatively, as IRT studies suggest, a very broad comprehensive construct of ‘potential for professionalism’, embracing various specific MMI-tested characteristics, might be assumed.[18, 43, 60]


어떤 것을 측정할 것인가. Another question relates to the nature of the attributes being measured and of what rating scales assess. Eva et al.[7] point out that these competencies should not be thought of as traits because of the context specificity[6] of a behaviour. As latent state–trait theory suggests, the behaviour of a person in a given situation depends on the characteristics of the person, the characteristics of the situation and the interaction between these two sets of characteristics.[70] Stations may be understood as representing different situations. Alternatively, they may be seen as representing different methods if the MMI represents a multitrait–multimethod approach.[71] The variety in the application of measurement scales and different approaches to the analysis of dimensionality (e.g. one station measuring only one distinct attribute versus several stations measuring several attributes) indicates uncertainty about what information can be derived from the behaviours shown in MMI stations.


Dowell et al. state that ‘if overall MMI scores do predict a significant portion of the variance in medical school performance, the issue of construct validity will become less important’.[37] Nevertheless, both predictive and construct validity benefit from prior theoretical assumptions that can explain high or disappointing values. If a priori assumptions about the nature and the theoretical connection between attributes exist, structural equation modelling (SEM) might provide a useful statistical method with which to test these assumptions.


The ‘construct validity problem’ which is discussed in the assessment center (AC) literature raises similar questions. Given the similarities between ACs and MMIs (multi-station approach, behavioural ratings), the AC literature might provide a good starting point for the further analysis of MMI construct validity (see Bowler and Woehr[72] for a recent meta-analysis and Lance[73] for a recent review on AC construct validity).




 2014 Dec;48(12):1157-75. doi: 10.1111/medu.12535.

Multiple mini-interviewssame conceptdifferent approaches.

Author information

  • 1University Medical Centre Hamburg-Eppendorf, Hamburg, Germany.

Abstract

OBJECTIVES:

Increasing numbers of educational institutions in the medical field choose to replace their conventional admissions interviews with amultiple mini-interview (MMI) format because the latter has superior reliability values and reduces interviewer bias. As the MMI format can be adapted to the conditions of each institution, the question of under which circumstances an MMI is most expedient remains unresolved. This article systematically reviews the existing MMI literature to identify the aspects of MMI design that have impact on the reliability, validity and cost-efficiency of the format.

METHODS:

Three electronic databases (OVID, PubMed, Web of Science) were searched for any publications in which MMIs and related approacheswere discussed. Sixty-six publications were included in the analysis.

RESULTS:

Forty studies reported reliability values. Generally, raising the number of stations has more impact on reliability than raising the number of raters per station. Other factors with positive influence include the exclusion of stations that are too easy, and the use of normative anchored rating scales or skills-based rater training. Data on criterion-related validities and analyses of dimensionality were found in 31 studies. Irrespective of design differences, the relationship between MMI results and academic measures is small to zero. The McMaster University MMI predicts in-programme and licensing examination performance. Construct validity analyses are mostly exploratory and their results are inconclusive. Seven publications gave information on required resources or provided suggestions on how to save costs. The most relevant cost factors that are additional to those of conventional interviews are the costs of station development and actor payments.

CONCLUSIONS:

The MMI literature provides useful recommendations for reliable and cost-efficient MMI designs, but some important aspects have not yet been fully explored. More theory-driven research is needed concerning dimensionality and construct validity, the predictive validity of MMIs other than those of McMaster University, the comparison of station types, and a cost-efficient station development process.

© 2014 John Wiley & Sons Ltd.


의과대학에서의 성공과 관련된 요인들: 체계적 문헌고찰

Factors associated with success in medical school: systematic review of the literature

Eamonn Ferguson, David James, Laura Madeley





Summary points

이전 학업성취도는 의과대학 성취의 - 완벽하지는 않더라도 - 썩 괜찮은 예측인자이다. 

Previous academic performance is a good, but not perfect, predictor of achievement in medical training

학부교육과정 성취도의 23%, 졸업후 성취도의 약 6% 분산을 설명한다.

It accounts for 23% of the variance in performance in undergraduate medical training and 6% of that in postgraduate competency

장기간의 전향적 코호트 연구, 케이스-대조군 연구를 통해 면허 취득 후 성공에 대한 예측인자를 조사해야 할 필요가 있으며, 신뢰도/타당도를 갖춘 모델 개발이 필요하다.

Long term prospective cohort studies or case-control studies are needed to examine predictors of success after qualification, and reliable, valid, and fair models of medical job competence need to be developed

생활스타일, 면접, 인종, 성별, 자기소개서, 추천서 등의 중요성에 대한 연구는 거의 없으나 '전략적 학습스타일', '백인종', '여성'이 의과대학의 성공과 관련이 있다.

Relatively little research has been done into the importance of learning styles, interviews, ethnicity, sex, personal statements, and references, but a strategic learning style, white ethnicity, and female sex are associated with success in medical training


영국에서 의과대학 학생선발은 최근 몇 년간 많은 관심의 대상이었다. 일부 연구자들은 백인, 여성, 독립학교(independent school) 졸업생을 선호하는 경향이 있다고 주장하고 있다. Laura Spence와 같은 사례를 보면, 의사의 선발/훈련/확인(validation)에 대한 대중의 의문을 알 수 있다. 의과대학 학생선발은 logistic한 측면에서 의과대학 학생선발은 매우 불만족스러운데, 5000명의 선발을 위해서 10000명의 학생이 40000개의 지원서를 작성하게 되며, '운'이 큰 요소로 작용하고 있다.

Selection of medical students in the United Kingdom has come under intense scrutiny in recent years. Some authors have claimed that discrimination occurs in favour of white applicants, female applicants, and applicants from independent schools.1 2 3 4 5,w1,w2High profile cases, such as that of Laura Spence, have led to a public questioning of the selection, training, and validation of doctors. The process of selecting medical students is unsatisfactory from a logistical point of view (approximately 40 000 applications are allowed from 10 000 students for just 5000 places) and leads to chance playing a big part and to apparent unfairness.


의과대학이 학생선발을 위해서 사용하는 기준들은 나라를 불문하고 대개 비슷한데, 학업능력, 의학에 대한 안목(경력 포함), 교과외 활동과 흥미, 인성, 동기, 언어와 의사소통 등이다. 그러나 이들을 사용하는 근거는 무엇인가?

The criteria medical schools use to select future doctors are similar across the country.4 They include academic ability, insight into medicine (including work experience), extracurricular activities and interests, personality, motivation, and linguistic and communication skills. But what is the evidence base for using these criteria?


The Committee of Deans and Heads of Medical Schools 은 의과대학의 성공 예측인자를 찾기 위한 체계적 고찰을 시작하였다. 여기서는 그 고찰의 결과를 보고하고자 한다. 의과대학 선발에서 사용되어온 8개의 준거에 대한 예측타당도에 대한 자료를 조사하였다. 

The Committee of Deans and Heads of Medical Schools commissioned a systematic review of factors believed to be significant predictors of success in medicine. We report the results of that systematic review, which was carried out from June to August 2000. The review examines data on the predictive validity of the eight criteria that have been studied in relation to the selection of medical students: 

  • cognitive factors (previous academic ability), 
  • non-cognitive factors (personality, 
  • learning styles, 
  • interviews, 
  • references, personal statements), and 
  • demographic factors (sex, ethnicity). 

이전 학업능력, 자기소개서, 추천서, 면접 등은 선발에서 전통적으로 사용되어왔는데, 미래 수행능력 예측을 얼마나 잘 하는걸까? 성격이나 학습스타일은 전통적으로 사용되지는 않았지만, 사용할 가치가 있을까?

Previous academic ability, personal statements, references, and interviews are all traditionally used in selection, but how good are they at predicting future performance? Personality and learning styles are not traditionally used, but should they be?



Methods

Search criteria

We used three databases to conduct literature searches: Medline OVID citations, Web of Science, and PsycLIT. We used the search criteria “medical school” or “student admissions” or “selection” and “medical school student performance” and “career outcome.” We initially used combinations of the key words or phrases “medical school,” “admissions,” “selection,” “medical education,” “predictors,” and “medical student.” We conducted additional searches using combinations of the above key words with the key words “personality,” “interviews,” “learning styles,” “gender,” “references,” “resumes,” “personal statements,” and “ethnicity.”


On the basis of their propensity to generate hits, we examined three journals—Medical Education, Journal of Medical Education, and Academic Medicine—for further relevant articles. Finally, we scrutinised the reference sections of relevant articles identified by these search strategies for further relevant publications. We aimed to identify papers on the predictive validity of as many aspects as possible of the process of selecting medical students.


For the systematic review we used a mixture of traditional techniques of qualitative review and more quantitative methods of meta-analysis. We included studies in the review if they had a clear description of the predictors used and their quantification, a clear description of the outcome measures, and an acceptable statistical method of analysis of the relation between predictors and outcome measures. For indicators of previous academic performance, we examined only studies that used nationally or internationally accepted academic indicators (for example, GCSE grades, A level grades, grade point average (GPA) scores, medical college admission test (MCAT)). For other predictor measures, such as personality profiles, we explored only studies reporting data based on validated indices. From the studies thus identified, we selected only those directly relevant to medicine; we excluded studies relating to nursing and physiotherapy training, for example. Finally, we used meta-analysis only when a sufficient quantity of systematic data was available.


Medline produced 157 hits, Web of Science produced 550 hits, and PsycLIT produced 413 hits. Of the articles on Medline, 19% also appeared on Web of Science and 5% appeared on PsycLIT. Sixty two papers reported studies of previous academic performance,w3-w64 and 31 papers contained information on personality.w10,w13,w17,w18,w20,w24,w30,w38,w40,w48,w63,w65-w84 We found 16 papers on sex,w1,w2,w10,w27,w42,w59,w85-w94 and 14 papers related to ethnicity.w1,w34,w39,w42,w45,w46,w55,w66,w92,w94-w98 Eleven papers described studies on motivation or study habits,w1,w28,w91,w99-w106 and 16 papers examined the predictive validity of interviews.w27,w30,w72,w76,w88,w107-w117 We identified two papers on the predictive validity of personal statementsw10,w27 and one paper on the predictive validity of references.w110


Sufficient data were available on measures of previous academic performance for us to be able to perform a meta-analysis and to examine two broad areas of achievement in medical training (undergraduate and postgraduate). Studies relating admission criteria to undergraduate assessments included all the years of undergraduate training, whereas the studies of postgraduate performance mainly focused on internship ratings (that is, the first year after qualification). For the other predictors, either insufficient data were available for meta-analysis (ethnicity, sex, learning styles, personal statements) or a variety of different assessment tools were used (personality), making a systematic comparison across studies difficult.


The indicators of previous academic performance ranged widely in the types of assessment and the response formats used. However, it seemed reasonable to examine these assessments as a whole for three reasons. 

    • Firstly, all are used in the selection of medical students, and some assessment of their overall predictive power is important. 
    • Secondly, the meta-analysis examining undergraduate medical training was to be general, combining preclinical and clinical assessments. Different aspects of previous academic performance might be differentially predictive at different stages of training,w26 so combining all the indices seemed more appropriate.
    • Finally, good evidence exists that diverse measures of cognitive ability are all statistically related to general intelligence.6


Statistical analysis

We conducted the quantitative analyses by using hierarchical linear modelling (see bmj.com).7 

    • Level 1 variables were the correlation coefficients between predictors and outcomes, and 
    • level 2 variables were sample sizes within the individual studies.


Measures of previous academic performance and assessments in medical school are associated with some degree of unreliability for a variety of reasons related to the candidate and the assessor (for example, illness, tiredness, environmental factors). In addition, students entering medical school are likely to be at the top end of the potential range of scores for previous academic performance and are also likely to do well in their medical school training. Both these factors (unreliability and restriction of range) statistically limit the size of the correlations between predictors and outcomes.8 We therefore corrected the effect sizes reported in this paper, calculated using HLM-5 software, 7 9 for error due to unreliability and range restriction. We used conventional methods to compare the corrected effect size estimates with the uncorrected ones to determine the contribution of error to the effect size estimates. 8 10


We converted the level 1 variables (the correlation coefficients) by using Fisher's r to Z transform before entering them into the meta-analysis. We entered all level 1 variables described in the papers into the analysis. Several papers examined the relation between multiple predictors and multiple outcomes.w7,w15,w21,w23 Although neither the predictors nor the outcomes are likely to be statistically independent, complete independence is not necessary for the meta-analysis to be valid.11


We used Cohen's calibration for effect size to guide interpretation of the results reported here.12 Cohen argues that an effect size of 0.10 should be classed as “small,” 0.30 as “moderate,” and 0.50 or greater as “large.”




Results

Tests of previous academic performance

이전 학습 또는 학업수행능력을 측정하는 시험으로 MCAT점수, A학점 수, GPA등이 있다. 753개의 활용가능한 상관계수를 입력하였다. 총 샘플은 21905명이었고, 5개 연구는 2487명을 대상으로 졸업후수련과의 관계를 연구하여 32개의 사용가능한 계수를 도출하였다. 

Tests measuring prior learning or previous academic performance included the medical college admission test, A levels, and grade point average. We entered 753 usable correlation coefficients into the meta-analyses for undergraduate performance, with a total sample size of 21 905 participants (mean 248.9, SD 265.06). Five studies explored admissions criteria in relation to postgraduate training, giving rise to 32 usable coefficients, with a total sample size of 2487 participants (mean 355.3, SD 566.8).w47,w50-w52,w64


학부교육과정에서 성공여부 예측에 있어서 평균 효과크기는 0.30이었다. 이는, 평균적으로, 이전 학업성취도가 의과대학 성취의 분산 중 9%를 설명한다는 의미이다. 예측인자와 결과(outcome)에 대한 비타당도를 교정한 뒤 효과크기는 0.36으로 상승하였다. restriction range에 대한 교정은 0.48까지 상승시켰다. 이는 의과대학 수행능력의 23%가 이전 학업성취도에 의해서 설명가능함을 뜻한다. 보정되지 않은(uncorrected) 상관계수는 중간정도의 효과크기이며, 최종적으로 보정된 상관계수는 큰 효과크기이다. 

In the prediction of undergraduate medical success, the average effect size was 0.30 (SE 0.016, range Embedded Image0.22 to 0.74, 95% confidence interval 0.27 to 0.33, P<0.00001). This means that, on average, previous academic performance accounts for 9% of the variance in overall performance at medical school. Correction for unreliability in both the predictor (previous academic ability) and outcome (medical training success) variables increased the effect size correlation from 0.30 to 0.36 (95% confidence interval 0.31 to 0.39). Further correction for restriction of range increased the coefficient to 0.48 (0.40 to 0.51). This corrected coefficient indicates that 23% of variance in medical school performance can be explained by previous academic performance. The uncorrected correlation coefficient would be classed as moderate in size according to Cohen's calibration, and the final corrected coefficient approaches a large effect.12


의과대학 졸업 후의 역량을 예측하는 데 있어서 평균적인 효과크기는 0.14였다. 따라서 평균적으로 이전 학업성취도는 졸업수 수행능력 중 3%이하만을 설명한다고 할 수 있다. unreliability에 대한 보정은 correlation을 0.17로, restriction of rage에 대한 보정은 0.24로 증가시켰다. 이러한 보정된 상관계수는 6%의 분산이 설명가능함을 의미한다. 보정 전과 보정 후의 상관계수 모두 Cohen's calibration에 따르면 작은 효과크기이다.

In the prediction of postgraduate medical competence the average effect size was 0.14 (SE 0.05, rangeEmbedded Image0.34 to 0.41, 95% confidence interval 0.05 to 0.23, P<0.05). Thus, on average, previous academic performance accounts for less than 3% of the variance in postgraduate medical performance. Correction for unreliability increased the effect size correlation to 0.17 (95% confidence interval 0.06 to 0.27), and further correction for restriction of range increased it to 0.24 (0.08 to 0.37). This corrected coefficient indicates that 6% of variance in postgraduate performance can be explained by previous academic performance. Both the uncorrected and corrected coefficients are classed as small according to Cohen's calibration.12


연구에 따라 효과크기의 차이는 상당히 컸는데, 의과대학 성취, 졸업후 성취 모두에서 표본의 크기와 효과크기는 유의한 상관이 없었다.

The 95% confidence intervals and ranges indicate a wide variability in effect sizes across the studies. This variability was not significantly associated with sample size for either the undergraduate analysis or the postgraduate analysis.


성격검사 Personality tests

성격검사에 대한 메타분석은 척도가 너무 다야하기 때문에 쉽지 않다. 

A meta-analysis of the personality measures was not possible owing to the wide variety of measures used, which included 

      • the California personality inventory, 
      • Rotter's “locus of control” scale, 
      • Cattell's 16PF, 
      • Eysenck's personality index, 
      • Minnesota multi-phasic personality inventory, 
      • Myers Briggs type indicator, 
      • state-trait anxiety inventory, and 
      • psychiatric interviews. 

The more consistent descriptive findings are summarised below.


가장 흔히 사용되는 것은 California personality inventory.이다. 8개의 subscale을 도출가능하다. 아래와 같은 상관관계가 나타난다.

The most commonly used test has been the California personality inventory. With this measure, eight subscales have emerged consistently as predictors of success in medical training: 

      1. “dominance,” 
      2. “tolerance,” 
      3. “sociability,” 
      4. “self acceptance,” 
      5. “well being,” 
      6. “responsibility,” 
      7. “achievement via conformance,” and 
      8. “achievement via independence.”w69,w79 
          • Dominance has been shown to be correlated with undergraduate multiple choice question scores (uncorrected r Embedded Image0.26), 
          • tolerance with the ability to use numerical data and make calculations (Embedded Image0.25), and 
          • well being and achievement via conformance with success in oral examinations (0.22 and 0.32).w79


Rotter's locus of control 는 사람들이 인생에서 겪는 일련의 결과의 원인을 내부에 두는지 외부에 두는지에 대한 검사이다. 의과대학생들은, 놀랍게도, 외적귀인성향이 더 강한 것으로 나타난다. 또한 의과대학 과정중에 점차 더 그렇게 된다는 연구도 있다. 이는 내적귀인성향이 높은 삭업성취와 연관된다는 다른 연구들과 대비된다. 한 가지 확인해야 할 것은 의과대학생의 연구들이 'defensive external' belief에 대해 다룬 것은 아니었는가 하는 것이다.

Rotter's locus of control is a personality test that assesses the extent to which people feel that outcomes in their lives are contingent on their own behaviour (“internals”) in comparison with the influence of factors such as “fate” and “chance” (“externals”). Medical students with high preclinical and clinical grade point averages were, surprisingly, more likely to express an external orientation (0.51 and 0.31).w74 There is also some evidence that medical students express more external beliefs as they progress through medical school.w48 This seems to be at variance with studies showing that higher levels of internal beliefs are associated with academic success.13 One area deserving further examination is that in these studies the researchers may be tapping into what is referred to as “defensive external” beliefs.14 Defensive externals act much like internals but endorse an external orientation as a verbal defence against failure.


state-trait anxiety studies 의 결과는 상태불안(state anxiety)은 유의미하게, 그러나 약한 부적 상관관계가 있음을 보여준다. 그러나 특성불안(trait anxiety)는 수행능력과 유의한 관계가 없다. 또한 학업 불안은 1학년 수행능력과의 관계에서 뒤집어진 U모양의 관계를 보이는데, 극도의 불안을 갖는 학생은 중간정도 불안을 갖는 학생보다 더 못한다는 것이다. 이는 사람들이 이상적 수준의 각성(arousal)상태에서 가장 수행능력이 좋다는 arousal theory와도 상응한다.

Results of state-trait anxiety studies have shown that state anxiety (anxiety in relation to a specific event, in this case examinations) is significantly, but weakly (3% of the variance), negatively associated with aspects of medical performance, but that trait anxiety (non-specific anxiety) is not significantly related to performance.w63,w84 Furthermore, levels of academic anxiety may show an inverted U shaped association with first year performance, in that students with extremes of anxiety tend to do worse than those in the mid-range.w48 This is consistent with arousal theory, which postulates that people perform best at an optimal level of arousal.15


최근에 개발된 성격이론은 5개의 요인이 있으며, 이 5개 요인은 앞서 보고된 척도로부터 도출가능하다고 제안한다. 이는 성격의 5요인이라 불린다. 

Recent developments in personality theory have suggested that five factors underlie normal personality and that these can be found in previously reported measures of personality. 16 17 These factors, known as the “Big 5” or five factor model of personality, are 

      • “emotional stability-neuroticism” (high scores relate to anxiety, depression), 
      • “extroversion” (high scores relate to being outgoing, sociable), 
      • “openness to experience” (high scores relate to being creative, artistic), 
      • “agreeableness” (high scores relate to being cooperative, trusting), and 
      • “conscientiousness” (high scores relate to being methodical, organised, motivated by achievement). 

the California personality inventory의 하위스케일 일부는, 특히 achievement 는 Big 5의 conscientiousness와 관련이 있어 보인다. Big 5는 의과대학 선발과 수련과정에 대한 이론틀을 제공한다. 

Some of the subscales of the California personality inventory, especially the achievement subscales, may relate to conscientiousness in the Big 5. The Big 5 offers a theoretical framework for the study of personality in medical selection and training. 

      • Conscientiousness has been shown in previous research to be related to success in a variety of occupational settings, and 
      • extraversion has been correlated with success in jobs that involve a social dimension (for example, sales).18 
      • Within medicine, extraversion predicted success in paediatric objective examinations (0.51).w83 
      • A recent study using the Big 5 has shown that conscientiousness is a positive predictor of preclinical achievement (standardised regression coefficient, Embedded Image=0.58), even with control for previous academic performance (A level grades).w10


성별 Sex

여성이 남성보다 의과대학 수행능력이 우수하다는 문헌은 일관되게 나오고 있다. 임상평가에서도 여성이 더 우수한 수행능력을 보인다. 남성이 조금 더 낫다고 보고한 연구에서는 초기에 남성이 낫다고 나오나(NBME part I), 이러한 차이는 시간이 지나면 사라진다. 그러나 성별에 따른 차이는 크기가 작고 표본수가 큰 때에만 유의하다. 따라서 성차가 가지는 실질적 관련성에 의문을 제기할 수 있다.

A consistent finding in the literature is that women tend to perform better than men in their medical trainingw1,w10,w27,w85,w91 and are more likely to attain an honours degree.w2 Women also tend to perform better in clinical assessments.w86,w87 Two studies suggested that men slightly outperformed women on early assessments (for example, National Board of Medical Examiners (NBME) part I) but that these differences disappeared later (NBME part II).w85,w86 However, these differences were small and reached significance only when the sample sizes were large. This raises the question of the practical relevance of these sex differences. For example, a significant difference was reported between men and women in NBME part II paediatrics scores, with men scoring 82.13 and women 82.70.w86


과거 학업수행능력이 남성과 예성에 대해서 동일한 예측능력을 가질까? MCAT과 같은 것들의 예측 정확도를 연구한 결과, 일부 연구에서 여성의 학업수행능력이 하향예측된다(즉, 예측치보다 실제 수행능력이 더 높다)고 보고된다.

Are tests of previous academic performance equally accurate predictors for men and women? When the accuracy of a predictor such as the medical college admission test is examined, the difference between predicted outcome scores (for example, NBME part I) and the actual outcome scores can be calculated. If the actual score is higher than the predicted score the test underpredicts; if the converse is found then the test overpredicts. Some evidence indicates that the admission test underpredicts for women.w94


동기(motivation), 학업, 인구학적 요소들이 남성과 여성의 수행능력에 영향을 미치는가에 대한 연구가 늘어나고 있는데, 한 연구에서는 '다른 사람을 도와줌'과 같은 service quality variable이 여성의 임상점수를 예측하며, '지적 성장'과 같은 individual mastery variable이 남성의 임상점수를 예측한다고 보고하고 있다.

A growing body of research explores whether different motivational, academic, and demographic factors influence the performance of men and women. Motivation seems to be important. For example, in one study, “service quality variables” (such as “helping others”) predicted women's clinical grades and “individual mastery variables” (such as “intellectual growth”) predicted men's clinical grades.w89


인종 Ethnicity

영국이나 미국에서 소수민족출신 학생이 백인 학생보다 시험에서 fail할 확률이 높다고 보고하고 있다. 그러나 비-영국출신 소수인종 학생들은 영국출신 백인 학생보다 더 수행능력이 낫다.

Some evidence indicates that in the United Kingdom, as well as in the United States, students from ethnic minority groups are more likely to fail a medical examination than are white students.w1,w55 However, non-UK ethnic minority students in the United Kingdom may perform better than white UK students.w1


여러 연구에서의 공통된 결론은 MCAT이나 GPA와 같은 전통적 인지적 척도들이 소수인종에서는 유의한 예측력을 가진다는 것이다. 그러나 이전 학업수행능력은 소수인종학생에 대해서는 의과대학 수행능력을 과대평가하여 예측하며, 백인학생에 대해서는 과소평가하여 예측한다. 

A common finding across several studies is that traditional cognitive selection measures (medical college admission test, grade point average) show significant predictive power for ethnic minority groups.w34,w45,w46,w55,w96,w97 However, measures of previous academic performance tend to overpredict for ethnic minorities but to underpredict for white students.w94,w95 No studies have examined whether differential experiences of training in medical schools contribute to this difference.


학습유형 Learning styles

학습유형은 학습동기와 학습과제에 대한 접근법 두 가지를 모두 다룬다. 아래의 두 가지 모델이 주로 사용된다.

Learning style covers both motivations for learning and the processes by which the student approaches the task of learning. Two general models of learning styles have been used (box).




Models of learning styles

Tripartite model

The first model is based on three learning approaches: “deep,” “strategic,” and “surface.19,w28 

  • Deep learning is based on three motivational factors (intrinsic motivation, vocational interest, and personal understanding) and three learning processes (making links across material, searching for a deeper understanding of the material, and looking for general principles). 

  • Strategic learning is motivated by a desire to be successful and leads to patchy and variable understanding.

  • Surface learning is motivated by fear of failure and a desire to complete a course, with students tending to rely on learning “by rote” and focusing on particular tasks.

Kolb model

The second model is based on Kolb's description of four approaches to learning—

  • concrete experience (experiential learning), abstract conceptualisation (development of analytic strategies and theories), active experimentation (learning through action and risk taking), and reflective observation (viewing problems from multiple perspectives before deciding how to proceed).w100 

  • These four approaches combine to produce four types of learner: “convergers” (emphasise the deductive method), “divergers” (use creative problem solving and view a problem from many perspectives before acting), “assimilators” (prefer an inductive approach), and “accommodators” (prefer hands-on experience as a way of learning).

삼원모델(tripartite model)에 대한 연구를 보면, 전략적 학습(Strategic learning)과 최종 성적관의 유의한 정적 상관관계가 꾸준히 보고되고 있다. 일부 연구에서는 심도학습(Deep learning)이 시험 수행능력과 연관된다고 보고하고 있지만, 다른 연구에서는 그렇지 않다. 유사하게 표면학습(surface learning)과 시험 성적간의 부적 상관관계가 보고되고 있으나, 일부 연구에서는 이러한 효과를 보여주지 않고 있다.

The studies examining the tripartite model in medical students have shown a relatively consistent finding of a significant positive association between the use of strategic learning and final marks (uncorrected r 0.178 to 0.26)w28,w99,w103-w105; only one study failed to replicate this effect.w101 However, although some evidence shows that deep learning has a positive association with performance in examinations (0.157 to 0.262),w28,w104 other studies have failed to replicate this finding.w101,w103 Similarly, although a significant negative association has been reported between surface learning and examination performance (for example, Embedded Image0.204),w28 several studies have failed to replicate this effect.w91,w101,w103


콜브의 모델을 활용한 연구를 보면, '수렴형'학습 스타일을 가진 학생이 다른 스타일보다 뛰어난 것으로 나타난다. 전략적 학습, 수렴적 학습 스타일을 활용하는 것이 좋을 것이다. 표면, 심도, 전략적 학습 스타일은 일정 정도의 안정정 특성을 보인다. 그러나, 이는 중간정도의 효과크기를 갖기 때문에, 다시 말하면 학습스타일은 바뀔 수 있는 것이고 의과대학에서 어떤 학습스킬을 사용하는 것이 좋을지 가르쳐주는 것이 유용할 수 있다.

Results from studies using the Kolb model suggest that students with a “convergers” learning style tend to perform better than those with any other style.w99,w100 Adopting a strategic or converger learning style seems to be a useful strategy for students who wish to succeed. Surface, deep, and strategic learning styles seem to show some degree of trait stability (0.33 to 0.42). However, this is only a moderate effect, suggesting that learning styles can change.w28 It may therefore be useful for medical educational programmes to teach students how to use the more successful study skills. 20 21



면접 Interviews

면접의 예측력 연구에는 세 종류가 있다. 첫 번째는 면접을 보고 입학한 학생과 면접을 보지 않고 입학한 학생(또는 한 대학에서는 불합격했지만, 다른 대학에는 합격한 학생)을 비교하는 것이다. 이들 연구에서는 면접에 따른 차이가 거의 없으며, 학생선발과정에서 기여하는 바가 적다고 결론내린 바 있다. 그러나 방법론적 한계가 있는데, 표본의 숫자가 작고, selection bias를 완전히 없애지 못했으며, 결과척도가 제한적이라는 점에서 그러하다.

Three types of study have explored the predictive power of interviews. The first type compared the performance of medical students who were interviewed and accepted with that of students who were accepted without intervieww113,w114 or those rejected by one medical school (Yale) but accepted at another, both on the basis of an interview, with those accepted by Yale but who chose to go to another medical school.w107 These studies showed no differences and concluded that the interview added little to the selection process. However, the studies had methodological limitations, including the use of small numbers (cohort range 23-113), a failure to eliminate selection biases, and a limited range of outcome measures.


두 번째 종류의 연구는 평가자의 평가를 초기 임상실습 전 성취나 유급 등과 비교하거나, 의사로서의 종합적 평가와 비교하는 것이다. 이러한 연구는 면접이 미래의 성공을 예측할 수 있음을 보고하고 있다. 예컨대 면접총점은 학장상(Dean's letter of recommendation) 이나 GPA와 상관관계가 있다.

The second type of study related interviewers' ratings (for example, overall suitability for medicine) to the interviewees' early preclinical success, withdrawal, and drop out ratesw27,w30,w72,w76,w88,w111,w112,w115,w116 and overall rating of the graduate physicians' potential competency as doctors.w111 These studies reported evidence that interview scores were able to predict future success. For example, overall interview rating correlated with a Dean's letter of recommendation (0.33)w111 and grade point average (0.08 to 0.14).w117


셋째로 면접점수와 다른 입학점수와 비교하는 것이다. 면접점수는 GPA를 통제한 이후에도 초기 수련과정의 성공과 독립적으로 연관되어 있었다.

Thirdly, one study compared the interview with other pre-admission criteria.w117 Interview ratings were independently associated with success in early training after controlling for grade point average (for example, 0.11).


즉, 면접을 통해서 예측력을 갖춘 추가 정보를 수집할 수 있으나, 평가자간 차이에 대한 요인이나 systemic bias가 존재하는지, 평가자훈련의 효과가 있는가에 대한 연구가 부족하다.

Thus useful additional information that has predictive power for outcome can probably be collected from an interview. However, little is known about factors such as the impact of inter-interviewer variation, whether any systematic biases exist, and the effect of training for interviewers.w117



자기소개서와 추천서 Personal statements and references

두 개의 연구에서 자기소개서의 예측력을 조사하였다. 한 연구에서는 초기 전임상 성공과 관련성을 찾지 못하였으며, 다른 연구에서는 작은 부적 상관관계를 발견하였다. 지금까지는 결론을 내리기에는 연구가 부족하다.

Two studies examined the predictive value of personal statements provided by candidates on their suitability to study medicine. One study analysed the content of candidates' actual statements and found no evidence that they predicted early preclinical success.w10 The other study used weighted proforma information about cultural skills (not candidates' actual statements) and found a small negative association with outcome (Embedded Image=Embedded Image0.184).w27 Thus too few data on personal statements are available to allow definitive conclusions to be drawn. More work is needed, especially into the relation between statements and clinical and postgraduate performance.


추천서와 관련해서는 한 연구에서 예측력 근거를 찾지 못했다고 보고한 바 있는데, 이는 다른 직업에서 나타난 결과와 일관된다.

The only study on the value of references suggested that the academic reference had no predictive value in subsequent achievement.w110 This is consistent with the conclusions from studies of the value of references in other occupations.


졸업후 수행능력에 대한 예측 Prediction of postgraduate clinical competence

대부분의 연구는 학부시절 성취에 대한 예측력을 조사한다. 졸업 후 역량에 대한 연구는 더 적다. 그러나 일부 연구가 이루어진 바 있는데, 의과대학 시기의 인지능력과 비인지적능력이 졸업후 임상역량을 예측한다는 연구가 있다. 일부 연구에서는 NBME part III점수의 51%까지도 설명한다고 보여주고 있으며, 또 다른 연구에서는 입학 전 점수가 인턴 시기 역량과 약한 관계가 있음을 보여준 바 있다. 의과대학시절의 성취와의 유의미한 상관관계가 약 60%에서 보고되었다면, 인턴시절의 성취와는 약 10%에서만 유의한 관계가 보고되고 있다. 이러한 패턴은 우리의 메타분석에서도 나타난다. 

Most studies of the predictive power of pre-admission cognitive and non-cognitive factors have focused on predicting success in undergraduate medical training. Fewer studies have examined pre-admission criteria as predictors of postgraduate medical competence. Several papers do, however, explore how cognitive factors (such as data gathering and analysis skills, knowledge, first to fourth year grade point average, and NMBE parts I and II) and non-cognitive factors (such as interpersonal skills and attitudes) assessed during medical student training predict postgraduate clinical competence.22 23 24 25 26 27 These studies show that cognitive factors can account for up to 51% of the variance in NBME part III grade.26 Only two studies have compared the predictive power of both admissions criteria (grade point average and medical college admission test) and scores in medical school examinations in relation to postgraduate competence.w47,w64 The evidence from these comparative studies indicates that the pre-medical scores show a weak relation to internship competence. For example, Richards et al showed that 60% (9/15) of the associations between previous academic ability and undergraduate success were significant (r range 0.17 to 0.34) but that only 10% (one) of the associations between previous academic performance and intern performance rating were significant (0.20).w47 This pattern of findings is confirmed by our meta-analysis. More detailed longitudinal studies exploring the complex relations between admissions criteria (cognitive, non-cognitive, and demographic), medical school performance, and postgraduate medical competence are needed.


졸업후 역량을 연구할 때 있어 주요 장애물은 서로 다른 전공간에 비교를 위한 점수체계를 만드는 것이다. 이는 "criterion problem"으로 알려져 있으며, 의학 뿐 만 아니라 다른 분야에서도 겪는 문제이다. 이 문제에 대한 한 가지 해결은 각 과별 세부적인 직무분석을 통해 core와 specific skill에 대한 역략이반모델을 개발하는 것이다.

One of the main problems with studying postgraduate clinical performance is establishing a comparable scoring system for assessing competency in different specialties. This is known as the “criterion problem” and confronts the prediction of success in all jobs, not just medicine. 28 29 One solution to this problem has been to develop competency based models of core and specific skills, through detailed job analyses of individual medical specialties.30


Discussion and conclusions

Relatively few studies provide comparative analyses of the predictive power of the wide variety of factors used in combination for selecting medical students (interview, grade point average, learning styles, personality). The research that has been undertaken has mainly concentrated on measures of previous academic ability as a predictor of undergraduate achievement. More work is needed to identify selection criteria that predict postgraduate performance.


Consistent with reviews in other occupational areas, academic or cognitive ability was a moderate predictor of success in undergraduate medical training.29 The strength of this association before corrections was moderate (0.30) in terms of Cohen's calibration, becoming large (0.48) after correction.12 Previous academic performance, however, would be classified as a predictor with a small effect (0.14 uncorrected, 0.24 corrected) for postgraduate medical competence.


Few studies have examined the effects of learning styles, interviews, personal statements, and references in relation to achievement in medical training. These factors need to be explored in future studies. The evidence indicates that work on learning styles is likely to be fruitful. The academic reference seems to have no predictive power. Virtually no research has examined the predictive power of personal statements. This is an important area for future research, as the personal statement forms an important part of the current selection process in the United Kingdom. More sophisticated research into the value of the interview is also needed—to explore the structure of interviews, how they are conducted, the effects of training, whether different interviewers (for example, psychiatrists or surgeons) focus on different factors, and how the predictive power can be enhanced.


Sufficient preliminary data indicating an impact of personality on medical school progression exist to warrant further research. However, the research needs to be conducted in a more prospective and systematic fashion.w10 “Achievement striving,” “state anxiety,” and “conscientiousness” should be the focus in future studies.


Future research needs to take a more multivariate approach to studying predictors of success in medical training. Predictors are likely to be intercorrelated,31,w10 as are outcome measures. Furthermore, learning across the medical degree (and indeed postgraduate learning) occurs over time, and time series analyses and models that allow for prediction of change over time would also be a useful approach. The use of structural modelling procedures,5 as well as hierarchical structural models using structural and time series components, would be beneficial to developing our understanding of the prediction of performance.




 2002 Apr 20;324(7343):952-7.

Factors associated with success in medical schoolsystematic review of the literature.

Author information

  • 1School of Psychology, University of Nottingham, Nottingham NG7 2RD, UK. eamonn.ferguson@nottingham.ac.uk







의과대학 학생선발에서의 인적특성(personal qualities, personal characteristics) 평가 

Assessing Personal Qualities in Medical School Admissions

Mark A. Albanese, PhD, Mikel H. Snow, PhD, Susan E. Skochelak, MD, MPH, Kathryn N. Huggett, MA, and Philip M. Farrell, MD, PhD





의과대학 학생으로 누구를 선발할 것인가를 결정할 때에는 철저한 검토가 반드시 필요하다. Jordan Cohen은 학부 GPA와 MCAT점수를 학생 선발의 일차적 수단으로 활용하는 것을 비판했다. 그 대신 "MCAT와 GPA는 일정 수준 이상인지만 보면 된다"라고 주장하며 "학생선발절차는 일차적으로 인적특성을 보는 것으로 시작하고, GPA와 MCAT점수는 그 다음 평가에서 봐야 한다"라고 말했다. 그는 또한 이러한 접근법에 대해서 "입학위원회는 한두개의 학업적 흠(blemish)를 우수한 인적특성으로 극복한 사례를 무수히 봐왔을 것이다"라고 말했다.

Making decisions about whom to admit to medical school has come under increasing scrutiny. Jordan Cohen, MD, in his address to the 112th annual meeting of the Association of American Medical Colleges, decried the use of undergraduate grade-point averages (GPAs) and Medical College Admission Test (MCAT) scores as the primary means of selecting medical students. Instead of using these indicators, he suggested using “MCAT scores and GPAs only as threshold measures,” or “beginning the screening with an assessment of personal characteristics and leave the GPAs and MCAT scores 'til later.” He argued that using this approach “admission committees might well find many instances in which truly compelling personal characteristics would trump one or two isolated blemishes in the academic record.”1


Cohen의 주장을 받아들이기 위해서는 인적특성을 효과적으로 측정하는 것이 더더욱 중요해진다. 인적특성 평가를 위한 의과대학 면접에 관한 문헌과 다른 인적특성 평가접근법을 살펴봄으로써 이 이슈에 대한 접근법을 제안하고자 한다.

With Dr. Cohen's challenge to look first to personal qualities in the admission process, the need to effectively measure personal qualities has assumed greater importance. In this article, we review the literature on the medical school interview as a mechanism for assessing personal qualities, discuss the challenges in using the interview and other approaches to assessing personal qualities, and then suggest approaches that might be taken to address this important issue.



BACKGROUND

전통적으로 인터뷰는 인적특성을 평가하기 위한 주된 수단이었으며, 의과대학 입학면접은 일부 소수 의과대학을 제외하고 거의 모든 의과대학에서 활용되어왔다. Edwards등은 의과대학 면접의 네 가지 목적에 대하여 기술한 바 있다. - 정보수집, 의사결정, 확인, 모집 -. 그 중 가장 중요한 목적은 다른 수단으로는 수집하기 어려운 지원자의 비학업적 정보를 얻는 것이다. 그들이 주장한 정보수집방법은 SAMS라는 것으로, 직무분석을 바탕으로 면접 내용을 정하는 방식이다. 이에 따라 모든 지원자에게 활용할 문항을 표준화하고, 평가자에게는 질문에 대한 예시응답을 제공하여 일관된 평가를 할 수 있게 해준다. 그리고 면접은 일군의 평가자에 의해서 이루어진다. 면접은 '구조화', '반구조화', '비구조화' 면접으로 분류할 수 있다.

Historically, the interview has been one of the primary methods of assessing personal qualities. Interviews for admission to medical school are conducted by all but a few U.S. medical schools.2,3 Edwards et al.3 cite four purposes for the admission interview: information gathering, decision making, verification, and recruitment. They argue that the most important purpose of the interview is to gather non-academic information about candidates that would be difficult or impossible to obtain by other means. The method they advocate for obtaining this information is a Success Analysis of Medical Students (SAMS), which includes selecting interview content based upon a job analysis (the critical-incidents technique is advocated for this purpose), standardizing the questions asked of all applicants, providing interviewers with sample answers to questions to help them give consistent ratings, and conducting each interview with a board or panel of interviewers. Interviews have been classified as being “structured” (like the SAMS model), semi-structured (having some but not all elements of a structured model), and “unstructured.”


입학위원회가 면접을 점차 강조하고 있다는 근거는 많다. Purvear와 Lewis는 107개 의과대학 중 61%에서 입학면접이 선발에서 가장 중요한 변수라고 응답한 바 있다고 밝혔다. 경험적 근거 역시 이를 지지하는데, Nowacek 등은 입학위원들은 지원자를 면접한 후에 지원자에 대한 평가를 바꾸는데, 그 변화의 폭이 SD의 0.47배에 달한다고 밝혔다. Patrick 등은 SAMS 모델을 입학결정에 도입한 효과를 발표한 바 있는데, 면접 평가 결과를 포함시키자 합격결정의 variance의 percentage가 21%에서 37%로 증가하였다. 이렇듯 입학면접에서 획득한 정보는 분명 입학결정에 큰 영향을 준다고 볼 수 있다. 그렇다면 어떤 인적특성을 평가해야 하며 어떻게 평가해야 하는 것일까?

Substantial evidence exists that admission committees place great emphasis on the information gleaned from interviews. Puryear and Lewis2 reported that 61% of 107 medical schools responding to their survey stated that the admission interview data were the most important variables used in selection. Empirical data support this result as well. Nowacek et al.4 found that, after interviewing candidates, admission committee members changed their ratings of the candidates, with mean values for various assessed qualities changing by as much as .47 of a standard deviation (effect size or ES). Patrick et al.5 reported the impact of introducing interview data obtained using the SAMS model on admission decisions. After adding the interview ratings to information from the written application, the percentage of variance in acceptance decisions accounted for by the regression model increased from 21% to 37%. Data obtained from the admission interview clearly can have a significant effect on admission decisions, but what are the non-academic qualities being assessed in the interview and in what ways are they being assessed?


여러 연구자가 이에 대해 보고한 바 있다.

  • Meridith et al.6 reported rating an applicant's maturity, individual achievement, motivation/interest in medicine, ability, and interpersonal skills
  • Nowacek et al.4 evaluated communication and interpersonal skills, commitment to serve others, familiarity with issues in medicine, leadership ability, motivation for medicine, and overall impression
  • Murden et al.7 assessed applicants' levels of maturity, nonacademic achievement, motivation, and rapport. 
  • Powis et al.8 assessed perseverance, tolerance of ambiguity, supportive and encouraging behavior, motivation to become a doctor, self-confidence, compatibility with the school's study styles, and an overall judgment. 
  • Taylor9 reported drawing traits assessed in a written form from 87 positive qualities of successful physicians. 
  • Collins et al.10 assessed communication, maturity, caring qualities/friendliness, awareness of community, political, social and medical issues, certainty of career choice, involvement in school activities, and involvement in community activities
  • Shaw et al.11 assessed 20 “non-cognitive, non-teachable traits,” such as being honest, energetic, confidence-inspiring, and conscientious. These authors are not alone in their beliefs that certain noncognitive traits are non-teachable. 

그러나 Shaw 등만이 '교육불가능한 비인지적 특성'에 대한 믿음을 가지고 있었던 것은 아니다. Bullimore 등은 인성은 18세까지 결정되기 때문에 의과대학 입학면접에서 비인지적특성에 대한 평가를 하는 것이 중요하다고 주장한다. '인성'과 '교육불가능한 특성'이라는 개념은 시간과 상황에 무관하게 일관된 특성이 있음을 뜻한다. 그런데 '정직성'이 정말로 교육불가능한 것인가? 에너제틱한 사람은 늘 그러한가? 어떤 특성이 '교육불가능하다'는 것을 인정하더라도, 그것을 한두시간 면접 내에 보여주는 것이 가능할까?

  • Bullimore12 argues that personality is set by age 18, making assessment of noncognitive variables in the medical school admission interview critical. The concept of personality and non-teachable traits implies traits that are stable across time and situations. Is honesty really non-teachable? Is an energetic person always energetic? Even if one accepts that there are some non-teachable traits, might they not be coachable for display in a one- to two-hour interview?


의과대학입학은 대부분의 지원자들에게 '고부담'의 일임에 분명하다. 많은 지원자들이 엄청난 양의 돈을 "MCAT대비 서비스"에 쏟아붓는다. 표준화된 면접을 통해서 교육불가능한 인적특성을 평가하는 것에 대비해서도 유사한 서비스가 가능한 것은 아닐까? 또한 면접이 비인지적 특성을 평가하기 위한 최선의 방법인 것은 맞을까? Taylor는 이런 특성은 지원단계에서 개인이 선택하여 작성할 수 있는 서식을 주고 이를 작성하게 함으로써 평가할 수 있다고 주장한다. 더 나아가 그는 UICM에서 이런 방식으로 선발한 학생이 전통적 면접으로 선발한 학생과 다르지 않음을 보여주었다. UWMS의 지원 단계에서 자기소개서는 비인지적 특성을 보여주는 핵심 표지자 중 하나이다. 그러나 우리의 문헌고찰 결과 면접에 비해서 자기소개서에 대한 평가가 비인지적 특성 평가에 어떤 차이가 있는가를 보여준 연구는 없다.

Admission to medical school is a high-stakes proposition for almost all applicants. Many applicants spend significant sums of money for test preparation services for the MCAT. Might a standardized interview purporting to assess non-teachable skills find itself susceptible to coaching from such a service? Further, is an interview the only way or even the best way to assess these noncognitive traits? Taylor9 argues that such traits can be assessed by having candidates distribute evaluation forms to individuals of their choice as part of the application process. He further reports that the students selected at the University of Iowa College of Medicine using such an approach did not differ from those selected when an traditional interview was conducted. In the application process at the University of Wisconsin Medical School, the personal statement has served as a key indicator of noncognitive traits. Our literature review, however, found no study that examined to what extent admission committees' assessments of the personal statement yielded different assessments of applicants' noncognitive qualities than an interview.


궁극적으로 면접에 비용과 시간을 투자할 가치가 있는가에 대한 결정은 면접이 다른 선발도구를 통해서는 얻을 수 없는 무언가(특히 지원서를 통해서 알 수 있는 정보)를 제공하는가에 달려있다고 할 수 있다. 면접의 가치에 대한 근거로 면접의 신뢰도와 타당도에 대한 연구가 있다. 그러나 그 결과는 모호하다. 면접관의 신뢰도에 대한 연구는 다양한 결과를 보여주고 있다. 

Ultimately, the decision whether an interview is worth the time and expense must be based on whether the interview yields something that cannot be obtained by other means, and, in particular, something that cannot be obtained from a review of written application materials. Evidence for the value of the interview has been sought in studies assessing the reliability and validity of the interview. The results have been equivocal. Studies of the reliability of interviewers have produced quite variable estimates. 

  • Meridith et al.6 found inter-rater correlations ranging from .55 to .91 for five qualities assessed in a sample of 14 applicants, each evaluated by two raters. 
  • Powis et al.8 report kappa (chance corrected inter-rater agreement) statistics ranging from .23 to .63 for seven qualities independently assessed by two raters. 
  • Edwards et al.3 report results from several meta-analyses showing inter-rater reliabilities ranging from .52 to .96, with a median of .83. Reliabilities for studies using structured interviews ranged between .82 and .84, and for those using unstructured interviews, reliabilities ranged from .61 to .75. 
  • Nowacek et al.4 reported inter-rater reliabilities for overall impressions of applicants that were .57 before the interview and .55 after the interview. 
  • Richards et al.13 reported an inter-rater reliability of .67 for panels of 13 interviewers. 
  • Van Susteren et al.14 reported an inter-rater kappa reliability of .79 for interviewers providing ratings scored within one point of each other on a five-point scale. Inter-rater reliabilities appear to be quite variable, but generally were higher (>.8) for structured interviews.


면접의 타당도 역시 그 결과가 모호하다. 

The validity of interviews has also proven equivocal. 

  • Litton-Hawes et al.,15 analyzing 15 interviews using stimulated recall procedures from videotapes, found interviewers made inefficient use of time and focused on written materials to the detriment of exploring what they were intended to do. They advocated improved training of interviewers. 
  • Smith et al.16 compared first-year medical students' grades for two classes that had been interviewed with those of two classes that had not been interviewed (n = 44 and 79, respectively). Results showed no difference in grades
  • Perhaps the study producing the most compelling results in support of interview data for admissions comes from Powis et al.8 In a case–control study designed to retrospectively analyze differences between students who left medical school due to failure or withdrawal over a nine-year period and students who received honors, 56 paired cases (who left medical school) and controls (who completed medical school and who were matched according to gender, age, and entry cohort—all had excelled in their academic performances) were analyzed. 
    • Those who left had uniformly been rated more poorly in the interview, with effect sizes of..
      • −4.17 for supportive and encouraging behavior, 
      • −3.46 for assessments of self-confidence and motivation to become a doctor, 
      • −3.11 for the overall rating, 
      • −2.76 for compatibility with study style of school, 
      • −1.98 for perseverance, and 
      • −.97 for tolerance of ambiguity. 
    • For differences between 58 pairs of students who graduated with honors and matched controls, honors graduates were rated more positively for...
      • perseverance, ES = 2.98; 
      • self confidence, ES = 2.59; 
      • overall rating, ES = 2.17; 
      • tolerance of ambiguity, ES = 1.04; and 
      • supportive and encouraging behavior, ES = .86. 
    • For the remaining qualities, honors recipients received more positive evaluations (with ES < .40). The ES values in this study are meaningful and strongly suggest that interview ratings can discriminate between students who fail to complete medical school and those who complete medical school, as well as between those who graduate with honors and those who do not.


Meridith 등은 입학면접 도입을 강하게 지지하는 또 다른 근거를 제시한 바 있음. 여러 연구결과를 종합하면, 면접은 의학교육 중에서도 임상교육 단계에서의 학생 수행능력과 관련된 정보를 제공해준다. 

  • Meridith et al.6 also provided compelling evidence to support conducting admission interviews. They correlated data collected from the admission interview, as well as the MCAT score and undergraduate GPA, with National Board of Medical Examiners (NBME) Part II scores and subjective clinical assessments in pediatrics and internal medicine clerkships for third-year medical students. Admission interview data did not significantly correlate with NBME Part II scores,  but did correlate with the subjective clinical assessments, accounting for over twice the variance as the next most potent predictor (interview = 10.4%, MCAT Science—Quantitative = 5.0%). 
  • Similar correlations of interview assessments with clinical assessments but not academic performances have been found in several studies involving non-medical health sciences programs.17–19 

Thus, evidence exists that the interview provides information for admission related to students' performances in the clinical portions of medical education.


입학면접을 해야한다고 주장하는 여러 근거 중에서 의과대학이 사람간의 상호작용을 중요시하기 때문에 입학절차는 단순히 지원자의 서류와 성취도에 점수를 매기는 과정이 아니라 한 인간으로서, 미래의 동료로서 지원자의 특성을 평가하는 과정이며, 이는 특히 환자-의사 관계가 대단히 개인적인 것이기 때문이라는 언급을 찾을 수 없었다는 것은 특이하다. 면접은 비록 한시적으로 성적이나 수행능력이 저하되었더라도 그것이 가족의 죽음이나, 개인적 질병 등 다른 이유에 기인한 경우 이에 대한 공감을 표현할 수 있는 수단이기도 하다. 면접은 면접기관이 고도의 스트레스 상황이면서 고부담 결정과정에 있는 모든 사람에게 인간적인 손길을 내밀 수 있는 기회이기도 하다.

It is extraordinary that, among all the reasons given for conducting an admission interview, we found no mention of its use to demonstrate that a school values the personal interaction between human beings, that the admission process is not just a mechanical analysis of paper credentials and accomplishments but a judgment of one's qualities as a human being and a future colleague, particularly because the physician–patient relationship can be so intensely personal. The interview can also be a means of demonstrating compassion for applicants whose records may have temporary performance deficits that may be related to deaths in the family, illness, or other problems. The interview is a chance for an institution to place a human touch on what is a highly stressful, high-stakes decision process for all involved.


요약하자면, AAMC는 의과대학생 선발에서 인적특성을 보다 강조한다. 이러한 인적특성에 대한 강조는 미국의 소수 의과대학을 제외한 모든 의과대학에서 초점을 두고 있는 부분이며, 입학위원회는 학생선발시 면접에 큰 비중을 두나 어떤 것을 핵심 인적특성으로 할 것인가는 학교마다 다르다. 면접의 타당도에 대한 근거는 모호하지만, 면접평가점수가 임상평가를 예측할 수 있고, 면접점수가 낮을수록 유급이나 자퇴의 가능성이 높다는 근거도 있다. 면접의 신뢰도는 구조화를 통해 확보할 수 있다. 일부 연구자들은 인적특성 파악을 위해서는 면접보다 비용-효과적인 방법이 있을 것이라 주장한다.

To summarize, the AAMC has called for a greater emphasis on compelling personal characteristics in the selection of medical students. These compelling personal characteristics are the focal point of admission interviews that are conducted by all but a few medical schools in the United States. Evidence exists that admission committees give substantial weight to interview data in the selection of applicants, but what constitutes a compelling personal characteristic varies among institutions, with as many as 87 different qualities being considered for assessment. Even though the evidence for the validity of the interview has been equivocal, there is evidence that interview ratings are predictive of subjective clinical assessments, and low interview assessments are predictive of failure or withdrawal from medical school. The reliability of the interview can be improved using structured approaches.3 Some have argued there are more cost-effective methods than the interview for assessing compelling personal characteristics.


CONSIDERING ALTERNATIVES


면접은 지원자와 기관 모두에게 비용이 많이 드는 일이기 때문에, 그 대안에 대해서 이야기해보고자 한다. 여기서의 목표는 주요 인적특성을 측정하는 어려움을 분석하고 대안을 제시하는 것이다.

Because interviews are expensive for both the applicant and the institution, it is this issue we address in the remainder of this article. Our goal is to analyze the challenges in measuring compelling personal characteristics and then offer some practical and some perhaps less practical alternatives.



MCAT과 GPA의 의존도를 줄이는 것의 어려움

Challenges to Reducing Reliance on MCATs and Undergraduate GPAs


Cohen의 조언을 받아들이고자 할 때 직면하게 되는 문제점은 이기주의, 관성, 철학적/역사적 요인 등이 있다.

The major challenges facing any school adopting Dr. Cohen's recommendation to use a minimum GPA and MCAT score as a threshold and measures of compelling personal characteristics for admission are self-interest, inertia, and philosophical and historical factors.



> 의과대학의 이익 Self interest

입학에서 학업능력의 중요도를 줄이려는 시도시 의과대학이 마주하게 되는 가장 큰 장벽은 다른 사람들의 인식에 미치는 영향에 관한 것이다. 평균 MCAT점수와 GPA는 '최고의' 의과대학 순위 산정에 주요하게 쓰인다. 문제를 더 복잡하게 하는 것은 MCAT점수가 USMLE Step 1 점수와 상당한 상관관계를 가진다는 몇몇 연구의 결과이다. 상관관계가 인과관계를 의미하는 것은 아니며 제 3의 요인에 의한 것일 수 있지만 MCAT점수 비중을 줄이는 것은 USMLE Step 1 점수의 하락으로 이어질 수 있다. 탈락률이 높아지지는 않더라도 Step 1점수가 낮아지는 것 자체가 의과대학에 상당한 타격이다. 우리의 경험에 따르면 상위권 지원자는 USMLE Step 1 점수의 평균이 어떻게 되는지 문의해오는 경우가 많아, 적어도 표면적으로는 어떤 의과대학을 지원할지 결정할 때 이것을 고려하는 것으로 보인다. 따라서 Step 1 성적이 하락하는 것은 학생 모집에 악영향을 미칠 수 있다. 따라서 의과대학의 질을 평가하는 다양한 성과의 차원에서 이전 학업성적의 커트라인을 낮추는 것은 의과대학의 이익에 반하는 것이 될 수 있다. 반대로 높은 커트라인을 가져갈 경우 인적특성이 우수한 학생 찾아내려는 노력을 수포로 돌아가게 할 수도 있다.

Perhaps the biggest challenge a medical school faces in reducing reliance on academic credentials in admissions is the impact such a reduction may have on the perceptions of others. Mean MCAT scores and undergraduate GPAs are used as part of the formula in determining the “best” medical schools by U.S. News and World Report. Further complicating the situation, several studies20,21 have reported that MCAT scores correlate fairly strongly with United States Medical Licensure Examination (USMLE) Step 1 scores (multiple correlations of MCAT scores with NBME Part I, predecessor of Step 1, between .39 and .63, median = .5820; multiple correlation = .5921). Since a correlation can reflect either a cause-and-effect relationship (unlikely in this case) or the influence of a third variable (say academic or test-taking aptitude), a reduction in MCAT scores may put USMLE Step 1 scores at risk. Even if failure rates do not rise, lowered mean USMLE Step 1 scores can have substantial damaging effects on an institution. In our experience, top applicants commonly ask us for our mean USMLE Step 1 score, ostensibly a factor they are considering in making their medical school selections. Thus, a lower USMLE Step 1 mean score has the potential to damage recruitment efforts. Compounding the problem, some of the most competitive residency programs consider USMLE Step 1 scores in their decisions. Thus, from the standpoint of various outcomes used to assess the quality of medical schools, ignoring academic credentials beyond a low threshold will bump up against self-interest. On the other hand, setting a high threshold may cripple efforts to identify students with the compelling personal characteristics that may be most prized.


> 관성 Inertia

입학과정은 고부담의 과정이며, 엄청나게 큰 사업들이 다방면에 걸쳐있다. 비록 변화라는게 쉽지 않은 것이지만, 이러한 상황에서는 더욱 어렵다. '합리적인 커트라인'이 어느정도인가에 대한 합의를 이루는 것만에도 상당한 논쟁이 필요할 것이다. 우리는 입학위원회가 받아들일 수 있는 커트라인에 도달하기까지 우리는 12년에 걸친 기간의 데이터를 가지고 다양한 커트라인을 시험해보았고, USMLE Step 1, Step 2, 의과대학 졸업에 미치는 영향을 검토해보았다. 커트라인은 GPA와 MCAT점수를 더 높이더라도 Step 1과 Step 2, 졸업에 영향을 주지 않는 선에서 결정되었다. 또 다른 관성은 입학절차에 대한 것인데, 입학절차의 관리요원은 그 직무가 만만친 않고 그들의 성취에 대한 자부심이 크다. 현재의 입학시스템에서 과도한 업무를 하고 있다고 느끼는 관리요원들에게 기존의 방식과 다르게 하라고 요구하는 것은 큰 부담이 될 수 있다. 따라서 교수들의 동의를 얻는 것 뿐만 아니라 관리요원의 지지를 이끌어내는 것 모두가 중요하다.

Admission is a high stakes, big business operation involving a large number of very eclectic individuals. Although change never comes easily, it is especially difficult under these circumstances. Coming to agreement about what constitutes a reasonable threshold will take a substantial and compelling argument. To arrive at threshold values that were acceptable to our admission committee, we analyzed performance data over a 12-year period in which we simulated various thresholds and the resulting impacts on the likelihood of first-time USMLE Step 1 and Step 2 passage and medical school graduation. Thresholds were adopted for which the likelihood of USMLE Step 1 and Step 2 passage and graduation did not improve with higher GPAs and MCAT scores.22 Other inertia lies with changing the process by which admission occurs. Admission staff have a tough job and take pride in their accomplishments. For staff feeling overworked under the current admission system, doing things differently could seem overwhelming. They may also feel personally threatened by change. Thus, it takes a concerted effort not only to get faculty buy-in but also to ensure that the administrative staff supports the changes.


> 철학적/역사적 요인 Philosophical and historical factors

마지막으로 철학적, 역사적 요인이 주요 장애물이 될 것이다. 일부 교수들은 학업적 측면에서 '최고로 우수한' 학생만 받아야 한다고 믿는다. 이런 상황에서 커트라인 아래에 있는 학생을 대상으로 비학업적 측정에 의지하는 것은 절대 못 할 일이다. 또한 학업능력이 떨어지는 몇몇의 학생의 역사는 변화의 노력에 부정적으로 작용한다. 한두명의 그러한 학생이 들어오는 것 만으로도 교수들에게는 추가적인 위험으로 간주되어 저항을 초래한다. 이러한 역사적, 철학적 차이를 극복하는 방법은 다양한 위험요인별로 학생의 수행능력에 대한 자료를 축적하는 것 뿐이다. 이를 통해서 한두명의 안 좋은 결과도 30~40명의 좋은 결과로 극복될 수 있다. 적절한 자료가 있지만 결과가 충분히 긍정적이지 않다면 위험과 이익을 비교할 준비를 해야 한다.

Finally, philosophical and historical factors are likely to be major obstacles. Some faculty believe that we should admit only the “best and the brightest” by academic measures. Reliance on non-academic measures beyond a low threshold would be an anathema from this perspective. The related issue is that a history of encountering problems with students who have low academic credentials can come back to haunt any effort to change. All it takes is one or two such students admitted under the new system encountering major academic problems for faculty to develop resistance to assuming additional risk. The only way to counter these historical and philosophical differences is to collect data on the performances of students in various risk categories (if there are such data available). In this way, the poor outcomes with one or two students can be put in perspective if there have been good outcomes with 30 or 40. If there are appropriate data and the outcomes have not been compellingly positive, one must be prepared to assess the risk and the benefits.


핵심 인적특성 평가의 과제

Challenges in Measuring Compelling Personal Characteristics


'커트라인 접근법'을 활용하여 인적특성을 기반으로 입학여부를 결정하기로 정해지면 그 다음 문제는 신뢰도와 타당도를 갖추어 이들 특성을 측정하는 것이다. 다음의 것들을 결정해야 한다.

If a faculty decides that it is willing to take the risk of using a threshold approach for screening applicants and then admitting them on the basis of compelling personal characteristics, it still faces the daunting task of reliably and validly measuring these qualities. Among the challenges are determining: 

  • What constitutes a compelling personal characteristic, and which is/are most compelling? 
  • What is/are best method(s) of measuring these qualities? 
  • To what extent are these qualities influenced by nature, nurture, or maturation? 
  • What are the costs of measuring these qualities? 
  • What are the ways of overcoming cunning adversaries?


핵심 인적특성에는 무엇이 있으며, 그 중 가장 중요한 것은 무엇인가?

What constitutes a compelling personal characteristic, and which is/are most compelling?


다른 것보다 더 중요한 핵심 인적특성은 무엇일까? 여러 문헌에서 면접 또는 다른 방법으로 평가해온 특성들을 다루고 있지만, 여전히 무엇이 가장 중요한가에 대해서 연구해볼 여지가 있다. Price 등이 주장한 성공한 의사의 87개의 특징은 좋은 시작점이 될 수 있으나, 그 숫자가 너무 많고 측정하기에 현실적이지 않다. AAMC에서 1970년대 초반에 활용한 방식이 도움이 될 수 있다. Price 등의 연구결과에 근거하여 NCWG는 Jack Collwell의 주도하에 7개의 인적특성에 대한 측정을 MCAT에 포함시켰다. (compassion, coping capabilities, decision making, interprofessional relations, realistic self-appraisal, sensitivity in interpersonal relations, and staying power—physical and motivational.) 이 작업팀의 권고가 실제로 활용된 적은 없지만, 이들이 밝힌 인적특성 집합은 의과대학 학생선발시 가장 중요한 인적특성이 무엇인가를 평가하는데 도움을 줄 수 있다. 이 권고안이 발표된 이후 25년간 많은 일이 있었다. 인적특성이 오늘날에도 동일한가를 단정지을수는 없다. 다양한 방법으로 이 권고안을 업데이트 할 수는 있겠지만, 여전히 지역 특성을 반영하기 위해서는 각 기관이 권고안을 자체적으로 다시 평가할 필요가 있다. 그럼에도 국가적으로 정의된 것이 있다면 각 기관이 개발하는 것 보다 좋을 것이다.

What are these compelling personal characteristics that might trump other indicators? Although the literature offers insights into some qualities that have been assessed by interview and other means, there clearly is room for research into what are the most salient qualities. The 87 positive qualities of successful physicians identified by Price et al.23 might be a good starting point, but that number of qualities makes it a daunting starting point and, most likely, impractical for measuring. An effort to improve assessment methods for prospective medical students by the AAMC in the early 1970s might be of some help in this regard. Based upon the work of Price et al.,23 the Non-Cognitive Working Group, under the leadership of Jack Collwell, proposed specific objective measures of seven personal qualities be incorporated into the MCAT: compassion, coping capabilities, decision making, interprofessional relations, realistic self-appraisal, sensitivity in interpersonal relations, and staying power—physical and motivational.24 Although the recommendations from this working group were never acted upon, the set of personal qualities they identified might contribute usefully to the dialogue about the most salient personal characteristics to assess during the medical school selection process. A lot has transpired in the quarter century since these recommendations. Whether the same personal qualities would be identified today cannot be determined. It might be worth convening a similar working group to update the work or, perhaps, use the nationally directed multi-institutional process employed for the Medical School Objectives Project (MSOP) to update the recommendations. Whatever the recommendations would be, they would still need to be assessed by each institution for local relevance. However, it would be of substantial help to have a nationally defined set as a starting point rather than having each institution develop its own.



이 인적특성을 측정하는 최선의 방법은 무엇인가?

What is/are the best method(s) of measuring these qualities?


여기에는 두 가지 어려움이 있다. 첫째로 인적특성을 측정하는 것은 이를 측정가능한 것으로 정의하는 단계가 필요하고, 이를 위해서는 인적특성을 어떤 '행동'으로 정의해야 할 뿐만 아니라 그 행동을 보았을 때 대부분의 사람들이 어떤 인적특성을 반영한 것인지에 인식할 수 있을 정도로 합당해야 한다. 이는 쉬운 일이 아닌데, 이타성을 예로 보자. MSOP는 이타성의 7개 특성을 규정하고 있는데, 그 중 하나는 "자신의 지식, 임상술기의 한계를 인지하고 지속적으로 그 능력을 개발하는 것"이다. 이는 보는 사람에 따라서 "이타성"의 범주에 들어가지 않을 수도 있다. 그리고 이것이 이타성으로 인정된다 하더라도, 대부분의 사람들이 이를 측정하기 위한 평가법을 개발해야 한다.

Measuring compelling personal characteristics is challenging for at least two reasons. First, measuring a personal quality requires the difficult step of defining the personal quality in measurable terms. This involves defining the personal characteristic not only in behavioral terms but also in behavioral terms that most reasonable people would recognize as reflecting the personal quality if they were to see it. This is not an easy thing to do. Take altruism as an example. The MSOP delineated seven qualities of altruism that medical students must demonstrate before graduation to the satisfaction of the faculty. The last of these qualities is “The capacity to recognize and accept limitations in one's knowledge and clinical skills, and a commitment to continuously improve one's knowledge and ability.”25 This is not one of those components of altruism that might be in everyone's definition of altruism. Even if it were a generally agreed upon element of altruism, developing an assessment method whereby most people would recognize it if they saw it offers multiple challenges.


두 번째는 거의 모든 무한히 다양한 상황에 일관되게 나타나는 인적특성을 찾는 것의 어려움인데, 다시 한번 이타성을 예로 들어보자. 대부분의 입학위원회는 고부담의 면접상황에서만 점수를 잘 받기 위하여 이타성을 보여주는 지원자를 원하지는 않을 것이다.  그러나 대부분의 입학위원회는 지원자의 경력을 보거나 면접에서의 행동을 보는 것 외에 더 이상 할 수 있는게 별로 없다. 추천서에서 약간의 정보를 더 얻을 수 있을지는 모르나, 반드시 그런 것은 아니다. 또한 추천서 작성자는 지원자에 의해서 결정되므로 추천서에 쓰여진 내용을 얼마나 신뢰해야 하는가에 대한 문제가 있다.

The second reason is that one is most interested in stable qualities that have a high probability of occurrence in an almost infinite number of different situations. Take the example of altruism again. Most admission committees would not be particularly interested in an applicant who showed signs of altruism only in a high-stakes interview situation or in the period when the incentive to get into medical school provides the motivation for volunteering in various ways. However, most admission committees have little more to go on than an applicant's record of volunteering and his or her performance in an interview. Letters of recommendation may provide some additional information, but not necessarily. Further, because the letter writers are chosen by the applicant, it is difficult to know how much confidence to place in the information contained in the letters.


입학위원회는 제한된 정보 속에서도 지원자가 의사가 된 이후에도 환자와 지역사회를 우선시할 것인가에까지 예측하고자 노력한다. 의사에게 있어 이타성은 여러가지 다른 방식으로 나타나는데, 의뢰를 하는 동료의사와의 사업적 관계, 투자의 이해상충, 제약회사에 의한 인센티브, 봉사활동 등등 다양하다. 이 목록은 끝도 없고 이해상충의 가능성은 널려있다. 입학절차에서 관심을 두는 것은 하나의 구체적인 유혹뿐만 아니라, 대부분의 상황에서도 이타심을 유지할 것인가에 대한 것이다. 지원자가 평가자가 듣고 싶어하는 말을 하는 것, 누군가에게 조언을 받은 대로 대답하는 것, 진실로 이타적인 모습을 가진 것을 구분할 수 있을까? 

The admission committee must work within the confines of the information it has, but it is trying to project whether the applicant will always and foremost put the patients' and community's needs above his or her own when he or she becomes a physician. This selflessness manifests itself in the practicing physician in many different ways, from business relationships with referral partners, conflicts of interest from investments, responses to incentives provided by drug and equipment manufacturers, willingness to volunteer for free medical clinics, etc. The list is endless and the potential for conflicts of interest is similarly pervasive. In the admission process, the interest is not just in whether the applicant will succumb to one specific temptation, but whether he or she will choose altruism as a guiding principle in most situations, if not all. During an interview, for example, how does one separate an honest portrayal of an applicant's response to a hypothetical situation from a carefully crafted response the applicant thinks the interviewer wants to hear, or perhaps that the applicant has been coached to provide by a preparation service?


이러한 인적특성은 천성, 양육, 성숙에 얼마나 영향을 받을까?

To what extent are these qualities influenced by nature, nurture, or maturation?


인적특성에 대한 또 다른 이슈는 그것이 과연 '안정적인' 것인가 하느냐는 점이다. 정직성과 같은 인적특성이 Bullimore가 말한 것과 같이 18세까지 완전히 결정되는 것인가? 정말로 '교육불가능한' 자질인가? 만약 교육불가능하다면, 그 가치를 중요시하는 집단 내에서 변화할 가능성은 없는가? 의과대학생이 의대를 다니는 동안 변화한다는 연구는 매우 많다. Rezler는 의과대학생에 대한 연구를 통해 의과대학환경이 학생들의 휴머니즘 수준 저하에 큰 영향을 미치며, 교수들의 태도/기술/헌신이 변하지 않는다면 교육과정변화로는 더 인간적인 의사를 만들 수 없다고 주장한다. Bland는 일차의료관련 전공 선택을 다룬 연구에서 Rezler의 관점을 다시 한번 강조하면서 의과대학기간 내에 휴머니즘 수준이 감소하는 것에 있어서 의학교육의 영향을 논하였다. 만약 의과대학이 이렇게 부정적인 영향을 준다면 의과대학문화를 변화시켜서 긍정적인 인적특성을 촉진할 수 있다는 주장이 가능하다. 이타성을 중시하고 이타성을 보여주는 교수들이 있다면 학생들도 이타적인 의사가 될 것이다. 또한 만약 사람들이 서로 다른 속도로 성숙한다면, 초기 단계에 있는 사람들은 이런 문화에 의해 더 많은 영향을 받을 것이다.

Another issue pertains to whether personal qualities are truly stable. Do personality traits such as honesty truly solidify by age 18 as argued by Bullimore?12 Are these truly “non-teachable” traits? If they are not teachable, are they malleable through immersion in a culture that values these traits? There seems to be considerable evidence that medical students change during medical school. Rezler,26 in a literature review on medical students' attitude changes during medical school, stated that the medical school environment was largely responsible for the decreasing humanism in medical students and that curricular innovations are unlikely to result in more caring doctors until a majority of medical teachers model these necessary attitudes, skills, and dedication. Bland et al.,27 in a comprehensive review of the literature on the determinants of primary care specialty choice, echoed Rezler's views on the decline in humanism during medical school and the negative influence of medical education. If medical school can have such a profoundly negative effect on students' humanism, it does not seem too far-fetched to suggest that a properly focused medical school culture could promote positive personality characteristics. With a culture that values altruism and faculty who demonstrate altruism always and foremost, it is conceivable that students might be nurtured into becoming altruistic physicians. Further, if people mature at different rates, those at earlier stages of development may be even more likely to be affected by an altruistic culture.


이 이슈는 제법 복잡하다. 만약 개개인의 인적특성이 의과대학 기간 내에 성숙한다면, 입학절차에서 해결할 문제는 '바람직한 방향'으로 성숙할 가능성이 있는 학생을 뽑는 것이다. 만약 인적특성이 변하지 않는다면, 입학절차에서 해결할 것은 신뢰도와 타당도를 높여서 이런 인적특성을 측정하고 이를 충분히 고려하는 것이다. 문제는 둘 중 하나가 절대적으로 옳지 않다는 점이다. 어떤 인적특성은 말랑말랑한 반면 다른 인적특성은 거의 변하지 않는다. 문제를 더 복잡하게 만드는 것은 말랑말랑한 특성과 변하지 않는 특성이 개개인마다 다르다는 점이다. 이 이슈를 논하는 것은 아마도 가장 어려운 문제 중 하나일 것이다.

The implications of this issue are particularly complex. If individuals do mature in their personal qualities such as altruism as they progress through medical school, the challenge for the admission process is to identify those who are most likely to mature in desirable ways. If personal qualities are stable, then the challenge for the admission process is to develop reliable and valid measures of these qualities and then to give them appropriate consideration. The problem may be that it is not an either/or proposition. Some personal qualities may be relatively malleable while others may be relatively stable by the time students enter medical school. To make matters even more complex, qualities that are malleable and those that are stable may vary among individuals. Sorting out this issue may be one of the greatest challenges to developing effective measures of personal qualities for medical school admissions.


이들 특성을 평가하는데 드는 비용은?

What are the costs of measuring these qualities?


입학절차는 지원자와 기관 모두에 큰 비용이 드는 과정이며, 추가적인 측정은 더 많은 비용을 수반한다. 기존에 수집된 정보로부터 새로운 수치를 이끌어내든, 더 좋은 방법으로 측정하든 문제는 비용이다. 측정하려는 측성이 많을수록 비용도 커지고, 새로운 정보는 새로운 측정법을 사용하기 전에 비해서 더 많은 가치를 제공할 수 있어야 한다.

The admission process is expensive for both the applicant and the institution, and adding measurements of new qualities in a rigorous manner will add more costs. Even if one uses information that is currently collected and derives new measures from it, or measures the qualities in a more rigorous manner, the change will add costs. The larger the number of personal qualities measured, the greater the costs incurred in their measurement. The new information must provide something of value beyond what was available before the new measures were added.



교묘한 지원자를 가려내는 방법?

What are the ways of overcoming cunning adversaries?


가장 큰 문제 중 하나는 지원자와 지원준비 서비스의 눈속임 방법이다. 의과대학의 일부 지원자는 카멜레온과 같아서 순간적으로 테레사수녀가 되기도 한다. 또한 일부 전문적 서비스에서는 그 하루를 위해서 좋은 특성을 보여주는 방법을 조언해준다. 잘 만들어진 소설 속에서 진실을 찾아내는 것이 중요하다.

One of the greatest challenges facing any effort to systematically measure personal qualities will be the cunning ability of applicants and preparation services. Some applicants to medical school seem to have a chameleon-like ability to adopt the short-term personality of “Mother Theresa” and the career interest du jour. Further, the survival of some professional services depends on their ability to help applicants with repelling personal characteristics display compelling ones for a day. Sorting out fact from carefully crafted fiction will make developing a standardized measure of compelling personal characteristics a difficult challenge.


가능한 도구들

Some Possible Measurement Approaches


만약 각 의과대학이 가장 비용이 덜 드는 방법을 원한다면 기존 정보의 새로운 활용이 좋을 것이다. AMCAS는 지원시 자기소개서와 에세이를 작성하도록 하고 있고, 이로부터 인적특성을 평가할 수 있다. 이미 대부분의 의과대학이 면접 시행하고 있어서 면접도 활용가능한 방식이다. 마지막으로 성적증명서, 부모교육이력, 재정정보 등도 인적특성에 대한 정보를 줄 수 있다. 예컨대 한 가족에서 처음으로 대학에 가는 지원자는 많은 의과대학에서 긍정적인 평가를 받는다. 

If schools attempt to take a least-cost approach to measuring compelling personal characteristics, making new uses of old information, (i.e. information already available) will be helpful. The American Medical College Application Service (AMCAS) application's Personal Statement and Essay could reflect compelling personal characteristics. Because all but a few medical schools already interview applicants, the interview itself might be a readily available source of information regarding compelling personal characteristics. Letters of recommendation are also commonly required to supplement the AMCAS application. Finally, there are elements of the transcript, parent(s)' education, and financial data that might give insights into personal qualities. For example, an applicant who is the first in his or her family to go to college, let alone go to medical school, is often given positive consideration in many medical schools. Other such insights might be gained from a careful and thoughtful consideration of such information.


자기소개서와 에세이

Personal statement and essay

자기소개서는 대체로 형식이 없으며, AMCAS지원 중 작성하게 되는 것이다. 자기소개서를 살펴보면 매우 다양함을 알 수 있는데, 이를 통해 인적특성을 평가할 수 있다. 자기소개서는 손대지 않은 자원과 같다. 자기소개서 해석을 잘 하기 위해서는 지원자들이 자기소개서를 통해서 의료전문직에게 필요한 인적특성을 잘 보여줄 수 있다고 믿는 지, 그리고 자기소개서가 다른 사람을 포함한 'group project'인지를 아는 것이 중요하다. 이 문제를 평가하기 위해서 우리는 의과대학 1학년생을 대상으로 설문조사를 했다. 3년간의 조사 결과를 보면 53~84%의 응답자가 자기소개서가 자신의 인적특성을 잘 보여준다고 생각하는 것으로 나타났고, 41~44%의 응담자는 자기소개서에 다른 사람의 도움을 받았다고 하였으며, 그 중 15~51%의 응답자는 내용 개발에, 2~6%는 전문적 서비스의 도움을 받았다고 했다. 비록 자기소개서가 많은 의과대학생으로부터 인적특성을 잘 드러내는 것으로 인식된다고 하더라도, 준비과정에서 다른 사람의 도움을 받는 것은 모든 지원자에 대해서 그 정확성을 담보하는데 제한을 만든다. 더 나아가 형식이 없기 때문에 한 자기소개서에서 강조하는 인적특성은 다른 자기소개서에서 강조하는 것과 다를 수 있다. 비표준화된 정보를 가지고 타당성있는 비교를 하는 것은 어려운 문제이다.

The personal statement is a relatively free-form essay that the applicant produces as part of the AMCAS application. A review of a sampling of such statements leads us to say that these statements are extremely variable. A literature search yielded no citation of its being used or evaluated to assess personal characteristics of the applicant. This would seem to be an untapped resource. To properly interpret the personal statement, it is important to know whether applicants believe the personal statement allows them to accurately represent their personal characteristics that qualify them for the profession of medicine, and whether the personal statement is a “group project” involving input from various others. To assess these issues, we surveyed matriculating first-year medical students for three years.28 Across the three years, 53–84% of the respondents indicated that the personal statement adequately represented some element of their personal characteristics; 41–44% reported the personal statement involved input from others, with 15–51% reporting input in content development and 2–6% receiving input from professional services. Although the personal statement was considered by the large majority of matriculating medical students to adequately represent their personal characteristics, questions about help received in its preparation limit the confidence that admission committees can place in its accuracy for all applicants. Further, because of its free-form nature, any given personal statement will highlight a set of personal characteristics potentially different from the set highlighted in another applicant's personal statement. Making valid comparisons of applicants' personal characteristics from such non-standardized information offers significant challenges.


면접 Interview

면접은 지원자가 직접 캠퍼스에 모습을 드러내는 몇 안되는 순간이다. 이 중요한 순간을 어떻게 활용하고 어떤 목적으로 사용하는가가 중요하다. 인적특성 평가를 위한 면접의 가능성은 두세시간의 면접 그 이상이다. 지원자의 총 방문시간은 그 몇배이며, 오리엔테이션/투어/재학생과의 상호작용/만찬 등등이 모두 들어간다. 이 모든 시간동안의 관찰이 유용한 정보가 된다.

The interview is one of the few times, if any, prior to the admission decision when the applicant is physically present on the campus. Deciding how to use that precious time and to what purpose is a critical decision. The potential of the interview for assessing personal qualities extends beyond the two- or three-hour interview; the visit usually lasts at least twice that long and includes orientation activities, tours, interactions with current students, luncheons, etc. Observations of applicants during these other times might also provide useful information.


연구 결과를 살펴보면 구조화된 면접이 인적특성을 가장 잘 보여주는 것으로 나타난다. 구조화된 면접의 특징은 다음과 같다.

Research suggests that structured interviews yield the most reliable estimates of personal qualities. A structured interview's features can include 

    • selecting interview content based upon a job analysis (e.g., the critical-incidents technique), 
    • standardizing the questions asked of all applicants, 
    • providing interviewers with sample answers to questions to help them give consistent ratings, and 
    • having the interview conducted by a board or panel of interviewers. 
    • One critical additional feature is providing training for the interviewers. Structured interviews require at least a minimal introduction to interview protocol and a rating system. 

또 다른 중요한 이슈는 면접관의 자격에 대한 것이다. Patrick은 비의학전공자들을 면접관으로 고용하여 훈련시켰다. Collins 등은 의대교수와 함께 소비자그룹의 대표와 교육분야 전문가를 면접 패널로 포함시켰다. 비-의대교수가 활용된 바 있지만 이들이 다른 평가경향을 가지는지에 대한 연구는 부족하다. 이러한 연구가 향후 필요할 것이며, 평가자의 특징과 지원자의 특징과의 상호작용에 대한 연구도 필요하다.

Collins et al.10 provided a half day of training for interviewers who were to be part of a 20-minute panel interview and then observed applicants interact in a group problem-solving session. Another potentially important issue concerns interviewers' qualifications. Patrick et al.5 hired non-medical people to serve as interviewers and trained them in the protocol. Collins et al.10 included representatives of consumer groups and experts in education along with medical school faculty on interview panels. Although non-faculty were used in these studies as interviewers, there was no effort to determine whether interviewers with different characteristics (consumer group representatives, education experts, medical school faculty) had distinguishable rating tendencies. Future research will need to determine to what degree interviewer characteristics produce detectable differences in interview results. There may also be an interaction between interviewer characteristics and applicant characteristics at play in the results.


Edwards 등은 구조화 면접의 특징을 정의하는 과정에서 면접관을 참여하게 했다. 한 명의 면접관을 참여시키는 것만으로도 평가결과의 신뢰도를 향상시켰으나, 다수의 면접관은 면접과정의 역학을 예측불가능하게 만들 가능성도 있다. 다수의 교수들이 한 명의 지원자를 평가하는 것은 위협적인데, 이는 비록 면접이 구조화되어있고 부드러운 분위기에서 진행되더라도 숫자의 불균형이 위협적으로 느껴질 수 있기 때문이다. 우리의 학생들을 대상으로 조사해본 결과, 패널식 면접을 하는 학교에 대해서 매우 비판적이었다. 여학생들이나 유색인종이 특히 비판적이었다. 면접 형식, 면접관의 특성, 지원자의 특성이 복합적으로 섞여서 상당한 영향을 줄 수도 있다.

Edwards et al.3 considered a panel of interviewers to be part of the defining characteristics of a structured interview process. Having more than one interviewer enhances the reliability of the resulting ratings, but multiple interviewers can impact the dynamic of the interview in potentially unpredictable ways. Having multiple faculty interrogate an applicant can seem threatening because, even if the interview is structured to be collegial, the imbalance in numbers can be intimidating. In interviews conducted at our institution about our students' experiences in the admission interviews they had experienced at various schools, the students have been very critical of schools using a panel approach. Women students and students of color have been especially critical of schools using panel interviews. The interview format, interviewer characteristics, and applicant characteristics may represent a complex mix of factors that could have a major impact on the admission interview process.


마지막은 면접의 특성과 관련된 것이다. Collins 등은 두 가지 종류의 면접을 보고한 바 있다. 하나는 20분짜리 구조화면접으로 1명의 지원자를 2명의 패널리스트가 면접하는 것이고, 다른 하나는 2명의 패널리스트가 50분의 집단토론에서 6명의 지원자를 보게 된다. 141명의 지원자 중에서 두 면접 종류의 일치도는 0.62였다. 비록 상당한 상관관계이고 통계적으로도 유의지하나 rating variance의 40% 이하만을 설명할 뿐이다. 이는 여전히 50%이상의 variance를 설명불가능한 영역으로 남겨두는 것이며, 면접의 특성과 어떻게 구조화되느냐가 면접 결과에 상당한 영향을 준다는 의미가 된다. 면접을 구조화하는 데 있어서 핵심은 평가하고자 하는 인적특성이 의미있는 방식으로 평가되느냐 하는 것이다. 어떤 인적특성은 평가하기에 용이하나 어떤 것은 그렇지 않다. 면접상황에서 다양한 인적특성을 어떠게 측정할 것인가에 대한 연구가 필요하다.

A final concern relates to the nature of the interview. Collins et al.10 reported two types of interviews being conducted: a 20-minute structured interview of one applicant by two panelists and two panelists observing six applicants as they participated in a 50-minute group exercise designed to stimulate debate. Separate panelists were used for the two types of interviews. Over 141 applicants, the correlation between ratings of the two interview types was .62. Although this is a relatively large correlation and was statistically significant beyond the .0001 level, it accounts for less that 40% of the rating variance. Disattenuating the correlation for the less-than-perfect reliability of the panel ratings (.67) still left over 50% of the variance in the two ratings unexplained, which clearly indicates that the nature of the activities and how they are structured for applicants may have a substantial effect on the results of the interview. The key is to structure the interview such that the personal qualities of interest can be assessed in a meaningful manner. Even if some personal qualities such as resourcefulness may be amenable to assessment in this type of situation, others such as altruism are likely to suffer from the artificiality of the conditions. Research related to how to measure various personal characteristics in an interview situation is clearly needed.


캠퍼스방문 중 비면접부분에서도 많은 것을 얻을 수 있다. 360도 평가 활용.

Much may be gained from the non-interview portion of the campus visit. The time applicants spend interacting with each other, participating in the orientation activities, meals, tours, etc. potentially can offer much insight into applicants' personal characteristics. One approach might be to adopt an element of the 360-degree evaluation model being explored for resident and physician evaluation by the Accreditation Council for Graduate Medical Education (ACGME) and American Board of Medical Specialties (ABMS). In this approach, almost everyone who comes into contact with the individual being rated provides a rating. In the admission interview case, one could do the same for the applicants during the non-structured interview time. Medical students, receptionists, food-service workers, tour guides, bus drivers, dean's staff, and others who interact with the applicants during the non-interview activities could be asked to rate the applicants. Because a relatively large number of applicants appear at the same time, such an evaluation would have to be picture-coded. It would probably be unreasonable for all of these different types of individuals to rate all applicants on all of the desired personal characteristics. Their contact would be so variable and transitory that it would probably be mostly wasted effort. If, however, these different individuals reported only memorable interactions of both positive and negative kinds, the strategy might provide useful information. At the very least, such an approach might be worth exploring.


다양한 인적특성에 대해서 방문의 구조화된 부분과 비구조화된 부분으로 나누어 검사해볼 수도 있다. 변하지 않는 인적특성을 신뢰도와 타당도를 갖추어 측정하기 위해서는 관심과 창의성이 필요하다. 비침습적 검사를 만드는 것도 고려해볼 수 있으나 인간적으로, 그리고 편집증적이지 않게 해야한다. 인적특성을 측정하기 위한 시도를 하는 과정에서 그 부작용에 대해서 항상 신경써야 한다.

One could also build tests of various personal qualities into the structured and unstructured portions of the visit. To reliably and validly measure personal qualities that are stable across time and situation will take care and creativity. The potential for building unobtrusive tests into the interview visit might be worth exploring, but care must be taken that it is done humanely and does not create a climate inducive of paranoia where the applicant feels under the microscope at every moment. As attempts are made to measure personal characteristics, one needs to be mindful of the potential side effects that these measurements might produce.



추천서

Letters of support

추천서는 AMCAS 지원과정에서 흔히 요구되는 것이다. 추천인을 어떻게 선정하는가에 대해서 의과대학마다 기준이 다르고, 무엇을 기술해야 하는가에 대해서도 각기 다르다. 종종 해석이 어려운 경우가 있는데, 이는 추천인을 지원자가 결정하기 때문이며, 그 형식이 자유롭기 때문이다. 어떻게 추천서를 바탕으로 지원자를 평가해야하는가가 아주 명확한 것은 아니며, 만약 지원자를 정확히 보여준다면 한 지원자의 추천서를 다른 지원자의 것과 비교해야 할 것인가를 고민해야 한다.

Letters of recommendation are commonly requested as supplements to AMCAS application materials. Medical schools vary in how they instruct applicants to select letter writers and the degrees of structure imposed on what is to be written. The consequence is that letters of support are often difficult to interpret. Because the writers are chosen by the applicants and the formats are often relatively free-form, it is never clear how representative a given letter is of the applicant, and, if it is an accurate portrayal, how to evaluate the quality of one applicant's letters of support against those of another applicant.


더 문제가 되는 것은 거짓 정보가 기술된 추천서이다. 이는 이론적으로만 가능한 것이 아니고 우리는 실제로 입학한 학생 중 추천서를 날조한 학생을 퇴학시킨 적이 있다. 더 나아가 입학절차에 대한포커스그룹 미팅에서 일부 학생들은 추천서 작성에 관여했음을 인정한 바 있다.

Even more problematic is the risk of fraudulent letters of support. This is not just a theoretical possibility. We had to dismiss a student we had admitted to our medical school when it came to light that the letters of support were self-fabricated. Further, in focus-group meetings about the admission process, some students have admitted to participating in preparing some of the “letters of support.”


또 다른 문제는 기관에 따라서 추천서의 평가가 매우 달라진다는 것이다. 무난한 추천서가 매우 좋게 평가받을 수도 있고, 과장된 추천서가 매우 부정적으로 평가받을 수도 있다. 추천서 해석의 총체적 접근법이 중요하다. 추천서에 무엇이 들어가야 하는가에 대한 국가적 합의가 있다면 평가가 용이할 것이다. 더 나아가 내용분석의 절차를 활용하는 방법도 있다. 거짓 추천서에 관해서는 표준화된 형식을 활용할 수 있고, NIH에서 현재 사용중인 전자추천서시스템을 활용할 수도 있다.

Another challenge is that interpreting letters of support varies and often depends upon previous institutional experience with individual letter writers. A relatively bland letter from one writer might be considered extremely positive, and a comparable letter from a more effusive letter writer could be considered extremely negative. A more systematic approach to interpreting letters of support would be helpful. If national standards were to be developed about what should be included in letters of support for medical school, at least the content of the letters might be easier to evaluate. Further, there may be ways of applying content analysis procedures to letters of support that could aid in their interpretation. To address the concern about fraudulent letters, if a standard format could be adopted, perhaps an electronic system like that currently required by the National Institutes of Health (which has built-in security factors) could be used. These issues deserve further consideration and research.


성적증명서, 부모의 교육수준, 재정자료

Elements of the transcript, parent(s)' education, and financial data


AMCAS파일, 지원서, 다양한 인구통계적 특성으로부터 핵심 인적특성을 뽑아내는 것은 복잡한 과정이다. 일부 주요 마커들을 유심히 볼 필요가 있는데, 인종/첫 대학진학자/농촌지역출신자/낮은 수입 등이다. 이를 봄으로써 취약계층 출신에게 가산점을 줄 수도 있다. 이러한 분석은 종종 정치적 이슈가 되기도 한다. 예컨대 의과대학에 가기 위해서 많은 역경을 겪은 지원자는 겨우 생계를 유지할 수 있는 정도의 삶을 살아왔을 수 있다. 의과대학이 요구하는 추가적인 것들은 그의 삶을 무척 힘들게 만들 수도 있다. 우리의 경험상 차이를 유발하는 삶의 역경을 이겨내는 것은 학업적 능력이 아닌 경우가 많다. 이러한 이슈가 종종 면접에서 논의되나 학업적, 인구통계학적 프로필을 통해서 고위험과 저위험을 구분하는 방법이 있을 것이다.

Deducing compelling personal characteristics from the AMCAS file, supplemental application form, and various demographics is a complex task. Certain key markers are sometimes used to consider applicants for scrutiny, such as ethnicity, first generation to attend college, rural/inner city residence, low income, etc. The general purpose of interpreting these markers is to give applicants from underrepresented and disadvantaged backgrounds consideration for admission in the context of their backgrounds. Such analyses are challenging and often become politically charged issues. For example, an applicant who has overcome substantial adversity to make it to medical school may be barely keeping his or her head above water under the weight of life's demands, obligations, or lingering effects of earlier obstacles (single parenthood, emotional trauma from loss of relatives/friends, surmounting a poor early educational system, etc.). The additional demands that medical school imposes can often push him or her under. Determining the difference between an applicant who can make it and one who cannot is difficult. In our experience, it often is not academic ability but the crush of life's obligations that makes the difference. This kind of issue is often discussed in the interview, but there may be ways to separate high risk from lesser risk through comparing the academic and demographic profiles of students who have made it through medical school with those who have not. More research in this area could yield valuable information.



A PROPOSAL FOR A UNIFIED SYSTEM OF ASSESSMENT


There are larger forces in the universe of medical education that might be usefully applied to assessment in the admission process. MSOP has identified four major objectives for medical education, each of which has six to 11 subobjectives. The ACGME and the ABMS have jointly identified six competencies for the practice of medicine. The MSOP competencies map fairly closely on these six competencies. Discussion of extending the ABMS/ACGME competencies into undergraduate medical education were held at the 2002 meeting of the Central Group on Educational Affairs, as was discussion of the possibility of integrating student-evaluation methods across undergraduate medical education (UME) and graduate medical education (GME).29


The potential benefits of integrating assessment methods across UME and GME would seem to be many. The ABMS and ACGME are beginning work on operationally defining how to measure the six competencies, beginning with communication skills. One of many tools that comprise the toolbox being developed for this purpose is the 360-degree evaluation. For a resident physician who is being evaluated, for example, one could have supervising physicians, nurses, patients, and administrative staff complete evaluations. Cost-effective methods of obtaining these evaluations are being developed.


Clearly, competencies appropriate for physician recertification would be more advanced than those for resident certification to practice, and, similarly, medical student competencies would be less advanced than those for residents. If we extend the concept of integration into the admission process, the competencies identified for applicants to medical school (pre-medical competencies) would be more rudimentary than those established for medical students. However, if one considers competencies to be a continuum from cradle to grave, the natural progression could serve as a means for assessing individuals at specific defining points. The evaluation methods used could build upon one another for continuity so that students feel a sense of progression and are better able to self-regulate their learning (monitor progress, identify learning needs, and adjust study accordingly). This Unified System of Assessment would enable all parties interested in measuring the competencies of physicians and physicians-in-training to pool their resources and adopt a developmental approach to the measurement process.


Even now, some of the methods being developed by the ACGME/ABMS collaboration might have potential application to the admission process. As it progresses, the work of these groups may help to narrow the field of personal qualities that are of the highest priority for assessment. For example, the ACGME/ABMS collaboration has adopted the American Board of Internal Medicine (ABIM) process for peer and patient assessment. The ABIM recertification process involves having diplomates arrange for ten professional colleagues and 25 patients to answer ten questions about their overall medical care and communication skills.30 They use a computer-administered telephone survey to collect the information. Using a similar method, it might be that a reasonably small number of applicant-nominated individuals completing a telephone survey can provide reliable and valid assessments of an applicant's personal qualities. If it is found that other personal characteristics can be better assessed via interview, the interview could be better focused to provide more reliable and valid measures of these other characteristics. It might even be possible to add a SAMS-type interview to the MCAT administration. This would potentially make assessing personal qualities during the campus visit optional or it could emphasize the elements unique to the particular school.


The segregation of UME, GME, CME and recertification has gone on for far too long. We need to consider the entire process as a continuum that includes even the selection of students for medical school and the pre-medical requirements. Pooling the resources of the entire system devoted to the education, testing, certification, and recertification of physicians would contribute to making much more headway than can be done with the current fragmented and separate small-scale efforts. Developing a continuum of competencies is a first step, developing a unified system for assessment would be the next.










 2003 Mar;78(3):313-21.

Assessing personal qualities in medical school admissions.

Abstract

The authors analyze the challenges to using academic measures (MCAT scores and GPAs) as thresholds for admissions and, for applicants exceeding the threshold, using personal qualities for admission decisions; review the literature on using the medical school interview and other admission data to assess personal qualities of applicants; identify challenges of developing better methods of assessing personal qualities; and propose a unified system for assessment. The authors discuss three challenges to using the threshold approach: institutional self-interest, inertia, and philosophical and historical factors. Institutional self-interest arises from the potential for admitting students with lower academic credentials, which could negatively influence indicators used to rank medical schools. Inertia can make introducing a new system complex. Philosophical and historical factors are those that tend to value maximizing academic measures. The literature identifies up to 87 different personal qualities relevant to the practice of medicine, and selecting the most salient of these that can be practically measured is a challenging task. The challenges to developing better personal quality measures include selecting and operationally defining the most important qualities, measuring the qualities in a cost-effective manner, and overcoming "cunning" adversaries who, with the incentive and resourcefulness, can potentially invalidate such measures. The authors discuss potential methods of measuring personal qualities and propose a unified system of assessment that would pool resources from certification and recertification efforts to develop competencies across the continuum with a dynamic, integrated approach to assessment.

PMID:

 

12634215

 

[PubMed - indexed for MEDLINE]


의과대학에서의 입학면접

The Admission Interview in Medical Schools

노혜린

강원대학교 의학전문대학원 외과학교실

HyeRin Roh, MD, PhD

Department of Surgery, School of Medicine, Kangwon National University, Chuncheon, Korea





서 론

의과대학에서는 입학전형을 통해 의과대학의 학업을 잘 수행하면서 동시에 바람직한 의사로서의 자질을 갖춘 응시자를 뽑고자 한다(Basco et al., 2000). 학교는 학업수행 여부 등의 인지적능력을 판단하기 위해 학교성적이나 공인시험성적 등을 확인하며, 의사로서의 자질 등의 비인지적 능력을 판단하기 위해 면접등을 시행한다(Stansfield &Kreiter, 2007). 대개 1차 전형으로 서류심사를, 2차 전형으로 면접을 실시한다. 입학면접이란 입학하고자 하는 응시자를 면대면 의사소통함으로써 응시자의 자질을평가하는 것이다(Brownell et al., 2007). 면접은 응시자가 제출한 자료들을 검토하는 서류평가를 보완하는 의미가 있다.


면접이 입학전형에 널리 이용되고 있음에도 불구하고, 그 실효성에 대한 의문은 많다. 사람의 자질을 짧은 시간의 만남으로평가하는 것이 불가능하며, 사람이란 자신의 출신배경과 살아온경험에 따라 서로 다른 주관을 가지게 되므로 객관적인 평가가어렵고, 면접평가의 결과로 학업 우수성을 예측하기 어려우며,더 나아가 나중의 의사로서의 업무를 수행하는 정도를 예측하기는 더더욱 불가능하다는 등의 이유에서이다. 면접에 대한 두려움도 있다. 학교에서는 학업성적이 우수하거나, 좋은 학교 출신학생들이 면접평가 결과가 좋지 않아 불합격하지는 않을지, 학생들이 면접에 대해 대비함에 따라 자신의 생각과는 다른 모범답안을 말하게 되는데 면접에서 이를 어떻게 평가할 것인지, 자신이 과연 객관적으로 평가할 수 있을지, 면접관들이 주관적으로 평가하다보면, 점수가 왜곡되어 실제 응시자의 자질과는 달리 평가되기 쉬운데 이를 어떻게 예방할지, 면접이 공정하지 못했다는 시비가 혹시 사후에 있지 않을지 등에 대한 걱정을 하게된다. 현실적으로 많은 교수를 동원해야 하고, 문항을 미리 개발해야 하며, 비용도 들기에 실행 면에서 어려움도 있다.


이러한 의문과 두려움에 따라 우리나라에서의 면접은 원래의면접 방식과는 다른 특성을 가지게 된다. 

    • 전체 응시자 면접은 하루 이내에 완료되며, 면접에 사용되는 문항은 당일 아침에 공개된다(노혜린 외, 2009). 이는 기밀 유지가 면접의 공정성에서 중요한 위치를 차지하기 때문이다. 
    • 평가는 대개 총괄채점(globalrating)으로 하게 되나, 면접관들은 자신의 점수에 따라 응시자가 떨어질 확률이 두려워 후하게 평가하게 되고, 이를 막기 위해학교에서는 상중하의 비율을 어느 정도 정해 모두가 후하게 평가하여 변별력이 없어지는 것을 예방하고자 한다. 
    • 또한 2차 전형에서 학생들의 총점을 낼 때 2차전형의 점수를 합산하는 것에 그치지 않고, 1차전형의 점수를 다시 여기에 상당 비율로 포함시켜, 공부 잘하는 학생이 면접에서 탈락하는 일이 없도록 하는 장치를 마련한다. 
    • 면접의 내용도 비인지적 영역에 국한하지 않고 인지적 영역을 되풀이하여 평가함으로써 성적이 우수한 학생이떨어지지 않도록 한다.


그러나 이로 인해 여러 가지 단점이 발생하게 된다

    • 하루 내에모든 면접이 끝내야 하니, 한 응시자를 면접하는 시간은 짧아지게 된다. 대개 10분 내외로 면접관은 응시자를 만난다. 
    • 따라서 정해진 모든 질문을 응시자에게 하지는 못하고, 면접관은 질문들중 몇 개를 골라 임의의 순서로 응시자에게 질문을 하게 된다. 
    • 면접관들은 면접 직전에 문항을 파악하게 되므로 충분히 숙지하고있기 어려워 문제의 목적이나 의도와는 다른 방향으로 면접이진행될 수 있다. 
    • 상중하의 비율을 정해놓고 평가함에 따라 학생의 자질은 실제 자질과는 달리 왜곡되며 공정성에도 문제가 있게 된다. 
    • 학생들은 10분 정도의 면접에는 답변을 준비할 수 있으며, 어떤 질문에 나왔는지에 따라 유리할 수도 있고, 불리할 수도있지만, 이에 대해서는 아직까지 그 해결책은 마련하지 못하고있다. 
    • 이에 따라 면접은 신뢰도가 떨어지게 되고, 그 공정성과 타당성을 의심받는 악순환을 반복하게 되는 것이다.


이에 저자는 선행 연구들을 통해 면접에서 평가하여야 할 자질과 면접 신뢰도를 높이는 방법에 대해 고찰해보고 우리가 해야 할 일에 대해 제안해보고자 한다.



본 론


가. 면접에서 평가할 필요가 없는 자질은 무엇인가?


  • 첫째, 1차 전형에서 이미 평가한 자질을 2차 전형에서 다시 평가할 필요는 없다. 예를 들어, 이미 수능이나 공인영어시험 성적을 1차 전형에 포함시켰다면 2차 전형인 면접에서 다시 영어로면접을 할 필요는 없다. 생물이나 화학에 대한 지식과 사고력을평가하는 공인시험 성적을 1차 전형에서 이미 반영하였다면, 2차 전형 면접에서 생물이나 화학 관련 지식을 다시 질문할 필요는 없다.
  • 둘째, 굳이 면접을 통하지 않고 평가가 가능한 자질은 면접을통해 평가할 필요는 없다. 면접이란 비용, 시간, 인력 소모가 많은 평가방법이기 때문에 비용과 시간을 절감하고 인력 동원이적은 방법으로 평가가 가능하다면 그 방법을 사용하는 것이 좋다. 
    • 예를들어,“ 왜의사가되고자하는가?”라는질문은 자기소개서나 에세이 등을 통해 평가가 가능하다. 
    • 어떤 사건이나 현상에대해 분석하고 비판하며 논리적으로 사고하는 능력은 필기시험이나 논술을 통해서도 평가가 가능하며, 굳이 면접을 할 필요는없다.
  • 셋째, 의과대학에 입학한 이후 의학교육을 통해 습득이 가능한 능력은 입학 전 면접에서 굳이 평가할 필요는 없다. 예를 들어, 의료윤리에 대한 지식이나 의료윤리적 딜레마에서의 바람직한 행동, 환자와 좋은 관계를 유지하는 법 등은 의과대학에 들어와서 배우게 되는 것으로, 이를 평가하기 위해 면접을 볼 필요는없는 것이다(Lowe et al., 2001).
  • 넷째, 직업현장에서의 선발면접과는 구분되어야 한다. 선발면접은 기업에서 더 활발하게 이루어지기에, 그 내용이 의학전문대학원 면접에 적용되기도 한다. 직업 현장에서 고용인을 뽑는것과 맥락이나 목적이 비슷한 부분이 있지만, 이 둘은 엄연히 다르다. 학생들은 고용인이 아니며, 일반적으로 학생들은 지식의몸체를 배우는 중이지만 고용인들은 지식의 몸체를 적용하는 중이기 때문이다[Goho &Blackman, 2006). 또한 기업에서 선호하는 자질과 의료인으로서 갖추어야 할 자질에는 차이가 있기 때문이다.


나. 면접에서 평가할 수 있는 자질은 무엇인가?


  • 첫째, 다른 평가도구를 사용하여 평가가 되기 어렵거나, 면접을 통해 가장 잘 평가할 수 있는 자질은 면접에서 평가되어야 한다. 예를 들어, 의사소통능력은 필기시험이나 자기소개서, 성적등을 통해 판단하기는 어렵다. 의사소통능력은 직접 면대면으로만나 대화를 함으로써 가장 잘 판단할 수 있다.
  • 둘째, 의과대학에서의 의학교육을 통해 변화하기 어려운 자질은 면접을 통해 사전에 평가되어야 한다(Albanese et al., 2003).특히, 정직성이나 윤리적 민감성은 타고나거나 어려서 형성되므로 의학도를 선발하는 면접에서 중요하게 평가되어야 한다고 주장되고 있다(Bullimore, 1992; Lowe et al., 2001). 그러나 과연 어떤 자질은 안정적으로 고정된 상태이며, 어떤 자질은 변하는지에 대해서는 뚜렷이 제시된 연구는 아직 없다(Albanese et al.,2003). 이 분야에 대해서는 좀 더 연구가 필요하겠다.
  • 셋째, 사회에서 요구하는 좋은 의사의 자질에 대해 고려하여야 한다. 이영미와 안덕선(2007)은 의대생, 의대교수, 환자, 개업의를 대상으로 시행한 설문조사 결과, 좋은 의사의 자질로‘환자의 입장에서 아픔을 공감’,‘ 환자진료 시 윤리적으로 판단하고행동’,‘ 환자에게 권위적이지 않고 친절함’,‘ 지역사회 공헌하고 봉사’등의 비인지적 자질이 선택되었다고 보고한 바 있다.면접에서 어떤 자질을 평가할 것인지 결정함에 앞서 사회에서기대하는 의사상에 대해서는 선행조사가 필요하며, 반드시 반영되어야 할 것이다.
  • 넷째, 의사집단이 의료전문가로서 생각하는 좋은 의사의 자질을 정립하여야 한다. 선행 연구들에서 주로 평가되었던 자질로는, 의사소통기술, 공감, 정직성, 윤리성, 성실성, 직업선택 동기,유연성, 의사결정능력, 문제해결능력, 팀워크, 비판적 사고력, 논리적 사고력, 인내력, 자기 확신, 리더십 등이 있었다(노혜린 외,2009). 또한 AAMC (Association of American MedicalColleges)(2004)는 면접에서 평가되어야 할 자질로서, 협력, 리더십, 의사소통, 의학에 대한 동기, 성숙도, 자기이해 등을 제안한 바 있다. 우리나라에서도 이와 같이 의사집단이 면접에서 평가할 좋은 의사의 자질에 대해 제안할 필요가 있다고 생각된다.


다. 면접 결과에 영향을 미치는 인자들은 무엇인가?


  • McManus(1998)은 응시자의 특성이 면접 당락에 영향을 미친다고 보고한 바 있다. 나이가 많은 학생일수록, 유색인종일수록면접에서 불리하게 평가받았다는 것이다. 나이가 있어 성숙한학생이 의사로서의 품성을 갖출 것으로 기대하는 일반적인 정서와는 달리, 면접에서는 나이가 적을수록 좋은 평가를 받았다. 언어적 의사소통 만큼이나 비언어적 의사소통도 평가 결과에 영향을 미쳤다(Edwards et al., 1990). 성별의 영향에 대해서도 지적되고 있다. 여자 응시자가 남자에 비해 불공정한 평가를 받고 있다는 것이다(Marquart et al., 1990; Stiffman &Blake, 1991). 면접관들은 남학생에게 직업선택 동기를 묻는 동안, 여학생에게는결혼이나 출산 계획을 묻는 경향을 보였다. 여학생은 남자교수에게 뿐만 아니라, 여자교수에게도 낮은 점수를 받는다고 보고한 연구도 있다(Edwards et al., 1990). 반면 여학생일 때 면접에서 더 유리하였다는 보고도 있다(McManus, 1998).
  • 특히 응시자의 성적은 면접 결과에 상당한 영향을 미친 것으로 보고되고 있다. Shaw et al.(1995)은 의학교육입문검사와 학사과정성적을 면접관에게 공개하고 평가한 면접의 경우 면접관들이 인성 영역 평가에 영향을 받았다고 하였다. 좋은 성적을 가진 학생의 경우 면접에 유리하였으나, 자연과학 과목이 아닌 영역에서의 좋은 성적은 면접에 불리하였다(McManus, 1998)는보고도 있다. Harasym et al.(1996)은 표준화된 모의응시자를 개발하여 면접관이 면접하게 하고 그 평가를 비교해보았을 때, 절반 정도에서만 적절히 평가되었다고 하였다. 예를 들어, 학업성적이 우수하지만 리더십은 없는 것으로 표준화된 모의응시자를만난 면접관은 리더십에 대한 질문을 하지 않았음에도 불구하고리더십에도 우수한 평가를 하였다. Elam et al.(1994)도 입학면접에서 직업선택 동기나 대인관계능력, 책임감 등을 평가하도록했음에도 불구하고 많은 면접관들이 그것보다 응시자의 학문적인 질적 수준을 평가하였다고 보고하였다. 따라서 면접관이 응시자의 성적에 대해 알지 못하는 상태에서 면접을 진행하여야객관적이고 공정한 면접이 가능하겠다.
  • 면접관의 특성 또한 평가 결과를 왜곡시킨다. 가장 선호하거나 싫어하는 한 특성에 영향을 과하게 받아 다른 특성에 대한 판단까지 하게 되는 후광효과, 평가자에 따라 채점을 지나치게 관대하거나 인색하거나, 모든 평가를 가운데로 모으는 집중경향이생기는 것 등이 흔히 나타나는 채점 오류이다(Edwards et al.,1990).
  • 그 외에도 면접 자체도 면접 결과에 영향을 미친다. 앞 사람이어떻게 했느냐에 따라 다음 사람이 영향을 받는 대조 효과가 대표적인 경우이다(Goho &Blackman, 2006).



라. 면접의 신뢰도를 높이기 위해서는 어떻게 해야 하는가?


  • 면접의 신뢰도에 가장 영향을 미치는 인자는 면접관이다(이규민, 2002). 따라서 면접관의 오차를 줄이는 것이 면접의 신뢰도를 높이는 데 가장 중요하겠다. 이규민(2002)는 면접을 사용하여 주요한 교육적 의사 결정을 할 경우, 
    • 1) 구체적인 채점 기준을작성하여 활용하고, 
    • 2) 면접관을 훈련시키며, 
    • 3) 총괄적인 채점(holistic scoring)보다는 분석적인 채점(analytic scoring)을 사용하고, 
    • 4) 신뢰도를 주기적으로 평가할 것을 제안하고 있다.


  • 또한 구조화된 면접이 그렇지 않은 경우에 비해 면접관의 일치도가 높다고 보고되고 있다(Goho &Blackman, 2006). 구조화면접의 조건은 다음의 4가지로 정의된 바 있다(Edwards etal., 1990). 
    • 첫째, 면접 내용은 직무 분석으로부터 개발되며, 
    • 둘째, 모든 응시자에게 같은 질문이 주어지며, 
    • 셋째, 질문에 대한 답변은 견본으로 면접관에게 제공되며, 
    • 넷째, 면접은 위원회나 면접관 패널에 의해서 진행되는 것이다.
  • Eva et al.(2004a)은 일반화가능도 이론에 의한 분석결과, 1개의 면접방에서 12명의 면접관이 모두 함께 평가하는 것보다, 12개의 면접방에 각각 1인의 면접관이 들어가 각각의 질문을 하는경우가 신뢰도가 훨씬 높았다며, 면접의 신뢰도를 높이기 위해서는 문제가 다른 면접방의 수를 증가시킬 필요가 있다고 하였다. 이에 다수의 면접방에서 각각의 상황을 두고 응시자를 평가한 Multiple Mini-interview (MMI)가 McMaster 의과대학에서 처음 개발되었다(Eva et al., 2004a). MMI는 8분 정도의 짧은 면접을 10개의 면접방 내외에서 연이어 시행하는 방식으로, 객관구조화진료시험과 비슷한 형태로 이루어진다. MMI는 우리나라에서도 2008년 도입되었다(노혜린 외, 2009). MMI의 신뢰도는0.6~0.8 정도로 보고되고 있다(Eval et al., 2004a; Lemay et al.,2007; 노혜린 외, 2009).
  • 면접에서 이루어지는 질문 방식도 면접의 신뢰도를 좌우한다.질문의 형태는 크게 상황질문(situational question)과 성과 질문(accomplishment question)으로 나눌 수 있다(Edwards et al.,1990). 
    • 상황 질문(situational question)은 만약 어떤 상황에 처한다면 어떻게 행동할 것인가를 묻는 것이다. 
    • 성과 질문(accomplishmentquestion)은 과거에 이런 비슷한 상황에서 어떻게 행동했었으며 어떤 성과가 있었는지를 묻는 것이다. 이는 과거의행동이 미래의 행동에 대한 가장 좋은 잣대라는 믿음에 근거를두고 있다. 
    • 상황질문보다는 성과 질문이 면접의 신뢰도를 위해서는 더 추천되고 있다.


마. 공정한 면접이 되기 위해서는 어떻게 해야 하는가?


모든 응시자에게 공정한 면접이 되기 위해서는 우선 면접평가가 신뢰할 만하고 타당하여야 할 것이다. Downing(2004)는 중요한 당락이 결정되는 평가의 경우 0.90 이상의 신뢰도가 필요하다고 주장하였다. 입학선발면접에서의 당락이 응시자에게 있어매우 중요하다는 것을 볼 때, 매우 높은 신뢰도가 필요함은 당연하겠다.


어떠한 응시자라도 출신 지방이나 졸업한 대학, 성적 등에 의한 편견 없이 평가를 받을 권리가 있다. 이는 면접관에게 응시자에 대한 정보를 제공하지 않고 면접하고, 모든 응시자에게 같은질문을 하는 것 등으로 이루어질 수 있다. 하지만 이것만으로는충분하지 않다.


Eva et al.(2004b)은 같은 응시자에 대한 교수간 일치도나 지역사회인사 간 일치도에 비해 교수와 지역사회인사 간 일치도가낮았다고 보고한 바 있다. 만약 의과대학이 교수만으로 이루어진 면접관을 통해 면접을 한다면 지역사회에서 생각하는 좋은의사의 자질을 가진 학생은 낮은 평가를 받을 수도 있다는 해석이 나오게 된다. 각각의 면접관이 가지는 편견을 희석하는 좋은방법 중 하나는 다수의 면접관을 사용하는 것이지만, 그에 못지않게 다양한 배경을 가진 면접관이 필요함을 이 연구결과를 통해 알 수 있다. Calgary 의대는 MMI를 시행하면서, 면접관으로1/3은 교수, 1/3은 학생, 1/3은 지역사회인사를 할당하였다고 하였다(Brownell et al., 2007).



결 론


문헌을 고찰하면서 외국 입학면접에 대한 연구는 그 역사가매우 오래됨을 알 수 있었다. 우리나라에서는 아직 고민 중이기만 하고 해결방안을 적극적으로 모색해보지 않고 있는 여러 가지 면접 딜레마 사안들에 대해 Academic Medicine의 경우 1970년대부터 논문이 발표되어 오고 있었다. 예를 들어, Litton-Hawes et al(1976)은 의과대학에서 시행한 입학면접에서의 의사소통 절차를 분석하여 면접관이 시간을 비효과적으로 사용하고 있으며, 응시자의 서류에 조기에 의존하는 경향을 보인다고하였다.


우리나라의 경우 입학면접에 대한 진지한 고찰은 부족한 상태이다. 아직까지는 우리나라에서 이루어지고 있는 면접의 형태에대한 전체적인 조사도 제대로 이루어지지 못하고 있는 실정이다. 사회에서 기대하는 의사상이나, 의사들이 생각하는 좋은 의사상에 대해서도 소수의 연구가 산발적으로 있을 뿐, 아직 전체적인 컨센서스는 모으지 못하고 있다. 면접을 잘 하는 것은 좋은의사를 길러내는 시작이자, 가장 중요한 단계일 수 있다. 따라서우리나라 의과대학의 입학면접의 신뢰도와 타당도를 높이기 위한 여러 해결방안을 모색해보는 일을 지금부터라도 적극적으로시작해보는 것이 필요하다. 우리나라에서도 사회에서 요구하는좋은 의사상에 대해 의견을 모으고, 의과대학학장협회나 의학교육학회와 같이 의학교육의 공신력이 있는 집단에서 면접에서 평가할 좋은 의사의 자질에 대해 연구하고 제안할 필요도 있다고생각된다.


면접의 신뢰도를 높이는 방법으로는 구조화 면접, 특히 MMI가 그 해결책으로 제시할 수 있었다. MMI는 응시자를 1시간 이상 면접하면서도 면접관의 시간은 많이 필요하지 않으며, 다양한 사례와 면접관으로 평가함에 따라 객관적이고 공정한 평가가가능하였다(Eva et al., 2004a). 우리나라에서도 MMI가 시도되기 시작하였으나, 아직 그 장애물은 많다(Roh et al., 2009).


우리나라의 경우 McMaster나 Calgary 의과대학에서와 같은많은 수의 교수 동원은 어렵다. 

  • McMaster 의과대학에서는 117명의 응시자 그룹을 대상으로 4일간 시험을 시행하였다. 하루에10명씩 하여 총 40명의 면접관이 평가에 참여하였다. 
  • Calgary 의과대학에서는 이틀 동안 총 281명의 응시자를 평가하기 위해 81명의 면접관을 동원하였다(Brownell, et al., 2007). 

또한 며칠 동안 면접을 계속 시행하기도 어렵다. 면접을 며칠씩 지속해서 보는 경우 뒤에 평가받는 학생들은 정보를 이미 입수하여 면접에대비할 것이라는 우려를 많이 하기 때문이다. 이는 결과에 차이가 없더라도 공정성에서 시비가 있을 수 있다. 즉, 우리나라에서는 되도록 단기간에 적은 교수의 동원으로 면접을 해야 하는 상황이다. 이런 환경에서도 면접의 신뢰도를 높일 수 있는 창의적인 아이디어가 절실하다.


면접이 비인지적 특성을 평가하는 대표적인 평가방법임을 생각할 때, 이에 대한 가장 큰 장애물은 바로 교수들의 우수한 인재를 뽑고 싶어하는 철학적인 요소이다(Albanese et al., 2003). 무조건 면접을 통해 비인지적 특성을 평가하여야 한다고 강조해보았자, 오랜 역사를 통해 이루어져온 교수들의 관념이 쉽게 바뀌지는 않을 것이다. 이러한 상황에 대응하는 가장 좋은 전략은 학생들의 자료를 지속적으로 모으고 분석하여 어떤 면접내용과 방식이 바람직한지 체계적으로 연구하는 것이다.


결론적으로 면접에서는 의학교육을 통해 변화하기 어려운, 좋은 의사로서의 자질을 평가하여야 한다. 나이, 성별, 성적, 비언어적 의사소통 등 다양한 응시자의 특성이 면접 당락에 영향을미쳤으며, 면접관의 채점 오류가 가장 큰 영향을 미쳤다. 다수의방에서 시행하는 구조화면접을 통해 신뢰도와 타당도 높은 면접이 가능하였는데, MMI가 그 대표적인 경우이다. 공정한 면접을위해서는 응시자의 정보가 면접관에게 제공되지 않아야 하며, 다양한 배경의 면접관을 배치해야 한다. 우리나라에서는 아직 입학면접에 대한 연구의 역사가 일천하다. 특히, 좋은 의사상과 효율적이면서도 신뢰도 높은 면접 방법에 대해 향후 보다 적극적인연구 자세가 필요하겠다. 우수한 학생을 뽑고 싶어하는 교수들의






This study is aimed to reflect non-cognitive traits that should be assessed in admissions interviews for medical school

applicants, with the goal being to increase the reliability of the admissions interview. The admissions interview is valued

for its ability to assess noncognitive and nonteachable attributes of good doctors, especially which cannot be evaluated with other admission assessment tools. Various characteristics of applicants including age, gender, exam scores, and nonverbal communication were found to have influenced the interview results. Bias from interviewers was a significant factor in the results of the interview. A Structured interview in multiple stations such as the Multiple Mini-Interview showed the highest reliability and validity. To make the interview fair, no information about the applicants was provided to the interviewers and interviewers were recruited from different backgrounds. There have been few research papers on admission interviews in Korea. Active research on the qualities of good doctors and effective and reliable admission interview methods should be encouraged. A strategy should be developed to overcome the philosophical obstacles that medical school professors want to admit academically excellent applicants.


Key Words: Reliability, Selection interview, Structured interview, Multiple mini-interview



의과대학 선발에서의 질적 요소

Qualitative variables in medical school admission.

McGaghie WC.





개인적 특성, 성격, 인생경험, 적응 능력 등은 전문직으로서 효과적인 일과 삶을 유지하는 것과 연관되어 있다. 의과대학생과 의사 모두에게 성공을 위해서는 질적인 요소들이 중요하다는 것은 널리 인정받고 있지만 학생 선발과 관련해 판단을 내릴 때에는 거의 고려되지 않는다. 여기서는 의과대학 선발에서 고려해야 하는 질적 요소들을 다루고자 하며, 입학에 있어서 질적 정보들을 사용할 것을 권고하고 있다. 또한 의과대학 선발에 대한 연구의 주요 의제로 끝맺고자 한다.

Personal qualities, character traits, life experience, and adaptive capacities are all associated with effective professional life and work. Despite widespread acknowledgement that qualitative factors are crucial for success as a medical student and physician, the variables are rarely measured or considered when medical schools reach decisions about student admission. This essay examines the qualitative variables that medical school admission committees might consider when filling their classes, and it offers recommendations about using qualitative data for admission decisions. It concludes with an agenda for research on medical school admission.



Important Qualitative Variables

1. Character and integrity have identified as the most sought-after attributes among candidates

2. There is consensus that breadth of knowledge, rather than narrow baccalaureate specialization, is an important selection criterion, as expressed forcefully in the 1984 GPEP report

3. Evidence of leadership is often sought among candidates for admission to medical school

4. Many medical schools, especially state-supported schools, have definite geographic preferences for student selection.

5. Applicant gender, race, and religious preference are qualitative variables now receiving explicit attention in ways that differ from many former admission policies.

6. Medical schools frequently consider applicants' work habits and motivation to study, although few schools are confident of their ability to evaluate these traits.

7. Students' personality and attitude have long been considered key qualitative variables, especially for screening applicants.

8. A personal orientation toward service is often mentioned as an important characteristic of medical students and practicing physicians. 

9. Closely allied to a service orientation is the frequently stated value of altruism, benevolent attention to the needs of others.

10. A broad yet significant category of qualitative variables concerns the physician's personal effectiveness: social competence, adaptability, personal manner, appropriate use of humor, and relations with others.







 1990 Mar;65(3):145-9.

Qualitative variables in medical school admission.

Abstract

Personal qualities, character traits, life experience, and adaptive capacities are all associated with effective professional life and work. Despite widespread acknowledgement that qualitative factors are crucial for success as a medical student and physician, the variables are rarely measured or considered when medical schools reach decisions about student admission. This essay examines the qualitative variables that medical school admission committees might consider when filling their classes, and it offers recommendations about using qualitative data for admission decisions. It concludes with an agenda for research on medical school admission.


의과대학 선발에 대한 여러 관점

Perspectives on medical school admission.

McGaghie WC.






이 글은 의과대학 입학과 관련하여 저자기 몇 가지 중요한 이슈를 꼽아본 것이다.

(1) Educational consequences 

최근 몇 년간, 의과대학에 입학한 거의 모든 학생이 M.D.학위를 받고 의사면허를 취득하였다.

(2) Economic consequences 

(1)에서 언급한 '높은 성공률'를 고려하면, 의과대학에 입학한 학생들은 경제적으로 매우 보장받는다고 볼 수 있다

(3) Social consequences 

입학에 대한 결정이 직접적으로 고소득의, 고지위의 전문직 엘리트 집단의 구성과 연결된다.

(4) Myth of the academic aptitude-achievement link

의학 교육에 대한 학생의 학업적 적성(MCAT, GPA)과 의과대학에서 성취도와의 연결이 약하다.

(5) Class composition vs stated intention

여러 의과대학이 학생의 성격, 동기부여정도, 다른 개인적 특질과 같은 것들이 중요하다는 립서비스를 하지만 막상 학생을 선발할 때는 과학과목과 MCAT에서 높은 점수를 받은 학생을 뽑는다.

(6) Selection =/= prediction

입학 관련 직원과 위원회는 '학생을 선발하는 것'과 '학생의 성취도를 예측하는 것'을 혼동한다.

(7) American core values

미국 문화에서 두 가지 핵심 가치(self-reliance, competition)이 교육의 모든 시기에 걸쳐서 규준지향평가(norm-referenced measurement)를 하게 만든다.

(8) Alternative definitions of merit

전문직 교육에 적합하다는 것이 무엇인가를 정의하는 것에 대한 전통적 정의 외에 다양한 대안이 있을 수 있다.


This article is the author's formulation of important issues concerning medical school admission: that (1) in recent years, almost all applicants who have been admitted to medical school have obtained the M.D. degree and been licensed to practice; (2) given this high success rate, an accepted applicant's economic security is virtually guaranteed; (3) the admission decision contributes directly to the formation of a highly paid, high-status professional elite; (4) the link between students' academic aptitude for medical education and their achievement in medical school is weak; (5) schools pay lip-service to the importance of students' character, motivation, and other personal qualities but continue to select students with high grades in science courses and high MCAT scores; (6) admission officers and committees often confuse selecting students with predicting their achievement in medical school; (7) two core values in American culture (self-reliance and competition) encourage the use of norm-referenced measurement in all phases of education; and (8) there are alternatives to the traditional approach to defining eligibility for professional education.








 1990 Mar;65(3):136-9.

Perspectives on medical school admission.

Abstract

This article is the author's formulation of important issues concerning medical school admission: that (1) in recent years, almost all applicants who have been admitted to medical school have obtained the M.D. degree and been licensed to practice; (2) given this high success rate, an accepted applicant's economic security is virtually guaranteed; (3) the admission decision contributes directly to the formation of a highly paid, high-status professional elite; (4) the link between students' academic aptitude for medical education and their achievement in medical school is weak; (5) schools pay lip-service to the importance of students' character, motivation, and other personal qualities but continue to select students with high grades in science courses and high MCAT scores; (6) admission officers and committees often confuse selecting students with predicting their achievement in medical school; (7) two core values in American culture (self-reliance and competition) encourage the use of norm-referenced measurement in all phases of education; and (8) there are alternatives to the traditional approach to defining eligibility for professional education.


Factors affecting the utility of the multiple mini-interview in selecting candidates for graduate-entry medical school

Chris Roberts,1 Merrilyn Walton,1 Imogene Rothnie,1 Jim Crossley,2 Patricia Lyon,1 Koshila Kumar1

& David Tiller3




Introduction

전세계적으로 의과대학은 최고의 학생을 뽑아서 좋은 의사로 만들어내고자 한다. 그러나 '낮은 탈락률'이 의미하는 바는 한 번 입학하면 대부분의 학생이 그들의 인성이나 전문직업적 특성과 관련없이 의사가 된다는 것을 의미한다. 선발 과정은 의과대학에 있는 모든 평가 중에서도 분명히 가장 high-stakes, highly stressful, resource intensive한 평가이다. 일반적으로 선발과정에는 학업적 능력이 일부 포함되며, 지원자의 인성을 평가하기 위한 면접이나 추천서 등이 포함된다. 학생의 '성적'이란 이전 교육과정에서 학생이 얼만큼의 수행능력을 보였는가를 보여주는 것이고, 여러 연구에서 지속적으로 의과대학에서의 미래 수행능력을 예측하는 가장 우수한 예측인자로 보고되고 있다

Worldwide, medical schools aim to select the best students into their programmes and consequently expect to produce good doctors. Low attrition rates mean, however, that, once admitted, most students graduate as doctors regardless of their personal and professional characteristics.1 Selection procedures are arguably the most high-stakes, highly stressful and resource-intensive of all medical school assessments. They generally include some measure of academic ability (the ‘marks’) and some measure of a candidate’s personability as assessed in an interview or letter.2 Student ‘marks’ reflect past performance over a number of years of previous education and are consistently the best predictor of future performance, whether at medical school, or, for example, in North American licensing examinations.2


좋은 의사를 만드는 것은 좋은 성적 뿐만이 아니며, 대부분의 의과대학이 면접 등과 같은 방식으로 지원자의 가치관/헌신/비인지적 특성을 평가하고자 노력하고 있다. 그러나 면접은 미래 수행능력(학생이든, 의사든)을 예측하는데 뚜렷한 가치를 보여주지 못해서 입학과정의 중요한 요소로서의 공정성이 훼손되고 있다는 지적을 받았다. 면접에 대한 Psychometric studies를 보면 매우 다양한 결과가 나타나는데, 이는 신뢰도의 정의가 서로 다르고, 서로 연구방법론이 다르며, 소수의 연구만이 generalisability approach를 사용했기 때문이다.

It takes more than good marks to make a good doctor and most schools attempt to assess values and commitment and other important non-cognitive characteristics of candidates in some form of interview. However, the interview is of limited value in predicting anything about future performance, either as a student or as a doctor,2,3 which undermines its fairness as an important part of admissions procedures.3 Psychometric studies of interviews have produced highly variable results, largely because of differing definitions of reliability and differing research methodologies, and only a few studies have used a generalisability approach.3–6


다면인적성면접은 면접의 신뢰도에 대한 우려에서부터 시작되어 비교적 새롭게 등장한 평가방법이다. 이는 OSCE형식을 가져와서 긴 면접이 갖는 문제를 피해가고자 했다. 즉, 지원자에 대한 점수가 제한된 면접 주제 및 면접관에서 오는 편향에 영향을 받는다는 것이다. 이러한 면접 방식은 ‘stable qualities within candidates that have a high probability of occurring in an infinite range of contexts’이라는 인식에 도전하는데, MMI가 내용과 독립된 평가자라는 두 가지 측면에서 한 평가자가 갖는 단점을 극복하여 지원자의 행동에 대해서 더 신뢰도높게 generalisation이 가능하다.

The multiple mini-interview (MMI)5 is a relatively new assessment tool which addresses concerns about interview reliability. It uses the objective structured clinical examination (OSCE) format, and so avoids the issues of the long interview (cf. the long case in clinical competence), where much of the observed mark of the candidate relates to biases from the limited interview content and the interviewer panel.3 It challenges the notion that the interview can test ‘stable qualities within candidates that have a high probability of occurring in an infinite range of contexts’1 by confirming the issue of context specificity.5,6 Because the MMI tests a larger sample of both content and independent interviewers than a single interview can, more reliable generalisations about a candidate’s behaviour can be made.


원래 MMI가 개발된 센터 바깥에서 진행된 Pilots를 보면 입학생들에 대한 그들 나름의 비인지적 특성에 대한 framework를 개발하였다. 이번 연구에서 우리는 의학교육 연속체를 가로지르는 professionalism의 하나로서 pre-professionalism을 평가하고자 하였다. McMaster의과대학의 MMI역시 미래의 의과대학생, 그리고 미래의 의사로서 수행능력을 예측하는 상관관계가 우수하다는 예측타당도를 주장한 바 있다.

Pilots of MMIs conducted outside the original centre have developed their own frameworks to establish the preferred non-cognitive characteristics of entry-level students.7 In our study we assumed that we were measuring the behaviours of candidates that have been variously linked to frameworks of professionalism which cut across the medical education continuum,8 and have been called pre-professionalism, to reflect the potential of entry-level students for professionalism. The McMasters University (Hamilton, ON, Canada) MMI also claims predictive validity in that it makes good correlations with future performance as a medical student9 and as a doctor.10


우리 연구의 목적은 의학전문대학원 프로그램 선발에 있어서 면접관이 타당하고 신뢰도 높은 판정을 내릴 수 있는가를 pre-professionalism framework를 사용하여 연구하고자 했다. 또한 MMI의 어떤 특성이 가장 유용한지도 알아보고자 했다.

The aim of our study was to establish whether interviewers can make reliable and valid decisions about applicants when selecting candidates for entry to a graduate-entry medical programme, using a pre-professionalism framework and the MMI format. Secondly, we wanted to know which features of the MMI were most useful in guiding admissions committees to focus their resources in making robust decisions about candidates.



Methods  

Data came from a high-stakes admissions procedure. Content validity was assured by using a framework based on international criteria for sampling the behaviours expected of entry-level students. A variance components analysis was used to estimate the reliability and sources of measurement error. Further modelling was used to estimate the optimal configurations for future MMI iterations.



Results  

This study refers to 485 candidates, 155 interviewers and 21 questions taken from a pre- prepared bank. For a single MMI question and 1 assessor, 22% of the variance between scores reflected candidate-to-candidate variation. The reliability for an 8-question MMI was 0.7; to achieve 0.8 would require 14 questions. Typical inter-question correlations ranged from 0.08 to 0.38. A disattenuated correlation with the Graduate Australian Medical School Admissions Test (GAMSAT) subsection ‘Reasoning in Humanities and Social Sciences’ was 0.26.










Conclusions

In a high-stakes admissions procedure performed outside the original centre, on a large sample, using generalisability theory, 

    • we confirmed that the MMI is a moderately reliable method of assessment
    • We established the construct validity of the MMI by showing a small positive correlation with GAMSAT section scores for ‘Reasoning in Humanities and Social Sciences’ and ‘Written Communication’. 
    • The largest source of identifiable measurement error relates to aspects of interviewer subjectivity, suggesting that further training of interviewers would be beneficial
    • Applicant performance on one question did not correlate strongly with performance on another question, demonstrating the importance of context specificity when testing professional behaviours. 
    • Multiple mini-interviews must have a sufficient number of questions for precise comparison for ranking purposes because of the size of the measurement error. 
    • We demonstrated that a significant proportion of students with high GPAs and GAMSAT scores can fail an MMI


Further research is required into the construct and predictive validity of the MMI in order to justify its long-term use, and to establish the impact of training on measurement error through careful experimental design.




 2008 Apr;42(4):396-404. doi: 10.1111/j.1365-2923.2008.03018.x.

Factors affecting the utility of the multiple mini-interview in selecting candidates for graduate-entry medical school.

Abstract

CONTEXT:

We wished to determine which factors are important in ensuring interviewers are able to make reliable and valid decisions about the non-cognitive characteristics of candidates when selecting candidates for entry into a graduate-entry medical programme using the multiple mini-interview(MMI).

METHODS:

Data came from a high-stakes admissions procedure. Content validity was assured by using a framework based on international criteria for sampling the behaviours expected of entry-level students. A variance components analysis was used to estimate the reliability and sources of measurement error. Further modelling was used to estimate the optimal configurations for future MMI iterations.

RESULTS:

This study refers to 485 candidates, 155 interviewers and 21 questions taken from a pre- prepared bank. For a single MMI question and 1 assessor, 22% of the variance between scores reflected candidate-to-candidate variation. The reliability for an 8-question MMI was 0.7; to achieve 0.8 would require 14 questions. Typical inter-question correlations ranged from 0.08 to 0.38. A disattenuated correlation with the Graduate Australian Medical School Admissions Test (GAMSAT) subsection 'Reasoning in Humanities and Social Sciences' was 0.26.

CONCLUSIONS:

The MMI is a moderately reliable method of assessment. The largest source of error relates to aspects of interviewer subjectivity, suggesting interviewer training would be beneficial. Candidate performance on 1 question does not correlate strongly with performance on another question, demonstrating the importance of context specificity. The MMI needs to be sufficiently long for precise comparison for ranking purposes. We supported the validity of the MMI by showing a small positive correlation with GAMSAT section scores.

PMID:

 

18338992

 

[PubMed - indexed for MEDLINE]


Can we improve on how we select medical students?

Patricia Hughes, MSc FRCPsych

Admissions Office, Hunter Wing, St George's Hospital Medical School, London SW17 0RE, UK

E-mail: p.hughes@shgms.ac.uk

 





합당한 의과대학 입학정책을 운영하는 것은 좋은 의사가 될 잠재력을 가진 사람을 선발함으로서 사회에 대한 타당성(fair)을 갖춰야 하며, 또한 지원자에 대한 공정함(fair)을 갖춰야 한다. 선발은 엄밀한 과학은 아니지만, 모든 측면에서 최선을 다하기 위해 활용가능한 근거를 최대한 활용해야 한다. 단순한 학업적 성취 외에도 더 넓은 범위의 준거를 활용해야 한다는 것에 대한 폭넓은 동의가 있으나, 현실적으로 많은 의과대학이 다른 고려사항보다도 입학 전 학업성적을 가장 중요한 준거로 삼는다. 


그러나 학교 성적을 지적 역량의 척도로 활용하는 것에 대한 단점이 있어서, 대표적으로 A학점을 받는 것에 가장 중요하게 작용하는 요인이 사회적 계층이며, 개인의 능력과는 무관하다는 연구가 있다. 또한 의과대학생이 되려고 하는 학생들이 과학 과목에 집중하는 이유가 physical sciences 부분에서 인문 과목보다 더 좋은 점수를 받기 쉽기 때문이다. 시험 결과만이 타당하고 신되도 있는 자료라는 것은 '매력적이지만 오류가 많은' 신념이다. 우리는 모든 선발 도구들이 주관적 판단에 의존하고 있다는 것을 명심해야 하며, 각각의 도구들은 논리, 공정함, 공공의 검토(reason, fairness and public scrutiny)라는 규칙을 따라야 한다. 그러나 우리가 비인지적 준거를 고려하자고 하는 순간 의학의 여러 전문과목들은 다양한 skill을 필요로 하며, 따라서 그 준거들이 너무 협소해서는 안된다는 타당한 우려를 갖게 된다. 또한 우리가 비인지적 특징을 평가에 포함시키고자 한다면, 그렇게 이루어진 평가가 몇 년이 지난 후에도 개인의 특성을 잘 예측할 것이라는 확신을 갖고 싶어한다.


Getting the right policy for admission to medical school is a balancing act: be fair to society by choosing people with the potential to be good doctors; and be fair to the applicants—that diverse group of people who for many reasons want to set out on the long road to a medical career. Selection is not an exact science but we must use what evidence we have to ensure that we do our best by all concerned. There is widespread agreement that we should select future doctors on wider criteria than scores of academic success1, 2, though in practice many medical schools have valued pre-admission academic scores at the expense of other considerations3. There are recognized drawbacks to the use of school exam performance even as a measure of intellectual competence. One study has shown that a major causal determinant of A level results is social class, independent of ability4, and some would-be medical students elect to focus on sciences for their school leaving exams because very high marks are more easily achieved in the physical sciences than in the humanities5. The conviction that only exam results give valid and reliable data has been trenchantly dismissed as a ‘seductive but fallacious’ belief in the precision of quantitative tests6. We are reminded that all selective instruments depend on subjective judgments and each must be accountable to the rules of reason, fairness and public scrutiny7. However, if we decide to consider non-cognitive criteria, a legitimate concern is that the many specialties of medicine need diverse skills and they must not be too narrow. We also want to be reassured, if we include noncognitive characteristics, that we can assess them reliably and that such evaluation can predict personal character over years of practice.


의사에게 필요한 skill과 인성의 범위는 넓지만, 여전히 어떤 의사에게나 요구되는 특징이 있다. 충분한 지적 능력 외에도 정직성, 진실성, 양심 등이 좋은 진료의 중심에 있다. 도움을 주고자 하는 마음과 협력하려는 자세(Helpfulness and willingness to cooperate) 역시 중요하며, 환자들은 대인관계기술이나 공감능력이 뛰어난 의사를 좋아한다. 전문직으로서 개개인의 안녕(welfare)를 잘 유지하는 것 역시 중요하다. 의사들은 다른 직종보다 알콜중독, 약물남용, 자살 등에 취약하다. 탈진(burnout)역시 흔히 일어나는 일이며 이는 개인 뿐만 아니라 동료, 환자가 받게 되는 서비스의 질에도 큰 비용을 수반하는 것이다. 정신적으로 취약한 의사를 잘 지원해 주는 것이 하나의 답이 될 수 있을 것이지만, 더 중요한 것은 스트레스에 잘 대처할 수 있는 능력을 가진 사람을 애초에 뽑는 것일 것이다.


While we need to maintain diversity of skills and personality, there are some characteristics which we demand in any doctor. Enough intellectual ability to do the job, plus honesty, integrity and conscientiousness, must be at the heart of good practice8. Helpfulness and willingness to cooperate come close behind8, while patients give high priority to interpersonal skills and empathy2. The personal welfare of the profession is another consideration9. Doctors are more vulnerable than comparable professional groups to alcoholism, drug abuse and suicide10, 11. Burnout is well recognized, and has a high cost for the individual, for colleagues and for the quality of service that patients get12. One answer may be better support for psychologically vulnerable doctors12, 13 (together with improved working conditions for all doctors), but perhaps we should try to evaluate ability to deal with stress right from the start.



인성은 성년 이후에도 안정적으로 유지되는 특성인가?

ARE PERSONALITY CHARACTERISTICS STABLE OVER ADULT LIFE?


만약 우리가 의과대학생에게 원하는 인성을 찾고자 한다면, 이것이 과연 미래의 인성에 대해서 확실히 말해줄 수 있다는 자신감을 가질 수 있을까? 의과대학생과 이들을 15~30년간 추적한 연구에 따르면 middle age에 정신적으로 건강했던 의사들은 학생때에도 높은 자존감을 유지하고 있었고, 삶에 대해 열린/유연한 자세를 가지고 있었으며, 부모와 따뜻한 관계를 영유하고 있었고, 불안, 우울이 적었고 스트레스 상황에서 받는 화(anger)도 낮았다. 

If we seek to identify the personal characteristics we want in a medical student, can we have any confidence that they tell us anything about future personality or adjustment? Studies that assessed medical undergraduates and followed them up for between 15-30 years12, 14, 15 indicate that doctors who are psychologically well in middle age had good self esteem as students, had an open, flexible approach to life, enjoyed a warm relationship with their parents, and had little anxiety and depression and low anger under stress. 


반면, 후에 중년에 약물 오용, 자살, 탈진 등에 취약했던 의사들은 학생 때 역시 유의미하게 정신건강이 좋지 않았으며, 장기 연구에서 6년~45년의 간격을 두고 재평가(retest)를 했을 때 높은 test-retest correlation을 나타냈다. 이러한 열과는 인성의 연속적 특성이 행복하든 불행하든, 부유하든 가난하든 성년이 되어서도 유지되는 안정성(stable tendency)을 보임을 알려준다.

In contrast, doctors vulnerable to later substance abuse, to suicide and to burnout in middle age had significantly poorer measures of psychological health as undergraduates. Other long-term studies of stability of personality characteristics have shown that personality traits exhibit high test-retest correlations over intervals of 6 to 45 years16, 17, 18, 19. These findings signify a substantial continuity of personality disposition in adulthood, suggesting a stable tendency to be either happy or unhappy, well or poorly adjusted.



미래에 직무역량을 예측해주는 요인은 무엇인가?

WHAT FACTORS GENERALLY PREDICT FUTURE JOB PERFORMANCE?


의학과 의학 외 분야에서 모두 이런 것과 관련된 연구가 있다. 이러한 연구에 많은 돈을 투자한 산업계에서 유용한 정보들을 확인할 수 있다. 평균적으로 가장 생산성이 옾은 사람은 평균보다 40%쯤 더 잘벌고, 가장 나쁜 사람은 40%쯤 덜 번다는 결과를 보여주고 있다. 그런데 이것이 의학 분야와 많이 다를까? 주위를 둘러보면 최신의 지식을 갈고닦는 의사가 있고, 이런 것은 거의 하지 않는 의사가 있다. 이것이 유일한 criteria는 될 수 없지만 중요한 것임엔 틀림없다. 

There is relevant research both within medicine and outside it. Useful information comes from industry, where serious money has gone into finding out what makes a good professional20. They measure outcome in hard cash and find that the most productive people are about 40% better than average, while the least are 40% worse than average21. Is this too different from medicine to be relevant? Look around: we all know who gets the work done and keeps up to date, and who slips through life doing the minimum. These are not the only criteria for a decent doctor but they matter. 


복잡한 직무에 있어서 가장 훌륭한 예측인자는 mental ability에 대한 몇 가지 척도와 IQ이며, 더 높은 자리로 올라갈수록 IQ가 더 중요하다는 것이 지속적으로 나타나는 근거이다. 가장 높은 관리자 수준에서 전체 수행편차 중 70%가 이 것(IQ)때문이다. 따라서 높은 IQ가 중요하다는 것을 봤을 때 이것을 근거로 선발하는 것은 옳다고 할 수 있다. 예측력은 다른 몇 가지 요인을 추가하면 더욱 향상된다. 

There is consistent evidence that, for work involving complex tasks, the best predictor of effectiveness is some measure of mental ability or IQ, and the higher you go up the professional scale the more IQ matters. At the highest managerial level it accounts for almost 70% of performance variability22. So in demanding evidence of high IQ (even in the form of exam results) we have got something right. Predictability can be improved by including some measure of other factors. 


더 추가해야 할, 지속적으로 확인되는 요인은 '진실성(integrity)'와 성실성(conscientiousness)이다. 이것은 IQ와의 상관관계가 없다. 예측력을 높여주는 인자는 이것이 전부이다. 교육기간이 약간의 예측타당도를 높여줄 뿐이고, 얼마나 많은 과목을 들었는가는 아무런 관련이 없다. 이전 직장에서의 직무수행능력은 이미 직무를 수행하는 단계에 있는 사람에게는 관련이 있을지 몰라도, 시작하는 사람과는 관계가 없다. 이런 결과 중 몇 가지는 직관에 반하는 것이다. 왜냐하면 IQ가 다른 요인들과 중복되기 때문으로, 습득력이 빠른 사람은 이전 직장에서 좋은 수행능력을 보여줬을 것이지만, 이 자체가 이미 IQ와 높은 상관관계가 있기 때문에 예측타당도에는 도움이 되지 않는 것이다.

Further factors consistently found to add to prediction of performance are integrity and conscientiousness: these do not correlate with IQ23. No additional predictability comes from the number or nature of outside interests; years of education adds little to predictive validity; and the number of courses a person has been on is of no value (so much for how we measure ‘ continuing professional development’). Previous job performance adds to prediction for those already in the profession, but adds nothing at entry. Some of these results are counter-intuitive: this is because IQ overlaps with other things. So a quick learner will have good performance in a previous job which will correlate so highly with IQ that it adds little to predictive validity20.



탈락(academic failure)를 예측하는 요인은 무엇인가?

WHAT FACTORS PREDICT ACADEMIC FAILURE IN MEDICINE?


의학에서 예측인자에 대해 연구한다고 할 때 가장 먼저 집착(?)하는 것은 시험 결과이다. 지금까지 보았을 때 시험 통과의 예측인자를 연구한 논문이 가장 많을 것이다. 이는 중간에 탈락하는 의과대학생에게 들어가는 경제적, 개인적 비용을 고려했을 때 합당한 것이다. 보통 8%~10%정도가 이렇게 탈락한다고 보고되고 있다. 그러나 많은 연구에서 'failure'를 평가할 때 자퇴(exclude)하는 학생 뿐만 아니라 재시험을 보는 학생까지 포함하는 경우가 많고, 따라서 이러한 예측인자들을 주의해서 봐야 한다. 비록 학생들이 고등학교 때에는 0.4%에서 10%사이의 상위권 학생들이었지만, 이 성적과 의과대학 시험 성적에는 상관관계가 있다. 일부 영국 연구들은 일부 과학과목에서 A학점을 받은 것이 의과대학시험 성적을 예측한다고 보고하고 있다. 영국 외 지역에서도 유사한 결과가 있으나 이러한 것이 장기적으로 봤을 때는 성공 또는 실패에 차이를 주지 않는다.

The first thing that strikes anyone exploring the work on predictors in medicine is that we are obsessed with exam results: by far the largest number of papers examines predictors of passing exams. This may be justified because of the economic and personal waste of losing students who begin a medical degree but fail to complete, with loss from schools that select at entry, both in the UK and elsewhere, generally reported between 8% and 10%24, 25, 26, 27. However, most studies assess ‘failure’ in broad terms to include all students who re-take an examination, as well as those who are excluded from the course, so predictors should be treated with caution. Although virtually all students are high academic achievers at school, from the top 0.4%8 to the top 10%29, school and medical exam scores do correlate, with contribution to variability reported between 16%29 and 58%30. Some UK studies show that certain science A levels predict exam success, variously putting biology, chemistry or physics in prime place31, 32, 33, and research from outside the UK reports associations between performance in physical sciences and in medical exams34, 35, 36. Generally this association falls later in the course, with no difference to longer term success or failure37, 38, 39, 40.


비학업적 요인들도 성공 또는 실패를 예측하는 것들이 있는데, 일부 연구자들은 더 나이가 많은 학생일수록 시험에 탈락할 가능성이 높다고 하기도 하나, 다른 연구자들은 이러한 차이는 없다고 보고하고 있다. 몇몇 미국 연구들을 보면 여성 또는 소수인종 학생에서 탈락률이 더 높다고 보고하고 있으며, 한 학교에서는 affirmative action으로 입학한 학생들이 전통적 기준을 통해 들어온 학생들과 졸업하는데 있어서 차이가 없다고 보고하고 있다. 영어가 모국어가 아닌 나라에서는 영어를 얼마나 유창하게 하는가가 중요하며, 미국에서는 소수 인종에서 독해 능력이 학업 성취를 예측해주기도 했다. 비인지적 요인들은 백인 남성보다 여성과 소수 인종에서 더 강력한 예측인자였으며, 여성에 있어서 면접점수와 이전 관련 경험이 시험 점수보다 예측성이 더 높았다. 소수인종 학생에게 있어서는 locus of control과 자기평가 능력이 예측인자였다. 

Non-academic factors also predict exam success or failure. Some researchers report that older students are more likely to fail exams36, 38, 41, but others have not found this42. Several US studies found higher failure rates among women and ethnic minority students, although most eventually graduate36, 38, 41, and one school reported that students admitted through affirmative action were as likely to graduate as those admitted by use of traditional criteria43. Proficiency in English is important for students for whom English is not their first language44, 45, and in the US, reading skills of disadvantaged minority students have been shown to predict academic success46. Non-cognitive factors are stronger predictors for women and ethnic minority students than for white men in the US. For women, interview ratings and previous relevant experience were more predictive than previous exam scores47, while for ethnic minority students, locus of control and ability to self-evaluate were predictors48, 49. One US study showed that different cognitive and non-cognitive factors correlate with academic success in different schools, so different cultures and teaching styles influence outcome50.


미래에 다가올 어떤 failure는 피할 수 없는 것이고, 일부 학생들의 진로희망이 바뀌는 것을 막을 수는 없다. 그러나 두 의과대학에서 학생을 잘 선발하고 잘 지원을 해주면 긍정적 효과를 보여줄 수 있다는 결과가 있다.

It has been argued that we cannot reduce loss further51, because some failure is inevitable and we cannot avoid a few students' wanting to change career. However, two medical schools have shown that careful selection and good support can have a positive impact. 

뉴캐슬 연구 결과를 보면, 낮은 면접점수와 향후 탈락간에는 높은 상관관계가 있었지만, 낮은 학업점수와는 그러한 상관관계가 없었다.

In Newcastle, New South Wales, for five years 50% of students were selected on academic marks alone but underwent a lengthy structured interview which was not used for selection. As a result, some students were admitted with very low interview scores. The remaining 50% were selected from a wider band of academic performance but scored high in interview. Analysis after ten years showed a significant correlation between low interview score and later drop-out but no correlation between academic score at entry and drop-out. Reasons for dropping out were academic failure or a variety of personal reasons, including lack of motivation for study or for medicine28. 


McMaster에서는 remediation을 잘 해준 결과 100명의 학생 중 한 학생만이 학업적 이유로 exclude되었고, 3명은 진로를 바꾸었으며 8%는 remedial이 도움이 되었다.

Another example of low drop-out comes from McMaster University in Ontario, which also invests heavily in selection and in addition offers ‘remediation’ for students having academic difficulty. In one five-year period in a class of 100 students, only one student was excluded because of academic failure, 3 changed careers, while 8% had remedial help52.



우수한 임상 수행능력을 예측하는 것은 무엇인가?

WHAT PREDICTS GOOD CLINICAL PERFORMANCE?


  • 임상수행능력은 입학전 학업성적만으로 예측되지 않는다. 
  • 나이와 성별 모두 임상 수행능력을 예측하지 못하며, physical science를 과거에 공부했는지도 관계 없다.
  • 그러나 이전 영어 학습과 인문학 학습이 임상 수행능력과 상관관계가 있음을 보여준 연구가 있고, 일부 보고서에서는 입학시 면접 점수와의 상관관계를 보여주기도 했다. (입학시 면접과는 상관관계가 없다는 연구도 있다.)
  • 지원자에 대해서 매우 자세히 평가하는 학교에서는 공감과 동기부여가 특히 중요하다는 것을 보여주기도 했다.

Investigators looking for early predictors of what makes a good clinician generally use reports from clinical clerkships and from the house officer or intern year. However, we should note that drop-out will mean that some unsatisfactory students will have left before the house officer year. Clinical performance is not generally predicted by pre-entry academic scores1, 35, 53, 54, 55, 56, 57: the one report of correlation between matriculation scores and clinical performance noted that matriculation scores included 50% contribution from school teacher assessment58. Neither age nor gender predicts clinical performance, nor does previous study of physical sciences, but there is evidence that previous study of English and humanities correlates with better clinical performance5, 34, 59. There are some reports of association between clinical performance and admission interviews55, 56, 60, 61, although others reveal no correlation54, 58. In a school that carefully evaluates applicants, empathy and motivation to be a doctor were found particularly important in predicting both clinical and academic success62.


미래 수행능력의 예측인자를 평가하기 위한 가장 타당도 높은 방법은 무엇인가?

WHAT ARE THE MOST RELIABLE PROCEDURES TO ASSESS PREDICTORS OF FUTURE PERFORMANCE?


지금까지 우리가 가지고 있는 최선의 도구는 구조화된 면접이다.

If we can agree that there are certain characteristics that we want to select in prospective doctors, what is the best way of doing this? Research shows that, if we want to add usefully to a measure of intellectual ability in predicting later job performance, our best instrument is the structured interview

While an unstructured interview adds about 8% to prediction of subsequent performance, the structured interview adds around 24%63. 


심리검사로 개인적 특성을 알아내는 것이 미래 수행능력 예측에 도움이 될 수 있지만, 만약 선발 도구로 사용하고자 한다면 '정답'을 찾아내기 어렵지 않으므로 이러한 타당도가 손상될 가능성이 있다.

Psychometric tests to measure desirable personal characteristics do predict future performance, but their validity may be compromised if they are used as a selection tool: 

the desired answer is not usually difficult to identify, and applicants who lack integrity are the most likely to manipulate the results64. 

However, some schools have applied psychometric tests at the point of entry rather than using them to select, and have found correlation between these tests and scores given in interview65, 66. This suggests that a well conducted interview may give similar information and that, if constructed to assess desired characteristics such as conscientiousness or helpfulness, it will give a reasonably reliable evaluation20.


이 전에 이 사람을 가르쳤거나 고용했던 사람의 reference가 도움이 될 수 있다. 그러나 법에 따라 employee에 의해서 고소를 당할 가능성이 있고, 평가자의 '동기'가 무엇인지 알 길이 없다.

Character references from a previous employer or tutor have potential to add to prediction. However, legislative changes in the US in the 1980s meant that an employer giving an adverse report could be sued by the employee: as a result, the predictive validity of personal references in the US has fallen to almost zero20. 

The reliability of UCAS references in the UK may be similarly threatened. The motivation of the referee is uncertain: some tutors may feel their first loyalty to their student, others may feel compromised by recent data protection legislation that removes the confidentiality of previous years. 


One medical school in New Zealand has adapted the traditional reference system by writing to head teachers with specific questions, and requesting a rating of the candidate's qualities against the level the head teacher believes to be desirable in a doctor. The long-term predictive validity of this method has not been published, but the school believes it provides valid information and correlates well with other non-cognitive indices (and not at all with academic scores)35. 


일부 학교에서는 small group에서의 수행능력을 통해 실시간으로 대인관계능력을 평가하기도 한다.

Some schools, particularly those which do a lot of small-group work in the course, use an assessment of performance in small groups as a ‘live’ way to assess interpersonal skills29, 52. Evaluation of students in this setting correlates highly with interview scores, and is reported to predict both problem-solving ability and group interaction52.


현재의 '최선의 진료'를 구성하는 것은 무엇인가?

WHAT CONSTITUTES CURRENT BEST PRACTICE?


In summary, the evidence is that we need to select students with good intellectual ability and that examinations, despite limitations, have some validity. For some candidates—e.g. older applicants, or those from disadvantaged social backgrounds—we may want to look for reliable measures of intellectual ability other than the traditional A levels. We seek individuals who are conscientious and have integrity, who are empathic and motivated to become doctors, and who are psychologically robust enough to enjoy a successful medical career. Some medical schools, mainly outside the UK, have already recognized best practice and have put great care and resource into their selection procedures, with well-planned structured interviews, focused reports from schools and evaluation of interpersonal behaviour. As detailed above, there is evidence that this investment is worthwhile in terms of the suitability of students selected, and economically in terms of student loss during the course.


우리는 지금 어디 있는가?

AND WHERE ARE WE IN THE UK?


The greatest single barrier to a more careful selection process in the UK is the amount of resource that each school has to invest. At present, would-be medical students apply to up to four medical schools. All but four of the UK's present twenty-four medical schools interview about 500 to 1000 applicants for their five or six year MB BS courses. Many interviews are still unstructured, and not all schools require their interviewers to be trained. It is unusual for the interview to be more than 15 or 20 minutes, and while brief interviews may be reliable67 the validity of a 15-minute interview is doubtful68. The fact that many candidates are interviewed four times underlines the wastefulness of our present national procedure, but the cost to individual schools to improve radically would be prohibitive. Our present system does not offer society the best practice available: at present we almost certainly turn away people who would make good doctors and accept some who will be mediocre or poor. We could probably reduce loss from the medical course, and so save money and save personal distress among those who were allowed to make an unwise choice. We could also be more just to applicants, and begin the process of education by showing that we are very serious about the kind of personal qualities that we want in a doctor.


The Civil Service, the Armed Forces, and many business corporations have had selection boards for many years: the Civil Service believe these to be money well spent, and industry has gone further and demonstrated their cost effectiveness20. Those medical schools which invest heavily in their selection procedures admit that it is not cheap: on the other hand, it is not cheap to lose students unnecessarily or to employ a poorly motivated or unhappy doctor. There is a strong argument for pooling resources so that applicants get one good assessment instead of four poor ones. This does not preclude medical schools' maintaining individuality and some degree of choice, and candidates will continue to visit schools and attend open days. However, it is time that UK medical schools got together to collaborate in setting up a first-class selection process that is fair to society and fair to all those people who hope to be the doctors of tomorrow.







Hughes, P. (2002). Can we improve on how we select medical students?. Journal of the Royal Society of Medicine95(1), 18-22.


The Acceptability of the Multiple Mini Interview for Resident Selection

Marianna Hofmeister, PhD; Jocelyn Lockyer, PhD; Rod Crutcher, MD




캐나다 Alberta의 Alberta’s International Medical Graduate Program (AIMGP)에 지원한 학생을 대상으로 실시한 MMI에 대한 연구로서, 가정의학과 레지던트에 지원한 해외의과대학졸업생(IMG)의 면접에 대한 연구이다. 


Background and Objectives: This study describes and assesses the acceptability of the multiple mini interview (MMI) to both international medical graduate (IMG) applicants to family medicine residency training in Alberta, Canada, and also interviewers for Alberta’s International Medical Graduate Program (AIMGP), an Alberta Health and Wellness government initiative designed to help integrate IMGs into Canadian residency training. IMGs are physicians who completed undergraduate medical education outside of Canada and the United States. IMGs who live in the Canadian province of Alberta may obtain a limited number of government-funded positions for residency training by applying to AIMGP. 


12개 스테이션으로 이루어진 MMI를 설꼐하였고, 프로페셔널리즘에 대한 비인지적 특성을 보고자 했다. 가정의학과 교수와 의학교육자들이 문항을 개발하였으며, 시험이 종료된 후 설문을 하였다.


Methods: A literature review and faculty and medical community consultation informed the development of a 12-station MMI designed to identify non-cognitive characteristics associated with professionalism potential. Clinical scenarios were developed by family physicians and medical educators. Applicant and interviewer posttest acceptability was assessed using surveys. Quantitative data were analyzed using descriptive statistics, and qualitative data were analyzed using content analysis and thematic description. 


면접관들의 만족도가 높았으며, 캐나다 가정의학의 맥락에 잘 맞는 문항이라고 평가하였다. 지원자와 평가자 모두 8분이 충분한 시간이라고 했고, 지원자들은 이 면접이 성별 또는 문화 BIAS에 영향을 받지 않는다고 느꼈다고 응답했다. 평가자들은 MMI가 공정한 평가법이라는 것에 동의하였다.


Results: Our research demonstrates evidence for applicant and interviewer acceptability of the MMI. Interviewers reported high levels of satisfaction with the time-restricted process that addressed multiple situations pertinent to the Canadian family medicine context. Applicants and interviewers were each satisfied that 8 minutes was enough time at each station. Applicants reported that they felt the process was free from gender and cultural bias. Interviewers agreed that this MMI was a fair assessment of potential for family medicine. 


Conclusions: Standardized residency selection interviews can be adapted to measure professionalism potential characteristics important to family medicine in ways that are acceptable to IMG applicants and interviewers.






Background

MMI

MMI개발의 역사, Validity, Generalizability의 근거 확보. Acceptability 확보

The MMI is a multi-station interview with one interviewer rating candidates’ performance at each station. The MMI was developed at the Michael G. DeGroot School of Medicine at McMaster University in Hamilton, Ontario, Canada, and has been validated there,5,14,15 at the University of Calgary,6,16 in Australia,17 and in the UK.18 This interview instrument has demonstrated evidence for generalizability and validity in relation to future clinical and licensing examination performance as compared to traditional interview methods.5,7.9 Further, the MMI has established acceptability with members of applicant and interviewer stakeholder groups at the admissions level.5,16,17


MMI는 프로페셔널리즘 역량을 평가할 수 있는 flexibility가 있으며, 우리는 CFPC가 정의한 네 원칙에 따라 문항을 개발했다.

The flexibility of the MMI allows programs to select applicants whose behaviors best align with professionalism competency expectations. In this assessment, we developed an assessment in accordance with the College of Family Physicians of Canada’s four principles.6,19,20 The four family medicine principles are: 

      • the family physician is a skilled clinician, 
      • family medicine is a community-based discipline, 
      • the family physician is a resource for a defined practice population, and 
      • the patient-physician relationship is central to the role.20 

These principles provide a framework with which competencies, such as those outlined by the Accreditation Council for Graduate Medical Education (ACGME) (ie, professionalism, interpersonal and communication skills, and systems-based practice), can be addressed.


프로페셔널리즘 : 정의, Developmental, Context specific

“Professionalism potential” is derived from medical professionalism theory.21 The attributes associated with medical professionalism and professional behavior are of universal concern.22 Medical professionalism is conceptualized as developmental.5, 6 Professional behavior is context specific or situation dependent.23 This means that evidence of aspects of professional behavior in one challenging situation does not predict different aspects of professional behavior in another. It follows that professionalism potential for family medicine may be examined in multiple situations critical to best family practice using this new interview methodology.



연구의 목표

Objectives

The acceptability of the MMI in IMG groups and the acceptability of the MMI for professionalism potential measurement in IMG individuals for family medicine residency selection have not been previously investigated. The objective of this research was to investigate the acceptability to family medicine interviewers and to IMG applicants themselves of an MMI designed to measure professionalism potential of IMG applicants.



Methods

The process of preparing for the MMI and constructing stations began in March 2006 with the formation of the AIMG MMI Committee. The committee included a family physician chair who oversaw the process and made final decisions relating to what characteristics would be examined in which scenarios. Ethics approval for the study was provided by the University of Calgary Conjoint Health Research Ethics Board.


MMI Development

Station construction was guided by a table of specifications based on previously examined characteristics, characteristics important to family medicine,24 and those associated with medical professionalism.21 The characteristics examined in previous interviews included ...

relationship-building skills, 

team skills, 

recognition of professional limitations, 

integrity, 

decision-making skills, 

problem-solving skills, and 

communication in caring relationships. 


The medical education literature pertaining to professionalism and desirable personal traits in medical practitioners was reviewed.24-26 A comprehensive list of characteristics that might be assessed using the MMI was constructed and circulated to decision makers. A structured formal inquiry through e-mail correspondence, meetings, and discussion was used to gather input from the family medicine residency program directors at the University of Calgary and the University of Alberta and from other community-based and academic family physicians. This information was used to construct content-specific situations that would enable each characteristic to be assessed. The AIMG MMI Committee ultimately developed station content, question probes, and background information.


MMI Development

Ten stations, each designed to measure one characteristic, presented situations the applicant might face in a family medicine residency.27 The characteristics tested were ...

      • teamwork, 
      • honesty, 
      • ability to accept feedback about one’s self, 
      • ability to accept self-limitations, 
      • caring and compassion, 
      • responsibility taking, 
      • time management, 
      • the ability to accept professional limitations, 
      • cultural sensitivity, 
      • motivation for family medicine, and 
      • goal setting. 

A sample station is shown in Table 1.


IMG Applicants

To qualify for the MMI, applicants were required to have completed the AIMGP’s entry requirements. These criteria include...

a passing score on the Medical Council of Canada Equivalency Examination, 

a passing score on the Medical Council of Canada Qualifying Examination Part 1, and 

proof of successful completion of undergraduate medical education in a medical school listed in the Foundation for Advancement of International Medical Education and Research (FAIMER) directory. 


In addition, applicants were required to pass all components of the AIMGP objective structured clinical examination (OSCE). 

Specifically they had to pass the minimum number of stations for the clinical skills component and exceed the benchmark scores on the communication, oral, and written English proficiency tests. 

Alberta International Medical Graduate applicants for family medicine residency training positions who exceeded the minimal pass level on the clinical skills OSCE were e-mailed an invitation to the family medicine MMI following notification of their success on the OSCE. After each MMI session, applicants were asked to complete the acceptability survey.


Interviewers

      • 가정의학과 교수와 고년차 레지던트들이 면접관으로 들어감
      • 면접관은 2주 전에 2시간의 의무적 트레이닝 세션에 참가해야 함
      • 면접 48시간 전에 스테이션에 대한 정보 제공
      • 모든 면접은 캘거리대학에서 진행됨.
      • 면접관 특성에 따라 면접 점수가 달라질 수 있다는 연구 결과에 기반하여 면접관의 구성은 professional status와 gender에 따라서 하였음.


Interviewers were family medicine faculty and senior family medicine residents at the University of Alberta and the University of Calgary, community physicians from both urban centers, and stakeholders from other medical community-related groups (ie, medical education, language education, and human resources). All of the interviewers participated in a mandatory 2-hour training session 2 weeks before the MMI. Interviewers then received their station information 48 hours before the interviews took place. All of the interviews took place at the University of Calgary. Because previous research has shown that applicant scores may be related to interviewer characteristics, interviewers were organized in tracks and stations according to their professional status and gender to minimize these effects on applicant scores.14,28


Interview Procedure

      • 한 스테이션당 1명의 면접관
      • 두 세트를 활용하였고, 세 세션으로 나눠서 진행하였음.
      • 2분간 문 앞에서 정보를 숙지하고 8분간 면접 진행

The MMI uses a multi-station format that is similar to an OSCE. Each applicant moves through the same set of stations and is evaluated by a single interviewer at each station. More than one set of stations can be run at the same time. We used two sets of stations per session and ran three sessions in a single day. At each station, the applicant read the information posted on the door for 2 minutes and discussed his/her response with the interviewer for 8 minutes. After each session, applicants were invited to complete the applicant survey and at the end of interview day, interviewers were invited to complete the interviewer acceptability survey.








 2008 Nov-Dec;40(10):734-40.

The acceptability of the multiple mini interview for resident selection.

Abstract

BACKGROUND AND OBJECTIVES:

This study describes and assesses the acceptability of the multiple mini interview (MMI) to both international medical graduate (IMG) applicants to family medicine residency training in Alberta, Canada, and also interviewers for Alberta's International Medical Graduate Program (AIMGP), an Alberta Health and Wellness government initiative designed to help integrate IMGs into Canadian residency training. IMGs are physicians who completed undergraduate medical education outside of Canada and the United States. IMGs who live in the Canadian province of Alberta may obtain a limited number of government-funded positions for residency training by applying to AIMGP.

METHODS:

A literature review and faculty and medical community consultation informed the development of a 12-station MMI designed to identify non-cognitive characteristics associated with professionalism potential. Clinical scenarios were developed by family physicians and medical educators. Applicant and interviewer posttest acceptability was assessed using surveys. Quantitative data were analyzed using descriptive statistics, and qualitative data were analyzed using content analysis and thematic description.

RESULTS:

Our research demonstrates evidence for applicant and interviewer acceptability of the MMI. Interviewers reported high levels of satisfaction with the time-restricted process that addressed multiple situations pertinent to the Canadian family medicine context. Applicants and interviewers were each satisfied that 8 minutes was enough time at each station. Applicants reported that they felt the process was free from gender and cultural bias. Interviewers agreed that this MMI was a fair assessment of potential for family medicine.

CONCLUSIONS:

Standardized residency selection interviews can be adapted to measure professionalism potential characteristics important to family medicine in ways that are acceptable to IMG applicants and interviewers.

Dropout Rates in Medical Students at One School Before and After the Installation of Admission Tests in Austria

Gilbert Reibnegger, DSc, Hans-Christian Caluba, Daniel Ithaler, Simone Manhal, Heide Maria Neges, and Josef Smolle, MD





2002-2003학년도에 오스트리아의 의학교육에는 근본적 변화가 생겼다. 전통적인, 학문 중심의 교육 프로그램이 근대적인(modern), 주제별(theme-based), 학위수여(diploma-granting) 교육과정으로 변한 것이다. 오스트리아에 있는 모든 세 개의 공립 의과대학이 이 변화를 수용했으나, 각각의 대학은 세부적인 사항에 대해서는 학교별 강점과 선호에 따라 자율적으로 조정하였다.

In academic year 2002–2003, medical education in Austria changed in a fundamental way. The traditional, discipline-oriented study program was transformed into a modern, theme-based, diploma-granting curriculum with a timely, module-track structure. Although all three public medical universities in Austria (Medical University of Vienna, Innsbruck Medical University, and Medical University of Graz) adopted this reform in general, each university was free in establishing the details of its curriculum according to its specific strengths and preferences.



Background

Graz의과대학 교육과정

The Medical University of Graz curriculum


Graz의과대학의 교육과정은 처음부터 전임상 주제와 임상 주제를 통합하는 형태였으며, 조기에 환자 경험을 쌓는 것은 사회적, 의사소통 능력 뿐만 아니라 신체검진 능력 향상에도 도움이 된다. 또한 과학적 연구에 대한 교육도 강화했으며, 새롭게 설계된 'clinical year'가 역시 교육과정의 특징이다.

The reformed curriculum at the Medical University of Graz 1 integrates preclinical and clinical topics from the beginning. Early patient contact strongly enhances training in physical examination skills as well as social and communication skills. In addition, better education in scientific research matters and a newly designed “clinical year” are the hallmarks of the new program. The curriculum is designed to be completed in six years.


처음의 두 학기는 "첫 부분"으로서 의학 맥락 속에서 기초과학을 주로 배우게 된다. "두 번째 부분"은 2학년부터 5학년까지로, 의학지식의 기초, 정상과 병리상태, 형태학, 다양한 의학/임상 학문 등등을 배우게 된다. 첫 번재와 두 번째 부분은 주제별로 5주간 진행되는 형태이다. 30개 모듈 중에서 25개 모듈은 의무이며, 5개 모듈은 선택할 수 있다.

The initial two study semesters, the “first part of study,” are dominated by the basics of natural sciences in a medical context. The “second part of study,” years 2 through 5, is devoted to the fundamentals of medical knowledge, including normal as well as pathological function and morphology and the various medical and clinical disciplines. The first and second parts of study are organized in theme-centered modules lasting five weeks each. The modules are accompanied by vertical “tracks.” In tracks, specific knowledge and skills are taught during consecutive study years. Students choose 5 out of the required 30 modules from a broad offering of elective modules; 25 modules are obligatory for all students.


6학년에는 학생들은 다양한 임상 현장에서 임상현장의 일상에 참여하게 되며, 전문적인 임상 교사(expert clinical teacher)에 의해서 관리감독을 받게 된다.  또한 6학년 기간에 5주는 general practitioner의 office에서 보내게 된다.

In year 6, students participate in the daily clinical routine at different training sites and are constantly guided and supervised by expert clinical teachers. Additionally, during the course of year 6, students also spend five weeks in a general practitioner's office.


 

오스트리아의 의과대학 입학

Medical school admissions in Austria


일반적으로 오스트리아 대학은 'open admission'을 따라왔다. 즉, 고등학교를 성공적으로 마친 학생은 누구나 자신이 원하는 어떤 대학에든 입학할 수 있다는 것이다. 그러나 의과대학에 있어서 이러한 '개방입학(open admission)'은 상당히 만족스럽지 못한 결과를 가져왔다. 예컨대, Graz의과대학의 경우 의과대학 신입생은 600~800명으로 매년 다르며, 이 숫자는 교수 뿐만 아니라 시설 측면에서 학교의 수용능력을 넘어서는 것이다. 따라서 학습 환경이 좋지 못하고, 의욕이 꺾인 학생과 교수들은 소규모 학습 따위는 거의 하지 않으며 대부분의 수업이 대형 강의로 진행된다. Bedside teaching도 거의 없다. 학생들은 의과대학이 6년제 교육과정임에도 평균적으로 50%(3년) 이상을 추가적으로 학교를 다니고 있으며, 약 절반의 학생은 졸업하기 전에 탈락(dropped out) 한다.

In Austria, open admission to university studies has been the rule: Everyone successfully finishing secondary school education is generally entitled to be admitted to whatever university study she or he wants. In medicine, open admission led to particularly unsatisfactory consequences. For example, at the Medical University of Graz, the average number of new medical students varied between 600 and 800 per year, substantially exceeding capacities in terms of staff as well as infrastructure. Thus, study conditions were poor. Frustrated students and faculty made do with little or no small-group lecturing, a predominance of mass lectures, and little bedside teaching, among other limitations. On average, students exceeded the scheduled study time of six years by 50% or more, and approximately half of the students dropped out before reaching graduation.


오스트리아 의과대학은 또한 오스트리아 외 국가에서도 학생을 받아왔는데, 역사적으로 오스트리아의 대학에 입학하는 다른나라의 학생들(EU 국가 포함)은 자신의 국가에서도 동등하게 대학에 입학하였다는 것을 입증하여야 한다. 그러나 유럽법(European law)에 따르면 EU국가의 모든 시민들은 오스트리아 대학에 지원할 때 오스트리아 국민과 동등한 대우를 받아야 하고, 2005년 7월 European court는 오스트리아의 외국 학생에 대한 정책이 위법이라는 판결을 내렸다. 

Austrian medical universities also admitted students from outside Austria. Historically, students from other countries—including member states of the European Union (EU)—were admitted to an Austrian university only after they proved they had also been admitted to the same course of study in their country of origin. According to European law, however, citizens from all EU member states must be treated in the same way as Austrians when applying to Austrian universities. In July 2005, the European Court ruled that Austria's policy of foreign student admission to university studies violated European law.2 


이러한 결정은 의과대학에 특히 결정적이었다. 독일은 오스트리아의 인접국이면서, 오스트리아와 같은 언어를 사용하는데, 독일에서는 30000명의 의과대학 지원자 중 8000명~10000명만 의과대학에 입학할 수 있었던 것이다. European Court의 판결 이후 세 개의 오스트리아 의과대학이 독일 학생들로 꽉꽉 찰 것이라는 우려가 상당했다. 이에 대한 대책으로서 오스트리아 법이 즉각적으로 개정되었는데, 대부분의 대학 입학에 대해서는 여전히 개방입학(open admission)으로 남겨놓았지만, 일부 학과에 대해서는 입학 시험을 도입하는 것으로 바뀌었고, 이러한 학과에는 의학과 치의학 학위 프로그램이 포함되었다. 또한 European Commission은 2007년부터 5년간 오스트리아로 하여금 학생의 정원을 통제할 수 있도록 하였으며, 대부분의 의과대학 정원은 오스트리아 국민에게 가도록 하였다. 전체 정원중 75%는 오스트리아 자국민에게 할당되었으며, 20%는 다른 EU국가, 5%는 그 외 다른 국가에게 분배되었다.

This decision was particularly important for medical universities because of circumstances in Austria's neighboring country, Germany, which shares the same language as Austria. In Germany, only 8,000 to 10,000 of the approximately 30,000 applicants for the study of medicine are admitted each year. Therefore, after the court's decision, it was feared that the three Austrian medical universities would be overwhelmed by German students. To avoid this, Austrian law was changed immediately: While admission to most university study programs remained open for all applicants having completed secondary education, admission tests were introduced to regulate access for selected studies. Among the regulated studies were the diploma programs in human medicine and dentistry. Additionally, the European Commission issued a five-year moratorium in 2007,3 entitling Austria to regulate quotas of students until 2012 to ensure that the majority of openings are reserved for Austrian citizens. Seventy-five percent of openings are reserved for applicants who completed their secondary education at an Austrian school, 20% for citizens from other EU states, and 5% for applicants of other nationalities.


 

Graz의과대학의 선발

Medical University of Graz admissions

 

2005년, Graz의과대학은 난관에 봉착했는데, 이 전 년도의 개방입학에서 지나치게 많은 학생들이 입학한 것이다. 또한 2002-2003학년도에 도입된 새로운 교육과정은 이 전 교육과정에 비해서 더 많은 자원이 투입되어야 했다. 이러한 상황에서 '첫 파트'를 성공적으로 이수한 학생들도 즉각적으로 '두 번째 파트'로 진학하지 못하는 문제가 생겼다.

In 2005, the Medical University of Graz faced an unfortunate state of affairs. Because of the open admission policy of previous years, there was an inordinate number of students enrolled in the diploma of human medicine program. Further, the new curriculum implemented in 2002–2003 required significantly more resources than the previous program. Under these circumstances, students who had successfully completed the first part of study could not immediately proceed with the second part because of a lack of resources.


 

이러한 상황을 해결하기 위해서, 의과대학에서는 두 가지 당시의 법적 상황을 활용하여서 새롭게 입학하는 학생의 숫자를 조절하였다. 이에 따라 2005-2006학년도에는 107명의 학생만이 새롭게 입학하였고, 그 다음 해에는 154명, 그 다음 해에는 282명으로 서서히 그 수가 증가하였다. 이러한 방식으로 Graz의과대학은 성공적으로 학생이 누적되는 문제를 해결하였다. 2008-2009학년도 이후에는 약 350명의 학생이 입학하고 있으며, 이것이 거의 상한선에 해당한다. 

To resolve this situation, the university used the new legal situation to manage the numbers of new students entering the university very efficiently. Thus, in academic year 2005–2006, only 107 new students were admitted. In the two following years, the numbers were raised incrementally (154 in 2006–2007, and 282 in 2007–2008). By this measure, we successfully eliminated the backlog of students waiting to continue their studies. Since 2008–2009, 340 to 350 students have been admitted per year, representing the upper limit of capacity. This upper limit was consensually defined with the Federal Ministry of Science and Research on the basis of previous experience.



입학 과정을 개선하기 위해서 두 가지 과정이 진행되었는데, 첫 번째로 2005-2006학년도에 1000명이 넘는 모든 지원자를 모두 임시합격시켜서 첫 학기를 이수하게 하였으며, 이 때애는 거의 인터넷을 활용한 원거리학습을 사용하였다. 첫 학기의 세 개 모듈은 모두 전자문서형태로 변환되었고, 'Graz의과대학가상캠퍼스'를 통해서만 제공되었다. 이는 종합적, 웹기반 학습 플랫폼으로 Graz의과대학에서 이전에 개발된 것이다. 2006년 1월에 임시 합격한 모든 지원자는 2일간의 선발 절차를 통과해야 하는데, 제1일에는 세 모듈에 대한 다지선다형 필기시험을 치르며, 제2일이에는 추가적은 다지선다형 시험을 통해서 생물, 화학, 물리, 수학에 대한 고등학교 수준의 지식을 평가한다. 최종 합격은 성적순으로 107명을 선발하며, 이 학생들이 최종입학하여 향후 의과대학 수업을 받게 된다. 다른 모든 학생들은 탈락된다.

Two different procedures were applied in our efforts to reform the admission process. First, in academic year 2005–2006, all applicants (more than 1,000) were preliminarily accepted for an initial semester, which entailed exclusively distance learning via the Internet. The contents of the three modules of the first study semester were transformed into electronic documents and were offered to students online by means of the Virtual Medical Campus Graz. This is a comprehensive, Web-based learning platform which had been developed previously at the Medical University of Graz 4–6 to support teaching and learning. In January 2006, all preliminarily accepted students had to pass a two-day selection procedure. On day 1, there was a written assessment in multiple-choice (MC) format based on the students' knowledge of the three modules. On day 2, the students took an additional MC test further assessing their knowledge of biology, chemistry, physics, and mathematics on the secondary school level. The available admission openings were awarded to the 107 applicants ranking highest after both assessments. These applicants then were fully admitted to further study. All other applicants were excluded from continuing their study.


 

두 번째 단계는 2006-2007학년도에 도입된 것으로서, 지금까지도 계속되고 있는데, Graz의과대학은 지원자의 수행능력을 기반으로 한 선발 과정을 치른다. 이 시험은 앞에서 제2일에 시행한 시험을 기반으로 만들어졌으며, 주로 고등학교 수준의 생물, 화학, 물리, 수학 시험을 보고, 과학교과에 대한 지원자의 이해능력을 평가한다. 자연과학 부분에 초점을 둔 이러한 시험을 도입한 주 근거는 오스트리아 고등학교 교육과정이 워낙 다양해서 의과대학에 입학한 많은 학생이 고전한다는 오래된 관찰 결과에 기반한 것이다. 

The second process was implemented for academic year 2006–2007, and it continues today. The Medical University of Graz employs a selection procedure based on an applicant's performance on a required MC test prior to admission. This test was built on the basis of the test used on day 2 of the previous admission test. It is based mainly on secondary-school-level knowledge of biology, chemistry, physics, and mathematics and further includes assessment of the applicant's comprehension of scientific texts. A major rationale for using an admission test focusing mainly on the natural sciences was the long-standing observation that, because of strong heterogeneities in Austrian secondary school education, many medical students faced massive difficulties—and hence, the largest risk to fail and to drop out of study—during the initial study semesters, which are dominated by these scientific disciplines.



경험이 풍부한 대학 교수가 시험을 출제하며, 시험은 매년 7월 치러지고 성적이 좋은 지원자만이 의과대학에 입학할 수 있다. 현재, Graz에서 사용하고 있는 입학 시험은 일부 독일 의과대학에서 사용하는 입학 과정과 유사하며, 이들 대학과 향후 더 협력할 계획을 가지고 있다.

Experienced university faculty produce the test items. The admission test takes place in July each year during the holiday season of schools and universities. Those applicants who rank best on the admission test are admitted to study. Presently, the admission test is used only at the Medical University of Graz, but there are similar admission procedures at some German medical faculties (e.g., University Medical Center Hamburg–Eppendorf), and we are considering cooperating more closely with these faculties in the future.


 

Studying the effects

 

우리가 기대하는 것은 학생들의 수학기간(6년 교육과정임에도 9년간 공부하는)의 단축, 그리고 탈락률(50%이상)이 감소하는 두 가지 이다.

In summary, starting with academic year 2005–2006, a fundamental change in Austria's admission practice for medical studies caused leaders at the Medical University of Graz to implement sweeping reforms to their own admissions practices. Not only was the threat of becoming overwhelmed by German students removed, the university was for the first time able to adjust the number of fresh medical students according to the capacities available. Two major research hypotheses—and indeed hopes—accompanied the introduction of selective admission procedures: We expected that students' overlong study times (approximately nine years instead of six years as scheduled) as well as the absurdly high study dropout rates (50% or more) would be efficiently reduced.


We addressed the first of these research questions, namely, the effect of the change in admission practice on study progress rates, in a previous analysis.7 In the present investigation, we investigate the second important question mentioned above: Is there a measurable effect on dropout rate of the change in admission practice from open admission to active selection of students? How large is the putative effect? Do demographic variables such as students' nationality, age, and sex significantly modulate the putative effect?



Method

 

Participants

We included in the study all new students routinely enrolled in the new diploma human medicine program during the academic years 2002–2003 to 2008–2009. We excluded from the investigation students being admitted by any other route (e.g., students with prior credits from medical studies at the Medical University of Graz or elsewhere).


총 2860명의 학생

In total, we included 2,860 students for statistical analyses. Of these, 1,971 (68.9%) were openly admitted during academic years 2002–2003 to 2004–2005; 889 (31.1%) were admitted after passing an admission procedure during years 2005–2006 to 2008–2009.


코호트별로 observation period가 다름

Data on study progress were accumulated from academic year 2002–2003 until the end of the winter semester in academic year 2009–2010 (February 28, 2010). Thus, the observation period varies among cohorts from the investigated academic years. Whereas students who were enrolled in 2002 and 2003 were observed for more than six years and thus were able to reach graduation during the observation time, the observation period for students who were enrolled in 2004 and later was shorter than the scheduled six years of the curriculum.


남성 여성, 연령. 연령은 3분위수를 이용하여 20.89세를 기준으로 이분화함. 1~3분위는 매우 숫자가 가까웠음. 그래서 나머지를 '나이든' 그룹으로 묶음.

The study included 1,230 men (43.0%) and 1,630 (57.0%) women. Age range was from 17.51 to 50.03 years (median: 19.69 years; first quartile: 18.92 years; third quartile: 20.89 years). As in our previous investigation,7 for subsequent analysis we arbitrarily dichotomized the variable “age at study entry” at the third quartile of 20.89 years. There was no other motivation for the dichotomization just at this age other than to compare younger and older participants; because the first, second, and third quartile are very close, the third was taken to ensure a reasonable number of participants in the “older” group. Finally, 2,481 of the students (86.7%) were Austrians, 226 (7.9%) were Germans, and 153 (5.4%) came from other nations.


학생을 선별할 수 없도록 데이터를 수집하였음. 

We gathered the deidentified data from information that is routinely collected about medical students' admission, dropout, and graduation dates and examination history, as required by the Austrian Federal Ministry of Science and Research. Because the data were anonymous and no data beyond those required by law were collected for this study, the Medical University of Graz's ethical approval committee did not require approval for this study.


 

통계

Statistical methods

탈락하는 학생에 대해서 학생이 탈락하고 말고 뿐만 아니라, 어느 단계에서 탈락하느냐도 중요함. 

Phenomena such as students prematurely dropping out of a program are intrinsically time-dependent: Besides the question of whether or not a student drops out, it also matters when in the course of study this event occurs. Proper analysis of dropout, therefore, must include the time elapsing between a defined starting event (in our analysis, this is the date of enrollment) and the terminating event under consideration (the date of dropout) as a central variable. 


ANOVA나 회귀분석 같은 방법은 적절하지 않음. 학생마다 모두 학습이 달라서 모든 학생이 탈락하거나 모든 학생이 졸업할 때까지 기다릴 수 없음. 

Application of ordinary statistical methods, such as analyses of variance or regression techniques, frequently are not suitable in investigations of this type. First, study progress of participants may vary considerably, and one might be interested in drawing sound conclusions without waiting until all participants have either dropped out or reached graduation. Under reasonable circumstances, only a fraction of participants will experience the terminating event “dropout” within a given observation time, and—at least in principle—other participants may get lost from the observation for reasons other than dropout (e.g., graduation). This latter phenomenon is called censoring. Participants experiencing the defined termination event during the observation period carry full information for statistical analysis (“they have experienced the terminating event after a well-defined time interval”). Participants who do not drop out of study during the observation period nevertheless contribute important information, at least for the time period under observation (“they have not experienced the terminating event during a well-defined time interval”) but not thereafter.


이러한 경우에 의학에서는 생존분석을 하게 됨. 입학 전형 또는 인구학적 특성에 따라서 탈락율 차이를 분석함.

In medicine, we meet situations of this type very commonly in survival studies. In these cases, the starting point very frequently is the date of diagnosis of, for example, a malignant tumor, and the terminating event might be the date of detection of tumor recurrence or metastasis or even death. Consequently, we analyzed the effects of open admission versus active admission procedure as well as of some selected demographic variables on dropout rates by statistical methods from the field of survival analysis.8


 

Here, we distinguish between nonparametric, semiparametric, and parametric methods. The product-limit approach by Kaplan and Meier 9 does not make any assumption concerning the underlying hazard function (“baseline hazard”) for the terminating event under scrutiny but estimates the cumulative probabilities of “survival” (for our purpose, this corresponds to “retention in study”) merely from the empirical data at hand. Thus, it is a nonparametric method. The proportional hazards method by Cox 10 also does not make any assumption about the baseline hazard; the effect of covariates, however, is modeled by a parameterized analytic expression. The model parameters are estimated from the data and allow, in a multivariate fashion, quantification of the relative predictive strengths of the variables included with regard to the terminating event. The Cox method is thus a semiempiric one. Finally, there are a host of parametric models which provide explicit mathematical models for the baseline hazard as well as covariate effects. These models assume one of several possible distribution models for the baseline hazard (e.g., exponential distribution, Weibull distribution, Gompertz distribution, and others) with adjustable parameters. If appropriate, such models allow the estimation of cumulative probabilities as a function of time by means of an explicit analytic expression.


 

We used the nonparametric product limit technique by Kaplan and Meier to compute the cumulative probabilities for retention in the course of study for student categories defined on the basis of several variables: mode of admission (open admission versus selection), sex, age, and nationality. Such cumulative probabilities are usually represented graphically by typical step functions decreasing from 1.0 to smaller values, as observation time progresses. We tested differences of cumulative retention probabilities among different categories by the generalized likelihood ratio method (Breslow [chi]2 statistic).11 To visualize the time-dependent risk of experiencing dropout for students in defined categories, we computed smoothed hazard functions for dropout according to Muller and Wang.12 These smoothed hazard functions give the instantaneous probabilities that a participant will experience a terminating event at time “t.” Roughly, they represent the negative first derivative with respect to time of the cumulative retention probabilities. We employed the semiparametric proportional hazards model by Cox in order to study the combined effects of potential predictor variables in a multivariate manner and to identify the relative strength of each individual predictor variable in the context of all other variables.



All statistical evaluations, including basic statistics for comparison of mean values and frequencies among different groups of students, were done using commercially available software (Stata Statistical Software: Release 11; StataCorp, 2009, College Station, Texas).










Results

Cumulative probability of dropout was significantly reduced in students selected by active admission procedure versus those admitted openly (P < .0001). Relative hazard ratio of selected versus openly admitted students was only 0.145 (95% CI, 0.106–0.198). 


Among openly admitted students, but not for selected ones, the cumulative probabilities for dropout were higher for females (P < .0001) and for older students (P < .0001). Generally, dropout hazard is highest during the second year of study.



Conclusions

The introduction of admission testing significantly decreased the cumulative probability for dropout. In openly admitted students a significantly higher risk for dropout was found in female students and in older students, whereas no such effects can be detected after admission testing. Future research should focus on the sex dependence, with the aim of improving success rates among female applicants on the admission tests





 2011 Aug;86(8):1040-8. doi: 10.1097/ACM.0b013e3182223a1b.

Dropout rates in medical students at one school before and after the installation of admission tests in Austria.

Abstract

PURPOSE:

Admission to medical studies in Austria since academic year 2005-2006 has been regulated by admission tests. At the Medical University of Graz, an admission test focusing on secondary-school-level knowledge in natural sciences has been used for this purpose. The impact of this important change on dropout rates of female versus male students and older versus younger students is reported.

METHOD:

All 2,860 students admitted to the human medicine diploma program at the Medical University of Graz from academic years 2002-2003 to 2008-2009 were included. Nonparametric and semiparametric survival analysis techniques were employed to compare cumulative probability of dropout between demographic groups.

RESULTS:

Cumulative probability of dropout was significantly reduced in students selected by active admission procedure versus those admitted openly (P < .0001). Relative hazard ratio of selected versus openly admitted students was only 0.145 (95% CI, 0.106-0.198). Among openly admitted students, but not for selected ones, the cumulative probabilities for dropout were higher for females (P < .0001) and for older students (P < .0001). Generally, dropout hazard is highest during the second year of study.

CONCLUSIONS:

The introduction of admission testing significantly decreased the cumulative probability for dropout. In openly admitted students a significantly higher risk for dropout was found in female students and in older students, whereas no such effects can be detected after admission testing. Future research should focus on the sex dependence, with the aim of improving success rates among female applicants on the admission tests.

PMID:

 

21694561

 

[PubMed - indexed for MEDLINE]







An analysis of the German university admissions system

Alexander Westkamp






독일 법에 따르면 Arbitur (secondary school을 성공적으로 마침)을 획득한 학생이라면 누구나, 어떤 학과든, 어떤 공립대학에서 수학할 자격이 주어진다. 

According to German legislation, every student who obtains the Abitur (i.e., successfully finishes secondary school) or some equivalent qualification is entitled to study any subject at any public university. Given capacity constraints at educational institutions and the ensuing need to reject some applicants, this principle has long been reinterpreted as meaning that everyone should have a chance of being admitted into the program of his or her choice. In order to implement this requirement, places in those fields of study that are most prone to overdemand have been allocated by a centralized nationwide assignment procedure for over 25 years. 


In the first part of this paper, I analyze the most recent version of this procedure that is currently used to allocate places for medicine and three specialities (dentistry, pharmacy, and veterinary medicine). In the winter term 2010/2011, more than 56,000 students applied for one of the less than 13,000 places available in these four subjects, meaning that ultimately three in four applicants had to be rejected. What sets this part of my study apart from previous investigations of real-life centralized clearinghouses is the sequential nature of the German admissions procedure: In the first step, the well-known Boston mechanism is used to allocate up to 40 % of the total capacity of each university among special applicant groups, consisting of applicants who have either obtained excellent school grades or have had to wait a long time since finishing school


About one month later, all remaining places—this includes in particular all places that could have been but were not allocated to special student groups—are assigned among remaining applicants according to criteria chosen by the universities using the college (university) proposing deferred acceptance algorithm (CDA). Applicants belonging to special student groups, who were not assigned one of the seats initially reserved for them, have another chance of obtaining a seat in this part of the procedure. 





Westkamp, A. (2013). An analysis of the German university admissions system.Economic Theory53(3), 561-589.


Abstract This paper analyzes the sequential admissions procedure for medical subjects at public universities in Germany. Complete information equilibrium outcomes are shown to be characterized by a stability condition that is adapted to the institutional constraints of the German system. I introduce matching problems with complex constraints and the notion of procedural stability. Two simple assumptions guarantee existence of a student optimal procedurally stable matching mechanism that is strategyproof for students. In the context of the German admissions problem, this mechanism weakly Pareto dominates all equilibrium outcomes of the currently employed procedure. Applications to school choice with affirmative action are also discussed.


Keywords University admissions · Matching · Stability · Strategyproofness · Complex constraints


Cutting costs of multiple mini-interviews – changes in reliability and efficiency of the Hamburg medical school admission test between two applications

Johanna C Hissbach1, Susanne Sehner2, Sigrid Harendza3 and Wolfgang Hampe1*





Results

The overall reliability of the initial 2009 HAM-Int procedure with twelve stations and an average of 2.33 raters per station was ICC=0.75. Following the improvement actions, in 2010 the ICC remained stable at 0.76, despite the reduction of the process to nine stations and 2.17 raters per station. Moreover, costs were cut down from $915 to $495 per candidate. With the 2010 modalities, we could have reached an ICC of 0.80 with 16 single rater stations ($570 per candidate).


Conclusions

다면인적성면접(MMI)의 비용-효과성을 높이려면, 점수체계/평가자 훈련/시나리오 개발에 투자하는 편이 좋다. 또한 스테이션 수를 늘리는 것이 스테이션당 평가자 수를 늘리는 것이 낫다. 그러나 80%이상의 reliability를 달성하고자 한다면 약간의 개선을 위해서도 엄청난 비용이 들어간다.

With respect to reliability and cost-efficiency, it is generally worthwhile to invest in scoringrater training and scenario development. Moreover, it is more beneficial to increase the number of stations instead of raters within stations. However, if we want to achieve more than 80 % reliability, a minor improvement is paid with skyrocketing costs.

Keywords: 

Multiple mini interview; Cost-effectiveness analysis; Reliability; Optimization


Background

Admission to medical school is a field of feisty debate. Usually, measures of academic achievement and interview performance are used for admission decisions. Assets and drawbacks of these different approaches allude to psychometric properties and costs. School grades such as grade point average (GPA) and high stakes ability tests are usually easily administered, cost efficient and psychometrically sound but they disregard personality factors that might be crucial for a medical career (e.g. [1-3]). On the other hand, interviews have high face validity [4], but evidence for the reliability and validity of panel interviews is scarce.


The multiple mini-interview (MMI) with its multiple sampling approach is widely accepted by raters and candidates [5-7], and it is regarded as a comparatively reliable measure of non-cognitive skills [8]. However, reliability coefficients vary substantially depending on the target population, setting variables, study design, and methods used, which impedes the comparison of results. In undergraduate medical school selection, reliability measures obtained on the basis of generalizability method [9] ranged from 0.63 to 0.79 [10-13]. Most coefficients for nine station procedures with one or two observers per station lie around G=0.75.


Another concern specifically addresses the cost-effectiveness of MMI. The costs and the effort of faculty are essential for officials to refrain from introducing MMIs [10]. The expenses associated with such a procedure depend mainly on varying modalities of the process. Even though there is evidence that MMIs are more cost-effective than traditional panel interviews [6,14,15], costs are still high as compared to paper and pencil tests. Eva et al. report the costs of the actual process on the interview day (about $35 per candidate) but do not include the costs generated in the framework of project preparation and organization [6]. Rosenfeld et al. provided an overview of the time requirements for mounting multiple mini-interviews and traditional interviews [14]. To interview 400 candidates with the MMI procedure they calculated a maximum of 1,078 staff hours (278 staff hours for the organization and 800 observer hours). Additional costs of $5,440 arose from the creation of stations ($50 per station for three hours creation time), infrastructure, and miscellaneous expenses. If we assume an average hourly rate of $50 for their staff, then the total costs would be approximately $150 per candidate.


In Tel-Aviv, Ziv et al. developed a medical school admission tool with MMI concepts (MOR) and found the inter-rater reliability of the behavioral interview stations was moderate [16]. The total cost of MOR process was approximately $300 per candidate but further information on the existing costs has not been provided.


In another study, costs of an Australian MMI procedure from 2009 were roughly AU $450 per candidate [17] – the costs reported, however, were mostly on candidates’ side, with airfares being the major factor.


Student selection at Hamburg medical school

In the 1990s, Hamburg Medical School conducted unstructured interviews for admission. Many faculty members were dissatisfied with this procedure, and the interviews were stopped within the scope of a change in federal law. With the introduction of a test in natural sciences for student admission in 2008 [18,19], the significance of psychosocial skills came to the fore. In March 2009, the faculty board decided to adopt the MMI format for a pilot test with a small number of candidates, aiming for a stepwise selection procedure in 2010: The GPA and HAM-Nat scores were applied to preselect candidates whose psychosocial skills were then assessed by the HAM-Int (“Hamburg Assessment Test for Medicine - Interview”).


The HAM-Int pilot (2009)

In a survey among the heads of clinical departments and members of the curriculum committees the following eight psychosocial characteristics received the highest ratings: integrity, self-reflection, empathy, self-regulation, stress resistance, decision-making abilities, respect, and motivation to study medicine. The participants of a faculty development workshop wrote the MMI scenarios, keeping the specified psychosocial skills in mind. These drafts were later discussed with psychologists and educational researchers and thereupon modified or rejected. Some of the defined skills were wide ranging or could not to be validly tested (e.g. integrity). Therefore, it was impossible to achieve a word-for-word translation of scenario characteristics. In total, twelve five-minute stations were assembled for the 2009 circuit.


We found a relatively low overall reliability coefficient (ICC=0.75 for twelve stations and a mean of 2.3 raters per station) as compared to those reported in other studies [20]. This raised the question as to which actions would enhance the reliability of the multiple mini-interview. Uijtdehaage et al. [21] found that a few changes in the procedure improved the reliability from G=0.59 to G=0.71. The increase in reliability was mainly due to a rise in candidate variation. The authors argue that maybe the change of venue – such as interviews were conducted in a different building – made the procedure less intimidating and therefore less stressful for candidates.


The feedback of raters and candidates drew our attention to the parameters, i.e. scenarios, score sheets, and rater training, aimed at improving reliability. We compare the results from the 2009 pilot test and the 2010 procedure.


This paper focuses on two aspects of MMI improvement: fine-tuning and cost-effectiveness. Our research questions were: Did our actions to improve the procedure enhance overall reliability? Which is the most efficient and practicable way to reach satisfactory reliability?

Methods

Candidates

In 2009, applicants for Hamburg Medical School were asked to state if they preferred to take the HAM-Nat test or the HAM-Int. We used the HAM-Int pilot to award 30 university places on the basis of interview results (in combination with GPA). The remaining places were allocated by HAM-Nat results (in combination with GPA). Among the 215 applicants who preferred the interviews to the HAM-Nat test, those 80 with the highest GPA were invited. The others were assigned to the HAM-Nat test. In 2010, we felt prepared to test 200 candidates who were preselected by the HAM-Nat test and GPA. All candidates took the HAM-Nat test, and those with excellent GPA and HAM-Nat scores (rank 1–100) were admitted without further testing, while the next 200 were invited to take the interviews. One hundred and fifteen further places were available. All candidates gave written informed consent.


Procedure

All interviews of one year took place on a single day in parallel circuits and consecutive rounds. Interviewers remained at their station during the day. Candidates were randomly assigned to circuit and round. In 2010, the number of circuits was increased from two to four and the number of rounds from three to five. To preclude a leak of scenario contents, all candidates checked in at the same time in the morning in 2009. As candidates perceived the waiting period before the start of the interviews as being quite stressful, in 2010 all candidates checked in just before they started their interview cycle. We also provided the raters with personalized score sheets in order of appearance of candidates, which substantially improved the interview cycle. An overview of the changes made to the procedure is given in Table 1.

Table 1. Changes made to the procedure (2009 – 2010)


Stations

In 2009, twelve five-minute stations with 1.5 minutes change-over time were assembled. Actors experienced with objective structured clinical examinations (OSCEs) from the in-house simulated patients program were trained for six scenarios. We provided prompting questions for the interviewers for the other six stations.

As it had turned out to be challenging to write scenarios which reflected the eight different target variables, the steering committee decided to focus on a core set of three in 2010: empathy, communication skills, and self-regulation. In 2010, nine five-minute stations were assembled. Those four stations that appeared to have worked best in 2009 were refined and reused, and five new stations were developed with more time and effort spent into testing and revision. In total, five stations involved actors.


Score sheets

The 2009 scoring sheets comprised three specific items and one global rating on a 6-point Likert scale. The numerically anchored scale ranged from 0–5 points. The specific items reflected e.g. communication skills, the formal presentation of a problem, empathy or respect in a social interaction, depending on the main focus of the station. The global rating was meant to reflect overall performance, including aspects not covered by the specific items. As the two lowest categories were only used in less than 5% of the global ratings, we changed the scale to a verbally anchored, 5 point-Likert scale in 2010. The scale ranged from 1 (very poor) to 5 (very good). In a thorough revision of all score sheets, we included detailed descriptions of unwanted and desired candidate behavior as anchors at three points along the scale (very poor performance, mediocre performance and very good performance). Raters were encouraged to use the full range of scores.


Raters and rater training

Hospital staff volunteered to take part in the interviews. Raters were released from work for the interview day within the scope of their regular contracts to be involved in the process. Mixed-gender rater teams of at least one professional from the psychosocial department and one experienced clinician were randomly assigned to stations to include a broad spectrum of judgments. The rationale to do so originated from the fact that not all candidates encountered the same set of interviewers. We aimed to ensure that all candidates saw an equal number of men and women as well as of psychologists and physicians.

All raters received a general instruction to familiarize them with the MMI procedure. They were then grouped within their specific stations, discussed their scenario, and had several practice runs with simulated candidates (students) to standardize scoring between the parallel circuits. While in 2009 the rater training session of two hours was held just before interviews started, the training was extended to a four hour session on the day preceding the interviews in 2010. While in 2009 interviewers rated the candidates’ performance, we refrained from this practice in the following year as a result of the interviewers’ feedback. They stated that is was too demanding to interview and to give a reliable rating at the same time.


Statistical analysis

Due to the naturalistic setting we have a partially crossed and nested design. Different sources of variability were estimated by means of a random intercept model with restricted maximum likelihood (REML) method. All analyses were conducted using IBM SPSS Statistics, Version 19.0.0 (2010).

As each candidate encountered all twelve or nine stations, respectively, candidates were fully crossed with stations but nested within circuit. Raters were nested within station and circuit as each rater was trained for one specific station. We constructed two different models. In the first model we examined the different sources of variability (random intercepts): candidate, station, rater, and candidate*station. The candidate effect reflects systematic differences in performance between candidates. The station effect represents systematic differences in station difficulty, while the candidate*station effect accounts for differences in the way candidates coped with the different stations. This effect is non-systematic and reflects a candidate specific profile of strengths and weaknesses with regard to stations. As raters remained at their station throughout the test, systematic differences in stringency (rater effect) could be estimated, while the rater*candidate effect (rater candidate taste) could not be separated from error. We apportioned all remaining variance to this term.

Corresponding to Generalizability Theory [22] we determined sources of measurement error by means of a multilevel random intercept model [23]. We took the ICCs as a G-coefficient for relative decisions as we included only those terms that affect the rank ordering of candidates. The reliability of the procedure is the proportion of variance attributable to candidates to total variance. As candidates were assigned to different sets of raters, systematic differences in rater stringency can have an effect on the ranking of candidates. Therefore, we adjusted for rater stringency as proposed by Roberts et al. [24] by including a fixed rater effect.

Unwanted sources of variability are due to the candidate specific station differences (Vcand*stat), namely candidate station taste, while systematic differences in station difficulty have no effect on the rank order, as all candidates encountered the same stations. All remaining residual variance was attributed to rater candidate taste (Vcand*rater). The following formula was used for the calculation of the overall reliability:

<a onClick="popup('http://www.biomedcentral.com/1472-6920/14/54/mathml/M1','MathML',630,470);return false;" target="_blank" href="http://www.biomedcentral.com/1472-6920/14/54/mathml/M1">View MathML</a>

As a measure of inter-rater reliabilities (IRR) in the different stations we report intraclass correlations (ICC) for average measures (consistency) with two-way random effects.












 2014 Mar 19;14:54. doi: 10.1186/1472-6920-14-54.

Cutting costs of multiple mini-interviews - changes in reliability and efficiency of the Hamburg medical schooladmission test between two applications.

Abstract

BACKGROUND:

Multiple mini-interviews (MMIs) are a valuable tool in medical school selection due to their broad acceptance and promising psychometric properties. With respect to the high expenses associated with this procedure, the discussion about its feasibility should be extended to cost-effectiveness issues.

METHODS:

Following a pilot test of MMIs for medical school admission at Hamburg University in 2009 (HAM-Int), we took several actions to improvereliability and to reduce costs of the subsequent procedure in 2010. For both years, we assessed overall and inter-rater reliabilities based on multilevel analyses. Moreover, we provide a detailed specification of costs, as well as an extrapolation of the interrelation of costsreliability, and the setup of the procedure.

RESULTS:

The overall reliability of the initial 2009 HAM-Int procedure with twelve stations and an average of 2.33 raters per station was ICC=0.75. Following the improvement actions, in 2010 the ICC remained stable at 0.76, despite the reduction of the process to nine stations and 2.17 raters per station. Moreover, costs were cut down from $915 to $495 per candidate. With the 2010 modalities, we could have reached an ICC of 0.80 with 16 single rater stations ($570 per candidate).

CONCLUSIONS:

With respect to reliability and cost-efficiency, it is generally worthwhile to invest in scoring, rater training and scenario development. Moreover, it is more beneficial to increase the number of stations instead of raters within stations. However, if we want to achieve more than 80 %reliability, a minor improvement is paid with skyrocketing costs.

PMID:
 
24645665
 
[PubMed - in process] 
PMCID:
 
PMC3995077
 
Free PMC Article


출신 고등학교 유형에 따른 의과대학생의 교과목별 학업성취도

을지의과대학교 산부인과학교실, 을지의과대학교 교육개발연구센터1, 한국방송통신대학교 교육과2

박원일․전수경1․정민승2

한국의학교육 : 제 19 권 제 2 호 2007 



서 론

입학 당시 학생의 특성이 향후 학업성취도에 미치는 영향을 예측하는 것은 학생지도에 중요한 부분이다. 국내 의과대학의 신입생 선발은 최근 의학전문대학원으로의 학제 변경에 따라 많은 대학이 의학전문대학원으로 전환하고 있지만 기존의 의예과(premedical)-본과 과정을 고수하여 고등학교 졸업생을 신입생으로 선발하는 대학도 많다. 국내 고등학교는 크게 일반고와 과학고, 외국어고로 대표되는 특수목적 고등학교로 나눌 수 있다. 국내 의과대학 입학생 중 특수목적 고등학교 졸업생이 많은 부분을 차지하기에 이들이 일반고 졸업생과 비교하여 의과대학에서 학업성취도에 어떤 차이가 있는가를 분석하는 것은 상당한 의미를 가진다고 할 수 있다. 그러나 국내에서 고등학교 유형별로 출신학생의 의과대학에서의 학업성취도를 과목별로 비교한 문헌은 발표되지 않았다.


특수목적 고등학교 중 특히 과학고의 경우는 과학과목의 경우 고등학교 수준에서는 선행학습이라 할 수 있는 교육이 이루어지며 이는 의과대학 교과과정, 특히 의예과와 전임상과정 (preclinical)의 일부 교과목에 대한 선행학습이라 할 수 있다. 초, 중등교육과는 달리 의학교육에서 선행학습의 효과는 아직 미지수이다. 

미국의 경우에는 의학대학원 제도이므로 학부시절에 자연과학을 전공한 학생과 인문사회과학을 전공한 학생의 의학대학에서의 학업성취도와의 관련성에 대한 연구가 많이 발표되어 왔다 (Dickman et al., 1980, Koenig, 1992). 

또 학부시절 생화학, 유전학 등 의학대학에서 배우는 과목을 이미 이수한 경우, 이수하지 않은 학생들과 비교하여 해당과목의 성적이 어떠한지를 비교한 연구도 발표되었으며 (Canaday & Lancaster, 1985), 

의과대학 입학 전에 이수한 예비학교의 성격을 가진 과정(postbaccalaureate course)에서 역시 의과대학에서 이수할 과목에 대한 선행학습이 이루어진 학생들의 의과대학에서의 학업성취도를 확인하는 연구도 여러 편 발표되었다 (Hojat et al., 1990, Smith, 1998).


유럽과 호주의 경우에는 한국의 학제와 유사하게 고등학교 졸업 후 6년간 의과대학에 다니는 제도를 가지고 있는데, 특수목적 고등학교가 거의 없기에 고등학교에서 의과대학 이수과목에 대한 선행학습의 효과를 연구한 문헌은 없으나, 고등학교 때의 학과목 성적, 대학입학시험 성적과 의과대학에서의 학업성취도와의 관계에 대한 연구가 소수 발표되었다(Lipton et al., 1988; Frischenschlager et al., 2005).


지금까지 발표된 외국의 연구결과를 종합해보면, 1980년대 이전에는 학부에서의 선행학습과 의과대학의 학업성취도 사이에 상관관계가 있다는 보고와 상관이 없다는 보고가 혼재되어 있어 결론을 내리기 어려웠다 (Hamberg et al., 1971; Thomae-Forgues et al., 1980). 그러나 1980년 이후의 문헌에서는 학부에서의 의과대학 이수과목 선행 학습여부, 해당과목의 성적 등은 의과대학 초반에는 약간의 상관관계를 보이거나 전혀 보이지 않으며, 의과대학 후기에는 상관관계가 전혀 없다는 결론이 지배적이었다 (Canaday & Lancaster, 1985; Hojat et al., 1990; Koenig, 1992; Smith, 1998).



대상 및 방법

학업 성취도는 최종 취득 학점을 4.3만점으로 수량화하였다. 의과대학은 학년제로 운영되어 한 과목만 F학점을 받아도 유급되므로 유급생의 경우는 유급을 했던 학년의 성적을 적용하였다. 평균 점수는 평균±표준편차로 표시하였다. 통계 분석은 일반고, 과학고, 외국어고의 3군을 비교하기 위해 분산 분석(ANOVA) 검정과 다중 비교 방법인 Duncan post-hoc test를 적용하였다. 분석 이전에 K-S test로 정규 분포 여부를 확인하였으며 모든 변수는 정규분포를 이루었다.


결 과

가. 일반적 사항

나. 의예과 및 전임상 과정의 교과목별 학업성취도 비교

다. 임상과목에서 강의 및 실습의 학업성취도의 비교

라. 성별, 나이, 고등학교의 유형이 유전학, 정신과 실습의 성적에 미치는 영향





고 찰

가. 고등학교에서의 학습과 의과대학 학업성취도

외국의 경우 고등학교의 유형에 따른 학업성취도를 비교한 연구는 없지만 고등학교 성적과 의과대학에서의 성적과의 상관관계를 조사한 문헌은 몇 개가 발표되었다 (Frischenschlager et al., 2005; Lipton et al., 1988). 이에 따르면 고등학교 성적과 의과대학성적에는 관련이 깊다는 주장과 그렇지 않다는 주장이 나뉘는데 이는 각 나라의 의과대학 신입생 전형 방법, 신입생의 인종적, 학문적 다양성의 정도, 의과대학의 위상 등이 매우 상이하기 때문에 다른 결과가 나타나는 것이 당연하다고 생각된다


나. 의예과 및 전임상 과정의 교과목별 학업성취도 비교

미국의 경우 이전에는 뚜렷한 연구 결과 없이 막연히 자연과학 특히 생물학이나 화학계열을 전공한 학생이 의과대학 수업에 유리할 것이라는 추측에 근거하여 많은 의과대학이 자연과학을 전공한 신입생을 선호하였고 이런 추세는 지금도 계속되고 있다. 1968년 Korman 등은 의과대학은 학부에서 자연과학을 전공한 학생만을 선발하는 것은 바람직하지 않으며 다양한 배경의 학생을 선발할 것을 최초로 주장하였다 (Korman et al., 1968). 그 후 Gough는 의과대학 입학생의 과학 과목의 대학 시절 성적과 의과대학 입학자격시험의 성적을 의과대학 성적과 비교한 연구 결과를 발표하였다 (Gough, 1978).상외로 과학 과목의 성적은 의과대학 1, 2학년의 성적과는 어느 정도 상관관계를 보였으나 3, 4학년의 성적과는 아무 관련이 없었다. 또 1980년 이후의 몇몇 연구결과에 의하면 학부시절에 생화학, 해부학, 발생학, 조직학 등 의과대학의 전임상 과목과 매우 유사한 과목을 이수한 학생의 학업성취도는 이런 과목을 이수하지 않은 학생에 비해 아무런 차이가 없었다 (Canaday & Lancaster, 1985). 이에 대한 설명으로는 학생들이 실제로 자연과학 계열의 과목에 흥미나 특기가 있어서 선택을 한 것이라기보다는 의과대학에 진학하기 위하여 선택한 경우가 많고 또 의과대학에 진학한 후 자신이 학부에서 경험한 과목에 대해서는 학습을 소홀히 하기 때문이라고 주장하였다. 이러한 설명은 본 연구의 결과에도 동일하게 적용될 수 있다고 생각된다.


연구에 의하면 의과대학에서의 학업성취도에 고등학교 성적이 영향을 미치는 부분은 12~14%에 불과하였다 (Lipton et al., 1988). 의학대학원의 경우는 학부과목의 성적이 미치는 영향은 23%라는 보고도 있다 (Ferguson et al., 2002). 이는 학업성취도의 대부분은 학문적 배경보다는 의과대학에서의 학습에 따라 좌우된다는 뜻이 된다. 또 의과대학에서 한 과목의 성적이 우수한 경우 다른 과목의 성적도 우수할 확률도 별로 높지 않았다. 즉 공통적 중요사항(common core)이 의과대학의 여러 과목 사이에 별로 많지 않다는 연구 결과이다. 따라서 고등학교 시절 과학이나 영어의 능력이 의과대학의 수많은 과목의 학습에 도움을 줄 수 있는 부분이 많지 않다고 생각할 수 있다.


본 연구의 결과에서도 비교적 고등학교 졸업 직후라고 할 수 있는 의예과 1, 2학년에서 졸업한 고등학교의 유형과 과목 성적 사이에는 거의 상관관계가 없었다. 한 예로 자연과학 계열인 정보전산의 경우 차이는 매우 작지만 오히려 외국어고 졸업생이 약간 점수가 높았다. 전임상과정에서도 사회과학적 성격이 강한 예방의학의 성적이 과학고 졸업생에서 더 높았다. 그리고 의예과 및 전임상 과정의 7개 과목 중 유일하게 통계적으로 의미 있는 차이를 보인 유전학의 경우도 외국어고의 선행학습과는 전혀 관계없는 과목이라 할 수 있으며 이러한 차이는 우연한 결과로밖에는 설명할 수 없다고 생각된다. 전체적으로는 출신 고등학교에서의 이수과목 혹은 이수시간의 차이는 의과대학에서의 유사한 과목의 학업성취도와는 상관관계가 없다고 결론지을 수 있다.


국내 연구로서 본 연구 결과와 상반된 보고가 발표되었는데 이 연구에서는 의예과 과목만을 분석한 결과 의예과 1학년의 경우 과학고 학생의 성적이 일반고 보다 더 높았으며 특히 과학 분야 과목의 성적이 높았으나 예과 2학년에서는 출신 고등학교에 따른 차이는 없었다 (Kim et al., 2002). 기존 연구와 본 연구 모두 일개 의과대학의 결과이므로 학생의 입학성적, 교과과정, 평가방법 등의 차이가 다른 결과의 원인이 될 수 있을 것이다. 실제로 기존의 국내연구에는 수학, 물리, 화학, 생물 등의 과목이 포함되었으나 본교의 교과과정에는 수학은 없으며 물리학 대신 생물리학을 가르치므로 교과과정의 차이가 가장 큰 원인일 것으로 추측한다.


다. 임상과목에서 강의 및 실습의 학업성취도의 비교

연구 가설과 다른 연구 결과로는 정신과의 임상실습에서의 학업성취도가 과학고 출신에 비해 외국어고 출신 학생이 높다는 것이다. 분석한 13개 과목의 평균은 과학고와 외국어고의 차이가 전혀 없지만 (2.99 대 2.98) 통계적으로 유의한 차이가 있는 유전학과 정신과 실습 성적은 외국어고의 성적이 높았다. 이에 대한 설명은 우선 과학고출신의 대부분이 남학생이라는 점을 들 수 있다. 분석 결과 유전학의 경우는 출신학교 이외에 성별 변인도 성적에 영향을 미친 것으로 나타났으므로 이런 설명이 가능하다. 국내 의과대학의 경우 일반적으로 여학생들의 성적이 높으며 학사경고, 유급 등 성적 불량학생의 대부분이 남학생인 경우가 많다 (Ahn et al., 2000; Kim et al., 2002). 그러나 정신과 실습의 경우에는 성별과 나이는 성적과 무관하고 출신 고등학교만이 영향을 주었다. 본 연구에서 예상과 다른 가장 중요한 결과는 임상과목의 경우 외국어고 졸업생이 강의에서는 낮은 성적을 나타냈지만 같은 과목의 임상 실습에서는 높은 성취도를 보인다는 것이다. 이는 고등학교에서의 특정 학문을 많이 이수한 것은 의과대학의 학업성취도와는 무관하다는 근간의 여러 연구 결과와 대치되는 것이라고 할 수 있다.


결론적으로 고등학교에서의 특정학문 (과학, 외국어)을 보통학생보다 많이 이수했다고 해서 의과대학에 진학한 후 유사한 과목의 학업 성취도가 높아지지는 않는다. 또 의과대학 교과목 중 실제 의사의 직무와 가장 유사한 임상 실습에서는 외국어고 졸업생이 과학고 졸업생보다 오히려 우수한 학업성취도를 나타내었다. 따라서 외국어고 졸업생은 인문, 사회 계열의 대학에 진학하는 것이 바람직하다는 사회 일부의 주장은 근거가 약하다고 할 수 있다. 본 연구에서는 대상자의 수가 너무 적고, 또 일개 의과대학만을 대상으로 한 연구 결과이므로 명확한 결론을 내리기는 어렵다. 향후 많은 수의 특목고 졸업생을 포함한 유사한 연구를 진행할 필요가 있다고 생각된다.






지원동기가 의과대학 적응에 미치는 영향

성균관대학교 의과대학 의학교육실

김지영․손희정․김태진․최윤호․김호중․기창원․김주희․홍경표

한국의학교육 : 제 16 권 제 2 호 2004 



서 론


특정 직업이나 전공분야에 대한 선호는 인적자원의 배분에 영향을 미치는 사회적인 현상이다. 그렇기 때문에, 지금까지 우리나라에서 이루어진 의과대학 지원동기에 대한 여러 연구들은 주로 시대에 따라 의과대학 지원동기의 양상이 어떻게 나타나며 변화하는지에 초점을 맞추어 왔다.


우리나라에서 이루어진 의과대학 지원동기에 대한 체계적인 연구의 효시로는, 이근태 등 (1985)이 1983년 전국 의대생들을 대상으로 수행한 조사연구를 들 수 있다. 이후, 특정 지역 (강복수 등, 1994)이나 특정 대학 (유희정 등, 1998)의 의대생들을 대상으로 한 조사연구가 주로 수행되었으며, 1997년에는 다시 전국 규모의 조사연구 (박정한 등, 1999)가 수행되었다. 최근에는 졸업 후의 진로선택에 영향을 미치는 요인의 하나로 의과대학 지원동기를 조사한 연구들이 수행되었다 (권성준, 2001; 김형준 등, 2003). 


이러한 연구들은 주로 의과대학 지원동기를 인술, 사회적 지위, 수입 등 몇 가지 항목으로 제시하고 학생들에게 가장 주요한 동기를 선택하게 하는 형로 이루어졌다. 이 연구들의 결과들을 살펴보면, 시대가 지날수록 본인의 의지로 의과대학을 선택하는 비율이 늘어나고, 이와 함께 경제적 요인에 의해 의대를 선택하는 비율 또한 증가하는 경향이 나타났다 (박정한 등, 1999).


2000년대에 들어와서는 우리나라 의과대학생들의 지원동기의 일반적인 경향을 알 수 있는 전국규모의 연구는 이루어지지 않았다. 그러나 우리나라의 경제불황과 청년실업의 증가현상을 고려해볼 때, 수입이나 사회적인 안정을 위해 진학하는 학생이 늘어나는 현상은 더욱 심화되었을 것이라고 추정할 수 있다.


본 연구에서는 지원동기를 Marcia의 자아정체감수준 이론에 근거하여 유형화함으로써, 이러한 문제점을 보완하고자 한다. 자아정체감이란 자신이 누구이고, 무엇을 할 수 있으며, 어떻게 살아가야 하는지를 알려주는 심리적인 상태이다 (박아청, 1984). 자아정체감의 개념을 처음으로 주창한 Erikson (1963)은 자아정체감을 ‘자신의 자아가 연속성과 동일성을 갖는 것으로 경험하고, 또한 그렇게 행동하도록 하는 능력’이라고 규정하였다.


대상 및 방법


나. 조사방법

2004년 3월 16일부터 4월 3일까지 3주에 걸쳐, 51명 학생 전원을 대상으로 개별면접을 실시하였다. 면접은 연구자 2인이 학생을 분담하여, 연구자와 학생이 일대일로 대화하면서 축어록을 작성하는 형태로 이루어졌다. 면접은 사전에 제작된 면접지를 활용하여 이루어졌고, 개인배경변인과 지원동기, 학교생활에 대한 만족도, 학생이 지각한 학업 및 생활상의 어려움, 남은 대학생활에 대한 기대와 걱정 등을 중심으로 이루어졌다


결 과

가. 지원동기 분석

축어록 분석과정을 통해, 우발적 (偶發的) 선택, 사회적 안정, 도전의식, 가족의 강권 (强勸), 소명의식, 모델링, 학문탐구 등 7가지 지원동기가 확인었다. 각각의 정의와 전형적인 반응은 다음과 같다.


나. 지원동기 유형 분류

이 일곱 가지 지원동기를 Marcia (1966, 1980)의 자아정체감수준이론에 근거하여, 네 가지 유형으로 구분하였다 (Table Ⅱ). 지금까지 Marcia의 이론에 기초한 면접지나 질문지는 대개 일반적인 미국 대학생을 대상으로 개발되었기 때문에, 대학생 시기의 주요한 발달과업인 직업탐색에 초점을 맞추고 있으며, 미국적 상황을 반영하는 가치관과 종교적 신념과 관련된 내용이 포괄되어 있다. 그렇기 때문에 이를 다른 문화권이나 다른 집단에 적용하는 것이 적절한지에 대한 의문이 제기되어 왔다 (김정규, 1983;박아청, 1994).




학생특성에 의한 지원동기 유형의 차이에 대한 Fisher의 정확검정을 실시한 결과, 성차만 유의미하게 나타났고, 고등학교 계열과 입학전형 형태에 따른 차이는 나타나지 않았다 (Table Ⅲ).


다. 적응도 분류

적응도 수준은 적응의 의미에서 도출된 다음과 같은 세 가지 준거에 기초하여 분류되었다. 

첫 번째 준거는 스스로 지각하는 적응상의 문제가 있는가 하는 것이다. ‘지난 2년간 학교생활을 해본 소감이 어떤가?’하는 질문에 대해, ‘잘 지낸 편은 아니죠.’ 또는 ‘뭐하고 살았나 싶어요.’ 등 자신이 생각해도 문제가 있다고 표현하는 경우에는 적응상의 문제가 있는 것으로 분류되었다. 


두 번째 준거는 이러한 문제로 인하여 심리적인 고통을 겪고 있는가 하는 것이다. ‘학교생활하면서 특별히 힘들거나 어려웠던 점은 없었는가?’하는 질문에 대해, ‘사실 힘들어요.’ 또는 ‘계속 이렇게 살아야 한다고 생각하면 우울해요.’ 등으로 심리적인 어려움을 겪고 있다고 표현한 경우에는 심리적인 고통이 있는 것으로 분류했다.


세 번째 준거는 의과대학 생활에 대한 만족감과 기대를 표명하고 있는가 하는 것이다. ‘의과대학 생활에 대해 어떻게 생각하는가?’ 하는 질문에 대해, ‘우리학교 학생들은 정말 많은 기회를 가지고 있다고 생각해요.’ 또는 ‘앞으로 경험할 일들을 생각하면 무척 설렙니다.’ 등으로 현재 생활에 대한 만족과 미래 생활에 대한 기대를 표명하는 경우에는 만족감과 기대가 있는 것으로 분류했다.


라. 지원동기와 적응도의 관계

지원동기 유형에 따른 적응문제 유형의 분포를 살펴본 결과, 타율형에서는 문제가 나타난 4명의 학생 중 3명이 학업부적응을 나타냈고, 목표혼란형서는 문제가 나타난 13명의 학생 중 8명이 학업부적응을 나타냈다 (Fig. 2).


마. 지원동기와 학업성취도의 관계

지원동기가 학업성취도에 미치는 영향을 확인하기 위해, 지원동기 유형에 따라 직전학년에 이수한 생명과학입문 과목 성적의 평균에 차이가 나타나는지를 살펴보았다.




고 찰

본 연구는 의과대학 학생들의 지원동기를 확인하고, 지원동기가 학업성취도를 포함한 의과대학 적응도에 미치는 영향을 탐색하기 위한 것이다. 축어록분석을 통해 확인된 7가지 지원동기는 Marcia의 자아정체감 수준이론에 근거하여 4가지 유형으로 구분되었다. 적응도는 문제의 유무와 심각도에 따라 4가지 수준으로 구분되었고, 적응상의 문제는 학업부적응과 사회적 부적응으로 분류되었다.


본 연구의 결과를 통해 얻을 수 있는 의과대학 교육을 위한 시사점은 다음과 같다.


첫째는 의과대학 학생선발 정책의 변화가 필요하다는 점이다. 그동안 의과대학 학생선발 정책은 고등학교 시절의 학업성취도가 우수한 학생들을 확보하는데 초점이 맞추어져 왔다. 그러다 보니, 의과대학에서 제공하는 입시정보들은 장학제도나 학교시설, 졸업 후의 진로보장 등 홍보성 내용에 집중되어, 의학분야에 필요한 자질과 의사가 되기 위한 학습과정에 대한 정보는 얻기 힘들었다.


이와 관련하여 생각해보아야 할 점이 입시제도의 변화가 학생들의 지원동기의 양상에 미치는 영향이다. 우리나라의 대학입시제도가 변화함에 따라, 의과대학 입시제도도 많은 변화를 겪어왔다. 교차지원 제도와 수시입학전형 등 6년제 의과대학의 틀 안에서의 변화가 이미 시작되었고, 의학전문대학원제도라는 전혀 새로운 의과대학 입시제도가 도입될 예정이다. 이러한 입시제도의 변화에 의해 의과대학 지원자들의 특성 또한 변화하게 될 것이다. 이러한 변화를 예측하고, 이에 대응하기 위한 보다 심도 깊은 논의가 진행되어야 한다.


둘째는 의과대학생을 위한 진로교육이 본격화되어야 한다는 것이다. 지금도 많은 의과대학생들이 자신의 선택이 잘못되지 않았는지 고민하고 있다. 이들 중 소수는 졸업 전에 의과대학을 떠나 다른 길을 선택하지만, 많은 경우는 주변의 기대나 새로운 진로개척에 대한 두려움 때문에 의과대학에 남아방황을 계속한다.


본 연구의 결과를 통해, 의과대학 생활을 통해 얻고자 하는 바가 명확한 경우, 즉 목표지향형과 성취지향형에서는 자아정체감의 혼란이나 이로 인한 적응상의 문제가 발생하지 않음을 알 수 있다. 목표지향형과 성취지향형 학생들의 경우, 학습에 대해서도 충분히 동기화되어 있고, 의과대학 생활에 대한 기대도 강하다. 반면에, 타율형과 목표혼란형 학생들의 경우에는 의과대학 생활을 통해 얻고자 하는 것이 불명확하기 때문에, 학습동기도 약하고 의과대학 생활에 대한 기대도 표명하지 않는다.





의학전문대학원 신입생 선발에서의 Multiple Mini-Interview

강원대학교 의학전문대학원 1외과학교실, 2약리학교실, 3가정의학교실, 4해부학교실, 5미생물학교실, 6내과학교실, 7진료능력개발센터

노혜린1,7, 이희제2, 박승배1, 양정희3,7, 김대중4, 김상현5, 이승준6, 채기봉1



서론

좋은 학생을 선발하는 것이 좋은 의사를 선발하는 시이기에 의학전문대학원의 선발면접은 매우 중요하다[1].

의학과 학생 선발면접에서는 불명예스러운 전문인이 될 가능성이 있는 사람을 걸러내는 작용도 크다[2]. 이를 통해 대학은 사회에 대한 공정성의 책임을 다하게 된다[3]. 선발면접은 의사가 되기를 원하는 다양한 그룹의 응시자에게도 모두 공정한 평가의 기회를 제공해야 한다[3].


우리나라에서 주로 사용하는 의학과 학생 선발면접 형태는 다수의 면접관이 한 명 또는 다수의 응시자를 10분 내외의 짧은 시간 동안 구술 면접하는 방식이다.


이러한 면접 형태는 평가의 신뢰도를 확보하기 어렵다는 것이 단점이다

  • 평가가 짧은 시간에 이루어지며 면접관이 사전에 충분히 면접 문항을 볼 시간이 부족하기 때문이다. 
  • 면접관은 면접문항의 평가목적을 제대로 파악하지 못하고 면접을 시작할 수도 있다. 
  • 또한 명확한 채점 기준이 없어 같은 응시자에 대한 점수가 면접관마다 다를 수 있다. 
  • 면접관이 질문하는 형태에 따라 질문의 의미가 달라질 수도 있다. 
  • 응시자마다 면접관이 달라지거나 평가문항이 달라져 형평성이 문제될 수 있다. 

따라서 이러한 면접에서는 면접관 간의 점수 차이를 줄이기 어렵고 평가의 신뢰도, 더 나아가 면접의 필요성까지 의심받게 된다. 


면접 평가의 주된 오차 요인 중 하나는 면접관에서 나온다. 

특히 면접의 결과가 선발, 배치의 기능 등 중요한 교육적 의사결정에 사용될 경우 평가자와 관련된 신뢰도는 반드시 고려해야 할 대상이 된다[4].


최근 객관구조화진료시험과 비슷한 형태로 진행되는 면접시험인 Multiple Mini-Interview (MMI)가 개발되었다[5]. 객관구조화진료시험에서 학생들은 주어진 시간 내에 여러 개의 시험방을 차례로 돌면서 부여된 진료 과제를 수행하게 된다[6]. MMI 역시 여러 개의 면접방을 차례로 돌면서 면접관과 함께 부여된 과제에 대해 면접하게 된다[7,8]. Eva et al.[5]은 MMI를 통해 0.65의 신뢰도를 확보할 수 있었다고 보고하였다.


대상 및 방법

1. 대상

2. 면접방과 면접관 배치

본교 입학전형회의에서 정한 가장 중요한 원칙은 기밀 유지였다. 이 원칙에 따라 모든 면접을 하루 내에 끝낼 것, 사례개발교수와 면접문항개발위원이 중심이 되어 기밀을 유지하며 사례를 개발할 것, 기밀 유지를 철저히 하기 위해 사례의 영역 선정이나 보완 등을 위한 자문패널 동원은 피할 것, 면접교수와 면접사례는 면접 전날 결정할 것, 실험적 연구는 면접 전날 저녁에 시행할 것, 모든 면접관은 교수로 할 것, 표준화환자는 활용하지 말 것 등의 세부 원칙을 결정하였다.

3. 평가 영역과 주제 선정

4. 사례 및 채점표 개발

의학적 상황에 익숙하지 않은 학생들을 고려하여, 쉬운 의학적 상황이나 비의학적 상황으로 제시하였다. 미리 준비된 답변보다는 본인이 가지고 있는 본연의 생각과 모습이 드러날 수 있도록 사례와 질문을 구성하였다. 암기된 전문지식을 회상하여 답하기 보다는 활성 지식을 논리적 사고를 바탕으로 이끌어내는 데 주력하였다. 남녀의 차이, 나이, 전공에 따라 더 유리하거나 불리하지 않도록 상황을 만들었다.


이와 함께 면접관들이 추가로 질문할 구조화된 탐색 질문을 사례별로 개발하였다. 핵심 질문이 상황에 대한 질문(situational question)인데 비해, 탐색 질문에는 구체적인 자신의 경험에서의 성과를 묻는 질문(accomplishment question)을 위주로 작성하였다[9].


5. 면접 진행방법

6. 면접 훈련

7. 실험적 연구

각 영역에 개발된 사례들 중 각각 1개씩이 면접 전날 선택되었다. 이 사례들로 면접 전날 실험적 연구(pilot study)를 거쳤다. 2007년 당시 자원한 2학년 학생 8명을 대상으로 하였다.

8. 자료처리 및 분석

신뢰도는 McMaster 대학에서 개발한 urGENOVA의 Windows 프로그램인 g string 4.1.0 (http://www.mcmaster.ca/perd/download)을 이용하여 일반화가능도계수로 구하였다. 오차 요인은 면접방(Station, S), 면접관(Rater, R), 채점항목의 종류(이분법적 채점 또는 총괄채점)(Group, G), 문항(Item, I) 등 4가지로 하였다.


결과

1. 전체 및 영역별 점수

2. 입학전형에 활용된 다른 성적과의 상관관계

3. 면접 영역 간의 상관관계 

면접 총점은 모든 면접 영역의 점수와 보통 수준의 상관관계(0.6~0.7)를 보였다.

4. MMI의 신뢰도

MMI의 신뢰도(Generalizability coefficient)는 0.791이었다. 면접관, 면접관과 문항간의 상호작용, 면접관과 응시자, 그리고 문항형태와의 상호작용 등의 분산성분은 0.0000이었다.

5. 학생들의 반응

1) 5점 척도 설문 결과

학생들은 MMI 형태의 시험에 만족한다(4.21±0.69)고 답하였다. 학생들은 이러한 평가방식이 약간 낯설었으며(3.35±1.03), 다른 면접에 비해 약간 더 긴장하였다(3.35±1.03)고 하였다.

2) 서술식 답변


6. 면접관들의 반응

1) MMI에 대한 면접관들의 5점 척도 설문 결과

면접관들은 면접시간은 적절하였으며(4.50±0.52), 사례는 응시자 수준에 적절하였다(4.14±0.66)고 답하였다.

2) MMI에 대한 면접관들의 서술식 답변


고찰

면접은 면접관과 응시자가 서로 대화를 통해서 얻고자 하는 자료나 정보를 수집하여 평가하는 방법이다[4]. 따라서 면접에 의한 평가는 면접관의 선입견이나, 응시자의 특성과 의사소통능력 등에 의해 좌우될 수 있다[5,10]. 주관적 판단에 의하지 않은, 신뢰 있는 평가가 되기 위해서는 별도의 노력이 필요하다. 모든 응시자에게 표준화된, 똑같은 질문을 하고, 질문에 대한 견본 답변을 제공하며, 면접관을 훈련하고, 다수의 면접관을 사용하는 구조화 면접이 신뢰도를 향상시킨다고 여러 연구에서 보고하고 있다[8,10,11].


MMI는 모든 응시자에게 표준화된, 똑같은 사례를 주고 질문에 대한 견본 답변을 제공하며, 다수의 면접관을 사용하는 구조화 면접의 일종이다. MMI는 기존의 구조화 면접에서 더 나아가 짧은 시간 다수의 사례를 사용함으로써 신뢰도를 더 높인 것이다.


Calgary 의과대학의 경우 9개의 8분짜리 면접방에서 MMI를 시행하였으며, 5개의 평가문항에서 10점 척도로 평가하였다[12]. 일반화가능도 이론에 의한 연구결과, 1개의 면접방에 12명의 면접관이 한꺼번에 들어가는 경우, 신뢰도는 0.55에 그치지만, 12개의 면접방에 각각 1인의 면접관이 들어가는 경우 신뢰도는 0.85였다. 사례를 많이 사용하고 각 사례당 채점자 수를 제한하는 것이 사례를 적게 하고 채점자를 많이 동원하는 것보다 신뢰도가 좋다는 것이다. Eva et al.[5]은 신뢰도를 높이기 위해 사례가 다른 면접방의 수를 늘릴 것을 제안하였다.


MMI의 장점은 응시자의 능력을 다각도에서 평가할 수 있으며, 응시자가 한 방에서 면접관에게 실수를 하였더라도 다른 방에서 만회할 기회가 있다는 것 등이다[5]. 그 외에도 응시자가 면접 받는 시간이 증가한다는 것, 그에 비해 면접관이 면접하는 시간의 증가는 없다는 것, 면접관이 다른 면접관의 영향을 받지 않고 독립적이고 주도적인 평가가 가능하다는 것, 이를 통해 면접관 또한 시간의 부담을 줄이면서도, 신뢰 있고 공정한 평가가 가능해진다는 것 등이 MMI의 장점이다.


중요한 당락이 결정되는 평가의 경우 0.90 이상의 신뢰도가 필요하다[13]. 


나이, 인종 등의 응시자의 특성이 면접 당락에 영향을 미칠 수 있다고 보고되고 있다[14]. 본 연구결과, MMI의 전체 점수는 나이, 전공, 거주지 등에 따른 차이를 보이지 않았다.


MMI 총점은 여자 응시자에서 약간 더 높았다


MMI 점수는 학사과정성적, 공인영어성적, 의학교육입문검사점수와 상관관계가 없었다. 이 결과는 MMI가 성적에 의한 후광효과를 배제하는 데 일조하였음을 알려준다. 응시자의 성적을 면접관에게 공개한 경우 면접관들이 인성 영역 평가에 영향을 받았다고 보고되고 있다[14,15]. 공정한 평가를 위해 면접관에게는 응시자에 대한 어떠한 정보도 주어서는 안 된다[12].


면접의 신뢰도가 높기 위해서는, 채점기준에 표준화된 일관성 있는 면접관의 채점이 필요하다. 기존 연구에서 구조화 면접의 경우 면접관의 일치도가 높다고 보고되고 있다[1,10,11].


면접에서의 평가자들이 자신의 평가에 확신을 갖기가 쉽지 않다. 연구에 따르면 평가에 대한 주관적 확신은 전체 면접시간에 비례하였다[10]. 평가자는 30분 이하의 면접에서는 평가 결과에 확신을 갖지 못하였다. Pendleton & Wakeford[10]는 면접에 최소한 1시간은 필요하다고 하였으며 1시간을 여러 명 이상의 평가자가나누어 써도 된다고 하였다.


외국 사례에서는 교수나 의사, 보건의료종사자, 지역사회인사, 학생 등 다양한 배경의 면접관을 활용하여 면접을 시행하고 있다[7,11].


그동안 면접의 중요성에 비해 면접의 필요성에 대한 믿음은 뚜렷하지 않았다[17]. 기존 연구에서는 면접 점수가 다른 입학전형 영역의 점수와 낮거나 보통의 상관관계를 보였다[5]. 본 연구결과에서 MMI 점수와 학사과정성적, 공인영어성적, 의학교육입문검사점수와 상관관계가 없었는데, 이는 MMI가 면접 이외의 영역에서는 평가하지 못하는 영역을 평가하고 있음을 의미한다.


면접에서는 의학전문대학원에서 가르쳐질 수 없는 영역을 평가해야한다는 인식이 많다[15]

  • Bullimore[18]는 정직성 등의 인적 특성은 18세까지 형성되므로, 의학전문대학원 선발면접에서 중요하게 평가해야한다고 하였다.
  • Lowe et al.[19]은 윤리적 지식이나 도덕적 추론, 윤리적 주제에 대한 신념 등은 의학전문대학원 교육과정을 통해 얼마든지 배울 수 있는 것들이므로, 굳이 평가할 필요는 없다고 하였다. 반면, 윤리적 이슈에 대한 태도나 윤리적 민감성은 타고난 것이며, 잘 변하지 않기에 타당성 있는 검사를 통해 평가되는 것이 바람직하다고 하였다.


본 MMI에서는 의사소통 영역을 평가하기 위해 앞에 가상의 인물이 있다고 가정하고 이야기해보도록 하였다. 외국 사례에서는 의사소통 영역을 평가하기 위해 표준화환자를 사용하기도 한다[5].


MMI에 대한 응시자들의 반응은 매우 긍정적이었다특히 이들은 면접 진행이나 면접 과정에서의 진행요원과 면접관의, 친절하고 응시자를 존중하는 태도를 보고 긴장감이 누그러졌으며, 학교에 대한 호감도 가지게 되었다고 하였다.


Reiter et al.[20]은 MMI를 시행하기 2주 전에 MMI에 대한 정보를 접한 응시자와 그렇지 않은 응시자 간에 평가 결과의 차이가 없었다며, 사례가 미리 공개되어도 평가에는 영향을 주지 않았다고 결론지었. 이러한 연구 결과가 우리나라에서도 일반화될 수 있을지에 대해서는 의문의 여지가 있다.

















Multiple Mini-Interview in Selecting Medical Students
HyeRin Roh,1,7 Hee Jae Lee,2 Sung Bae Park,1 Jeong Hee Yang,3,7 Dae-Joong Kim,4 Sang Hyun Kim,5Seung-Joon Lee,6 and Gibong Chae1
1Department of Surgery, School of Medicine, Kangwon National University, Chuncheon, Korea.
2Department of Pharmacology, School of Medicine, Kangwon National University, Chuncheon, Korea.
3Department of Family Medicine, School of Medicine, Kangwon National University, Chuncheon, Korea.
4Department of Anatomy, School of Medicine, Kangwon National University, Chuncheon, Korea.
5Department of Microbiology, School of Medicine, Kangwon National University, Chuncheon, Korea.
6Department of Internal Medicine, School of Medicine, Kangwon National University, Chuncheon, Korea.
7Clinical Performance Center, School of Medicine, Kangwon National University, Chuncheon, Korea.

Corresponding Author: HyeRin Roh, Gibong Chae. Department of Surgery, Kangwon National University Hospital 17-1 Hyoja-3-dong, Chuncheon 200-701, Korea. TEL) 033-258-2306, CELL) 011-372-3621, FAX) 033-258-2169, Email: hyerinr@kangwon.ac.kr 


Abstract

Purpose

Selecting medical students through interviews seems difficult and the reliability of the results is one of the major concerns. The purpose of this study was to investigate the reliability and acceptability of the Multiple Mini-Interview (MMI) in selecting medical students of Kangwon National University.

Methods

Eighty-four applicants participated in the MMI which consists of 3 8-minute stations that have 9 checklist items and 3 global items. The 3 domains that we chose were motivation to become a doctor, communication and interpersonal skills, and ethical decision-making. We placed 2 interviewers in each room. The interviewers were chosen from our faculty. We analyzed the reliability of the MMI with urGENOVA for PC. We conducted a survey of these applicants and interviewers.

Results

The reliability was 0.791. Students answered that the interview was impressive and enjoyable. Students were also satisfied with the level and quality of the MMI cases. They described that they were evaluated objectively. Interviewers also responded positively. They stated that more stations and more efforts to develop the cases were needed to improve the reliability and validity.

Conclusion

The MMI was acceptable to our applicants and faculty. It is reliable for assessing medical school applicants in Korea. We should develop more stations and better cases to increase the reliability and validity of the MMI.

Keywords: Admission interviewMedical students selectionGeneralizabilityMultiple mini-interview.


+ Recent posts