의학교육의 국제화와 미래: 글로벌 관점(Acad Med, 2006)

International Medical Education and Future Directions: A Global Perspective

Ronald M. Harden, MD





Friedman은 이렇게 말했다. "21세기 초반은 기억될 것이다. 군사적 갈등이나 정치적 사건 때문이 아니라 세계화의 완전히 새로운 시대가 열렸기 때문이다. 세계가 "flattening"하고 있다."

The beginning of the 21st century will be remembered, Friedman1 argues, not for military conflicts or political events, but for a whole new age of globalization—a “flattening” of the world.


오늘날, 거의 대부분의 국가가 고등교육에 대해서 세 가지 야망을 가지고 있다.

Today, almost every country has three ambitions regarding higher education.3 

  • 첫째는 접근성을 높이는 것이다. 즉 더 많은 학생이 대학에 다니게 하는 것이다.
    The first is to provide greater access—that is, to admit more students to a university. 
  • 둘째는 국제시장에서 경쟁력을 갖도록 고등교육을 향상시키는 것이다.
    The second is to improve higher education to compete in an international market. 
  • 셋째는 대학 자체의 평등을 향상시키는 것으로, 소외계층의 학생이 그들의 사회, 문화, 인종적 배경으로 인해서 손해를 보지 않게 하는 것이다.
    The third is to increase equity—to offer university education to students disadvantaged because of their social, cultural, or ethnic background.

마틴루터킹 주니어는 1968년 이렇게 말했다. "It really boils down to this" "모든 생명은 서로 연결되어 있다. 우리는 빠져나갈 수 없는 상호성의 네트워크에 잡혀 잇으며, 하나의 운명이라는 옷으로 묶여 있다. 하나에 영향을 주는 것은 무엇이든 모든 것에 간접적으로 영향을 준다." 교육도 예외는 아니다.

“It really boils down to this:” argued Martin Luther King, Jr5 in 1968, “that all life is interrelated. We are all caught in an inescapable network of mutuality, tied into a single garment of destiny. Whatever affects one directly, affects all indirectly.” Education is not exempt.



국제화를 촉진하는 요인들

Factors Encouraging Internationalization


의료의 국제화

Globalization of health care delivery


국제 의료인력 시장의 어떤 특징들은 상당한 우려를 갖게 하며, 의료인력의 불균등한 분포는 거의 전세계적인 문제이다. 개발도상국에서 의료인력이 유출되는 것은 지구 많은 곳에서 의료인력의 부족을 가져왓으며, 거의 모든 사하라이남 국가에서 의사-인구 비율은 1960년 이후 정체되어있다.

Certain features of the international health care labor market have given rise to concern, and the maldistribution of health care workers is a near-universal problem.7 Migration of physicians from developing countries has created serious shortages of medical manpower in many parts of the world,8,9 and the physician to population ratio has stagnated or declined in nearly every Sub-Saharan country since 1960.




정부의 압박

Government pressures


유럽에서는 예를 들면 '볼로냐 프로세스'가 있으며, 그것의 목적은 가입국가간 유럽의 고등교육에 있어서 학습성과, 교육 프로세스, 인증 등을 공유하는 것이다.

In Europe, for example, the “Bologna process” has, as an objective, the creation of a European higher education area where learning outcomes, the educational process, and accreditation are shared by the member states.


이 탑-다운 접근법에서, 고등교육기관은 정부기관의 도구로서 여겨지며, 자율성을 가진 배역이 아니다. 2001년 프라하 컨퍼런스에서 "고등교육은 공공재이며, 정부는 공공제를 제공할 책임이 있는 사회의 대리인이다"라고 했다.

In this top-down approach, higher-education institutions are seen as instruments of government policy, not as autonomous actors. The communique´ froma conference held in Prague in May 2001 stated explicitly that “higher education is perceived as a public good and governments are the agents in society that are responsible for providing public goods.”11



커뮤니케이션 채널의 발전

Improved channels of communication


전 세계에서 일어나는 의학교육의 여러 흥미로운 발전에 대한 의사소통과 상호교환의 요구에 맞춰, 오타와에서는 1985년 임상역량의 평가에 대한 미팅이 열렸다. Jake Epp (캐나다 복지부 장관)은 그 미팅이 담아낸 열망을 이렇게 표현했다. "이 미팅이 의학교육의 국제적 기준을 만들고, 이를 통해서 전세계의 건강과 의료를 발전시키는 계기가 되기를 바랍니다" 이후 오타와 컨퍼런스를 잇는 일련의 컨퍼런스가 진행되었다. 제8차 컨퍼런스는 NBME가 주최하였고, 12차 컨퍼런스는 2006년 뉴욕에서 열려 55개국에서 1000명의 참가자가 왔다. AMEE도 있고 APMEC도 있다
In response to a perceived need to communicate or exchange views about the many exciting developments in medical education occurring around the world, an international meeting was held in Ottawa in 1985 on the assessment of clinical competence. Jake Epp, the Minister of National Health and Welfare for Canada, defined the aspirations of the meeting: “It is my hope that this meeting will encourage the development of international standards of medical education which will lead to further improvements in health care and health care delivery around the world.”12 There followed a series of international Ottawa conferences. 
    • The eighth conference was hosted by the National Board of Medical Examiners in Philadelphia; and the 12th conference, in May 2006 in New York, attracted about 1,000 participants from 55 countries. 
    • The annual international meeting on medical education organized by the Association for Medical Education in Europe (AMEE) now attracts about 2000 participants frommore than 80 countries. 
    • The third Asia Pacific Medical Education Conference in Singapore in 2006 attracted more than 400 participants fromthe region.


공통의 언어 개발

The development of a common vocabulary


공통의 용어 사용의 중요성은 IIME가 인식하고 용어를 개발, 출판하였다.ㅋ

The importance of common usage of terms in medical education was recognized by the International Institute of Medical Education (IIME). This led to the development and publication of a glossary.14


상대적으로 새로운 온라인 자료원도 있다(MedEd Central)

A relatively new online medical education information source and glossary, “MedEd Central” (www.mededcentral.org), has been developed by AMEE in collaboration with MEDINE, the European Union (EU) medical education thematic network.


성과기반 교육과 스탠다드

Outcome-based education and standards



스코트랜드

In Scotland, the five medical schools published learning outcomes that have now been adopted in a number of other countries in Europe.16 

캐나다

The competencies and learning outcomesset out by the Royal College of Physicians and Surgeons of Canada17 and 

ACGME

the Accreditation Council for Graduate Medical Education in the United States18 have also had a significant international impact. 

IIME

IIME19 has identified the “Global Minimum Education Requirements” with the express purpose of defining the minimumcompetencies that all physicians must have regardless of where they receive their general medical education or practice. 


WFME

The World Federation for Medical Education20 has played an important role in internationalization of medical education through its development, publication, and dissemination of standards in basic medical education, postgraduate medical education, and continuing medical education.


유럽 ECTS는 상호-인정을 위하여 도입되었다. 유럽 내에서 고등교육의 "동등성" 원칙을 수용함으로써 세계화에 더 압박이 되었다. 그러나 고등교육의 Europeanization, 세계화, 글로벌화에 대해서는, 특히 의학교육에 대해서는 국가마다 정책이 다양하다.

The European Credit Transfer System was introduced to encourage a greater cross- recognition. Within Europe, the acceptance of the principle of “equivalence” of higher education programs has contributed to the pressures for globalization and the move toward a European higher education space.21 However, there have been diverse national policies regarding europeanization, internationalization, and globalization of higher education, particularly in medical education; Finland is often identified as widely accepting the supernational policies, and the United Kingdom and Greece are often considered least supportive.


FAIMER 

The Foundation for Advancement of International Medical Education and Research24 has had a significant impact in this area over the past five years through its faculty development programs; it is hoped that these efforts will help increase production of physicians in countries where there is an undersupply.



경쟁력과 상업화

Competitiveness and commercialization



실제로 IAU는 고등교육의 국제화에 있어서 경쟁력이 세계화의 가장 중요한 요인임을 밝혔다. 대학은 국제학생을 또 다른 수입원으로 발견해냈다. 많은 국가에서 국제 활동을 상업적 이익을 가져다주는데 도움이 된다는 인식으로 바뀌고 있으는데, 예를 들면 호주에서는 고육은 이제 국가의 9번째 큰 수출산업이 되었다.

Indeed, the IAU6 found competitiveness to be the most important factor driving internationalization in institutions of higher education—a major shift fromearlier findings. Universities have discovered international students as an alternative source of income.25 In many countries, a shift has occurred from seeing international activities as aid to perceiving themas providing commercial advantage. In Australia,26 for example, education is now recognized as the nation’s ninth largest export earner and an industry of 4.2 billion (Australian) dollars.




교육에서의 세계화를 바라보는 관점

Views of Internationalization in Education





학생

The student



유네스코의 1997년 자료는 빈곤국가 혹은 개발도상국 학생이 상대적으로 부유한 국가에서의 수학을 선택하는 경향을 확인해주었으며, 이는 더 양질의 고등교육을 찾아가는 것이거나, 현지에서의 노동시장에 접근하기 위한 것일 수 있다. 그러나 이러한 학생 수출(?)의 문제는 Broadhead와 Muula가 지적한 바 있다 "당신네들의 어린 학생을 중등교육 이후 곧바로 보내는 것은 그들이 교육을 마치고 돌아오지 않을 위험을 감수하는 것이다. 이를 보여주는 단적인 사례가 1980년대에는 - 사실에 가까운 - 이러한 농담이 있었다. '맨체스터에는 말라위 인구 전체보다 더 많은 수의 말라위 출신 의사가 있다'"

The UNESCO199721 data confirmed that students fromrelatively poor or developing countries tended to opt for study in a relatively rich country, hoping to gain access to a more advanced quality of higher education and possibly access to the labor market of the host country. The problemof sending students from developing to developed countries for their education, however, has been stated by Broadhead and Muula27: “Sending young students abroad immediately after their secondary education and in their formative years risks their not coming back when they qualify. This proved to be the case. By the 1980s, the joke—ironically true—was that there were more Malawian doctors practising in Manchester than in the whole of Malawi.”


학생 이동의 문제는 선진국 간에서도 발생한다. 그러나 많은 연구 결과는 그렇게 해외에서 교육을 받는다고 그 학생이 더 세계화적 마인드를 가지게 된다거나, 짧은 해외 수학기간만으로 고등교육을 받은 국가에 더 친숙하게(frendlier) 되는 것은 아님을 보여준다.

This movement of students also takes place between developed countries that are more or less on equal terms. Many research projects, however, have shown that students become neither more internationally minded nor friendlier to their host country during a short period of study abroad.21



많은 미국 시민들이 외국으로 의학을 배우러 나가고, 진료를 위해서 미국으로 다시 돌아온다. 현재, 약 25%의 레지던트과 25%의 미국 내 진료의사는 의학사 학위를 미국이나 캐나다 외부에서 취득했다.

Many U.S. citizens study medicine abroad and return to the United States to practice.28 Currently, approximately 25%of all residents and 25%of practicing physicians in the United States obtained their medical degrees outside the United States or Canada.


중국에서 16만명의 학생이 1978년과 1999년 사이에 해외로 나간 것으로 추산된다. 또한 2000년 중국 내 해외 학생의 수는 44711명 정도였다.

In China,29 it is estimated that 160,000 private students went abroad between 1978 and 1999 and that they studied in 103 countries. China has also tried hard to attract foreign students. In 2000, the number of foreign students in China was estimated as 44,711; the students came from164 countries, with 71%of the students fromAsia, 14%fromEurope, 11%fromAmerica, and 3%fromAfrica.


콸라룸푸르의 국제의과대학은 새로운 국제협력의 모델을 제시하는데, 첫 2년은 말레이시아에서 수학한 이후 22개 파트너 대학 중 하나로 이동하여 그 국가에서 학위를 받게 된다.

The International Medical University in Kuala Lumpur offers an interesting model of international collaboration. Students complete the first two and a half years of their study in Malaysia. They then transfer to one of 22 partner schools in Australasia, Europe, and North America and are awarded the medical degree of the university in the country where they satisfactory complete their training.



교사

The teacher


2000년 이후, 영어로 쓰인 무수한 교과서가 중국에 도입되었고, 2002년에는 중국 내 가장 유명한 10개 대학에서 하버드, 스탠포드, MIT에서 사용했던 거의 모든 교과서를 수입하는 것으로 결정했다.

From 2000 onward, large numbers of textbooks written in English were imported and introduced into China,29 and in 2002, 10 of the most famous universities in China decided to buy and use almost all of the textbooks, including medical ones, used at Harvard University, Stanford University, and MIT.


IVIMEDS는 16개국 30개 의과대학이 공유하는 학습자원이다.

For example, in the International Virtual Medical School (IVIMEDS),30 schools in 16 countries share learning resources.


학습목표는 작은 학습자료 부분부분이 모여서 만드는 교육 프로그램이다. 이-러닝의 혁신자인 Hodgins는 레고를 사용하는 것에 비교했다

Learning objects are small chunks of learning material—such as a diagram, a clinical photograph, or a short instructional sequence—that can be combined to make up a learning program. Hodgins,31 a leading innovator in e-learning, has compared this approach to the use of the toy Lego, in that 

    • 작은 교육의 조각을 쌓아서 큰 교육의 구조를 만들 수 있으며, 그 교육의 조각은 또 다른 교육구조를 만드는데도 쓸 수 있다.
      small pieces of instruction (Lego blocks) can be assembled (stacked together) into a larger instructional structure (eg, a castle) and reused in other instructional structures (eg, a spaceship). 
    • 어떤 학습목표도 다른 학습목표와 결합가능하며, 학습목표는 학습자의 요구에 따라서 어떤 식으로든 교육 프로그램을 만드는데 사용할 수 있다.
      Any learning object (Lego block) is combinable with any other learning object (Lego block), and the learning objects (Lego blocks) can be assembled however users choose to create educational programs (toys) to meet their needs.

IVIMEDS의 교육과정 지도
The IVIMEDS curriculummap30 provides a useful user interface which has embedded in it a view of medicine across different cultures.



교육과정

The curriculum


첫 번째 접근법은 로컬 교육과정을 활용하는 것이다. 로컬 교사들이 로컬 학생들을 위해서 개발한 교육과정이며, 이 전통적인 접근법이 우리에게 익숙하다.

The first approach is the use of a local curriculum, a program of studies developed by local teachers for use by local students. This is the traditional approach to which most of us are accustomed.


두 번째 접근법은 한 국가에서 개발한 교육과정을 비슷한 다른 국가로 수출하는 것이다.

The second approach is a curriculum developed for an institution in one country and exported for use in a different country.

Examples in medicine include joint programs 
        • in Malaysia with Monash University and with the Royal College of Surgeons in Ireland, 
        • in Qatar with Cornell University, and 
        • in Singapore with Duke University.


호주는 특히 해외캠퍼스와 쌍둥이-관계를 늘려가고 있는데, 1996년에는 50개 정도에서 2001년에는 1000개 이상으로 늘어났다. 브렌치-캠퍼스에서의 교육과정은 메인-캠퍼스에서의 교육과정과 동등(parallel)하게 설계되어 있으며, 종종 프로그램을 동등하게 하려는 노력이 두 교육과정을 거의 구분 불가능하게 만들기도 한다.

Australia, particularly, has seen a dramatic increase in the number of offshore campuses and twinning arrangements, from about 50 in 1996 to more than 1,000 in 2001.26 The curriculum at the branch campus usually is designed to parallel that at the “main” campus, and often for the purpose of equivalence the program is made as indistinguishable as possible from that at the main campus.


교육과정 개발의 세 번째 옵션은 진정으로 세계화적 접근법이며, transnational 혹은 global 이라고 불리는 것이 적합하다. 즉 로컬 학생의 니즈를 고려하면서, 동시에 강력한 인터네셔널 토대를 가지고 있는 것이다. 이러한 교육과정에서 로컬 이슈가 국제적 맥락에 놓이게 된다.

There is a third option for curriculum development. It is the development of a truly international— better described as transnational or global—curriculum that, while considering the local students’ needs relating to the topics covered, has a strong international basis. In such a curriculum, local issues are put into international context.





3차원적 모델 

A three-dimensional model









Transnational 의학교육

Transnational Medical Education


이 모든 것을 가능하게 하는 것은 인터넷과 새로운 교육 접근법이다. Horton은 "웹과 인터넷이 우리 세계를 바꾸고 있으며 몇 년 전에는 상상만 가능했던 기회를 가능하게 했다. 교육과 훈련보다 이 기회가 더 열려있는 곳은 없다"

What makes all of this possible is the rapid development of the Internet and new pedagogical approaches. “Web and Internet technologies,” Horton33 has said, “are transforming our world, presenting opportunities we could only imagine a few years ago. Nowhere are these opportunities greater than in training and education.”


Koehn과 Swick은 현재 미국이 강조하는 cultural competence에서 transnational competency로 옮겨가야 한다고 주장했다.

In the article “Medical education for a changing world: moving beyond cultural competence into transnational competence,” Koehn and Swick38 advocate moving fromthe current emphasis in the United States on cultural competence to a specified set of transnational competencies.


"transnational education"이라는 용어에 대한 혼란이 있다.

There has been terminological as well as conceptual confusion about the term“transnational education.” 

A report from Council of Europe/UNESCO40 included in transnational education “all types of higher education study programmes, or sets of study, or educational services (including those of distance education) in which the learners are located in a country different fromthe one where the awarding institution is based.”


의학교육이 transnational/global 모델로 나아가고자 할 때 논쟁이 되는 점은 공동학위에 대한 인정에 관한 부분이다. 공동(협동)학위(Joint degree)는 서로 다른 국가에 위치한 다수의 고등교육기관이 협력함으로써 나타나는 결과인데, 비록 일부 기관에 이것은 위협이 되는 것으로 보일 수 있으나, 엄청난 잠재력을 가지고 있다. EU는 이 방향으로 움직이고 있다. Erasmus Word 프로그램은 90개의 대학간 네트워크를 통해서 250개의 공동석사과정을 만들었다. 

A controversial aspect of moving toward a transnational/global model of medical education is that it could lead to calls for the award and recognition of joint degrees. Joint degrees resulting from cooperation among several higher education institutions located in different countries, although appearing threatening to some institutions, have considerable potential.41 The EU has already moved in this direction. The Erasmus World program42 included the development of about 90 inter-university networks to provide 250 joint master’s courses to students around the world.


미국은 National Geographic Society survey 등에 따르면 교육적 고립의 상태를 지적받고 있다. 설문의 대상이 된 8개 국가의 학생 중, 미국 학교에 다니는 학생이 국제적 문제나 지리에 관한 정보를 두 번째로 적게 제공받고 있었다. 다행히 우리는 의학교육에서는 다른 양상을 발견할 수 있으며, ECFMG는 국제적인 차원의 리더십을 제공하고 있다.

The United States, for example, has been criticized for its educational isolation as evidenced by the findings of a National Geographic Society survey.35 Among students fromeight countries surveyed, those attending American schools were the second most poorly informed about world affairs and geography. Fortunately, in medical education we see a different picture, with the ECFMGproviding leadership in what can be achieved internationally.


이-러닝의 국제적 관점에 대해서 Carr-Chellman은 이렇게 말했다.

Writing on global perspectives on e-learning, Carr- Chellman43 challenged us that


지난 10년간의 변화는 느릴 것이다. 아마 '터벅터벅 걷는다' 라는 정도로 표현할 수 있을 것이다. 테크놀로지를 비판적으로 보는 사람들은, 지금까지 이-러닝의 실패가 성공보다 더 두드러졌다는 점으로부터 우리가 새로운 것을 만들고 혁신하는데 어려움을 겪을 것이라고 생각한다. 궁극적으로, 그러나, 언제, 어디서나 학습을 하고자 하는 열망은 저항할 수 없을 것이며, 교육적으로는 물론 경제적으로도 그럴 것이다.

progress over the next decade is likely to be slow, probably best described as plodding. The technology’s sceptics, emboldened by the fact that, to date, e- learning’s failures have been much more prominent than its limited successes, will challenge each new product and innovation. Ultimately, however, the lure of anywhere-anytime learning will prove irresistible—educationally as well as financially.



14 Wojtczak A. Medical education terminology. Med Teach. 2002;24:357.


38 Koehn PH, Swick M. Medical education for a changing world: moving beyond cultural competence into transnational competence. Acad Med. 2006;81:548–56.
















 2006 Dec;81(12 Suppl):S22-9.

International medical education and future directions: a global perspective.

Author information

  • 1International Virtual Medical School, Dundee, U.K. r.m.harden@dundee.ac.uk

Abstract

Internationalization, one of the most important forces in higher education today, presents a powerful challenge and an opportunity for medicalschools. Factors encouraging internationalization include (1) globalization of health care delivery, (2) governmental pressures, (3) improved communication channels, (4) development of a common vocabulary, (5) outcome-based education and standards, (6) staff development initiatives, and (7) competitiveness and commercialization. A three-dimensional model--based on the student (local or international), the teacher (local orinternational), and the curriculum (local, imported, or international)-offers a range of perspectives for international medical education. In the traditional approach to teaching and learning medicine, local students and local teachers use a local curriculum. In the international medical graduate or overseas student model, students from one country pursue in another country a curriculum taught and developed by teachers in the latter. In the branch-campus model, students, usually local, have an imported curriculum taught jointly by international and local teachers. The future of medicaleducation, facilitated by the new learning technologies and pedagogies, lies in a move from such international interconnected approaches, which emphasize the mobility of students, teachers, and curriculum across the boundaries of two countries, to a transnational approach in which internationalization is integrated and embedded within a curriculum and involves collaboration between a number of schools in different countries. In this approach, the study of medicine is exemplified in the global context rather than the context of a single country. The International Virtual MedicalSchool serves as an example in this regard.

PMID:
 
17086041
 
[PubMed - indexed for MEDLINE]


FDP의 교육-스킬에 있어서 장기 효과성에 대한 질적 평가(Med Educ, 2007)

Qualitative assessment of the long-term impact of a faculty development programme in teaching skills

Amy M Knight, Joseph A Carrese & Scott M Wright






FDP의 평가는 보통 프로그램이 끝난 직후에 시행되나 많은 경우 이러한 평가는 양적평가가 된다. 질적연구 설계는 특정 주제에 대해서 대상자의 관점을 더 폭넓고 깊게 확인할 수 있게 해준다. 비록 많은 FDP 성과 연구들이 질적연구방법을 사용했지만, 대부분은 20명 이하의 참가자만을 대상으로 하거나, 6개월 이내에 시행된 바 있다.

Evaluations of FDPs are usually performed immedi- ately or soon after their conclusion and the majority of these assessments have been quantitative in nature.6 Qualitative study designs may better identify the breadth and depth of subjects’ perspectives on a particular topic.7 Although several studies of FDP outcomes have used qualitative methodologies,8–14 most have included fewer than 20 participants8–11 or have occurred within 6 months of the conclusion of the programme.8,9,13


9개월짜리, 매주 반일동안 교육스킬에 대해 진행되는 FDP가 1987년부터 진행되어왔다. 프로그램 종료 직후의 평가와 장기 follow-up 설문의 양적연구 결과는 출판된 적 있다.

A 9-month, 1 half-day per week FDP in teaching skills (FDP⁄ TS) has been offered annually at our institu- tion since 1987. An immediate post-programme evaluation15 and results froma quantitative long-term follow-up survey16 have previously been published.



방법
METHODS


프로그램

Programme description


5~8명씩. 1~2명의 퍼실리테이터와 만남. 9월 초부터 5월 말까지. 매주 반일. 모듈별 운영. 모듈당 길이는 1~6주. 프로그램 목표. 

Participants in the FDP⁄ TS meet in groups of 5–8 participants with 1–2 facilitators between early September and late May for 1 half-day each week to work on modules that vary in length from 1 to 6 weeks. Programme goals are for participants to experience and gain expertise in concepts believed to be critical to educating medical learners, such as learner-centredness, self-directed learning, and the building of a supportive learning environment.


모듈 토픽

Module topics include 

    • giving and eliciting feedback, 
    • precepting (1-to-1 teaching), 
    • time management, 
    • communication and interviewing, 
    • negotiation and conflict management, 
    • giving lectures and presenta- tions, and 
    • small-group leadership skills.


The programme has been described in detail elsewhere.15


연구대상

Study population


1987년부터 2000년까지 참여한 242명

In July 2002, we surveyed the 242 faculty members and fellows who had taken part in the FDP⁄ TS from 1987 through 2000.


연구 설계

Survey design



자료 수집

Data collection





분석

Analysis


  • Handwritten responses to the open-ended question about programme impact were transcribed verbatim and analysis was independently performed by 2 investigators (AMK and SMW) using an editing analysis style .18 
  • Categories and subcategories of themes were generated and conceptually organised by each investigator. 
  • A third investigator (JAC) independently compared these generated themes with the transcribed subject comments, looking for completeness, congruence and coherence
  • The 3 investigators then had a series of meetings to discuss the analyses. 
  • Final domains and subcategories were agreed on by all 3 investigators, and the number of responses related to each subcategory was tabulated. 
  • Several representative quotes were selected for inclusion by consensus. The year in which programme participation began is provided with each quote.



RESULTS


프로그램 효과의 질적 평가

Qualitative assessment of programme impact


Table 2에 나와있음.

Table 2 also notes how many responses to the open-ended question were related to each of the subcategories. Each domain was represented in comments from 1 or more respondents from each of the 14 cohorts studied. Descriptions of each domain and its subcategories follow, with supporting quotes.



내적 성장

Intrapersonal development


성찰과 자기인식

Commitment to reflection and self-awareness


우선순위와 목표 설정

Prioritising and setting goals


조직관리, 시간관리 기술

Organisation and time-management skills



대인관계 성장

Interpersonal development


건강한 관계

Healthier relationships


경청과 타인과 의사소통

Listening to and communicating with others


피드백 주고 받는 능력

Ability to give and elicit feedback


갈등관리와 협상 기술

Conflict management and negotiation skills


리더십과 그룹 참여 스킬

Leadership and group participation skills


교사로서의 발전

Development as a teacher


전반적인 교육기술, 능력

Overall teaching skills and abilities


교사로서의 자신감

Confidence in self as teacher


교육을 더 즐기게 되고 만족하게 됨

Greater enjoyment and satisfaction in teaching


학습자 중심 교육, 지지적 학습환경 조성

Being learner-centred and creating a supportive learning environment


프로그램은 일부 참가자들이 (1)보다 학습자중심이 되고 (2)학습자를 존중하게 되고 (3)학습자의 요구에 신경쓰며 (4)긍정적 학습 환경을 조성하게 도와주었다.

The FDP⁄ TS has helped some past participants to: 

1 become more learner-centred; 

have more respect for learners; 

be more aware of learners’ needs, and 

establish a positive learning climate.


배운 교육법을 지속적으로 사용하고, 다른 교수들이 발전하게 도와줌

Continuing to use teaching methods learned and helping other teachers improve



커리어 발전

Career development


FDP담당 교수들과 동료를 만남으로써 얻은 것

Benefits from exposure to FDP faculty and peers


일부 응답자들은 FDP/TS 담당 교수들을 만나고 이들이 지속적으로 롤모델과 멘토 역할을 해주었다고 했음. 다른 참가자들과 네트워킹의 가치와 friendship을 형성하는 가치를 알게 되었다고 했음.

Some respondents noted benefits deriving from being exposed to FDP⁄ TS faculty who had continued to serve as role models and mentors. They also appreciated the value of networking and forming friendships with other participants:



진로와 진로 계획에 대한 영향

Influence on career path and planning


일부 참가자들은 프로그램에서 (1)교육자로서 그들의 커리어에서 무엇을 기대할 수 있는가 (2)성공을 염두에 둔 커리어 설계는 어떻게 해야 하는게 등을 배웠다고 함

Some participants credited the programme with giving them perspectives on: 

1 what to expect from their careers (particularly as educators), and 

2 how to structure their careers with success and advancement in mind.



전문성 향상으로 인한 기회

Opportunities due to expertise gained



DISCUSSION


지난 참가자에서는 (1)학습자-중심, (2)지지적 학습환경 구축, (3)효과적인 피드백 제공을 할 수 있게 되었다고 응답함.

Past participants were also more likely to describe themselves as: 

learner-centred; 

building supportive learning environments, and 

giving effective feedback.15 


이러한 폭넓은 효과는 프로그램의 장기적 성격으로 인한 것이며, FDP담당 교수 및 다른 참가자들과 관계를 맺을 기회를 주었기 때문으로 생각된다.

This broad impact probably results from the longitudinal nature of the programme4 and the opportunities it provides for building relationships with programme faculty and other participants.


FDP에 대한 많은 장기 follow-up 연구는 학문적 성공의 객관적 지표(프리젠테이션, 출판, 리더십 위치) 등에 초점을 두었다. 이렇게 협소하게 정의된 ㅍ로그램의 성과는 참가자들에게 더 깊고 지속적으로 일어나는 변화, 그리고 개인적 성장이나 타인과의 관계적 측면을 잘 보여주지 못할 수 있다.

Many long-term follow-up studies of longitudinal FDPs have focused on objective markers of academic success, such as presentations, publications and leadership positions.10,12,19–22 These narrowly defined quantitative programme outcomes may preclude the detection of deeper and more sustained changes in participants’ professional or personal growth and relationships with others.


FDP/TS 참가자들은 프로그램에서 프로그램 담당 교수들과 다른 참가자들과 관계를 발전시킬 수 있는 기회에 대해 높은 가치를 두었다. 이는 다른 연구에서도 나타난 바 있다. 학문적 동료들과 네트워크를 형성하는 것은 커리어 성공과 연관되어 있으며, FDP의 중요한 부산물이다.

Past participants in the FDP⁄ TS highly valued opportunities provided by the programme to develop relationships with programme faculty and other participants. This finding has been borne out in other studies of FDPs.8–10,12–14 The development of a network of academic colleagues has been shown to be associated with career success23–25 and is an import- ant by-product of FDPs such as that described in this manuscript.


이 다양한 응답들이 개방형 문항에서 자발적으로, 아무런 요구가 없는 상태에서 얻은 응답이라는 것이 중요하다. 또한 더 적게 언급된 카테고리가 더 많이 언급된 것들보다 덜 중요하다고 생각해서는 안된다.

It is important to note that the variety of responses to the open-ended question about programme impact were spontaneous and unsolicited, and we should not assume that those subcategories mentioned fewer times are less valid than those mentioned more frequently.


15 Cole KA, Barker LR, Kolodner K, Williamson PR, Wright SM, Kern DE. Faculty development in teaching skills: an intensive longitudinal model. Acad Med 2004;79:469–80.


18 Crabtree BF, Miller WL. Doing Qualitative Research, 2nd edn. Thousand Oaks, CA: Sage 1999;145–61.












 2007 Jun;41(6):592-600.

Qualitative assessment of the long-term impact of a faculty development programme in teaching skills.

Author information

  • 1Johns Hopkins Bayview Medical Center, Baltimore, MD 21230, USA. aknight@jhmi.edu

Abstract

CONTEXT:

The long-term impact of faculty development programmes (FDPs) is poorly understood, and most assessments of them have been quantitative in nature.

OBJECTIVE:

This study aimed to use qualitative methods to better understand the long-term impact of an FDP in teaching skills (FDP/TS).

METHODS:

A survey was carried out in July 2002 of the 242 faculty members and fellows who had participated in a 9-month FDP/TS at any time from 1987 through 2000. The survey included 2 quantitative questions and an open-ended qualitative question about the impact of the programme on the participants' professional and personal lives.

RESULTS:

A total of 200 past participants (83%) responded to the survey. Participants from early and recent cohorts were similarly represented. In all, 82% of respondents said programme participation had had 'a moderate' or 'a lot' of impact on their professional life, and 49% said their personal life had been affected to this degree. Four major domains, each containing at least 3 subcategories, emerged from qualitative analysis. The domain intrapersonal development included changes participants reported in themselves and in their approach to self-management. Interpersonaldevelopment contained subcategories relating to how participants interact with others. Subcategories in the domain development as a teacher related to increased teaching ability and enjoyment. The domain career development included professional growth and career opportunities attributed toprogramme participation.

CONCLUSIONS:

Longitudinal FDPs can have broad and sustained positive effects on the professional and personal lives of participants. Qualitativeevaluation methods may result in a richer and deeper understanding of the impact of these programmes.

PMID:
 
17518840
 
[PubMed - indexed for MEDLINE]


커크패트릭 레벨과 교육의 '근거'(Med Educ, 2012)

Kirkpatrick’s levels and education ‘evidence’

Sarah Yardley1 & Tim Dornan2









도입

INTRODUCTION 


최근의 책에서 Donald Kirkpatrick 은 어떻게 평가모델에서 네 개의 descriptor에 이르게 되었는지를 설명했다. 그는 기술 훈련은 반응-학습-행동-효과로 평가해야함을 관찰했다. 커크패트릭의 목적은 관리자에게 학습자와 학습자가 근무하는 조직에 대해서 즉각적으로 확인가능하고 측정하기 쉬운 결과를 제공하는 것이었다. 기업의 리더들은 훈련이 효과과 있었다는 구체적 증거(판매량이라던가, 제품의 질, 수익성)를 필요로 한다. 산업계에서 성공적으로 사용한 보고는 다른 영역으로의 확산을 가져왔다. 커크패트릭 그 자신도 여러 곳에서 찬사가 쏟아진 만큼 이 네 단계의 validation은 필요없다는 입장을 밝혔다. 그러나 커크패트릭 모델이 의학교육에서 널리 사용되고 있지만 비판이나 분석은 없다.

In a recent book, Donald Kirkpatrick explains how he arrived at the set of four descriptors that are now widely used to evaluate the impact of interventions in education.2 He had observed that technical training could be evaluated by measuring learners’ reactions, learning and behaviour, and their impact on the organisations for which the learners worked.3 Kirkpatrick’s purpose was to provide managers with promptly identifiable and easy-to-measure outcomes in learners and the organisations for which they worked. Business leaders needing tangible evidence that training would enhance their sales volume, product quality and profitability quickly implemented his ideas. Reports of their successful use in business attracted interest from other fields and his ideas spread. Kirkpatrick himself said there was no need to validate the descriptors because accolades poured in.2 Despite the wide use of Kirkpatrick’s levels in medical education, there has been no review or critique of their use in this context.



방법

METHODS



결과

RESULTS


교육 인터벤션 평가를 위한 커크패트릭 레벨의 적합성

The suitability of Kirkpatrick’s levels for appraising education interventions


대다수의 연구에서 커크패트릭 레벨을 heuristics로 사용하였다. 단지 네 개의 연구만이 미 모델의 사용을 비판했으며, 이 중 하나는 커크패트릭의 레벨이 인적자원개발 분야에서 무비판적으로 적용되고 있음을 지적했다. Abernathy는 이 '레벨'이 교육훈련의 평가에 대해서 어떤 질문을 던지고 어떤 결과를 내야하는가에 영향을 준다면서, 이 레벨은 '연성(soft)' 성과 혹은 지속학습의 평가에는 부적절하다고 지적했다. Alliger와 Janak은 커크패트릭 모델이 가정하고 있는, 그리고 훈련 결과의 모양을 빚어내는 세 가지 요소를 지적했다. 

Most articles found by our search used Kirkpatrick’s levels as heuristics in education evaluation; just four critiqued their use11–14 and one of these found that Kirkpatrick’s levels were applied uncritically in the field of human resource development.14 Abernathy,12noting that the levels could influence the questions asked and results produced, rejected them as unsuitable for evaluating either ‘soft’ outcomes or continuous learning (as opposed to time-limited interventions). Alliger and Janak identified three types of assumption by which Kirkpatrick’s model could tacitly shape research findings, comprising: 

    • 레벨에 숫자를 붙임으로서 위계를 가정함
      assumptions of hierarchy associated with the numeric labelling of levels; 
    • 레벨 간 인과관계를 가정함
      assumptions of causal links between levels, and 
    • 레벨 간 정적 상관관계를 가정함
      assumptions that the levels are positively inter-correlated.11 

Blanchard 등은 어떤 연구든 평가를 하기에 앞서서 커크패트릭 레벨과 상관 없이 목적이 정의되어야 한다고 주장했다.

Blanchard et al.13 argued that the purpose of any research had to be determined before any evaluation of it at any particular Kirkpatrick level could be considered. 


비록 이들 중 어떤 연구도 의학교육에 대한 것은 아니지만, 의학교육에 적응가능한 것으로 보이며, 커크패트릭도 그 자신이 각 '레벨'을 훈련 휴리스틱으로 사용했기 때문에 (어떻게 숙고한 연습과 사회적 학습을 통해서 전문가를 키우는가가 아니라) 이들 저자의 의견에 동의했을 것이다. 그는 단기적인, 실재하는 목적(판매량, 질, 수익성) 등을 측정하기로 선택했다. 무형의 훈련성과에 대한 커크패트릭의 해결책은 - 그가 스스로 책에서도 언급한 바와 같이 - 유형의 이익과 연결시키는 것이며, 왜냐하면 훈련은 구체적인 측정가능한 행동을 목적으로 하고 있는 훈련이 시장가치를 인정받을 수 있기 때문이다.

Although none of those studies concerned medical education, they seem applicable to it and Kirkpatrick himself might have agreed with these authors because he actually advocated using the levels as a training heuristic,2 not to evaluate how professionals become expert practitioners through deliberate practice and social learning. He chose the levels to measure very short-term and tangible endpoints like sales volume, quality and profitability. Kirkpatrick’s solution to intangible benefits of training, which he acknowl- edged in his original work, was to link them to tangible benefits because training orientated towards specific measurable behaviours could be assigned a market value.2


커크패트릭이 커크패트릭 레벨을 성공적으로 적용했다고 인용한 여러 개의 참고문헌 중 어떤 것도 의학교육에 관련된 것은 없으며, 비지니스 분야와는 특성이 다르다(다양한 집단의 니즈를 충족시켜야 함). 커크패트릭 레벨의 문제점은 각 레벨에서 고려하는 수혜자가 모두 다르다는 것이다. 레벨1~3은 학습자이며 레벨4a는 조직, 레벨4b는 환자이다.

Of his numerous references to successful applications of the levels,2 none came from a field as complex as medical education, which differs from business in that it is required to meet the needs, equitably, of a whole array of beneficiaries, including patients, students, practitioners, communities and health care organisations. A problem with Kirkpatrick’s levels is that different levels concern different beneficiaries: levels 1–3 concern learners; level 4a concerns organisations, and level 4b concerns patients.


이 구조에서 교사 역시 완전히 사라져 있다. 이 모델은 질적/양적 방법론으로 평가가능한 무수한 성과에 대한 평가를 포괄하는데 한계가 있으며, 뿐만 아니라 어떻게 그리고 왜 복잡한 인터벤션의 특정 요소와 어떠한 결과의 관계 있어서 그러한 결과가 나타났는지도 설명해주지 못한다. 이 모델은 그저 기대 성과를 측정하는데 사용될 뿐이며, 기대하지 않았던 결과는 무시한다. 즉, '의도한 대로 A라는 결과가 나타났는가?'를 물어보는 것이지 '이 인터벤션의 결과는 무엇인가?'라고 묻지 않는다. 임상적으로 이에 적합한 비유는 신약의 의도한 효과만 측정하고, 부작용은 측정하지 않는 것이다.

Teachers are missing from the scheme altogether. The model does not allow for the rich variety of outcomes that can be evaluated using qualitative as well as quantitative methodologies, nor explain how or why such out- comes are consequential to particular elements of complex interventions. It tends only to be used to measure anticipated outcomes and ignores unanticipated consequences. That is, it asks ‘Was outcome X achieved as intended, or not?’ rather than ‘What were the outcomes of this intervention?’ A clinical parallel would be a clinical trial that measured only the intended effects of a new drug and not its side-effects.




커크패트릭 레벨의 의학교육에 대한 적용

Application of Kirkpatrick’s levels to medical education research


커크패트릭 모델이 나온 40년이 자나서, BEME는 문헌리뷰의 기준 위한 커크패트릭 레벨의 modified version을 도입하였다

Forty years after Kirkpatrick’s original work, the BEME collaboration adopted a modified version of Kirkpatrick’s levels (which it named a ‘hierarchy’) as a grading standard for bibliographic reviews (Table 2). A prototype coding sheet, accompanied by explanatory notes, offered two complementary ways of appraising evi- dence, using either Kirkpatrick’s ‘hierarchy’ to grade the impact of interventions (Table 2) or a simple anchored rating scale of 1–5 of the ‘strength’ (Table 3) or trustworthiness of findings.


우리의 첫 번째 BEME 리뷰는 그 레벨을 사용하고, 위계로서 받아들여서 높은 수준의 커크패트릭 레벨을 더 중요한 성과로 지칭했다. 이 연구에서 24%의 성과는 레벨 에 있었으며 - 우리는 그 당시 이것을 중요하지 않은 것으로 평가했고 - 76%에서는 더 중요한, 높은 수준의 성과 평가로 나아갔다. 총 64%의 성과가 레벨 2에 있었으며, 레벨 3과 레벨 4를 합해도 12%밖에 남지 않았다. 우리가 연구성과의 '강도'에 대해 평가했을 때는 출판도니 결과의 42%만이 강력한연구성과(3점 이상) 이면서 중요한 연구성과(레벨 2 이상)에 해당되었고, 그 대부분은 레벨 2에 있었다.

Our own first BEME review9 (of early workplace experience in undergraduate medical education) used the levels and, accepting them as a hierarchy, treated a higher Kirkpatrick level as indicative of a more important outcome. This review found that 24% of outcomes were at level 1, which we then regarded as unimportant, and the other 76% were progressively more important according to their higher levels. A total of 64% of outcomes were found to be at level 2, leaving only 12% at levels 3 and 4 combined. When we added in an appraisal of the ‘strength’ of outcomes (Table 3), only 42% of published outcomes were both strong (rated at ‡ 3) and important (Kirkpatrick level ‡ 2) and then mostly at level 2 in the hierarchy.


그러나 레벨 3, 4 성과가 매우 미미하다는 것을 보여준 것은 우리만은 아니었으며 14개의 분석된 자료 중 레벨 3, 4를 평가한 것이 50%를 넘는 것은 3개 뿐이었다.

It shows we were not alone in finding relatively fewKirkpatrick level 3 or 4 outcomes. In only three of 14 data analyses (21%) were half or more of the outcomes rated at a level > 2. 


Table 1에서 기술한 대부분의 연구는 레벨 1 혹은 레벨 2a, 2b를 측정하고 있으며, 이들은 각각 description (레벨 1) and justification (레벨 2a, 2b) 연구라고 명명되었으며 나름의 가치가 있다. 문제는 'clarification study'의 성과인데, 이것이야말로 의학교육연구를 더욱 강화시키는 토대가 되는 연구이며, 모든 커크패트릭 레벨에 맞을 수 있는 연구이다. 그러나 우리가 어떻게, 왜 특정 효과가 인터벤션의 특정 요소의 결과로 나타나는지를 이해하지 못한다면, 교육의 이익을 최대화하기 위해 개선하는데 어려움을 겪을 것이다. 구체적인 예를 들어보자면, 한 연구를 통해서 학습자의 정서에 영향을 주는 교육 인터벤션에 대한 clarify를 한다고 했을 때, 이것은 레벨 1 혹은 레벨 2 에 대한 것이며 상대적으로 중요하지 않은 결과로 여겨질 수 있다. 그러나 이것은 학습자의 professional development에 중요하다는 것이 자명하다. 늘 '결과'가 '과정'보다 중요한 것일까?

Most papers in Table 1 described what learners experienced (level 1) or measured what they learned (levels 2a and 2b); these have been more simply termed ‘description’ and ‘justification’ studies, respectively, and each has its own value.23 The snag is that outcomes in ‘clarification studies’, which are a rich basis on which to strengthen medical education,23 could fit under any or all of Kirkpatrick’s levels. Yet, unless we understand how, and why, effects are consequential to particular elements or interactions, it will be difficult to refine education to maximise benefit. To give a specific example, it is possible that a study clarifying how an educational intervention affected learners’ emotions could be classified as demonstrating outcomes at level 1 (reactions) or 2 (attitudes), which are regarded as relatively unimportant, despite being self-evidently important to the professional development of the learners. Are outcomes necessarily more important than processes (which are not included in Kirkpatrick’s levels)?


Holton은 충분한 이론적 근거나 지지적 근거가 부족한 상태에서 낮은 레벨의 성과가 높은 레벨의 성과의 전제조건이 된다는 위계를 사용하는 것은 부적절하다고 지적했다.

Holton criticised their use as a hierarchy on the grounds that they lack important attributes of a theory and lack supportive evidence to indicate that lower-level outcomes are prerequisite to higher-level ones.14





의학교육 연구 평가를 위한 대안

Alternatives for appraising research in medical education


상대적으로 단순하나 교육훈련 인터벤션의 평가를 위해서는, 신속하게 드러나고 쉽게 관찰가능한, 전통적인 실험 설계에서의 성과를 평가하여 '레벨'이 (학습자 외의) 중요한 수혜자(대개 환자)에게 직접적 관심을 갖게 해줄 수 있다. 그러나 앞선 리뷰를 보면, 복잡하고, 장기 성과가 중요하고, 결과에 대한 평가만큼이나 과정에 대한 평가가 중요한 여러 교육 인터벤션에서는 부적합함을 보여준 바 있다. 실제로, 우리의 리뷰는 커크패트릭 레벨을 잘 못된 유형의 근거로 활용하는 것은 오히려 해로울 수 있음을 지적했다.

When evaluating relatively simple training interventions, the outcomes of which emerge rapidly and are easily observed within classical experiment designs, the levels can direct attention to important beneficiaries other than learners (notably patients). The preceding review, however, leads us to conclude they are unsuitable for the higher proportion of education interventions, which are complex, in which the most important outcomes are longer-term, and in which process evaluation is as important as (perhaps even more important than) outcome evaluation. Indeed, our review found a body of opinion that considered that Kirkpatrick’s levels, applied to the wrong type of evidence, might be harmful.11–14



그렇다면 '근거의 종류가 평가를 편향시키지 않는' 대안으로 무엇이 있을까? 다른 식으로 표현하자면, 어떻게 우리가 추구하는 가치를 손상시키지 않으면서 적합한 '레벨'을 포함하는 균형을 잡을 수 있을까? 현재의 지식으로는 negative finding을 포함시키고, 추가 연구나 실천에 유용한 새로운 혹은 더 철저한 작업에 대한 구체적 요구를 포함시키는 방법이 있다.

What alternative ways are there, then, to critique the quality of various types of evidence in a scholarly way without allowing the type of evidence to bias its evaluation? Put another way, how do we balance the right level of inclusiveness with rigour in our approach to value? It is important that the current state of knowledge, including ‘negative’ findings and specific needs for new or more rigorous work to usefully inform further research or practice innovation, is represented.


다양한 systematic review  

The scholarship of systematic review in clinical science takes its origins from a paper published 40 years ago by the epidemiologist Archie Cochrane, in which he berated medical practice for being ineffective or frankly harmful.27 

    • The Cochrane Collaboration (http://www.cochrane.org) came into existence to promote clinical trials, using systematic review and statistical meta-analysis to synthesise findings from their aggregated results. ‘Evidence’ was rated as ‘weak’ or ‘strong’ according to standard criteria, which appraised its ability to support the statistical estimation of effect sizes. The Cochrane approach is not the only one in the health domain. 
    • The Joanna Briggs Institute (http://www.joan- nabriggs.edu.au/about/home.php) and the W K Kellogg Foundation (http://www.wkkf.org), both of which seek to improve health care practice through multidisciplinary working, have taken a pluralistic approach and do not place randomised controlled trials at the top of a hierarchy, regardless of the question posed. 
    • Recognising that the hypoth- etico-deductive, experimental approach of natural sciences is ‘ill-equipped to help us understand com- plex, comprehensive, and collaborative community initiatives’ (http://www.wkkf.org/knowledge-center/ resources/2010/w-k-kellogg-Foundation-Evaluation- Handbook.aspx), they allow questions to be asked and answered without forcing complex systems to fit the evaluative tools of one dominant research para- digm. 
    • By contrast, the Campbell Collaboration (http://www.campbell collaboration.org), which reviews evidence related to education, crime and justice, and social welfare, has aligned itself with the Cochrane Collaboration in holding data that are suitable for statistical meta-analysis as of intrinsically higher quality.




따라서, 이들 리뷰의 서로 다른 방법론은 서로 다른 '인식론적' 정의로부터 시작한다. '인식론적' 이라는 단어는 인식자(knower)와 지식(known)의 관계를 의미하는 것이다. 코크란 접근법은 전통적인 과학적 방법론에 따른 것이며, 실증주의적 인식론에 바탕을 두고, 복잡한 상황을 상대적으로 단순한 실험적 설계 내에서의 변수 간 비교로 환원한다. 비판적 평가에 있어서 코크란 접근법의 기준은 그것의 인식론적 입장과 일관된다. 그러나 Pope 등은 systematic review는 비록 그것이 임상영역에 강한 선호를 보이지만 (여러 대안 사이에 선택을 도와주기 때문에) 이것이 근거를 종합하는 유일한 방법은 아님을 주장한다.

Thus, different review methodologies start from different ‘epistemological’ assumptions, where the term ‘epistemological’ refers to the relationship between the knower and the known. The Cochrane approach, drawn from classical scientific methodology, has a positivist epistemology which allows it to reduce complex situations to a comparison of variables within relatively simple experiment designs. Its standards of critical appraisal are consistent with its epistemological stance. Pope et al. noted that systematic review, although it is strongly favoured in the clinical domain because it helps in making choices between alternative treatments, is not the only way of syn- thesising evidence.28


The Cochrane Collaboration이 '의사결정을 지지'하기 위하여 근거를 사용하는 방식은 '지식을 지지'하기 위하여 근거를 사용하는 (비-이분법적) 방법과 구분된다. 근거를 종합하는 집합적, 해석적 방법은 질적 근거와 양적 근거를 혼합하거나, 질적 근거만을 가지고 종합하기도 한다. 이러한 방법은 '지식을 지지'하는 더 나은 방법이 될 수 있으며, 실증주의적 인식론보다는 구성주의적 인식론에 기초한다. 의학교육연구는, 우리의 리뷰에서 보여준 바와 같이, 다원주의적이다. 따라서 bibliographic research가 커크패트릭의 평가에 따르지 않는 다섯 중 네 명의 리뷰어는 어디에 있는가?

The Cochrane Collaboration’s use of evidence for ‘decision support’ can be distinguished from the (non-dichotomous) use of evidence for ‘knowledge support’. Aggregative or interpretive methods of evidence synthesis that mix qualitative with quantitative evidence, or synthesise qualitative evidence alone, give better knowledge support and start from constructionist rather than positivist epistemological assumptions.28 Medical education research, our reviews have shown, is pluralistic. So where does that leave the four out of five reviewers whose bibliographic research does not lend itself to Kirkpatrick rating?



비판적 평가의 레퍼런스 기준을 정의하는 것이 아니라, 이 리뷰는 그러한 기준이 과연 존재할 수 있는지, 그리고 근거의 종합을 위한 계획을 수립할 때 얼마나 많은 질문에 대한 답이 필요한가에 대한 의구심을 던지는 것이다. 독자로 하여금 근거를 평가할 수 있는 토대를 전혀 주지 않기보다는, 우리는 논리적 접근을 위한 사고실험을 하고자 한다.
Far from defining a reference standard for critical appraisal, this review casts doubt on whether such a standard could ever exist and shows how many questions must be answered when planning an evidence synthesis. Rather than leave the reader with no basis on which to appraise evidence, we conducted a thought experiment in order to define a logical approach. 

실증주의적 원칙에 따라 진행된 실험에 있어서 근거기반의학의 비판적 평가 도구는 교육근거에도 적용될 수 있다. 이 섹션의 첫 문단에서 제시한 조건과 같이 그렇게 상대적으로 단순한 훈련 인터벤션의 평가에서 커크패트릭의 레벨은 적절하다. 대부분의 경우에서는 (아마 의학교육 근거의 80%는 될 것이다) 구성주의적 인식론이 적합하다. 이 경우에 비판적 평가는 (BEME scale 1~5와 같은) 결과의 신뢰성에 대한 단순한 종합적 판단이 아니다. 비록 한 리뷰 안에 포함된 개개의 연구마다 개별적으로 비판적 평가를 위한 도구가 적용될 수는 있지만, 종합적 결론에 대한 신뢰도에 있어서 이득이 되는 정도는 미미하다. 
For experimental research conducted on positivist principles, the critical appraisal tools of evidence-based medicine can be applied to education evidence. Under the conditions defined in the first paragraph of this section, such as in the evaluation of relatively simple training interventions, Kirkpatrick’s levels are appropriate. In the majority of cases (perhaps 80% of medical education evidence syntheses), a constructionist epistemology is likely to be appropriate, in which case critical appraisal will rest on simple global judgements of trustworthiness, such as the BEME scale of 1–5. Although critical appraisal tools appropriate to individual methodologies could be applied to individual studies included within a review, any gain in reliability is likely to make little difference to the overall conclusions pieced together from multiple different methodologies.


DISCUSSION


근거를 종합하는 것의 예술은, 우리의 결론으로는, 하나의 방법론이나 평가기준을 다양한 것들에 고정시키는 것이 아니라 여러 측면이 고려된 선택을 내리는 것에 있다. 이것은 Eva가 말한 '질(quality)의 단일한 결정권자는 없다'것과 같은 것이며, 왜냐하면 어떤 근거가 놓이는 위치가 그것의 유용성을 결정하기 때문이다.

The art of evidence synthesis, we conclude, lies in making well-considered choices rather than valorising one methodology or appraisal standard over another, echoing Eva’s view that there can be no single arbiter of quality because it is the use to which evidence is put that determines its utility.29


근거의 다양한 활용방법이 - 그것이 무엇이 되었든 - 어떤 방법을 사용할 것인지를 좌우한다.

The use of evidence to support policy, define outcomes, identify new research questions, answer practical teaching questions, inform teachers’ personal development, serve as a debating tool or establish the ‘state of knowledge’ on a subject can all dictate different methodologies


중립적인 것으로 보여지는 지식의 현 상태('state of knowledge')를 알아내는 것조차 존재론적, 인신론적 지위가 있다. 만약 질문이 '단순한 인터벤션의 효과'를 플라시보와 비교하는 것이라면 'naive realist'적 존재론과 인식론에 따라 코크란의 비판적 평가 기준을 사용하게 하고 효과크기를 측정하게끔 할 것이다. 더 환원주의적 관점에서는 결과는 더 명확할지언정, 활용가능도는 더 떨어질 것이다.

Even the last of these, which is often presented as a neutral assessment, involves ontological and epistemological positioning. If the topic in question is the efficacy of a simple interven- tion compared with a placebo administered under controlled conditions, a ‘naive realist’ ontology and epistemology30 would direct the use of Cochrane critical appraisal standards and estimation of effect sizes. The more reductionist a review, the clearer its results, but perhaps also the less applicable they are.


의학교육자들인 근거중심의학에서 벗어나는 지점은 단순한 실험의 결과와 실제 상황에서의 적용 사이에 있는 큰 격차이다. 프로세스는 물론 맥락이 교육 성과에 영향을 미친다. 더 나아가 경험의 스토리가 사라진다면 풍부한 뉘앙스는 물론 심지어 정보의 모든 핵심이 사라질 수도 있다.

Where medical education really deviates from evidence-based medicine is in its recognition of a wide gap between the results of simple experiments and their applicability in ‘real practice’. Context as well as process impacts on educational outcomes. Moreover, rich nuances or even the whole essence of information may be lost when stories of experience are omitted.


이 모든 이유로 인해, 리뷰어는 양적 자료는 물론 질적 자료를 고려하여 연구자가 교육행위의 상대주의적, 사회적 실천의 한 부분으로서 존재하기 위한 대화에 맞는 주장을 구성(construct)해야한다. 만약 연구자가 정책에 영향을 미치고 있다면 현실주의자적 입장과 수반적(attendant) 방법이 적합할 수 있으며, 리뷰어가 다음의 문제를 위해서는 실용주의적 판단을 내려야 한다.

      • 내가 의사로서 원저를 읽을 때 거기서 무엇을 얻어야 하는가? 
      • 그 맥락 속에서 받아들여야 할 것은 무엇이며, 더 정제되거나 뉘앙스를 파악해야 할 것은 무엇인가? 
      • 나는 어떻게 이 주제의 연구를 계층화하여 부분적으로 도움이 되는 정보를 버리지 않고서도 강력한 근거와 제한적 근거를 알아볼 수 있을 것인가?

For all of these reasons, it is likely a reviewer will need to consider qualitative as well as quantitative sources of evidence and ‘construct’ an argument fitted to the conversation he or she wants to be part of in the relativist, social world of education practice. If the reviewer wants to influence policy, a realist stance and attendant methods may be appropriate, 31 whereby the reviewer uses pragmatic judgement to answer questions like: If I were reading the original papers as a practitioner, what would I take away from them? What would I accept within context or pass judgement on in a more refined or nuanced manner than the current systematic review process allows? How can I stratify the studies on this topic to see where evidence is strongest or limited without unnecessarily discounting partially helpful informa-tion?



우리의 포괄적 결론은 근거가 사용되는 목적이 근거를 종합하는 방법의 신뢰성과 최선의 방법이 무엇인지에 영향을 준다는 것이다. 과학적 실험의 방법론적 가정이나 환자에게 돌아올 이득에 대한 임상적 가정을 근거의 기준으로서 채택하지 않는다면, 우리는 연구자들이 다음과 같이 근거를 종합할 것을 권고한다.

Our broad conclusion is that the purpose to which evidence is put influences its trustworthiness and the best way of synthesising it. Having rejected the methodological assumptions of scientific experimentation and the clinical assumption of patient benefit as reference standards of evidence, we suggest that researchers synthesising evidence should: 

    • state very clearly the aims of their work; 
    • make their epistemological and ontological assumptions explicit; 
    • admit any evidence that is appropriate to the aim, including complex and qualitative evidence; 
    • consider features of empirical research such as the strength of its theoretical orientation and its relevance to the review question when considering its weight in the final synthesis, and 
    • make absolutely transparent, when reporting a review, the decisions they took and their reasons for taking them.









 2012 Jan;46(1):97-106. doi: 10.1111/j.1365-2923.2011.04076.x.

Kirkpatrick's levels and education 'evidence'.

Author information

  • 1Keele University Medical School, Faculty of Health, Keele, UK. syardley@doctors.org.uk

Abstract

OBJECTIVES:

This study aims to review, critically, the suitability of Kirkpatrick's levels for appraising interventions in medical education, to review empirical evidence of their application in this context, and to explore alternative ways of appraising research evidence.

METHODS:

The mixed methods used in this research included a narrative literature review, a critical review of theory and qualitative empirical analysis, conducted within a process of cooperative inquiry.

RESULTS:

Kirkpatrick's levels, introduced to evaluate training in industry, involve so many implicit assumptions that they are suitable for use only in relatively simple instructional designs, short-term endpoints and beneficiaries other than learners. Such conditions are met by perhaps one-fifth of medical education evidence reviews. Under other conditions, the hierarchical application of the levels as a critical appraisal tool adds little value and leaves reviewers to make global judgements of the trustworthiness of the data.

CONCLUSIONS:

Far from defining a reference standard critical appraisal tool, this research shows that 'quality' is defined as much by the purpose to which evidence is to be put as by any invariant and objectively measurable quality. Pending further research, we offer a simple way of deciding how to appraise the quality of medical education research.

© Blackwell Publishing Ltd 2012.


MMI로 평가하는 학업/경험/역량 측정의 가중치 변화가 합격자 민족/인종 코호트에 미치는 영향(Acad Med, 2015)

The Effect of Differential Weighting of Academics, Experiences, and Competencies Measured by Multiple Mini Interview (MMI) on Race and Ethnicity of Cohorts Accepted to One Medical School

Carol A. Terregino, MD, Meghan McConnell, PhD, and Harold I. Reiter, MD






의학교육에 있어서 피훈련자의 다양성 혹은 그들의 비율을 인구구조를 반영하게 하자는 폭넓은 요구가 있다. 보건의료인력의 다양성을 증가시키는 것은 그 그룹 간 격차를 줄이는 하나의 접근법이 된다. Cohen 등은 공정과 평등 이슈에 더하여 접근성의 향상, 보건의료시스템의 관리의 최적화 등을 인력 다양화를 달성해야 할 실용적 이유로 보았다.

Within the context of medical education, there has been a call for broad strategies extending beyond measures of the compositional diversity of trainees or representational ratios.2 Enhancing diversity in the health care workforce has been proposed as one approach to address those group disparities.3 Cohen et al3 cite increasing access and ensuring optimal management of the health care system, in addition to issues of equity and fairness, as pragmatic reasons for attaining workforce diversity.


피훈련자의 다양성을 높이는 것은 모든 학생에 대하여 교육의 질을 높이는 것에 중요하고, 농촌지역, 도심 매부, 소수자들의 의료접근성을 높이고, 공공보건 연구의 진보를 가속화하는 데 중요하다. GPA와 MCAT점수에 의존하는 방식은 의료계의 다양성을 증대시키는데 큰 제약이 되며, 연구자들은 MMI에 기반한 선발이 다양성을 더 높인다고 주장한 바 있다.

Increasing trainee diversity is important for shaping educational quality for all students, increasing access to health care in rural, inner-city, and minority populations, and accelerating advances in medical and public health research.22 Reliance on GPAs and MCAT scores may severely constrain diversity within medicine,23,24 and researchers have argued that basing admission selections on MMI scores may promote applicant diversity.17,25


의과대학 인증기준의 변화 역시 의과대학들이 다양성에 관심을 가지게 된 계기이다. Holistic Review Project는 학생선발 과정에서 학문적 역량과 인성 역량을 모두 고려할 것을 장려하는 모델이며, 이를 위해서 RWJMS는 더 전인적인 평가과정을 도입했다.

Changes in accreditation requirements reflect the enhanced attention to diversity expected of all medical schools.26 The Holistic Review Project has articulated a model that promotes the consideration of both academic and personal competencies in the application process.27 In response, Rutgers Robert Wood Johnson Medical School (RWJMS) began to implement a more holistic screening process;


중요한 것은, 지원자들을 오직 MMI점수로만 선발한다는 점이다. MCAT 자료는 학업역량의 최저 수준을 결정하는 것을 도와준다. 11개 의과대학의 자료를 바탕으로 Julian은 MCAT점수 중 생물과학 점수 8점, 물리점수 7점, 언어추롡점수 6점 이하가 되지 않는 한 학업적 어려움을 겪을 가능성은 매우 낮다는 것을 보여주었다. 이러한 연구결과는 합당한 학업적 최저한계점만 넘어선다면, 입학절차는 학업적 수행능력에 덜 신경쓰고, 핵심 인성역량에 더 신경써야 한다는 것을 보여준다.

Importantly, applicants are admitted exclusively on the basis of their MMI scores. MCAT data support this reliance on academic thresholds. Using data from 11 schools, a study by Julian28 demonstrated that the risk of academic difficulties remained very low until entering students’ MCAT scores fell below 8 for biological sciences, 7 for physical sciences, and 6 for verbal reasoning. These findings suggest that for students exceeding acceptable academic thresholds, selection procedures should be less concerned with academic performance and more concerned with core personal competencies performance.


이 가설을 지지하듯 최종 합격자 선발을 MMI로만 했던 RWJMS의 첫 번째 코호트는 1학년과 2학년 과정, 그리고 USMLE Step 1에서 그 앞의 코호트와 동등한 성과를 보여주었다. 또한 이 집단의 MMI점수가 의과대학 재학 중 평가한 핵심인성역량(reliability, integrity, service/sensitivity to diversity)을 잘 예측했다.

In support of this hypothesis, the first cohort at RWJMS whose final admissions decision was based solely on MMI scores performed equivalently in first- and second-year courses and on United States Medical Licensing Examination (USMLE) Step 1 relative to previous cohorts admitted on the basis of traditional interviews, academic scores, and experiences. Additionally, the MMI scores from this first cohort predicted scores for students’ core personal competencies assessed in medical school (reliability, integrity, service/sensitivity to diversity).29


우리는 학업적 척도, 경험 척도, 인성 점수가 지원자의 자기보고식 민족/인종에 따라 다른지, 그리고 이 점수들의 가중치를 변화시켜서 입학생의 다양성에 영향을 줄 수 있는지를 보았다.

Specifically, we examined whether academic measures (GPA, MCAT), experience scores (service, clinical, and research [SCR]), and personal competencies scores (MMI) varied as a function of applicants’ self-reported race/ethnicity, and whether change in weighting of scores would impact diversity by altering the demographic composition of the entering classes.



방법

Method


세팅, 연구집단, 지원자 선발 과정

Setting, study population, application screening process


후향적 연구

This is a retrospective study of previously collected and recorded data for the RWJMS admissions process for entering classes 2011–2013.


학업 기준

We determined that applicants screened for MMI were academically and experientially prepared, based on threshold criteria previously set by the RWJMS Admissions Committee (

    • total GPA > 3.0, 
    • total MCAT > 22, 
    • MCAT biological science score > 8, and 
    • no other MCAT score < 6).


봉사/임상노출/연구/자기소개서/추천서를 5점 척도로 평가함. (3점: 지원자로서 acceptable함.)

    • 연구에서의 5점은 피어-리뷰 발표나 출판 경험
    • 봉사에서의 5점은 봉사단체를 조직한 것, 3점은 정기적으로 봉사조직에 참여한 것


We scored service, clinical exposure, research, the personal essay, and letters of recommendation on a 1–5 Likert scale. The scale was developed so that a score of 3 is an acceptable score for an applicant. An example of a research rating of 5 would indicate culmination of the research experience with peer- reviewed presentation or publication. With respect to service, regular involvement in a service organization would be rated 3, whereas the founder of a service organization would be rated a 5.


스크리닝 점수의 총합은 지원자의 순위를 매기는데 사용되지 않고, threshold로만 사용함(어느 점수 이하는 면접 안 봄). 그러나 SCR점수는 스크리닝 결정에 도움을 주기 위한 자료이지 스크리닝을 하는 절대적 기준은 아니며, 예컨대 일부학생은 연구 경험이 없었기 때문이다. 학업기준을 충족시키고 SCR, 자기소개서, 추천서 점수가 3점을 넘는 학생에게 면접기회를 줌. 이후 GPA, MCAT, 경험치 스크리닝 점수, 자기소개서 ,추천서 등은 더 이상 고려하지 않음

The sums of the screening scores were not used to rank applicants but served as threshold scores below which an interview would not be offered. An SCR score was developed to inform but not dictate screening decisions, as some students did not have research experience. We considered for interview only applicants who met the academic criteria and who had SCR, personal essay, and letters scores of at least 3. We did not revisit the GPA, MCAT, experiences screening scores, essays, and letters after applicants were selected for interview.





MMI 절차, 위원회 고려사항, 합격 결정

The MMI process, committee deliberations, admissions decisions


MMI. 6개 스테이션. 한 면접날의 문항은 그 날에만 사용됨. 

The MMI process at RWJMS consists of a six-station MMI. Each station consists of a behavioral descriptor or situational judgment-type interview stem addressing a specific AAMC COA core personal competency4 or combination of competencies. All interview stems are unique on a given interview day and written by one of the authors (C.A.T.). The MMI process at RWJMS employs only the 30 members of the standing committee, who participate in modified frame-of-reference training prior to the sessions. Extensive interviewer training allows for the assumption of adequate reliability with a six-station MMI.


Table 1

Table 1 demonstrates the behaviorally anchored rating scale for communication.




5점 척도로 다음을 평가

In each station, interviewers evaluate applicants on the 

    • basis of communication, 
    • content/argument, and 
    • overall global impression 

using a behaviorally anchored 1–5 Likert scale.



Statistical analysis


가중치를 달리하여 "what-if" analyses를 수행함. alternative weighting을 적용하기 전에 서로 다른 스케일로 평가하였기 때문에 z-score로 변환함

In addition to comparing differences in mean performance scores as a function of applicant self-reported race/ ethnicity, we also conducted a series of “what-if ” analyses to determine whether alternative weighting methods would have changed final admissions decisions and entering class composition. Because the different performance measures are on different numeric scales, we converted performance measures (GPA, MCAT, SCR score, and MMI) to z scores before implementing alternative weighting schemes.





결과

Results



전통적 수행능력 측정

Traditional performance measures


지원자와 MMI 스테이션의 상호작용은 33% 변인 설명. 이러한 상호작용 효과는 지원자가 MMI 스테이션에 따라 다양한 수행능력을 보이며, context-specificity를 의미함.

The interaction between applicant and MMI station accounted for the second largest amount of variance (33%). This interaction effect indicates that applicant performance varied across MMI stations, an effect commonly referred to as “contextspecificity.” 15







지원자 다양성과 전통적 수행능력 척도와의 관계

Relation of traditional performance measures to applicant diversity







"먄약" 분석: 가중치가 달랐을 경우의 결과

“What-if ” analyses: The effects of alternative weighting of performance measures on race/ethnicity composition of accepted applicants


URIM 지원자의 비율은 가중치에 따라 57%~22%로 다양함.

the proportion of URIM applicants accepted into the undergraduate medical program would have declined from 57% to 22% depending on weighting.









고찰

Discussion


전통적인 학업이나 경험 점수보다 MMI의 비율을 높이면 인종/민족 다양성이 높아질 것임을 보여준다. 우리가 아는 바에 따르면 이는 미국 의과대학에서 MCAT이나 GPA가 아닌 MMI의 URIM 지원자에 대한 중립성을 보여준 첫 번째 연구

Our findings suggest that increasing use of MMI scores in admission decisions may enhance racial/ethnic diversity among entering medical students, relative to reliance on traditional academic measures and experience scores. To our knowledge this is the only report from a U.S. medical school showing the neutrality of the MMI for underrepresented applicants, contrary to the MCAT or GPA.31


MMI 수행능력에 있어서 URIM지원자와 non-URIM 지원자간 차이는 없었으며, 소규모 캐나다 연구와 같은 결과이다. 이러한 결과로부터 extrapolate하는 것은 연구 대상자의 규모나 미국/캐나다의 극도의 사회문화적 다양성 때문에 한계가 있다.

Our results revealed that there was no statistical significance in MMI performance between URIM and non- URIM groups, a finding consistent with a small Canadian study on five aboriginal applicants.25 Extrapolation from that study, however, is limited because of the size of that study, and the very different social and cultural backgrounds of the United States and Canada. 


상위 45% 학생의 민족/인종 구성만 놓고 보면 변화는 더 극적이다. Reiter 등은 여섯 개 캐나다 의과대학에서 MMI 결과를 분석하여 MMI가 다양성을 증가시키고, 의과대학 접근가능성을 높이며, 학업적 변인의 효과를 중화시킨다는 것을 보여줬다. McMaster의 접근법(면접 대상자 선발시에는 60% GPA 와 40% 자기소개서, 최종선발자 선발시에는 70% MMI와 30% GPA)도 있다. 캐나다 연구는 이렇게 가중치를 달리 했을 때 가구수입이나 지역사회 규모를 기준으로 비교하였을 때 합격자 코호트에는 영향을 주지 않았다.

The change in racial/ethnic makeup of the top 45% ranked students who would be offered acceptance is even more surprising. Reiter et al17 combined MMI results of six Canadian medical schools over two years, focusing on MMI effect on enhancing diversity, increasing access to medical school, and neutralizing the effect of academic variables. McMaster’s formulaic approach to invitation for interview was 60% GPA and 40% autobiographical questionnaire, and postinterview selection was 70% MMI score and 30% GPA. The Canadian study found that these differential weighting schemes did not impact the diversity of accepted cohorts, as measured by income and community size.17











 2015 Dec;90(12):1651-7. doi: 10.1097/ACM.0000000000000960.

The Effect of Differential Weighting of AcademicsExperiences, and Competencies Measured by Multiple MiniInterview (MMI) on Race and Ethnicity of Cohorts Accepted to One Medical School.

Author information

  • 1C.A. Terregino is senior associate dean for education and associate dean for admissions, Rutgers Robert Wood Johnson Medical School, Piscataway, New Jersey. M. McConnell is assistant professor, Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada. H.I. Reiter is professor, Department of Oncology, McMaster University, Hamilton, Ontario, Canada.

Abstract

PURPOSE:

To examine whether academic scores, experience scores, and Multiple Mini Interview (MMI) core personal competencies scores vary across applicants' self-reported ethnicities, and whether changes in weighting of scores would alter the proportion of ethnicities underrepresented in medicine (URIM) in the entering class composition.

METHOD:

This study analyzed retrospective data from 1,339 applicants to the Rutgers Robert Wood Johnson Medical School interviewed for entering classes 2011-2013. Data analyzed included two academic scores-grade point average (GPA) and Medical College Admission Test (MCAT)-service/clinical/research (SCR) scores, and MMI scores. Independent-samples t tests evaluated whether URIM ethnicities differed from non-URIM across GPA, MCAT, SCR, and MMI scores. A series of "what-if" analyses were conducted to determine whether alternative weighting methods would have changed final admissions decisions and entering class composition.

RESULTS:

URIM applicants had significantly lower GPAs (P < .001), MCATs (P < .001), and SCR scores (P < .001). However, this pattern was not found with MMI score (non-URIM 10.4 [1.6], URIM 10.4 [1.3], P = .55). Alternative weighting analyses show that including academic/experiential scores impacts the percentage of URIM acceptances. URIM acceptance rate declined from 57% (100% MMI) to 43% (10% GPA/10% MCAT/10% SCR/70% MMI), 39% (30% GPA/70% MMI), to as low as 22% (50% MCAT/50% MMI).

CONCLUSIONS:

Sole reliance on the MMI for final admissions decisions, after threshold academic/experiential preparation are met, promotes diversity with the accepted applicant pool; weighting of "the numbers" or what is written about the application may decrease the acceptance of URIM applicants.

PMID:
 
26488572
 
[PubMed - in process]


의학교육 개혁: 아시아의 경험(Acad Med, 2009)

Medical Education Reform: The Asian Experience

Tai Pong Lam, MBBS, MFM, PhD, MD, and Yu Ying Bess Lam, MA






로컬 커뮤니티의 의료에 대한 니즈가 변하고, 학생들의 학습 니즈가 변하면서 아시아 역시 다른 국가와 마찬가지로 의과대학에서 가르치는 내용과 의사가 실제로 필요로 하는 스킬 사이의 미스매치가 심해지고 있다.

As a result of changing health care needs of local communities and the learning needs of the students, Asia, like other parts of the world, is haunted by a mismatch between what is taught at medical school and the actual skills that are needed by doctors to provide health care service.1


아시아는 세계에서 가장 넓고 가장 인구가 많은 대륙으로 거의 전 인구의 60%정도가 이 곳에 산다. UN에 따르면 아시아는 다섯 개의 세부 지역으로 구분 가능하다.

Asia, the world’s largest and most populous continent, contains nearly three fifths of the world’s total population.2 According to the United Nations, Asia is divided into five subregions. 

  • China, Hong Kong, Macau, Japan, North Korea, South Korea, Mongolia, and Taiwan are known as Eastern Asia
  • Southern Asia includes Afghanistan, Bangladesh, Bhutan, India, Iran, Maldives, Nepal, Pakistan, and Sri Lanka. 
  • Brunei, Cambodia, East Timor, Indonesia, Laos, Malaysia, Myanmar, Philippines, Singapore, Thailand, and Vietnambelong to Southeastern Asia
  • Central Asia is composed of Kazakhstan, Kyrgyzstan, Tajikistan, Turkmenistan, and Uzbekistan. 
  • Armenia, Azerbaijan, Bahrain, Cyprus, Gaza, Georgia, Iraq, Israel, Jordan, Kuwait, Lebanon, Oman, Qatar, Saudi Arabia, Syria, Turkey, United Arab Emirates, Palestine, and Yemen are located in Western Asia.


이렇게 인구가 많기 때문에 인종과 문화적 배경도 다양하며, 사회 개발 수준도 다양하다. 따라서 서구에서 영감을 받은 의학교육은 그 자체의 근원은 동일하나, 이것이 실제로 아시아 의학교육 개혁에 적용되었을 때는 지역이나 국가 간 균질성보다는 엄청난 차이를 보인다. 

With such a huge population, varied racial and cultural backgrounds, and diverse social developments, it is not surprising that variation instead of uniformity is found in the Western-inspired medical education reforms that span over Asia in spite of their common origination from the West. As Hays and Baravilala3 pointed out, any reformeffort in Asia’s medical education must take into consideration the need for adjustment and adaptation in the local context. In other words, the direct and outright application of a Western medical education model or reformmay not be viable in the Asian context.




동아시아

Eastern Asia


동아시아의 전통적 교육시스템은 유교에 근원을 두고 있으며, 이는 단순 암기, 시험-지향적 태도, 교사의 우월적 지위 등을 특징으로 한다. 이것이 교육 개혁의 장애로 지적된다. PBL을 아시아에 적용하는 데 있어서 PBL 자체가 도전을 받고 있고, PBL의 효과성이 아직 확고하지 않다는 것을 제외하고서라도, 어떻게 문화적 차이와 장벽에 의한 서구-기반의 개혁 적용시의 장애가 극복될 수 있는지에 대한 통찰을 제공한다.

However, the fact that the traditional education system in Eastern Asia is deeply rooted in Confucianism—an educational tradition characterized by rote learning, an examination-oriented mindset, and the superior status of the teacher—may present as a barrier to the implementation of education reforms. The experiences in applying problem-based learning (PBL) in the Asian setting, regardless of the facts that PBL is itself facing challenges6 and that evidence of its effectiveness is still limited, provide insights on howcultural differences and barriers which limit the application of Western-based reformmay be overcome by indigenous solutions.


대만의 Fu-Jen 의과대학. 교사가 우월적 지위에 있는 전통적 아시아 교육 모델로 인해, 아시아 학생들은 - 특히 동아시아 학생들은 - 높은 수준의 교사를 기대한다. 또한 교육 능력 외에 동아시아 사회는 교사들이 좋은 인격을 갖추기를 기대한다. Fu-Jen의 경험은, 아시아 학생의 관점에 있어서 PBL tutor의 질이 중요함을 제시한다.

Established in 2000, the Fu-Jen Medical School in Taiwan has adopted PBL across its entire curriculum. The emphasis PBL puts on student-centered learning signifies a significant deviation, if not violation from the traditional Asian model,7 which places the teacher in a superior position. Because of these traditions, students in Asia, especially Eastern Asia, expect high-quality teachers. Besides teaching abilities, the Eastern Asian society also expect their teachers to possess a positive personality.7 Fu-Jen’s experience suggests that studies on the quality of PBL tutors fromthe perspective of Asian medical students are important.


한국에서, 전통적인 의학교육 시스템 역시 새로운 사회적 요구에 따라 도전을 받고 있는데, 한국 의과대학생들은 직무-기반 훈련을 받기 전까지는 독립적으로 진료할 만큼 충분한 역량을 갖추지 못했다고 느낀다. 이들에게 있어서 학습스타일의 변화는 교수 스타일의 변화만큼이나 중요하다. 그들은 어떤 전통적 관점에서의 심리적, 행동적 특징이 - 예컨대 극도의 예절, 수동성, 교수에 대한 맹목적 존중 - 고등교육에서 학생 학습의 장애가 되며, 특히 서구식 의학교육에서 더욱 그러하다.

In Korea, the traditional education systemin medical schools also faces challenges in meeting new societal needs. Korean medical students feel that they are not competent enough to performsolo practice unless they have undergone task-based training. For them, changing the style of learning is as important as changing the style of teaching. They believe that certain traditional psychological and behavioral traits, such as extreme politeness, passiveness, and blind respect for teachers are hindrances to student learning in higher education, especially in modern Western medical education.8



일본에서도 학생-중심이 아닌 교수-중심의 교육이 흔하다. TWMU에서의 연구에서 - PBL 10년 이상 운영 - 교수-중심의 전통에서는 PBL 학습자들이 문제를 추출해내는데 어려움을 겪음을 발견했다. 이 상황을 극복하기 위해서 1학년 학생들에게 문제와 학습목표를 추출해낼 수 있게 동기부여를 하고 촉진하게끔 하는 전략적 인터벤션이 전통적 PBL을 개선하기 위해 사용되었다. 결과는 긍정적

The tradition of a teacher-directed rather than a learner-directed education is also prevalent in Japan. In a study conducted by the Tokyo Women’s Medical University, where PBL has been integrated as a component of the preclinical curriculumfor more than 10 years, it was found that PBL learners early in their training encountered difficulties in extracting problems under the influence of a teacher-directed tradition.9 To combat the situation, strategic interventions aimed to motivate and facilitate freshman students to extract problems and derive learning objectives were employed to modify the conventional PBL program. The results were positive,


홍콩대학에서는 1997년 새로운 임상-기초 통합 교육과정을 도입하였다. 소그룹 PBL과 함께 시스템-기반 접근법을 하는 교육과정이다. 새로운 교육과정은 1학년 학생이 그들의 수동적 스타일을 자기주도적, 문제중심적 모드로 바꾸게 한다. 이 변화를 촉진하기 위해서 고등학교에서 의과대학으로의 이행을 도와주는 transitional course가 도움이 됨을 밝혔다.

Introduced by the University of Hong Kong in 1997, the new medical curriculumfocuses on integrating the basic and clinical sciences; courses in the traditional disciplines have been removedfromthe curriculum.10 The curriculum’s system-based approach with small-group,PBL is probably the most revolutionary of its kind in Asia today. The new curriculumencourages first-year medical students to transformtheir passive learning style into a self-directed, problem-based mode. To facilitate this change, a transitional course fromhigh school to medical school was developed and has been found to be helpful.11 


중국에서 의학교육 변화는 중국의 경제 혁신에 비해서 그 폭이나 깊이가 늦다. 사회의 증가하는 니즈에 대응하기 위해서 중국 의학교육의 혁신은 중요하고 긴급한 문제가 되었다. 서구시 의학교육을 도입하고자 하면서, 사회, 문화적 배경의 차이가 상당한 장애를 가져왔다. 단순 암기와 시험중심 태도로 학습하는 것은 수백년간 중국에서 받아들여지고 실행되어 왔다. 유교적 가르침이 자기-통제를 중시하기 때문에 개성과 창의성의 발달은 무시당하거나 심지어 억압되어왔다. 그 결과로 중국인의 일반적 태도는 지나치게 조심하고, 위험을 두려워하고 상상력을 제한시키기 되었다.

In China, medical education reformlags behind its economic reformin depth and in scope. To cope with increasing societal needs, the innovation of medical education in China has become an urgent and important issue.13,14 In its attempt to adopt Western methodology in medical education, differences in social and cultural background have created considerable obstacles to the reform effort. Learning by rote has been widely accepted and practiced for hundreds of years in China, nourished by the exam-oriented mindset that has historically flourished in China.15 Because Confucian teaching emphasizes self-control, the development of personality and creativity is neglected and even suppressed. The result is a general mindset in China of being overcautious and fearful of risk, which limits imaginative thinking.


1990년대 이후, 중국의 대학에서는 PBL이 점차 늘어났고 결과도 긍정적.

Since the 1990s, tertiary colleges in China adopting PBL have gradually increased, and the results are encouraging.16


최근의 연구결과를 보면 중국과 서구 학생의 문화적 배경 차이가 중국에서의 PBL 도입에 있어서 당당한 차이를 가져왔다.

A recent study also pinpointed that because of different cultural backgrounds between the Chinese and the Western students, despite the similarity of the teaching process and outcomes to Western experience, notable differences are found in the adoption of PBL in China.19


문화적 장애와 별도로, 경제적 조건의 제한, 농촌 인구의 높은 비율, 농촌에서 의료분야의 노동력 부족 현상, 표준화된 트레이닝의 부족 등도 있다.

Apart fromcultural barriers, limited economic conditions, a huge population with a large percentage of rural inhabitants, a growing manpower shortage in the health care industry of rural China, and variations in training standards are ruling factors that hinder the development of medical education reformin China.15


남아시아

Southern Asia


남아시아의 의과대학은 전통적으로 식민지의, 유럽에 근원을 두고 있다. 그 결과 교육과정은 서구의 단순 모방이며, 지역의 문화와 의료행위를 무시하여 종종 지역사회의 기대와 전혀 다르게 된다. 이에 따라 남아시아에서 의사의 양성은 그 지역의 커뮤니티 니즈를 정확히 이해하거나 대응하지 못한다.

Medical schools in Southern Asia have traditionally modeled their education systems on their colonial, European roots. The curriculum, as a result, is a direct imitation fromthe West, which neglects local cultures and practices and often falls short of community expectations. The medical training is inadequate so physicians in Southern Asia are unable to understand and respond to community needs in the region.1,21


자원의 부족으로 인해 남아시아 국가들은 의학교육 개혁을 통해 커뮤니티의 의료 요구를 충족시키고자 했다.

Faced with a lack of resources, developing countries in Southern Asia have focused on reforming medical education to fulfill the health care needs of their communities.


이런 개혁 중 일부는 국제기구의 도움을 받아서 진행되었다. 전통적으로 일방적 강의식, 교수-중심, 과목-중심의 교육과정을 지역의 구체적 요구에 대응할 수 있는 교육과정으로 변화시켰다. 지역사회-기반 교육 프로그램의 개발은 - 지역사회 세팅에서의 학습에 초점을 두고, 지역 인구의 건강 요구를 이해하는 것 - 좋은 방향으로 여겨진다.

Some of these reforms are carried out with help and support of international organizations. They have reoriented the traditional curricula which emphasize didactic, teacher-centered teaching and a discipline-based approach to a curriculumwhich is responsive to the specific needs of the community.20,22 The development of community-based educational programs,1,20,21,23 which focus on learning in community settings and understanding the health needs of local population groups as well as individuals in the community, is seen as a viable direction.24


인력과 자원의 차이는 다양하지만, PBL, 지역사회 기반 교육, 통합 교육 등을 도입하려는 노력은 학교마다 이뤄지고 있다.

Although variations in manpower and resources exist, tentative steps have been taken by medical schools in the region to introduce problem-based, community- oriented, integrated teaching for basic and clinical sciences in their curricula.


의학교육과 별개로, 일반인 건강교육 역시 커뮤니티에서 중요하게 여겨진다. 환자도 더 충분히 정보를 가지고 있고, 의사의 질도 더 향상될 때 커뮤니티의 건강이 향상될 것이다.

Apart from medical education, health education is also deemed important in the community. The interaction of better-informed patients and well-qualified doctors may significantly improve community health.25


파키스탄은 정치적, 사회적 도전에도 불구하고 지역사회-기반 의학교육을 실천해 온 역사가 길다. Aga Khan University 은 20년 전 지역사회의 건강요구 충족을 위한 교육과정을 도입하였다. 지역사회 의사 양성 수련은 풀뿌리 수준 - 스스로의 요구를 파악하는 지역사회 기반 조직이나 단체 - 에서 이뤄진다. Aga Khan University 의 성공에 기반하여 파키스탄 정부는 이러한 교육모델을 국가 내 모든 의과대학에 확장하였다.

Despite political and social challenges, Pakistan has a long history of practicing community-based medical education. Aga Khan University introduced a medical education curriculum20 years ago that addresses the health care needs of the community at large. Medical training is carried out in the communities at the grassroots level—community-based organizations and groups identify their own needs. On the basis of Aga Khan University’s successful experience, the Pakistani authority has formally extended its community-oriented teaching model to all medical colleges in the country.22


다른 지역에서도 지역사회 기반 의학교육이 좋은 결과를 남겼다. 스리랑카의 의학교육 개혁. 인도의 Kerala 주(India’s Kerala state) 경험 등. 

Elsewhere, the practice of community- based medical education has also reaped impressive results. In 2005, the demand for medical education curricular reform emerged in Sri Lanka, and emphasis was put on community-based training. India’s Kerala state experienced similar medical education reform. Both Sri Lanka and the Indian state of Kerala have maintained policies to achieve gender and social equity. As a result, Sri Lanka and Kerala have the best health indicators,


세계에서 두 번째로 인구가 많은 인도의 의학교육 개혁은 의료전달에 있어 더 포괄적인 접근법을 자극했다. 인도 경제가 지난 20년간 지속적으로 성장해왔지만, 그 분포는 불균등하다. 더 나아가 인도 지역사회는 문화적/언어적/인종적으로 무척 다양하다. 서로 다른 커뮤니티의 문제를 대할 때 특별한 주의가 필요하고 문화적 민감성에 대해 반드시 고려해야 한다. 

The need for medical education reform in India, the second most populous country in the world, has sparked the urge for a more comprehensive approach to health care delivery.27 Although the Indian economy has grown steadily over the last two decades, its wealth distribution is uneven.1 Furthermore, the Indian communities are culturally, linguistically, and ethnically diverse. Special attention and cultural sensitivity are essential when addressing the unique problems of different communities. The research, training, and service missions of academic medicine in India need to be better linked up to provide appropriate health care services.26


네팔은 지역사회 기반 학습이 수년간 수행되어 왔지만, 여전히 많은 곳에서 국내 정치적 갈등으로 인해 급성기 병상 세팅에서 교육을 한다. Institute of Medicine in Kathmandu에 의해서 지역사회 진단 프로그램이 시행중이며, Kathmandu University School of Medical Sciences 는 의과대학 교육과정에 지역사회-기반 학습을 포함시키게 개선하고, 학생들이 외딴 지역을 방문하고, 지역 주민과 가깝게 교류하도록 하였다. 그러나 안타깝게도 내란으로 인해서 커뮤니티 현장학습이 점차 줄고 있다.

In Nepal, though community-based learning has been carried out for many years, the majority of teaching is still conducted in acute hospital settings because of ongoing political conflicts within the country.28 Apart fromthe community diagnosis programs organized by the Institute of Medicine in Kathmandu, Kathmandu University School of Medical Sciences has revised its medical curriculumto include an emphasis on community-based learning by having students visit remote areas and neighboring districts to interact closely with local communities. Dismayingly, the community field trips have been gradually curtailed because of the waves of insurgency.28


과거에는 남아시아의 PGME는 거의 외부의 관리나 내부의 질 관리 대상이 되지 않았다. 그러나 스리랑카, 파키스탄 의과대학이 영국/호주/싱가폴/뉴질랜드 등에서 외부의 examiner를 초청하여 최종시험 감독을 의뢰하였다.

In the past, postgraduate medical training programs in Southern Asia were rarely subjected to external review or internal quality control.29 However, medical schools in Sri Lanka and Pakistan have begun to invite external examiners from the United Kingdom, Australia, Singapore, and New Zealand to oversee their final exams.




동남아시아

Southeastern Asia


Amin이 지적한 바와 같이, 비록 지역의 대부분 국가들이 재정적 한계와 저항에 직면하고 있지만, 동남아시아의 의과대학은 교육과정 통합을 향해 움직이고 있다. PBL은 동남아 지역의 절반 이상의 의과대학에서 하이브리드 교육과정의 한 부분이다. 조기임상노출과 지역사회기반 교육 역시 지지를 얻고 있다.

As Amin et al22 remark, medical schools in Southeastern Asia are moving towards curriculumintegration, though most of the countries in the region are facing financial constraints and resistance to change. PBL is part of the hybrid curriculumin at least half of the medical schools in the region.21 Early clinical exposure and community-based education are also gaining ground, and assessment reformis under way.30


말레이시아(새롭게 산업화된 동남아 국가)는 다인종, 다문화, 다언어 사회로 인해 지역사회기반 헬스케어 서비스 제공을 강조한다.

Malaysia, a newly industrialized country in Southeastern Asia, puts emphasis on the provision of community-based health care service because of its multiethnic, multicultural, and multilingual society. 


2003년 새롭게 지어진 Universiti Malaysia Sabah School of Medicine의 기초와 임상의 통합 교육과정은 Sabah 지역의 커뮤니티 맞춤형이다. Universiti Malaysia Sarawak 의 의과대학은 PBL, 통합교육, 지역사회 교육을 도입하여 구체적인 Sarawak 지역의 건강요구를 반영했다.

The integrated curriculumof basic medical sciences and clinical skills at the Universiti Malaysia Sabah School of Medicine, newly established in 2003, was tailored for the local community of Sabah.31 The medical school of Universiti Malaysia Sarawak adopted a PBL, integrated, community-based curriculum that reflects the specific health care needs of the people of Sarawak. For example, doctors have to be familiar with and sensitive to the beliefs and cultural practices of the 26 ethnic groups in Sarawak.32


싱가포르에서는 1999년 시작한 NUS의과대학(Yong Loo Lin School of Medicine)은 교수-중심, 강의-중심, 과목-지향 교육과정을 학생-중심, 통합, 상호작용적, 교육과정으로 변화시켰다. 전통적 영국식 교육과정과 비교하면, 새로운 교육과정은 학생들을 싱가포르 국민의 건강요구에 부합할 수 있는 의사를 키우도록 하고 있다. PBL은 새로운 교육과정의 핵심으로서 20%까지 활용된다. 전통적 방식의 교육에 익숙한 환경으로 인해서 PBL을 완전히 도입하지는 못하였다.

Starting in 1999, the National University of Singapore’s Faculty of Medicine (today known as the Yong Loo Lin School of Medicine) implemented a new hybrid curriculumwhich shifted froma teacher-centered, lecture-based, discipline-oriented curriculumto a more student-centered, integrated, interactive, faculty-directed curriculum. Compared with its traditional British-style curriculum, the school hopes the new curriculumcan better prepare students to meet the challenges of health care needs for the people of Singapore. PBL is the key feature of the new hybrid curriculumand takes up 20%of the curriculum. But because of an environment deeply entrenched in traditional education, the Singapore medical school has decided not to fully adopt the PBL curriculum.33


주목할 만한 점은 남아시아와 동남아시아에 새롭게 등장하는 사립 의과대학이다. 필리핀, 인도네시아, 말레이시아, 파키스탄, 방글라데스 등에 사립의과대학들은 정부의 제약에서 벗어나 PBL, 지역사회 기반 교육등 새로운 교육의 장점을 빠르게 도입하고 있으나, 교육의 질은 심각한 우려 대상이다.

Worthy of note is the emerging phenomenon of private medical schools among the Southern and Southeastern Asian countries such as the Philippines, Indonesia, Malaysia, Pakistan, and Bangladesh. While private medical schools enjoy the advantage of quick adaptation to education innovations such as problem-based, community-oriented teaching without hindrance from government bureaucracy, quality of training is a serious concern.25,35


중앙아시아와 서아시아

Central and Western Asia


중앙아시아에서 소련의 해체 이후 의학교육의 국제 기준을 만들고자 하는 시도가 있다.

In Central Asia, attempts have been made to reach international standards of medical education after the collapse of the former Soviet Union’s centralized medical education system.36


서아시아에서 학부 교육과정이 아직 강의-기반이긴 하나 서구의 의학교육과의 비교연구가 이뤄졌고 적절한 교육과정 변화가 있다.

In Western Asia, even though the undergraduate curricula in most of the medical schools remain lecture-based and teacher-centered, comparative studies with Western innovative medical schools have been done, and the appropriateness of curriculumreformis substantiated.37


바레인에서 1982년 완전히 PBL 도입

In Bahrain, the College of Medicine and Medical Science of Arabian Gulf University adopted a full-scale PBL curriculumat its inception in 1982 and continues to refine the curriculum.



Conclusion


이러한 도전에도 불구하고 여러 중요한 통찰이 밝혀졌다.

Despite the challenges, a number of important insights are revealed in there form effort undertaken by different regions in Asia. 


개발도상국에서 가장 중요한 우선순위는 지역사회의 건강요구를 충족시키는 것이다. 이러한 관점에서 의료인력은 지역사회-기반 환경에서 수련받아야 하며, 지역 인구의 요구를 이해해야 하고, 지역 커뮤니티와 더 가까이 일해야 한다. 이것이 서구에서는 이미 다 발전한 경향이지만, 아시아 사회에서는 인종과 문화적 구성의 복잡성, 자원의 부족 등을 고려하면 의학교육 개혁을 지역사회의 맥락에 더 마주는 것이 중요하다.

In the developing countries, top priority is placed on fulfilling the health care needs of the community. In response, health care personnel should be trained in community-based environments so that they will be able to understand the needs of the local population and be able to work more closely with the local community. While this is already a growing trend in the West, the need to place medical education reformin the context of local community seems more pressing in Asian societies in view of the scarcity of resources and the greater complexity in ethnic and cultural composition. 


더 나아가, 의학교육 연구를 촉진하고 연구와 교육의 격차를 줄이는 것이 중요하다. 이 지역의 의학교육에 대한 자료가 부족하나, 연구 결과가 점차 아시아 의과대학의 교육에 도입되어 간다는 것은 고무적이다.동남아시아 30개 의과대학 대상 연구에서 72%의 응답자가 의학교육실(medical education unit)을 가지고 있다고 했고, 대부분은 1990년 이후 설립되었다.

Moreover, promoting research in medical education and bridging the gap between research and education are crucial areas that Asian medical schools should seriously consider.34 While data pertaining to medical education in this region are limited,24 it is encouraging to see that research findings in education are gradually being incorporated into the practices of Asian medical schools.34 A survey of 30 medical schools in Southeastern Asia showed that 72%of the respondents have existing medical education units, most of which were established after 1990.21




1 Improving health by investing in medical education. PLoS Med. 2005;2:e424.







 2009 Sep;84(9):1313-7. doi: 10.1097/ACM.0b013e3181b18189.

Medical education reform: the Asian experience.

Author information

  • 1Family Medicine Unit, The University of Hong Kong, Ap Lei Chau, Hong Kong. tplam@hku.hk

Abstract

Medical education reform is taking place all over the world including Asia, which has 60% of the world's population. Confronted with diverse social and cultural needs as well as resource constraints, various regions in Asia have carried out medical education reform at different levels and directions. In this article, the authors describe the application of Western-inspired reforms and localization and adaptation of Western models to fit the cultural and community needs in the five different subregions of Asia: (1) Eastern Asia, (2) Southern Asia, (3) Southeastern Asia, (4) Central Asia, and (5) Western Asia. The article reviews whether the medical education reforms brought improvement to the medical curricula and effectively fulfilled the cultural and social needs of Asian countries. The authors also explore the establishment of medical education departments in manyAsian medical schools and the incorporation of research findings into medical practice. Departments of medical education will facilitate localization and promote further development of medical education reform in Asia despite the challenges ahead.

PMID:
 
19707080
 
[PubMed - indexed for MEDLINE]


교수개발: 꿈의 영역? (Med Educ, 2009)

Faculty development: a ‘Field of Dreams’?

Yvonne Steinert,1,2,3 Peter J McLeod,1,2,4 Miriam Boillat,1,2,3 Sarkis Meterissian,2,6 Michelle Elizov1,2,4 &

Mary Ellen Macdonald2,7,8






목적: FD 워크숍 참석자들은 종종 '가장 FD가 필요한 사람들이 가장 참석하지 않는다'라고 코멘트한다.

OBJECTIVES Participants in faculty develop- ment workshops often comment that ‘those who need faculty development the most attend the least’.



"만들면, 올 것이다" Kinsella 1980

‘If you build it, [they] will come.’ Kinsella, 19801


임상교육자들은 더 이상 내용전문가라는 사실 만으로 성공하지 못한다. 다른 연구자들도 임상교육자로 성공하기 위해서 필요한 스킬에 대해서 언급한 바 있다

Clinical teachers can no longer succeed with mere content expertise. Others4,5 have identified the skills required to succeed as teachers, including the ability to 

  • 적절한 환경 만들기 create an appropriate environment, 
  • 학습자 관찰하고 평가하기 observe and assess learners, 
  • 피드백 제공하기 provide feedback, 
  • 다양한 세팅에서 가르치기 teach in multiple settings, and 
  • 효과적 롤 모델 되기 role model effectively.


FDP에 대한 다양한 연구에도 불구하고, 참석과 참여에 대한 연구는 부족하다. 교육-개선 목적 프로그램 참여의 장애에 대해서는 지금까지 단 하나의 연구가 있었는데 여기에는 다음과 같은 것들이 있따.

Despite many published descriptions of faculty development programmes and activities,5–7 the liter- ature on attendance and participation is scant. To our knowledge, there is only one descriptive paper8 that outlines potential barriers to participation in teaching improvement programmes. These barriers include: 

  • 교사의 태도와 오해 the attitudes and misconceptions of teach- ers; 
  • 기관의 불충분한 지원 insufficient support from the institution, and 
  • 교육개선방법의 이점에 대해 확신을 주는 근거의 부족 a lack of convincing research on the benefits of teaching improvement methods. 


참여에 대한 가능성을 낮추는 요인으로는..

Non-participant attitudes that diminish the likelihood of participation comprise: 

  • FDP의 필요성을 평가절하하는 경향 a tendency to underestimate the need for faculty development programmes; 
  • 교육스킬의 효용성에 대한 신념 부족 lack of belief in the utility of teaching skills, and 
  • 교육자 훈련이 excellence 교육과 무관하다는 신념 a belief that teacher training is unrelated to teaching excellence.8



방법

METHODS



설계

Design


포커스그룹의 장점

We selected focus groups13 as the primary method of data collection for this descriptive study. Focus groups, which encourage participants to 

    • recount their beliefs and practices, 
    • provide an opportunity for group interaction and 
    • trigger memories of hitherto forgotten experiences,14 

allowed us to meet our study objectives.



참가자 모집

Subject recruitment



포커스그룹

Focus groups


의료인류학자 진행

A medical anthropologist, who was not involved in our programme, conducted the focus groups. The focus group questions were pilot tested with members of the Centre for Medical Education and tapped four main areas of inquiry: 


질문 영역

    • perceptions of faculty development; 
    • reasons for non-participation; 
    • perceptions of effec- tive teaching methods and preferred learning formats, and 
    • perceived barriers to participation. 

프로브 질문의 목적, facilitator의 역할

‘Probes’ accompanied each question to stimulate thinking, to encourage faculty members to give detailed responses, and to solicit examples of more general observations.13 The facilitator...

    • took minimal field notes during the focus groups, but 
    • wrote more extensive notes within 24 hours of the session in order to record her impressions, 
    • capture main themes and 
    • facilitate preliminary data analysis


한 그룹에서 어떤 주제가 등장하면 이후 그룹에서 다시 확인함

The focus group questions evolved slightly as the process unfolded; when salient issues emerged in one group, they were reiterated in subsequent groups to test their relevancy.



데이터 분석

Data analysis


  • We audiotaped and transcribed all focus groups using standard rules of transcription.14 
  • We removed iden- tifiers and names from the final transcripts. 
  • All transcripts were reviewed by one of the investigators for accuracy. 
  • Content analysis guided the data analysis. 
    • Three of the investigators independently read all transcripts, using multiple close readings; 
    • recurrent themes were identified and agreed upon, and 
    • similar themes noted across transcripts were assembled and analysed together, for both specialties. 
    • Additional codes for newly emerging topics were created as needed. 
    • The final step in the analysis included the development of major categories and 
    • the identification of exemplar quotations illustrating each theme.


결과

RESULTS


세 가지 주요 카테고리

We grouped our findings into three main categories: 

1 What are faculty members’ perceptions of faculty development? 

Why do some faculty members not participate? 

3 How can we get faculty members to attend more often?



FD에 대한 교수들의 인식은?

What are faculty members’ perceptions of faculty development?


다수의 참여자들이 FD의 필요성을 인식했으며, '어떻게 가르쳐야 하는가에 대한 지도'를 전혀 받지 못했는지에 대해 지적했다. 한 참가자는 FD를 '교수를 위한 문법학교'라고 했다.

A number of participants perceived the need for faculty development and commented on howthey had never been given ‘instructions on howto teach’. One participant likened faculty development to ‘grammar school’ for teachers:


'아이가 학교에 가는 것과 비슷합니다. 아이들은 말하는 법을 이미 알지만, 그들은 자신들의 예술을 더 정제할 수 있게 해주는 문법을 배우고, 더 효과적으로 할 수 있게 됩니다'

‘It’s sort of like kids going to school; they all know how to speak, except that they are learning the grammar which allows them to refine their art and perhaps be more effective at doing it.’


한 참가자는 CME와 임상기술을 업데이트 하는 것 등과 '혼란'이 있어 보일 수 있는 점을 지적했다. 또 다른 사람은 FD를 '교육기술에 대한 CME'라고 했다.

One participant described what might be seen as a possible ‘confu- sion’ with continuing medical education (CME) and the updating of clinical skills. Another viewed faculty development as ‘CME for teaching skills’.


FD의 역할이 진로개발을 촉진해주는 것이라는 점은 여러 참여자가 언급했으며, 이 질문에 들어있다.

The role of faculty development in promoting career development was also noted by many of the partici- pants, as reflected in this rhetorical question:


'대학 커뮤니티에서 FD란 단지 더 잘 가르치는 것, 혹은 더 생산성높은 구성원이 되는 것만 의미할까?'

‘Does faculty development just mean teaching or being a productive member of the [university] community?’



왜 일부 교수들은 참여하지 않는가?

Why do some faculty members not participate?


• clinical reality, which includes volume of work and a lack of (protected) time; 

• a perceived lack of direction from, and connec- tion to, the Faculty of Medicine; 

• a perceived lack of recognition and financial reward for teaching, and 

• the geographically central location of faculty development activities and other logistical issues.


임상 현실, 과도한 근무량과 시간 부족

Clinical reality, volume of work and lack of (protected) time


의과대학과에서의 방향제시 혹은 의과대학과의 연결성에 대한 인식 부족

Perceived lack of direction from, and connection to, the Faculty of Medicine


다수의 참가자들은 대학에서 방향을 제시해주거나 커리어 가이드를 해줌으로써 개인적 차원의 목표나 전문직 차원의 목표를 도달할 수 있게 도와주기를 바랐다. 이러한 정서는 대학 교수로서 처음 시작할 때 프로그램(패키지)에 대한 오리엔테이션을 제공받고 싶은 바람을 포함하고 있었으며, 처음에 어떤 식으로 해야하는지에 대한 조언을 누군가가 주기를 바라는 것도 있었다.

A number of participants highlighted a desire for direction or career guidance from the university to help themto achieve personal and professional goals. This sentiment included a strong wish for an orien- tation programme or package on first starting at the university, as well as the desire to have someone tell them what to do at the outset:



참가자들은 개인적/커리어 개발에 관련하여 멘토십에 대한 강한 희망을 나타냈으며, 어떤 정보원에 접근가능한지(승진이나 테뉴어 관련 정보, 연구비 제안서 작성) 등도 있었다. 추가로 대학 차원의 중요한 지원으로서 구조적, 교육 도구의 제공이나 파킹 바우처, 온라인 교육 등이 있었다.

Participants also expressed a strong preference for mentorship to help promote personal and career development, as well as knowledge about what resources they could access (e.g. information on promotion and tenure, grant writing possibilities and support). In addition, they voiced a desire for instrumental support from the university, ranging from the provision of structural and educational tools, to parking vouchers and online resources.



교육에 대한 인정 혹은 재정적 보상 부족

Perceived lack of recognition and financial reward for teaching


FDP가 개설되는 장소, 다른 로지스틱 이슈

Location of faculty development activities and other logistical issues



다수의 포커스그룹 참여자들은 4시간 워크숍은 너무 길고, 강력하게 '짧은 교육'을 선호했다. 실제로, 여러 참여자들은 그랜드라운드에서 교육 주제를 다룰 것을 요청하거나, FD를 모듈 형식으로 혹은 '위성(파견) 워크숍'형식으로 해주기를 바랐다.

A number of focus group participants commented that a 4-hour workshop was too long, and voiced a strong preference for ‘short snappers’. In fact, several participants requested educational topics at grand rounds, faculty development in a modular format, or ‘satellite workshops’:



어떻게 더 자주 참여하게 만들 수 있을까?

How can we get faculty members to attend more often?



지역에서 (그리고 더 짧은) 프로그램 제공

Offer local (and shorter) activities


교육을 인정하고 보상해주기

Recognise and reward teaching



FD를 요구(기대)사항으로 만들기

Make faculty development an expectation


여러 참여자들이 FD가 모든 임상교육자에게 의무화 될 것을 제안했다. 의무화에 반대하는 일부도 있었으나, 이들도 FD가 '기대사항(expectation)'이 되어야 한다고 제안했다.

Several participants suggested that participation in faculty development be mandatory for all clinical teachers. Others balked at this idea, but did suggest that faculty development be made an ‘expectation’:


상호연결성의 인식 확립하기

Build a sense of connection



DISCUSSION


가장 놀라운 결론은 참여자들이 FD를 교수로서 한 개인의 일반적 성장에 대한 것으로 인식하고 있었으며, 단순히 스킬 습득으로 인식하지 않는다는 점이었다. 사실 교수개발은 교수를 포괄적 관점에서 개발하는 것이며, 여기에는 personal and career 개발이 포함된다. 단순히 교육/연구/행정에 관한 특정한 역량 향상이 아니다.

One of the most surprising results of this inquiry was the finding that partic- ipants perceive faculty development as referring to a person’s general development as a faculty mem- ber, not just his or her skills acquisition. In fact, faculty development was perceived as the develop- ment of faculty in the broadest sense, which included personal and career development, and not merely the enhancement of specific competencies related to teaching, research or administration.


또 다른 기대하지 못한 결과는 많은 참여자들이 FD에 대해 물었을 때 '대학에 대한 실망'을 표했다는 점이다. 다수가 '단절'을 느꼈고, 이는 종종 심리적 문제였다.

Another unexpected finding was that many of the 16 focus group participants, when asked to describe their views on faculty development, described a sense of disappointment with the university at large. Many felt ‘disconnected’, often in a psychological sense.


많은 경우에 우리의 결과는 Baldwin이 generalist academic doctors의 요구도 조사 결과와 비슷하다. 이 연구자들도 심대한 변화를 이루기 위해 필요한 세 가지 포괄적 니즈를 발견했다.

In many ways, our findings resonate with the obser- vations of Baldwin et al.9 in their needs assessment of generalist academic doctors. These researchers found that participants identified three global needs requiring significant change: 

  • a better understanding of and rewards for their academic activities; 
  • better networking with one another, and 
  • more control over their time and responsibilities.


다양한 방식으로 본 연구는 FDP가 학자로서의 스킬을 다룰 것을 강조했으며, 이는 FD에서 간과되는 영역이다.

In diverse ways, our study’s findings underscore the need for faculty development programmes to address professional academic skills, often neglected in faculty develop- ment,16 as well as institutional goals and priorities.7


Morzinski 등은 professional academic growth를 이루기 위해서 FD에서 멘토링의 중요성을 강조했는데, 여기에는 가치/지식/동료관계 등이 포함된다. 우리의 결과는 FD가 진로개발에 보다 초점을 맞출 것을 권고한다. 초기 연구에서 Steinert는 FD가 조직개발에 초점을 맞출 것을 강조했다. 또한 학문적 활동으로서 교육의 촉진, 교육적 문화 배양, 교육 리더십/혁신/수월성에 대한 보상 등도 강조했다.

Morzinski et al.17 highlighted the importance of mentoring as a faculty development strategy to address professional academic growth, which includes the values, knowledge and collegial relations needed to succeed as an academic. The findings of our inquiry further reinforce the need for faculty development to focus on career development. In an earlier paper, Steinert16 highlighted the need for faculty development to focus on organisational development: to play a role in promoting teaching as a scholarly activity and to create an educational climate that encourages and rewards educational leadership, innovation and excellence.


Skeff 등의 연구 및 Liben의 연구에 더하여 본 연구는 교수들이 중앙-기반(centrally based) FDP에 참여하지 않는 이유가 '시간이 없어서'라고 하였다.

The results of our study also build upon those of Skeff et al.8 and complement the findings of a more recent survey by Liben et al.,19 who observed that faculty members do not participate in centrally based activities because they ‘cannot afford the time’.


초반에 말한 것처럼 Skeff 등은 FDP 참여에 다양한 장애를 언급했는데, 흥미롭게도 우리 연구의 참여자는 기관 차원의 지원 부족만을 주요 장애로 언급했다. Skeff가 언급한 '태도'의 문제(FDP의 필요성이 낮다든가 교육스킬의 효용성이 없다든가)를 언급한 사람은 매우 소수였다.

As we noted at the outset, Skeff et al.8 described a number of barriers to participation in faculty devel- opment activities. Interestingly, the participants in our study only highlighted insufficient institutional support as a major deterrent. Very few of our teachers identified the attitudes noted by Skeff et al.,8 includ- ing a tendency to underestimate the need for faculty development programmes or a lack of belief in the utility of teaching skills.






 2009 Jan;43(1):42-9. doi: 10.1111/j.1365-2923.2008.03246.x.

Faculty development: a 'field of dreams'?

Author information

  • 1Faculty Development Office, McGill University, Montreal, Quebec, Canada. facdev.med@mcgill.ca

Abstract

OBJECTIVES:

Participants in faculty development workshops often comment that 'those who need faculty development the most attend the least'. The goals of this study were to explore the reasons why some clinical teachers do not participate in centralised faculty development activities and to learn how we can make faculty development programmes more relevant to teachers' needs.

METHODS:

In 2006, we conducted focus groups with 16 clinical teachers, who had not participated in faculty development activities, to ascertain their perceptions of faculty development, reasons for non-participation and perceived barriers to involvement. Content analysis and team consensus guided the data interpretation.

RESULTS:

Focus group participants were aware of faculty development offerings and valued the goals of these activities. Important reasons for non-participation emerged: clinical reality, which included volume of work and lack of (protected) time; logistical issues, such as timing and the central location of organised activities; a perceived lack of financial reward and recognition for teaching, and a perceived lack of direction from, and connection to, the university.

CONCLUSIONS:

Clinical reality and logistical issues appeared to be greater deterrents to participation than faculty development goals, content or strategies. Moreover, when asked to discuss faculty development, teachers referred to their development as faculty members in the broadest sense, which included personal and career development. They also expressed the desire for clear guidance from the university, financial rewards and recognition for teaching, and a sense of 'belonging'. Faculty development programmes should try to address these organisational issues as well as teachers' personal and professional needs.

PMID:
 
19140996
 
[PubMed - indexed for MEDLINE]


새천년의 교수개발: 도전과 미래 방향(Med Teach, 2000)

Faculty development in the new millennium: key challenges and future directions
YVONNE STEINERT
Department of Family Medicine, Sir Mortimer B. Davis Jewish General Hospital and Faculty of
Medicine, McGill University, Canada

 

 

 

 

 

 

FD는 의학교육에서 매우 중요한 요소가 되었다. FD활동은 모든 교육연속체에서 교사의 효과성 향상을 위해 설계된다.

Faculty development has become an increasingly important component of medical education. Faculty development activities have been designed to improve teacher effectiveness at all levels of the educational continuum

 

이 논의에서 FD는 다음의 정의를 따른다. "기관이 교수들의 역할을 새롭게(renew)하거나 도와(assist)하기 위해서 활용하는 광범위한 활동". 즉, FD는 교수로서 일을 수행하는데 필수적으로 여겨지는 영역에 있어서 개인의 지식이나 스킬을 향상시키기 위해 설계된 모든 활동이며, 여기에는 교육, 연구, 행정이 모두 포함된다. 더 나아가 FD는 기관이나 교수들을 다양한 역할에 대비시키고 생산성과 생명력을 유지하기 위한 프로그램들도 포함한다.

For the purpose of this discussion, faculty development refers to that broad range of activities that institutions use to renew or assist faculty in their roles (Centra, 1978). That is, faculty development is considered to be any planned activity designed to improve an individual’ s knowledge and skills in areas considered essential to the performance of a faculty member, including teaching, research and administra- tion (Sheets & Schwenk, 1990). Moreover, faculty develop- ment includes those programs designed to prepare institutions and faculty members for their various roles and to sustain their productivity and vitality (Bland et al., 1990).

 

교수개발의 포커스

The focus of faculty development

 

지금까지, FDP의 대다수는 교수의 교육스킬 개발에 초점을 두어 왔으며, 개인의 성장이나 교수/조직적 요소(의사결정, 변화 프로세스)에는 관심을 덜 가져왔다. 비록 다른 FD 활동이 기술된 바 있지만, 교육 개선 또는 교육 효과성을 지나치게 강조한 측면이 있으며, 더 포괄적인 프로그램이 고려되어야 한다. 특히 FDP는 리더십이나 조직관리 스킬, 프로페셔널 학문 스킬, 조직개발 등이 교육개선 프로그램의 '메뉴'에 포함되어야 한다. 추가로, 특정 내용(정보 테크놀로지, 프로페셔널리즘, EBM) 과 '교육자들 교육시키기' 프로그램도 필요하다.

To date, the majority of faculty development programs have focused on the improvement of faculty members’ teaching skills (Hitchcock et al., 1993; Irby, 1996), with minimal attention being paid to the personal development of faculty members or organizational elements such as decision- making or the change process (Lipetz et al., 1986). Although other faculty development initiatives have been described, most notably in the area of research (e.g. Hekelman et al., 1995; Holloway et al., 1988), there has clearly been an over-emphasis on teaching improvement and instructional effectiveness, and more comprehensive programs should be considered. In particular, faculty development programs designed to enhance leadership and management skills, professional academic skills, and organizational develop- ment should be added to the `menu’ of teaching improve- ment programs. In addition, we should offer programs that focus on the teaching of speci®c content areas (e.g. informa- tion technology; professionalism; evidence-based medicine) and `educating the educators’ .

 

 

 

리더십과 관리 기술

Leadership and management skills

 

유례없는 헬스케어 분야의 변화가 교수의 역할과 보상체계를 바꿔놓고 있다. 내적, 외적 영향력에 의해 의사들은 점점 관리자적 역할과 리더십 역할을 맡아야 하게 되었으며, 그러나 아직 어떻게 우리가 교수들을 이 역할에 대비시킬지는 잘 모른다. 일부 프로그램이 이 목적으로 설계되었으나 리더십, 관리자, 행정스킬 개발 등에 대한 강조가 더 필요하다.

Extraordinar y changes in health care delivery have signi®cantly altered faculty roles and rewards (Bland & Simpson, 1997). In response to internal and external forces, physicians are being asked to take on increasing administra- tive and leadership roles, and yet, how do we formally prepare our faculty members for these challenges? Several programs designed to address this need have been described (e.g. McGaghie et al., 1981; Morahan et al., 1998; Steinert et al., 1997a); however, an increased emphasis on leadership, management, and administrative skill development is essential in these times of change.

 

이런 분야에 해당하는 프로그램에는 다음과 같은 것이 있을 수 있다.

Content areas for such programs might include: 

    • 조직의 구조 understanding `formal’ and `informal’ organizational structures; 
    • 현 정치, 경제, 조직 압력 분석 analyzing current economic, political, and organizational pressures and trends; 
    • 리더십과 관리 스킬 leadership and management skills; 
    • 갈등 관리와 협상 conflict management and negotiation; 
    • 시간 관리 time management; 
    • 수행능력 평가 performance appraisal; and 
    • 재정 관리 financial management (Bogdewic et al., 1997; Burke et al., 1997; Irby, 1996). 


Bogdewic이 말한 것처럼 조직과 리더십 스킬은 전통적인 의미에서 교수가 맡는 교육/연구/진료적 역할에 부가적인 것이 아니라, 이제는 핵심적 중요성을 갖는 스킬이다.

As Bogdewic and colleagues (1997) have said, organizational and leadership skills can no longer be thought of as an adjunct to the traditional roles of teaching, research, and service. These skills are of central importance.

 

 

프로페셔널 학문 스킬

Professional academic skills

 

프로페셔널 학문 스킬은 학자로서 성공하기 위해 필요한 가치/지식/동료관계 등을 말한다. 이 스킬은 학계의 핵심 가치, 규범, 기대치에 대한 이해, 어떻게 생산성 높은 커리어를 관리하는지, 경험이 많고 박식한 동료와의 네트워크를 갖추는지 등을 필요로 한다.

Professional academic skills encompass the values, knowledge, and collegial relations needed to succeed as an academic (Morzinski et al., 1996). These skills include an understanding of the underlying values, norms and expecta- tions of academia, knowing how to manage a productive career, and establishing a network of experienced and knowledgeable professional colleagues (Bland et al., 1990; Wilkerson & Irby, 1998).

 

관련 토픽

Examples of topics to be addressed include: 

    • academic promotion 달성 how to achieve academic promotion; 
    • 멘토 찾고 함께 일하기 how to identify and work with a mentor; 
    • 동료와 함께 일하기 how to work with colleagues; and 
    • 전문가 네트워크 개발 how to develop professional networks (Bland et al., 1990; Hitch- cock et al., 1997).

 

조직 개발

Organizational development

 

여러 저자들이 조직시스템의 변화와 리더십 전략이 더 생산성높은 교육환경을 만들기 위해 필요하다고 주장한다. 그러나 비록 1980년대부터 조직개발이 FD의 한 부분이 되었지만, 이 분야에 특정한 FD노력은 미미하다. 이 영역에 포함되는 것들.

Several authors have suggested that changes in organizational systems and leadership strategies may be needed to promote more productive educational environments (Bland et al., 1990; Bogdewic et al., 1997). However, although organizational development became part of the language of faculty development in the 1980s (Ramsey & Hitchcock, 1980), few faculty development efforts have speci® cally targeted this content area. Initiatives in this domain should include 

    • 참여적 조직 정책과 구조 efforts to create participative and empowering organizational policies and structures; 
    • 우수한 교육의 평가와 보상 procedures to evaluate and reward teaching excellence; and 
    • 교육과정 운영과 교실간 협력 programs to enhance curriculum administration and collaboration across departmental boundaries (Irby, 1996). 


Lipetz 등은 "FD의 클라이언트는 누구인가?"라는 흥미로운 질문을 던졌다. 명백히, 우리는 개인과 조직의 니즈를 연결시켜야 하며, 조직개발과 개인의 스킬개발의 짝을 이룰 수 있어야 한다.

Lipetz and colleagues (1986) have posed an interesting question: ªWho is the client in faculty development? º Clearly, we need to link individual and organizational needs (Bland & Simpson, 1997), and we should pair organizational development with individual skill development (Baxley et al., 1999).

 

특정 내용분야의 교육

The teaching of specific content areas

 

Cruess & Cruess 는 의학교육의 모든 레벨에서 변화하는 사회적 기대에 부응할 것을 강조했다. 커뮤니케이션 기술은 충분한 관심을 받지 못하고 있다. 비록 이 주제가 전통적으로 도제교육과 롤모델 분야에서 다뤄졌지만, 현재의 의료전달체계 맥락에서는 이러한 트레이닝 방법의 가능성은 낮으며, 더 공식화되고(formal) 조직적 방법이 필요하다.

Cruess & Cruess (1997a,b) have highlighted the need to teach professionalism at all levels of medical education in response to changing societal expectations. Communica- tion skills are also not receiving the attention they deserve. Although these subjects have traditionally been addressed through apprenticeship and role modeling, the current context for health care delivery negates the potential of these training methods, and we need to consider more `formal’ , systematic methodologies for addressing these content areas.

 

컴퓨터와 정보 테크놀로지. 

At the same time, computers and information technolo- gies are transforming many aspects of our personal and professional lives (Irby & Hekelman, 1997). As a result, the demand for training in this area will increase signi®cantly in the next decade. Crandall and colleagues (1997) outline a series of skills that might be included in such faculty develop- ment initiatives: 

    • accessing and managing the medical literature; 
    • planning and delivering lesson plans and presenta-tions; 
    • using computers for research and writing; and
    • integrating computers into clinical practice.

 

 

교육자들 교육시키기

Educating the educators

 

교수개발자들은 개개 교수들의 교육 효과성을 향상시키기 위한 프로그램의 전달 측면에서 성공을 거뒀다. 그러나 이제는 교육에 있어서 리더십을 발휘할 수 있는 개인들을 어떻게 더 발전시킬 수 있는지, 어떻게 그들이 '교육의 멘토'로서 역할을 할 수 있는지, 혁신적 FDP를 어떻게 설계하고 전달할 수 있는지 고민할 시간이다.

Faculty developers have succeeded in delivering programs designed to enhance individual teachers’ instructional effectiveness. It is now time, however, to further develop individuals who will be able to provide leadership to educational programs, act as `educational’ mentors, and design and deliver innovative faculty development programs.


Cusimano & David가 기술한 것과 같이, 다른 사람을 교육하는 방법에 대해서 훈련된 사람이 더 많아야 한다. 그리하여 의학교육이 지속적으로 변화의 동력에 반응할 수 있게 해야 한다. 또한 우리는 교육 측면의 학자를 더 양성해서, 이들이 교육에 접근할 때 교육과 교육과정의 프로세스와 성과에 대해 질문하도록 해야 하며, 의학교육 연구를 수행하게 해야 한다.

As Cusimano & David (1998) have stated, there is an enormous need for more health care professionals trained in methods of educating others so that medical education will continue to be responsive to driving forces of change. We must also work to encourage the development of educational scholars, individuals who approach education with questions about the process and outcome of teaching and curricula (Wilkerson & Irby, 1998) and who conduct research in medical education.

 

 

 

트레이닝 방법과 형식들

Training methods and formats

 

 

공식 멘토십

`Formal’ mentorships

 

멘토링은 교수들의 사회화/개발/성장을 촉진할 수 있는 흔한 전략이다. 

Mentoring is a common strategy to promote the socializa- tion, development, and maturation of academic medical faculty (Bland et al., 1990). It has also been recommended as a faculty development strategy by a number of educators (Bower et al., 1998; Longhurst, 1994; Morzinski et al., 1994, 1996).

 

Daloz 는 멘토십 모델을 세 가지 핵심 요소의 균형으로 보았다. (지지, 도전, 비전)

Daloz (1986) has described a mentorship model that balances three key elements: support, challenge, and a vision of the individual’ s future career. 

      • 불확실성과 불안 줄여주기 Support refers to those activities that affirm the value of the individual or try to reduce uncertainty or anxiety (Bower et al., 1998). 
      • 자신이 가진 가정을 점검하고 성찰의 가치 일깨움 Mentors challenge their colleagues by encouraging them to check out their assumptions and re¯ ect on their values and competen- cies; and 
      • 롤모델링, 토론 they foster career vision through role modeling or guided discussion. 


이 세 가지 요소의 균형을 통해서 멘토는 변화와 성장에 필요한 핵심적 텐션을 만들 수 있다. 롤모델의 가치와 멘토의 가치는 Osler 시대부터 강조되어 왔으며, 이 방법이 주는 장점을 잊어서는 안된다.

By balancing these three components, mentors create a tension essential for change and growth. The value of role models and mentors has been highlighted since Osler’ s time, and we should not forget the bene®ts of this method of professional development despite new technologies and methodologies.

 

 

통합적, 장기 프로그램 

Integrated, longitudinal programs

 

일부 연구자들은 '통합적, 장기 프로그램'의 가치를 강조했다 

Several authors (e.g. Elliott et al., 1999; Gelula, 1997) have highlighted the value of `integrated, longitudinal programs’ such as the Teaching Scholars Program in North Carolina (Stritter et al., 1994) and at McGill University, and we should build on this new faculty development practice.

 

비록 이들프로그램 대부분이 교수의 교육자로서 역할에 초점을 두지만, 행정, 관리, 연구에 대한 프로그램도 쉽게 개발 가능하다.

Although the majority of these programs to date have focused on the educational role of faculty members, such programs could easily be designed to promote expertise in administration, management, and research.

 

분권화된 활동

Decentralized activities

 

많은 부분 가정의학 분야에서 처음 시작한 FDP는 종종 각 학과 단위로 혹은 중앙에서 조직되어 운영된다. 커뮤니티 프리셉터가 늘어나고 외래-기반 교육이 늘어나면서 우리는 점점 FDP를 대학 바깥으로 '수출'해야 하는 상황이 되었다. 우리는 또한 주니어와 시니어 교수의 서로 다른 니즈를 해소해주기 위해서 노력해야 하며, minority 교수를 위한 것도 필요하다. 또한 지원이 적은 환경(underserviced setting)에서 근무하는 사람들을 위한 것도 필요하다.

Faculty development programs, many of which started in Family Medicine, are often departmentally based or centrally organized (i.e. faculty-wide). Given the increasing use of community preceptors and ambulatory sites for teaching, we should now `export’ faculty development programs outside of the university setting (e.g. Anderson et al., 1991; Baxley et al., 1999; Bing-You et al., 1999; DeWitt et al., 1993). We must also work harder to address the differing needs of junior and senior faculty members (Burke et al., 1997; Lipetz 1999), minority faculty members et al., (Johnson et al., 1998; Rust et al., 1998), and individuals who work in underserviced settings (Freeman et al., 1998). Our focus to date may have been too limited.

 

 

자기주도 학습 프로그램

Self-directed learning initiatives


Ullian & Stritter 이 말한 바와 같이 교수들은 성찰, 학생평가, 동료평가 등을 통해 스스로의 니즈를 결정해야 하며, 스스로의 자기개발 활동을 설계해야 한다.

As Ullian & Stritter (1997) have said, faculty must be encouraged and taught to determine their own needs through self-reflection, student evaluation, and peer feedback, and they must learn to design their own development activities.


Harris and colleagues 는 교수 효과성의 향상에 있어서 성찰의  가치를 강조했다. 실제로 우리는 FD에서도 성찰을 개인적 성장의 방법으로 삼아야 한다.

Harris and colleagues (1995) have underscored the value of re¯ ection as a method of improving teaching effectiveness; indeed, we should take advantage of the increasing attention paid to re¯ ection as a method of personal growth (SchoÈn, 1987) in faculty develop- ment initiatives as well.

 

 

컴퓨터 기반 FD

Computer-based faculty development

 

컴퓨터 기반 FD의 장점

Computer-based faculty development would allow for individualized programs targeted to speci®c needs. Moreover, the technology is now in place so that interactive instructional programs can be created in all domains of faculty development (Westberg &Whitman, 1997).

 

 

 

프로그램 평가

Program evaluation

 

 

더 철저한 프로그램 평가

More rigorous program evaluations

 

    공통적으로 나타나는 문제는 대조군, 비교군의 부재, 자기보고 측정에 지나친 의존, 작은 샘플 크기

Common problems have included a lack of control or comparison groups, heavy reliance on self-report measures of change, and small sample sizes.

 

가능하다면, 더 철저한 연구를 해야 함.

Whenever possible, we should try to conduct more experimentally rigorous research studies and work to overcome commonly encountered design problems.


참여자에 의한 학습을 기록해야 하며, 가능하다면 참가자의 학생/동료/기관에 대한 효과도 기록해야 함.

Programs should document learning by par ticipants, and whenever possible, the program’ s effect on the participants’ students, colleagues, and institution (Skeff et al., 1997b).


동시에 만족도에 대한 재평가가 필요함. 비록 연구자들이 이 정보의 가치를 평가절하하긴 하지만 참가자의 만족도는 교수들이 배우고자 하는 동기부여가 되고, FD를 동료들에게 권장하는 데 중요한 변인이다. 또한 프로그램 기획자들에게 가치잇는 피드백이다. 

At the same time, we should re-assess the value of participant satisfaction data. Although researchers have denounced the value of this source of information, participant satisfaction remains an important variable if faculty members are to be motivated to learn and to recom- mend faculty development initiatives to their colleagues. Participant satisfaction also gives valuable feedback to program planners.

 

 

프로그램 평가의 다른 모델들

Other models of program evaluation

 

교육 관련 문헌들을 보면 프로그램 평가와 관련한 다양한 모델을 제시한다. 그 중 많은 것들은 FD에서 잘 고려되지 않고 있다. 이러한 모델을 평가 구조에 포함시키는 것은 가치가 있을 것. 

The educational literature is rich with models of program evaluation, many of which have not been systematically considered in faculty development. Incorporating aspects of these models (Popham, 1975;Wholey et al., 1994) into our evaluation schema would now be worthwhile. 

    • 목표달성 모델 For example, the application of a goal attainment model (e.g.Tyler, 1942) would force us to clarify our program goals and ensure that we are assessing the attainment of our objectives; this model would also help us to consider unanticipated consequences, which occur frequently in this domain. 
    • 판단 모델 A judgmental model (e.g. Scriven, 1974) would have value if faculty development programs were to become part of the accreditation process and receive feedback on program design and implementation from a group of peers and experts. 
    • CIPP모델 The CIPP model (Stufflebeam, 1974) could be useful for examining the faculty development literature (Meurer & Morzinski, 1997) and for decision-making in times of budgetary restraint. CIPP is an acronym representing four levels of evaluation: 
      • the program objectives and the basis for those objectives (Context)
      • the educational strategies and how they were chosen (Input)
      • the actual implementation and how it compares with planned activities (Process); and 
      • how well the needs of the target population were met (Product). By


 

질적 방법론

Qualitative methods

 

 

양적방법론의 한계, 질적방법론의 가치

A number of authors have noted the limitations of quantita- tive methods in evaluating the effectiveness of faculty development programs and activities, and they havehighlighted the value of adding qualitative methodologies to more traditional assessments (Freeman et al., 1992; Hitch- cock et al., 1993; Skeff et al., 1997b).

 

더 광범위한 평가

Broader focus of evaluation

 

모든 경우에 'impact' 수준의 평가는 매우 값진 것이다. FD활동이 더 광범위한 시스템이나 개개인의 커리어패스에 영향을 주었는지에 하는 것. 학문적 전파(발표나 출판), 프로그램 개발(트레이닝 자료, 메뉴얼), 프로그램 수행 등에도 관심을 가져야 함.

Indeed, in all situations, it would be worthwhile to assess change at the impact level, trying to identify whether faculty development activities have had an impact on the system at large or on individuals’ career paths. We should also consider the question of academic dissemination (e.g. presentations and publications), product development (e.g. training materials and manuals), and implementation (Blumberg & Deveau, 1995).

 

 

장기 변화에 대한 평가. 즉각적 단기성과를 넘어선 평가. 6개월 혹은 그 이상 후에 평가한 연구는 매우 적음

Finally, it would be worthwhile to focus on the assess- ment of longer-term change. It is essential for us to move beyond immediate short-term outcome measures. Too few studies have assessed change at 6 months or longer (Nasmith et al., 1997).

 

 

파트너십

Partnerships

 

 

자원이 한정되고 재정적 제약이 있을 때 협력은 점점 더 중요하다. 실제로 파트너십은 여러 수준에서 가능하다.

Collaboration is becoming increasingly important in the current environment of limited resources and ®nancial constraints. Indeed, partnership is possible at a number of levels: 

    • among academic institutions; 
    • between academic institutions and professional societies and organizations; 
    • between faculty development and continuing medical educa- tion (CME); and at an international level.

학문기관 간, 그리고 학문기간 내 협력의 필요성은 프로그램 기획/수행/평가의 모든 단계에서 강조되어왔다. Skeff 등은 다양한 전문가 조직에 의해서 주최되는 지역과 국가 단위 미팅에서 제공되는 여러 FD활동을 조화(coordinate)시킬 것을 권고했다. 

The need for collaboration amongÐand withinÐ academic institutions has been highlighted in the area of program planning, delivery and evaluation (Steinert et al., 1997b). Skeff and his colleagues (1997b) have also pointed out the need to coordinate faculty development activities that are offered at regional and national scienti®c meetings hosted by various professional organizations. The time to consolidate available activities, and avoid duplication, is upon us.

 

국제 파트너십도 중요하다. 북미와 유럽의 전문성을 감안할 때, 성공한 모델이 공유되어야 한다.

International partnerships also hold great promise. Medical schools in many countries wish to start academic medical programs but do not have speci®cally trained faculty available. Given the expertise in faculty development in North America and in Europe, successful models should be shared. 

    • For example, Johnson & Zammit-Montebello (1990) describe an interesting program to train Maltese general practitioners in Malta with a visiting professor of Family Medicine. 
    • Thompson & Spann (1997) provide an example of a faculty development program they developed for Latin American physicians, conducted in Spanish in an American University.

These models for enhancing academic skills could also be exported to other settings.

 

 

근본 원칙

Underlying principles

 

 

1. 기관의 맥락과 문화를 이해하고 이에 기반하여 FD하라

1. Understand and work within the institution’s context/culture


기관의 문화와 맞아야 하고 니즈에 반응해야 한다. 조직의 강점을 강조하고, 조직의 수장(리더)와 함께해야 한다. 추가로 현재의 맥락을 FD노력을 촉진하고 향상시킬 수 있게 활용해야. 예컨대, 큰 교육과정이나 교육의 개혁이 있는 시기에 FD활동이 중요하다. Rubeck과 Witzke가 언급한 바와 같이 "자연적으로 발생하는 기회"를 노려야 한다.

Faculty development programs need to match the institution’ s culture and be responsive to its needs (Rubeck & Witzke, 1998). They should also capitalize on the organization’s strengths and work with the leadership to ensure success. In addition, we should remember that the current context can be used to promoteÐ or enhanceÐ faculty development efforts. For example, faculty development activities during times of substantial educational or curricular reform can take on added importance. As Rubeck &Witzke (1998) have stated, we should always remember to look for ª natural opportunitiesº .

 

 

2. 니즈에 기반한 FD를 하라

2. Ensure that programs and activities are based on needs

 

교수의 니즈, 기관의 니즈, 학생의 니즈, 사회의 니즈, 환자의 니즈, 조직의 요구와 도전 등

Faculty development programs should anticipateÐand basethemselves onÐthe needs of faculty members as well as the institution in which they work. Student needs, patient needs, and societal needs, as well as organizational demands and challenges, should be considered in the design of all programs, for faculty development should aim to renew and assist faculty in their diverse roles and help to meet the needs of the organization in which they work.

 

3. 지지를 끌어내고 효과적으로 마케팅하라

3. Promote `buy in’ and market effectively

 

FD에 참여할지 말지에 대한 결정은 그렇게 단순하지 않다. 아래와 같은 요인이 있다.

The decision to participate in faculty development is not as simple as it might at ®rst appear. It involves 

    • 특정 FDP에 관한 관심(reaction)
      the individual’ s reaction to a particular faculty development offering, 
    • 특정 기술을 얻고자 하는 동기
      motiva- tion to develop or enhance a speci®c skill, 
    • 시간이 가능한지
      being available at the time of the faculty development session, and 
    • 필요하다는 사실을 인정하는 심리적 장벽
      overcoming the psychological barrier of admitting need (Rubeck & Witzke, 1998). 


이러한 한계를 극복하고 우리의 '상품'을 팔 때에 그러한 저항이 학습의 자원이 되게 해야 한다.

As faculty developers, it is our challenge to overcome these potential obstacles and to market our `product’ in such a way that resistance becomes a resource to learning.

 

4. 다양한 프로그램과 방법을 제공하라

4. Offer diverse programs and methods

 

 

다양한 교수(역할, 발달단계 등)들의 니즈를 민감하게 반영해야 함. 

The need for diverse approaches to faculty development has been highlighted by many authors (Rubeck & Witzke, 1998; Steinert et al., 1997b). As discussed earlier, we must design programs that are sensitive to the needs of different faculty members. We must also consider differing faculty roles and address the various developmental stages of faculty members.

 

5. 성인학습의 원리와 다른 관련 이론틀을 활용하라

5. Incorporate principles of adult learning and other relevant theoretical frameworks

 

 

많은 경우 이들 원칙은, FDP의 초점이나 형식과 무관하게, Knowles가 설명한 것처럼, FDP의 개발과 운영의 지침이 되어야 한다. 이는 우리가 다음을 기억해야 함을 말한다.

In many ways, these principles, best articulated by Knowles (1980), should continue to guide the development and implementation of all faculty development programs, irrespective of their focus or format. That is, we should remember 

    • 의사들의 자기주도성과 경험 that physicians demonstrate a high degree of self-direction and that they possess many experiences that should be used as a learning resource; 
    • '알아야 할 필요'를 경험한 다음에 학습할 준비가 됨 that adults will only become ready to learn after a `need to know’ is experienced; and 
    • FDP는 과제-중심, 경험학습, 즉각적 적용을 강조해야 that faculty development programs should be task- centered, with an emphasis on experiential learning and immediacy of application (Carroll, 1993).

 

As Turnbull (1999) has so eloquently said, until recently those of us responsible for educating future physicians have emphasized the art of medical education and have tended to ignore the fundamental science of learning underlying our basic practice.The same can be said of faculty development activi- ties.

 

 

6. 실용성을 놓치지 마라

6. Remain relevant and practical

 

교수의 활동과 관련성이 있어야 하며 실용적이어야 한다. 경험학습이 핵심이다.

Although it is important that theory inform practice, faculty development activities and programs must remain relevant and practical. As stated above, experiential learning is key.

 

또한 개념과 스킬을 가르칠 때 단순하고 명확해야 한다. 비록 FD의 영역들은 복잡하지만, 교수들은 단순한 메시지, 개념, 방향을 원하며, 복잡성을 지양하고 실용성을 추구하는 것이 우리의 책임이다.

The teaching of concepts and skills in this area must also remain clear and simple. Although the domains for faculty development are complex (Rubeck &Witzke, 1998), faculty members want simple messages, concepts, and directions, and it is our responsibility to avoid complexity and promote practicality.

 

 

7.흔한 문제를 극복하기 위해 노력하라

7.Work to overcome common problems

 

조직 차원의 지원, 제한된 자원, 제한된 시간 등. 이를 극복하기 위한 창의적 프로그래밍, 능숙한 마케팅, 목표가 분명한 재정지원 확보, 양질의 프로그램 제공 등을 통한 극복

Common implementation problems include a institutional support, limited resources, and limited faculty time (Steinert et al., 1997b). Faculty developers must work to overcome these problems through creative program- ming, skilled marketing, targeted fundraising, and the delivery of high quality programs. lack of

 

 

8. 효과성을 평가하고 보여주라

8. Evaluate and demonstrate effectiveness

 

교수개발은 단순한 학문적 활동 이상이라는 것을 기억하라

The need to evaluate our programs and activities has been highlighted in a separate section. However, we must remember that the evaluation of faculty development is more than an academic exercise.

 

 

 

 

 

 

 


 

Faculty development in the new millennium: key challenges and future directions
Research Article

Faculty development in the new millennium: key challenges and future directions

PDF
Full access
DOI:
10.1080/01421590078814
Yvonne Steinerta

pages 44-50

Abstract

Faculty development initiatives in the year 2000 will need to respond to changes in medical education and health care delivery, to build on the achievements and accomplishments of the past, and to continue to adapt to the evolving roles of faculty members. To remain at the forefront, faculty development programs will need to broaden their focus, consider diverse training methods and formats, conduct more rigorous program evaluations, and foster new partnerships and collaborations. Academic vitality is dependent upon faculty members' interest and expertise; faculty development has a critical role to play in promoting academic excellence and innovation.

 

교육효과성을 높이기 위한 FD initiative 의 systemic review (BEME Guide No. 8) (Med Teach, 2006)

A systematic review of faculty development initiatives designed to improve teaching effectiveness in medical education: BEME Guide No. 8


YVONNE STEINERT1, KAREN MANN2, ANGEL CENTENO3, DIANA DOLMANS4, JOHN SPENCER5, MARK GELULA6 & DAVID PRIDEAUX7 

1McGill University, Montreal, Canada; 2Dalhousie University, Halifax, Canada; 3Austral University, Buenos Aires, Argentina; 4University of Maastricht, Maastricht, The Netherlands; 5University of Newcastle upon Tyne, Newcastle, UK; 6University of Illinois at Chicago, Chicago, USA; 7Flinders University, Adelaide, Australia











결론: 대부분의 인터벤션은 진료를 하는 의사를 대상으로 하였다. 모든 연구는 교육의 향상을 목적으로 하여 워크숍/세미나 시리즈/단기코스/장기프로그램 등으로 분류되었다.

Results: The majority of the interventions targeted practicing clinicians. All of the reports focused on teaching improvement and the interventions included workshops, seminar series, short courses, longitudinal programs and ‘other interventions’. The study designs included 6 randomized controlled trials and 47quasi-experimental studies, of which 31 used a pre-test–post-test design. 

 

방법론적 한계를 감안하더라도 다음의 결과를 지지한다.

Key points: Despite methodological limitations, the faculty development literature tends to support the following outcomes: 


  • FDP의 전반적인 만족도는 높다. 참가자들은 지속적으로 프로그램이 수용가능하고/유용하고/자신들의 목표와 관련된다고 응답하였다.
    Overall satisfaction with faculty development programs was high. Participants consistently found programs acceptable,useful and relevant to their objectives. 
  • 응답자들은 FD와 교육에 관하여 긍정적인 태도 변화를 보고하였다.
    Participants reported positive changes in attitudes toward faculty development and teaching. 
  • 참가자들은 교육 원칙에 대한 지식이 향상되었으며, 교육 기술이 향상되었다고 응답하였다. 지식 향상 점검을 위해 시험을 보았을 때에는 유의미한 향상이 있었다.
    Participants reported increased knowledge of educational principles and gains in teaching skills. Where formal tests of knowledge were used, significant gains were shown. 
  • 교육 행동의 변화는 지속적으로 참가자들에 의해 보고되었으며, 학생들도 그렇게 보고하였다.
    Changes in teaching behavior were consistently reported by participants and were also detected by students. 
  • 조직 차원의 변화와 학생들의 배움의 향상은 자주 연구되는 것은 아니었다. 그러나 연구 결과를 보면 교육에 대한 참여가 높아지고, 동료간의 네트워크가 확립되었다.
    Changes in organizational practice and student learning were not frequently investigated. However, reported changes included greater educational involvement and establishment of collegiate networks. 
  • 경험학습, 긍정적 피드백의 제공, 효과적인 동료관계, 교수학습 원칙을 따른 잘 설계된 인터벤션, 하나의 인터벤션 내에서 다양한 교수법 활용 등이 효과적인 FD에 기여한다.
    Key features of effective faculty development contributing to effectiveness included the use of experiential learning, provision of feedback, effective peer and colleague relationships, well-designed interventions following principles of teaching and learning, and the use of a diversity of educational methods within single interventions. 

 

학계의 생명력은 교수의 흥미와 전문성에 달려있다. FD는 학문적 수월성과 혁신을 촉진하는데 핵심적 역할을 한다.

Academic vitality is dependent upon faculty mem-bers’ interest and expertise; faculty development has a critical role to play in promoting academic excellence and innovation. (Wilkerson & Irby,1998)

 

교수들이 다양한 역할을 할 수 있게 돕기 위해서 다양한 FDP가 설계되고 도입되었다. 워크숍/세미나...등등을 포함한다. 이들 활동의 많은 부분이 의학교육연속체의 교수 효과성을 높이기 위해 설계되었다. 또한 local, regional, national level의 보건의료전문직에게 제공되어왔다.

To help faculty members fulfill their multiple roles, a varietyof faculty development programs and activities have been designed and implemented. These activities include work-shops and seminars, short courses and site visits, fellowships and other longitudinal programs. Many of these activitieshave been designed to improve teacher effectiveness acrossthe medical education continuum (e.g. undergraduate andpostgraduate education), and they have been offered tohealthcare professionals at local, regional and national levels(Clark al., 2004; Skeff al., 1997). 


교수개발

Faculty development


 

FD는 다양하게 정의되어왔다.

Faculty development has been defined as

  • 교수의 역할을 지원하거나 새롭게 하기 위해서 기관 차원에서 활용하는 활동
    that broad range of activities that institutions use to renew or assist faculty in their roles (Centra, 1978), and
  • 교수들의 교육/연구/행정에 있어서의 수행역량을 향상시키기 위한 것
    includes initiatives designed to improve the performance of faculty members in teaching,research and administration (Sheets & Schwenk, 1990).
  • 기관과 교수들을 학문적 역할(교육, 연구, 행정, 저술, 경력 관리)을 준비시키는 계획된 프로그램
    In many ways, faculty development is a planned program to prepare institutions and faculty members for their academic roles, including teaching, research, administration, writingand career management (Bland et al., 1990).
  • 개인 강점과 능력, 조직적 역량과 문화를 향상시킴으로서 변화의 실천과 관리를 개선시키는 것
    Faculty development is also meant to improve practice and manage change (Bligh, 2005), by enhancing individual strengths and abilities as well as organizational capacities and culture. 

 

FDP는 다양한 방식으로 분류되었다. 

Faculty development programs have been classified indifferent ways.
  • 조직차원의 전략, 펠로우십, 포괄적 지역 프로그램, 워크숍과 세미나, 개별활동 으로 분류
    Ullian & Stritter (1997) describe a typology that includes organizational strategies, fellowships, compre-hensive local programs, workshops and seminars, and individual activities.
  • 신임교수들의 전문직으로서 방향 설정, 교수역량 개발, 리더십역량 개발, 조직 개발.
    Wilkerson & Irby (1998) offer a different classification, ranging from professional orientation for new faculty members to instructional development, leadership development and organizational development. These authors also suggest that all four elements comprise a comprehensive approach to faculty development that is fundamental to academic vitality.
  • FDP는 기관이 그들의 직원에 대해서 가지고 있는 내적 신념을 외부로 보여주는 신호이며, 성공적인 FD는 교육의 향상과 학생 혹은 의사의 더 나은 학습성과로 나타난다.
    Bligh (2005) has made a similar suggestion,stating that faculty development programs are outward signs of the inner faith that institutions have in their workforce, and that successful faculty development performance is expected to result in improved teaching and better learning outcomes for students or doctors. 


지금까지 여러 문헌에서 FD활동의 효과성을 리뷰하였다.

To date, a number of publications have reviewed the effectiveness of faculty development activities.

  • FDP는 많지만, 평가는 잘 되지 않고 있으며, 주로 만족도를 평가하는 짧은 설문에 그치고 있다.
    In 1984
    , Sheets & Henry observed that despite the growth in faculty development programs, evaluation of these initiatives was a rare occurrence, usually consisting of short questionnaires tapping participants’ satisfaction.
  • 가정의학전공 교육자들에 관한 FD에 관한 연구로부터 비슷한 결론을 도출하고, 관찰한 행동 변화에 근거한 더 철저한 평가를 요구했다.
    In 1990
    , Sheets & Schwenk reviewed the literature on faculty development activities for family medicine educators and made a similar observation, calling for more rigorous evaluations based on observed changes in participant behavior.
  • 이전의 연구를 요약하면서, FD의 개념이 진화/확장하고 있다고 하였음. 특히, 교육 기술이 FD의 두드러지는 측면이며, 펠로우십이 새로운 교수를 모집하고 훈련시키는데 효과적이며, FD의 효과는 더 연구가 필요하다
    In 1992
    , Hitchcock et al. summarized earlier reviews of the faculty development literature (e.g. Stritter, 1983; Bland & Schmitz, 1986; Sheets & Schwenk, 1990) and concluded that the concept of faculty development was evolving and expanding. In particular, they observed that teaching skills were a prominent aspect of faculty development, that fellowships were being used effectively to recruit and train new faculty, and that the efficacy of faculty development needed better research documentation.
  • 24개의 문헌을 리뷰하여, 비록 일부 긍정적인 성과가 보고되지만, 방법론적 약점이 단정적인 결론을 내는 것에 장애가 된다
    In 1997
    , Reid et al. reviewed 24 papers (published between 1980 and 1996) and concluded that despite some positive outcomes for fellowships, workshops and seminars, methodological weaknesses precluded definitive conclusions regarding faculty development outcomes.
  • FD가 의학교육과 의료의 변화에 대응해야 하며, 교수들의 진화하는 역할에 지속적으로 적응해야 한다. 또한 더 철저한 프로그램평가가 필요하다. FDP는 초점을 더 확장시켜서 다양한 훈련 방법과 형식을 고려하고, 새로운 파트너십과 협력을 모색해야 한다.
    In 2000
    , Steinert highlighted the need for faculty development to respond to changes in medical education and healthcare delivery, to continue to adapt to the evolving roles of faculty members, and to conduct more rigorous program evaluations. She also commented that faculty development programs need to broaden their focus, consider diverse training methods and formats, and foster new partnerships and collaborations.


목적

Objectives

 

교수들의 교육역량 강화에 초점을 둔 연구에만 한정함.

The goal of this review is to determine the effect of faculty development activities on faculty members’ teaching abilities and to assess the impact of these activities on the institutions in which these individuals work. We focused specifically on programs designed to improve faculty members’ teaching abilities because the majority of faculty development programs have targeted this particular role (Hitchcock et al., 1992; Irby 1996); instructional effectiveness is central to the mission of medical education; and we wanted to limit the scope of our search to a feasible task. We did not examine faculty development programs designed to improve research or writing skills, administrative or management skills, or professional academic skills (career development). We also chose to limit the review to faculty development programs designed for teachers in medicine, and did not examine those programs specifically designed for residents or other healthcare professionals (e.g. nurses; dentists). All types of faculty development interventions (e.g. workshops, short courses and seminars, and fellowships) were included in the review.



검토 질문

Review question


FD를 효과가 있게 하는 특징은 무엇인가?

What are the features of faculty development that make it effective? 


FD가 차이를 만드는가?

Does faculty development make a difference?

  •  What makes for effective faculty development?
  •  Does participation in faculty development improve facultymembers’ teaching, research and administrative skills?
  •  Does faculty development have an impact on the institu-tional climate and organization? 

 

FD인터벤션이 교사의 지식,, 태도 술기에 미치는 효과는 무엇이며, 그 교수가 속한 기관에 미치는 영향은?

What are the effects of faculty development interventions on the knowledge, attitudes and skillsof teachers in medical education, and on the institu-tions in which they work?

 In addition, we also explored the following questions: 

  • What characterizes the faculty development activities tha thave been described?
  • What are the methodological strengths and weaknesses ofthe reported studies? 
  • What are the implications of this review for facultydevelopment practices and ongoing research in this area? 


리뷰 방법

Review methodology


그룹 형성

Group formation


An international Topic Review Group (TRG) of individualsrepresenting six countries was constituted. Three criteriaparticipation:were used to invite individuals for TRG international diversity; practical experience in faculty devel-opment and medical education; and expertise in educationalresearch methodology. 



파일럿 단계

The pilot process


A two-step pilot process was undertaken to prepare for the formal, systematic review. 


개념틀 개발

Development of a conceptual framework


그림 1

The pilot phase led to the development of a conceptual framework that guided this review (see Figure 1).


커크패트릭 모델 사용

To classify and analyze outcomes, we used Kirkpatrick’s model of educational outcomes (Kirkpatrick, 1994), which offers a useful evaluation framework for this purpose (see Figure 2).


커크패트릭은 이 성과가 위계적이지 않으며, 모델은 정책과 프로그램 개발에 더 전체적이고 포괄적인 평가를 의도한 것이라 하였음.

In his original work, Kirkpatrick (1967) asserted that these outcomes were not hierarchical and that the model is intended to provide a more holistic and comprehensive evaluation that can inform policy and program development. The model has also been used by other BEME groups (e.g. Issenberg et al., 2005) as well as other review groups (e.g. Freeth et al., 2003), and with some modifications, was well suited to our review.


포함/배제 기준

Inclusion/exclusion criteria


Based on the pilot studies, the following criteria guided theselection of articles for review:


탐색 전략, 논문 출처

Search strategy and sources of papers


A literature search was conducted on Medline and ERICusing the following key words: staff development; in-service training; medical faculty; faculty training/development; andcontinuing medical education. (A copy of the search strategyis included in Appendix I, which is available on the BEMEwebsite: http://www.bemecollaboration.org



선택 방법과 방법론적 질 점검

Selection methods and judgment of methodological quality


The literature search resulted in a total of 2777 abstracts. A two-stage process was employed in the selection of studies eligible for review (Freeth et al., 2003) and is outlined in Figure 3.

 




데이터 관리 기술

Data management techniques


Data extraction, analysis and synthesis

어떤 자료를 추출하여 분석하고 종합하였는가?


결과

Review findings


리뷰에 포함된 연구의 개괄

Overview of studies included in review


(a) Description of the interventions and expected outcomes—which will be further divided into: setting, professional discipline, focus of the intervention, program type,instructional methods, duration, and level of outcome assessed. 

(b) studies—which will be Methodological quality of the further divided into: study goal and theoretical frame-work, study design, data-collection methods, data sources, and study quality and strength of findings. 



(a) Description of the interventions and expected outcomes


세팅(국가, 기관)

Setting: Of the 53 papers reviewed, 38 studies (72%) took place in the US, the remainder being in Canada, Egypt, Israel, Malta, Nigeria, the UK, Switzerland and South Africa. Most activities were delivered in a university, hospital or community setting, with several initiatives offered by profes- sional associations.

 

참여 대상의 전공(내과, 가정의학과, 40%에서는 참가자의 전공 종류가 2개 이상, 기초과학자 대상 등)

참가자 수(6~399명, 평균 60명)

Professional discipline: The majority of faculty development interventions targeted practicing clinicians, with a prepon- derance of activities in family medicine and internal medicine. Interestingly, 21 of the faculty development initiatives (40%) welcomed more than one clinical discipline. Five interventions (10%) were designed for both clinicians and basic scientists; an additional two (4%) targeted basic scientists only. The number of participants in the interven- tions (which does not equal respondents for the evaluative component) ranged from six to 399, with a mean attendance of 60.



Table 2. Summary of faculty development outcomes by Kirkpatrick level.*


인터벤션의 초점: 교육 개선, 임상교육, 피드백과 평가, 소그룹 교수법, 강의기술, 학습자-중심 교육, 특정 내용 교육에 대한 것, 일반적인 교육 향상, 개인적/진로 개발, 조직변화, 행정과 리더십, 연구기술

Focus of the intervention: As a result of the selection criteria, all of the reports focused on teaching improvement. The majority aimed to improve clinical teaching, with a secondary emphasis on feedback and evaluation, small-group teaching and lecturing skills. Several studies highlighted ‘learner centeredness’ as an outcome, and several others focused on the teaching of specific content areas in addition to general teaching improvement (e.g. communication skills and medical interviewing; principles of family medicine and preventive medicine). Although the primary focus of these reports was instructional improvement, many also addressed personal/career development, organizational change, administration and educational leadership, and research skills.


프로그램 유형: 워크숍, 세미나 시리즈, 단기 코스, 장기 프로그램, 개인별 피드백, 증강(augmented) 피드백, 현장 방문. 용어의 비일관적인 그리고 다양한 사용이 분류를 어렵게 하는 측면이 있음.

Program type: The majority of activities were workshops (n¼23; 43%), of varying duration. Ten (19%) of the interventions were described as a seminar series and six (11%) as a short course. Five (10%) were described as a longitudinal program (e.g. fellowship) and nine (17%) fell under ‘other’, which included a seminar method, individual or augmented feedback, or site visits. An inconsistent and variable use of terms (e.g. workshops and seminars; seminars and short courses), complicated this classification; however, whenever possible, the authors’ terminology was used.


교수법: 강의, 소그룹토론, 상호작용 연습, 역할극, 시뮬레이션, 비디오-녹화 리뷰. 강의로만 진행되는 프로그램은 없으며, 대부분의 프로그램은 피드백이 동반된(microteaching과 같은) 실습(experiential) 부분을 포함하고 있음. 일부 프로그램은 현장훈련을 제공하며, 배운 것을 바로 적응할 수 있게 한다. 비록 교육 프로젝트와 in vivo practice가 일부 인터벤션(대부분 세미나와 단기 코스)의 일부였지만, 교사의 지속적 교육활동과 연관되어 있는 것을 묘사한 연구는 극히 적었다. 요구도 조사가 이루어진 경우는 적었다.

Instructional methods: All reports described a wide range of instructional methods that included lectures, small-group discussions, interactive exercises, role plays and simulations, films and videotape reviews of performance. No programs were completely lecture-based, and the majority included an experiential component with opportunities for guided practice with feedback (i.e. micro-teaching). Some programs offered on-site training opportunities where teachers could readily apply what they learned. Few described a direct link to teachers’ ongoing educational activities, although educational projects and in vivo practice were part of several interventions (most notably seminars and short courses). Needs assessments were used sparingly.


길이: FDP 인터벤션은 1시간에서 1년까지 다양했다. 워크숍(대체로 one-time 인터벤션)의 경우 3시간에서 1주까지 분포하고 있었으며, 중간값은 2일이었다. 세미나 시리즈(장기간에 걸쳐 진행되는 것)은 12시간에서 1달까지 분포하고 있었으며, 중간값은 14시간이었다. 단기코스는 1주에서 1달에 분포하였다. 펠로우십은 full-time와 part-time이 모두 있는데, 한 인터벤션은 18개월에 걸쳐 50시간에 달하였다.

Duration: The faculty development interventions ranged in duration from one hour to one year. Workshops, which were generally one-time interventions, ranged in duration from three hours to one week, with a median duration of two days. The seminar series, which occurred over time, ranged in duration from 12 hours to one month (with a median duration of 14 hours), and the short courses ranged from one week to one month. Fellowships were both full time and part time in nature, and one intervention, entitled a ‘longitudinal program’, was 50 hours in length over 18 months. 


평가 항목: 74%의 연구가 reaction을 평가하였으며(만족도, 유용성 인식, 수용가능성, 학습활동의 가치), 77%에서 learning을 평가하였다(태도, 지식, 술기의 변화), 72%에서 '행동'을 평가하였으며, 'Result'평가는 13%에서 조직의 변화를, 6%에서 학생/레지던트 학습의 변화를 평가하였다.

Level of outcome assessed: Table 2 shows that 39 studies (74%) assessed reaction, which included participant satisfaction, perception of program usefulness and acceptability, and value of the activity. Forty-one studies (77%) assessed learning, which included changes in attitudes, knowledge or skills. Thirty-eight (72%) assessed change in behavior. At the results level, seven studies (13%) reported change in organizational practice and three (6%) assessed change in student or resident learning.

 

 


 

(b) 연구의 방법론적 품질

(b) Methodological quality of the studies


연구 목표와 이론틀: 모든 연구에서 목표를 기술하였으며, 일부는 더 구체적으로 목표를 기술하였음.(FDP의 교육행동 또는 교육태도에 대한 효과성 평가). 7개를 제외한 모든 연구에서 관련 문헌을 인용하였으며, 57%에서 개념틀 혹은 이론틀(주로 성인학습, Instructional design, experiential learning, reflective practice)과 연결지었다.

Study goal and theoretical framework: All 53 reports stated their objective, sometimes quite broadly (e.g. to describe, implement and evaluate a faculty development initiative). Some reports described more specific objectives, outlining a particular study question such as assessing the effectiveness of a faculty development program on teaching behaviors (Hewson, 2000) or attitudes (Schmidt et al., 1989). One study examined the effect of experience on workshop gains (Baroffio et al., 1999), and several others assessed different methods of assessment (Nasmith et al., 1997; Hewson et al., 2001) and program evaluation (Sheets, 1985). All but seven cited the relevant literature, though often in a very limited fashion. Thirty reports (57%) placed their work within a conceptual or theoretical framework, primarily drawing upon principles of adult learning, instructional design, experiential learning and reflective practice.


연구 설계: 11% RCT, 89% quasi-experimental. 45개의 단일그룹설계 중, 69%는 pre-post test, 26%는 post-test만. 질적 접근만 활용한 경우는 없었으며, 21%는 질적연구방법 사용

Study design: Of the 53 papers reviewed, there were six (11%) randomized controlled trials. The majority of studies (n¼47; 89%) were quasi-experimental in design, with two including a comparison group in the main part of the study. Of the 45 single-group designs, 31 (69%) employed a pretest– post-test design. Fourteen studies (26%) used a post-test only. None of the reports used a qualitative approach only, though 11 (21%) incorporated a qualitative method (or analysis) in their design. 


데이터 수집: 워크숍 후 설문, pre and post 측정, 학생/레지던트/자기 평가, 교육행동 관찰 등. 설문이 가장 흔히 사용되는 방법이었음. 55%에서는 설문만 사용. 38%에서는 설문+alpha. 대부분의 설문은 특정 연구를 위해 개발되었고, 매우 소수 연구에서는 psychometric properties도 보고함. 30%는 (비디오 녹화 등) 직접 관찰 평가를 수행.

Data collection methods: Methods to evaluate faculty development programs included end-of-workshop questionnaires, pre- and post-test measures to assess attitudinal or cognitive change, student, resident and self-assessment of post-training performance, and direct observations of teaching behavior. Questionnaires were the most popular method of data collection. All but four of the interventions used a survey or questionnaire. Twenty-nine (55%) of the interventions used a questionnaire only; 20 (38%) used a questionnaire and another method (e.g. observation; expert opinion). Most questionnaires were designed for a particular study, and few reports described psychometric properties. Sixteen studies (30%) included direct observation (of live or videotaped teaching sessions) as part of their assessment methodology. 


데이터 출처: 대부분 교육에 대한 자기보고에 의존하고 있으며, 수행-기반 변화측정은 매우 제한적으로 사용되었음. 28%는 학생/레지더트의 평가를 도입하였음. 전문가 견해를 사용하기도 함. 학생의 시험점수, 레지던트 행동에 관한 환자의 평가 등도 있음. 응답률은 low 또는 unspecified.

Data sources: The majority of programs relied on selfreported ratings of teaching, with a limited use of performance-based measures of change. Fifteen studies (28%) employed student or resident ratings to assess changes in teaching behaviors. An additional two used expert opinions to assess outcomes. One study assessed student exam scores; another included patient ratings of resident behaviors. In many studies, the response rates for outcome measures were low or unspecified; statistical methods or differences were often not described.


연구의 품질과 연구결과의 견고성: 5점척도로 평가했을 때(원래는 subscale도 넣었지만 reliable하지 않았음) study quality는 평균적으로 3.14점이었으며 1점에서 5점까지 분포(1 낮음, 5 높음). strength of finding은 2.88이 평균이었고 1점에서 4점까지 분포 (1점: no clear conclusions can be drawn; 3점: conclusions can probably be based on results; 5점 results are unequivocal)

Study quality and strength of findings: Study quality was rated on a five-point scale (1¼low; 5¼high), and reviewers were asked to indicate study strengths and weaknesses. We had originally included subscales to rate the evaluation methods (e.g. appropriateness of and implementation of study design; appropriateness of data analysis), but this did not yield reliable results. We therefore chose to use an overall rating for this variable. Strength of findings was rated on a five-point scale with specific anchors (1¼no clear conclusions can be drawn; 3¼conclusions can probably be based on results; 5¼results are unequivocal). The mean rating for study quality was 3.14, with a range from 1 to 5. The mean rating for strength of findings was 2.88 (with a range of 1–4).



인터벤션 유형에 따른 결과 요약

Summary of findings by intervention type


(a) 워크숍

(a) Workshops


23개의 인터벤션이 워크숍이었으며, 대부분은 duration이 다양한 single intervention이었음.

Twenty-three of the interventions reported were described as workshops, most commonly a single intervention of varying duration.

 

23개 중 7개 만이 개념 또는 이론틀을 기술함

Only seven of the 23 stated a theoretical or conceptual framework.



(b) 단기 코스

(b) Short courses


54개의 인터벤션 중 6개가 단기코스 형태였고 1주에서 1달까지 분포하였다. 모두 목표를 기술하였고, 6개 중 5개에서 이론틀을 제공하였다.

Six of the 54 interventions (Sheets & Henry, 1984, 1988;Gordon & Levinson, 1990; Skeff et al., 1992b; DaRosa et al.,1996; Pololi et al., 2001) were in the form of a short course,ranging in duration from one week to one month. All hada stated objective and all but one provided a theoretical framework. 


(c) 세미나 시리즈

(c) Seminar series


10개의 연구에서 세미나시리즈를 하였고, 이것의 특징은 각 세션이 시간 간격을 두고 진행되는 것이다.

Ten studies described a seminar series characterized by the fact that the sessions were spaced over time 


(d) 장기 프로그램과 펠로우십

(d) Longitudinal programs and fellowships


한 연구에서 장기프로그램을 보고하였으며, 모두 목표를 기술했다.

One report described a longitudinal program.

All had stated objectives and all but one incorporated a theoretical framework.



우수 연구 자세히 들여다보기

The focused picture


8개의 연구가 study quality와 strenght of findings에서 4점 혹은 그 이상의 점수를 받았다. 이 연구만 따로 보면 다음과 같다.

Eight articles scored 4 (or higher) for both study quality andstrength of findings, and we chose to examine these separately in order to provide a more focused picture offaculty development. 

 

8개 중 4개의 연구에서 effect size를 계산 할 수 있었다. 평균점수와 SD로 effect size를 계산함. Table 3에 그 결과가 나와있다.effect size는 다양했지만 moderate to high 한 정도의 효과크기가 네 가지 모두에서 나타났다. 즉 인터벤션의 효과가 있었으며 특히 교육의 측면에서 그리고 인터벤션에서 도움을 받은 교수들의 측면에서 있었다.

Four of the eight studies included in our focused review provided data that allowed for the calculation of effect size(Baroffio et al., 1999; Skeff, 1983; Skeff et al., 1986; Mahler& Benor, 1984). Mean scores and standard deviations weredrawn from the data and were converted into effect sizes (d)using Cohen d’s calculation (Cohen, 1988). These effects areshown in Table 3, where these studies are summarized. Whileeffect sizes varied, moderate to high effect sizes were found inall four studies, highlighting the effects of the interventions,particular aspects of teaching that were affected, and groupsof teachers who might benefit from the intervention. 


(a) 인터벤션과 기대 효과

(a) Description of the interventions and expected outcomes


45분짜리 피드백 세션에서부터 1달짜리 세미나 시리즈까지 다양했다.

The interventions described in these eight reports rangedfrom a 45-minute feedback session for clinical teachers(Marvel, 1991) to a month-long seminar series designed tofacilitate dissemination of workshop concepts (Stratos et al.,1997). One study described two workshops aimed at improving tutor behavior, each consisting of several phases(Baroffio et al., 1999). Another study provided augmented feedback, consisting of norm-referenced graphic summaries of teachers’ clinical teaching performance ratings, together with individually written clinical teaching effectiveness guide-lines, to attending staff and residents (Litzelman et al., 1998).Two studies assessed the benefits of a four-day workshop designed to improve teachers’ cognitive styles (Mahler & Benor, 1984; Mahler & Neumann, 1987), and two studies assessed the impact of an intensive feedback and seminar method on clinicians’ teaching behaviors (Skeff, 1983; Skeff et al., 1986).

 

모든 연구에서 behavior change를 평가했으며, 이는 3단계와 4단계에 해당한다. 4개의 연구에서 참가자의 만족도를 조사하였고, 3개의 연구에서 학습의 변화를, 7개의 연구에서 교육행동의 변화를 3개에서 학생과 시스템의 변화를 평가하였다.

All of the studies assessed behavioral change, targeting level 3 or 4 of Kirkpatrick’s model. Four studies included participant satisfaction. Three studies examined changes in learning (i.e. knowledge, attitudes or skills); seven studies assessed change in teacher behavior and three assessed change at the level of the student or system. One study assessed outcome at all four levels (Skeff et al., 1986).


(b) 방법론적 품질

(b) Methodological quality of the studies


3개는 RCT. 5개는 single-group design. 1개 연구에서 non-equivalent control group을 포함함. 8개 연구에서 pre- post- test 디자인을 활용. 3개의 연구에서는 delayed post-test 활용

Three of the eight studies (38%) were randomized controlled trials; the remaining five (62%) were single-group designs, with one study including a non-equivalent control group for one part of the intervention. All eight studies employed a pre- test–post-test design, with the addition of a delayed post-test in three.


8개 중 6개에서 설문을 활용했으며(이론적 구인에 따라 reliability를 점검함). 이 6개 중 3개는 수행능력을 객관적으로 측정함. 2개는 수행능력에 대한 관측측정만 함.

Six of the eight studies (75%) used questionnaires (the majority of which were tested for reliability and based on a theoretical construct). Three of these same six studies also incorporated objective measures of performance. The two remaining studies used observed measures of performance only.


8개 모두 참가자의 자기-보고 외에 다른 자료 출처를 사용함. 5개의 연구는 교육 행동에 대한 학생, 레지던트의 평가 활용. 5개는 숙련된 관찰자의 평가를 활용

All of the eight studies used data sources other than participants’ self-report. Five of the studies incorporated student and resident ratings of teacher behavior; five utilized trained observer ratings.




Discussion


결과 요약

Summary of outcomes


FDP에 대한 높은 만족도: 참가자가 자발적으로 참여했다는 사실 외에도, 일관되게 FDP가 수용가능했고, 유용했으며, 개인 목표에 부합한다고 응답함. 실습과 스킬-기반 방법이 높은 평가를 받았음.

High satisfaction with faculty development programs: Overall satisfaction with faculty development programs was high. Notwithstanding the fact that the participants were volunteers, they consistently found the programs acceptable, useful and relevant to their personal objectives. The methods used, especially those with a practical and skills-based focus, were also valued by program participants.


교육과 교수개발에 대한 태도 변화: 두 가지 모두에서 긍정적으로 변화했다. 개인의 강점과 약점을 더 잘 인식하게 되었고, 동기부여가 더 되었으며, 교육에 대한 열의가 생겼고, professional development에 대해 긍정적으로 평가하게 되었다. 이러한 효과는 개방형 설문과 pre-post 측정에서 모두 나타났다.

Changes in attitudes towards teaching and faculty development: Participants reported a positive change in attitudes towards faculty development and towards teaching as a result of their involvement in a faculty development activity. They cited a greater awareness of personal strengths and limitations, increased motivation and enthusiasm for teaching, and a notable appreciation of the benefits of professional development. This impact was observed both in answers to open-ended questions and in pre–post measures of attitudinal change.


지식과 스킬의 습득: 교육의 다양한 측면(구체적인 교육전략, 보다 학습자-중심적 접근) 외에도 교육적 개념에 대한 지식 향상을 언급하였다. 스킬의 습득(학습자 요구사정, 성찰 촉진법, 피드백 제공법)도 이뤄졌다. 지식의 평가를 위한 시험은 흔히 사용되진 않더라도 긍정적 변화를 보여준다.

Gains in knowledge and skills: Participants often reported increased knowledge of educational concepts and principles as well as various aspects of teaching (e.g. specific teaching strategies; a more learner-centered approach). They also described gains in skills (e.g. assessing learners’ needs, promoting reflection and providing feedback). Formal tests of knowledge, though infrequently used, also demonstrated positive changes.


교육 행동의 변화: 스스로 인식한 교육행동의 변화는 지속적으로 보고되었다. 학생들의 평가에서 FDP 참가자들이 인식하는 변화를 항상 반영하는 것은 아니지만, 교육 행동에 변화가 있는 것은 명확해보인다. 예컨대 교육 행동의 변화는 23개의 워크숍 중 15개, 10개의 세미나 시리즈 중 7개에서 드러난다.

Changes in teaching behavior: Self-perceived changes in teaching behavior were consistently reported. While student evaluations did not always reflect the changes that participants perceived, there was evidence that change in teaching performance was detectable. For example, changes in teaching behavior were reported for 15 (of 23) workshops and seven (of 10) seminar series. New educational initiatives, designed and implemented during the intervention, were also described.


조직과 학생 학습의 변화: 흔히 평가되는 것은 아니지만, 이것을 평가한 소수 연구를 보면 새로운 교육활동에 더 적극적으로 참여하고, 동료들과 새로운/개선된 네트워크를 형성한다.

Changes in organizational practice and student learning: Changes in student (or resident) behavior as well as organizational practice were not frequently investigated. However, in those few studies that examined organizational practice, participants reported a greater involvement in new educational activities and the establishment of new and improved networks of colleagues. The latter outcome was most frequently noted for the seminar series and longitudinal programs.



핵심 특징 요약

Summary of ‘key features’


경험학습의 역할: 배운 내용을 적용하는 것, 스킬을 연습하는 것, 스킬에 대한 피드백을 받는 것의 중요성이 여러 연구에서 강조되었다. 모든 연구자들은 교수들이 배운 것을 연습하고, 즉각적 관련성과 실용성이 핵심이라고 하였다.

The role of experiential learning: The importance of applying what has been learned (during the intervention and afterwards), practicing skills, and receiving feedback on skills learned was highlighted by several authors (Irby et al., 1982; Coles & Tomlinson, 1994; Hewson, 2000), all of whom suggest that faculty members need to practice what they learn, and that immediate relevance and practicality is key (e.g. Sheets & Henry, 1984, 1988).


피드백의 가치: 변화를 만드는데 피드백의 역할을 여러 보고된 인터벤션에서 명백하다. 추가로 여러 연구에서 인터벤션 전략으로서 피드백 활용을 조사하였으며, systematic한 건설적 피드백이 교육 행동의 개선을 가져올 수 있음을 보여주었다. 그러나 한 연구에서 augmented feedback은 일부 부정적 효과를 보였다.

The value of feedback: The role of feedback in promoting change was evident in many of the reported interventions. In addition, several studies (Skeff, 1983; Litzelman et al., 1998) specifically examined the use of feedback as an intervention strategy and found that systematic and constructive feedback can result in improved teaching performance. However, in one study (Litzelman et al., 1998), augmented feedback was shown to have some negative effects; this potential effect should be considered and investigated further.


동료의 중요성: 많은 연구에서 동료 관계의 이점에 대해서 언급했다. 특히 동료를 롤모델로 삼는 것, 정보와 아이디어를 서로 교환하는 것, 변화를 촉진하고 유지하는데 동료의 지지의 중요성 등을 언급했다.

The importance of peers: A number of reports (DeWitt et al., 1993; Elliot et al., 1999) commented on the benefits of peer and collegial relationships. In particular, they highlighted the value of using peers as role models, the mutual exchange of information and ideas, and the importance of collegial support to promote and maintain change.


교수-학습의 원칙을 고수하는 것: 많은 FDP가 이론/개념적 프레임워크에 기반하고 있지 않지만, 많은 연구에서 성인학습을 인용하였으며 경험학습을 인용하였다. 실제로 이러한 원칙에 입각하여 진행하는 것이 더 효과적인 교수-학습을 가져온다는 컨센서스가 나타나고 있다. instructional design 원칙 역시 자주 언급된다.

Adherence to principles of teaching and learning: Although many of the programs were not grounded in a theoretical or conceptual framework, many cited principles of adult learning (e.g. Knowles, 1988) and experiential learning (e.g. Kolb, 1984) as an organizing structure. In fact, there appears to be a developing consensus that adherence to these principles promotes more effective learning and teaching. Principles of instructional design were also frequently cited.


목표 달성을 위한 다양한 교수법 활용: 앞서 언급된 바와 같이 모든 인터벤션은 여러 교수법을 활용하며(소그룹 토의, 상호작용 연습, 롤플레이, 시뮬레이션) 강의만 하는 것은 없다. 명백히 모든 프로그램은 다양한 학습스타일에 맞춰야 할 필요성과 더불어 다양한 목표를 달성하기 위해서는 다양한 방법이 필요함을 인식하고 있다.

The use of multiple instructional methods to achieve objectives: As mentioned earlier, all of the interventions included a wide range of instructional methods (e.g. smallgroup discussions; interactive exercises; role plays and simulations) and none relied on lectures alone. Apparently, each program was aware of the need to accommodate different learning styles as well as the fact that different methods are required to meet diverse objectives.



FD인터벤션과 관련한 관찰결과

Observations re faculty development interventions


맥락의 역할: 대부분의 연구는 교수 중 특정 맥락에 있는 특정 그룹의 니즈에 맞는 프로그램을 개발하였다. 이 프로그램 개발과 '맞춤형' 프로그램이 종종 성공을 이루곤 했지만, 놀라운 것은 아니다. 이러한 관찰 결과에서 배워야 할 점은 '맥락'이 핵심이라는 것이며, 연구의 결과가 일반화가능하지 않을 수 있지만, FDP의 개발의 원칙은 일반화가능할 수 있다는 점이다.

The role of context: The majority of reports describe programs that were developed to meet the needs of a particular group of faculty members, in a particular context. To the extent that this development and ‘match’ were often successful, it is not surprising that there were many reports of changes in the desired direction. One lesson to be learned from this observation is that context is key, and that although the results of these studies may not be generalizable, the principles of faculty development might be.


맥락은 또 다른 의미에서 중요한데, Kirkpatrick에 따르면 변화가 일어나려면 네 가지 조건이 맞아야 한다. (1)변하고자 하는 욕망이 있어야 하며, (2)무엇을 어떻게 할지에 대한 지식이 있어야 하고 (3)지지적 환경이 필요하며 (4)변화의 보상이 필요하다. 흥미롭게도 처음 두 개의 요소는 FDP를 통해 달성가능하나 나머지 두 개는 그렇지 않다. 그러나 우리가 바라는 변화는 이 지점에 있다.

Context is important in another way as well. According to Kirkpatrick (1994), four conditions are necessary for change to occur: the person must have the desire to change, knowledge of what to do and how to do it, a supportive work environment, and rewards for changing. Interestingly, the first two elements of change can potentially be achieved through faculty development activities; the last two cannot, and yet it is at this level that we expect change to occur.


참여의 특성: FDP에 참여하고자 하는 동기는 아직 해결되지 않은 의문이다. 왜 참여하는가? 왜 어떤 사람이 특정 프로그램에 특정 시점에 참여하고자 하는가? 지금까지 대부분의 참여자는 자발적 참여자였다. 아마 이제는 이 '자발성'을 넘어서야 할 때인지도 모른다. 개인적 차원의 것을 넘어서 FDP참여를 촉진하거나 방해하는 요인을 알아봐야 한다. '교육'이란 것은 '사회적 활동'이기 때문에, 참여의 사회적 결정요인에 대해 살펴볼 필요가 있을 수도 있다. 경험을 통해서 얻는 것과 워크숍을 통해서 얻는 것의 차이를 볼 필요도 있다.

The nature of participation: Motivation to attend faculty development activities remains an unanswered question. What motivates participation? What determines whether someone will take advantage of specific offerings at a particular time? To date, the majority of participants are volunteers. Perhaps it is time for us to move beyond ‘volunteerism’ as we strive to enhance teaching and learning. It would also be worth exploring factors beyond the individual that encourage or impede attendance. As teaching is a ‘social activity’ (D’Eon et al., 2000), the social determinants of participation merit further inquiry. It would also be worthwhile to conduct further studies to determine what is learned through workshops vs. experience. 

 

FD의 facilitator로 참여하는 것의 효과도 연구할 가치가 있을 것이다. "가르치는 것은 두 번 배우는 것과 같다" 라는 말이 있다. 흥미롭게도 지금까지 어떤 연구도 FD facilitator의 참여의 영향을 연구하지 않았다. FD intervention에 faciitator로 참여하기 위해서는 독특한 스킬과 자질이 필요할 것이라는 것이 우리의 생각이다.

The impact of participation on faculty development facilitators would also be worthy of investigation. It has been said that ‘‘to teach is to learn twice’’. Interestingly, no studies to date have examined the impact of participation on faculty development facilitators. It is our impression that facilitating a faculty development intervention requires a unique blend of skills and aptitudes that should be examined in greater depth.


 

확장 프로그램의 가치: 더 장기간 이뤄지는 프로그램(세미나 시리즈)이 단기성 프로그램보다 더 성과를 낼 가능성이 높아 보인다. 예컨대 세미나 시리즈는 네트워크를 형성시키고, 협력적 관계를 만들어준다. 이러한 인터벤션은(펠로우십 포함) FDP이후 교육활동에 더 많이 참여하게 하며, 이는 지속가능성을 시사한다. 단기 프로그램과 장기 프로그램의 더 철저한 비교가 필요하다.

The value of extended programs: Our review of findings by intervention type suggests that longer programs, extended over time (e.g. the seminar series), tend to produce outcomes not apparent in one-time interventions (e.g. short courses or workshops). For example, in several instances the seminar series resulted in the creation of networks and cooperative interactions among colleagues that are possible when a group meets over time (e.g. Rayner et al., 1997). These interventions, as well as fellowships, also reported more involvement in educational activities following the faculty development activity, implying sustainability over time. A more rigorous comparison of ‘short’ and ‘long’ interventions would be beneficial to test out the hypothesis that extended programs yield more long-term changes.


FDP의 다양한 대안 고려: 이번 연구에서 전통적인 면-대-면 방식의 FDP에 지나치게 의존하고 있음이 드러났다. 이러한 인터벤션이 일정관리에 장점이 있고, 관심이 있는 교육자들의 커뮤니티를 형성해주며, 동기를 더 부여해주는 것으로 나타나지만, 다른 방법(온라인 교육, 자기주도 학습, 피어-코칭)도 고려해봐야 하며 멘토링도 고려해봐야 한다. 여기에 속하는 일부 연구들이 'strength of findings'에서 높은 점수를 받았다.

The use of ‘alternative’ practices: The current literature demonstrates an over-reliance on traditional face-to-face methods such as workshops and seminars. Whereas these interventions seem to have the stated advantage of ease of scheduling, building a community of interested educators and increasing motivation, we should consider other methods that include online and self-directed learning, peer coaching (Flynn et al., 1994) and mentorship (Morzinski et al., 1996). It is interesting to note that some of the studies that scored highly on ‘strength of findings’ used alternative methods (e.g. individual feedback session).


 

방법론적 이슈와 관련한 관찰결과

Observations re methodological issues


더 철저한 연구설계의 필요성: 1992년 Hitchcock 등은 FDP를 더 철저한 질적/양적 설계로 평가해야 한다고 언급했다. 그 때부터 상황은 크게 달라진 것 같지 않다. 본 리뷰에서도 더 철저한 연구를 통해서 흔히 마주치는 연구설정상의 문제를 극복할 필요를 제시한다. 만약 가능하다면, RCT를 고려하거나 최소한 대조군을 포함해서 FDP가 정말 차이를 만드는지 더 일반화가능한 결론을 내야 할 것이다.

The need for more rigorous designs: In 1992, Hitchcock et al. commented on the need to better evaluate faculty development programs and use sound qualitative and quantitative designs to document outcomes. The situation does not seem to have changed significantly since then. The results of this review suggest the need to conduct more rigorous research studies and overcome commonly encountered design problems. If possible, we should consider the use of randomized controlled trials, or at least comparison groups, so that we can make more generalizable statements about whether faculty development does, indeed, make a difference. 

 

문헌을 검토한 바, 엄격한 질적연구방법이 잘 활용되지 않음을 발견했다. 동시에, 많은 저자들이 FD활동 이후에 교수들의 열정/갱신(renewal)/변화에 대한 직관적인 인상을 기술했다. 그러나 현재까지의 연구방법은 이러한 직관이나 관찰 일화를 잘 잡아내지 못하고 있다. 더 나아가 비록 FD활동이 교육활동에 대한 흥미에 불을 지핀다는 일반적 동의가 있지만, 이것이 어떻게 도달되는 것인지, 이 열망이 어떤 것인지 등이 더 면밀히 조사될 필요가 있다. 많은 경우 질적연구를더 많이 활용할 경우 얻을 수 있는 이점이 많다.

In reviewing the literature, we perceived an underutilization of rigorous qualitative methodologies. At the same time, many authors described an intuitive impression of enthusiasm, renewal and change following a particular faculty development activity. Current methods do not adequately capture these intuitions or anecdotal observations. Moreover, although there is general agreement that faculty development activities kindle interest in educational activities, how this is achieved, and what this inspires, needs to be examined more carefully. In many ways, a greater use of qualitative methods (e.g. Freeman et al., 1992) would yield considerable benefits.

 

 

FDP는 복잡한 세팅에서 이뤄지는 복잡한 인터벤션이다. 우리의 개념틀에서 지적한 바와 같이, 많은 매개변수(개인 특성, 교사의 직위와 책임) 등이 통제불가능한 외적 요인으로 작용한다. 이것이 평가가 어려운 이유이며(변화가 있다고 해서 프로그램의 기여가 아닐 수 있다), 새로운 연구방법론이 필요한 이유이다. Blumberg와 Deveau는 교육 혁신/인터벤션의 학문적 전파/교육제품(product)개발/도입을 평가하기 위한 모델을 개발하였다.  이것이 우리가 고려해야 할 것이며, 조직에 대한 영향과 더불어 기대 성과와 '기대하지 않은' 성과의 가치를 고려해야 한다.

Faculty development activities represent complex interventions in complex settings (Drescher et al., 2004). As noted in our conceptual framework, many intervening, mediating variables (e.g. personal attributes; teacher’s status and responsibilities) interact with uncontrollable, extraneous factors. This is one of the many reasons that evaluation of effectiveness is difficult (for even if changes are noted, they may not definitively be attributed to the program) and that new research methodologies are required (e.g. Campbell et al., 2000). Blumberg & Deveau (1995) have developed a model by which to evaluate an educational innovation/ intervention that looks at academic dissemination, product development and implementation. This is something that we should consider in faculty development. We should also consider the value of examining anticipated and ‘unanticipated’ outcomes (e.g. Blumberg & Deveau, 1995), including impact on the organization.


 

참가자 만족도에 보다 관심을 기울이기: 참가자 만족도 자료의 가치를 다시 돌아보아야 할 때이다. 비록 FDP에 대한 반응은 초보 단계의 평가이지만, 변화의 토대가 된다. 참가자들의 만족이 중요한 이유는 높은 만족도가 더 학습하고자 하는 동기를 부여해주고, 전문성-개발 활동에 참여하게 해주기 때문이다. 또한 프로그램 개발자에게 가치있는 피드백이 되기도 한다. Belfield 등이 말한 바와 같이 참가자들의 만족도는 교육의 잠재적 효과에 대한 대강의 대리지표이다. 그러나 특정 프로그램에 대한 만족도는 그러한 정보의 목적과 활용이 명확하기만 한다면중요한 정보가 되기도 한다. 우리의 견해로는 만족도를 완전히 무시하기보다는 가치를 만들어나갈 수 있어야 한다. 참가자의 경험과 스토리에 대한 질적 연구방법(네러티브 분석, 결정적 사건 분석)은 또 다른 접근법이 된다.

Attention to participant satisfaction: It is time to re-affirm the value of participant satisfaction data. Although reaction to the program is an elementary level of evaluation, it is fundamental for change to occur. Participant satisfaction is important if faculty members are to be motivated to learn and to attend professional development activities. It also gives valuable feedback to program planners. As Belfield et al. (2001) have said, participant satisfaction is a crude proxy for the substantive effects of education. However, information on the reactions of participants to a specific program provides valuable information, as long as the purpose and use of such information is made explicit. In our opinion, we must build on the value of participant satisfaction rather than discredit it completely. Applying qualitative methodologies to participants’ experiences and stories (e.g. analysis of narratives; critical incident technique) is another approach worth

pursuing as we try to understand participants’ reactions to faculty development offerings. 


 

성과 측정: 지금까지의 연구결과를 보면 변화 측정에 있어 자기평가와 설문에 지나치게 의존한다. 더 나아가기 위해서 새로운 평가방법을 고려해야 한다. 예컨대 Simpson 등은 교수의 교육능력을 개발하기 위한 표준화된 교육 상황을 개발하였으며, Zabar 등은 objective structured teaching examinations 를 통해서 효과를 평가했다.

Outcome measures: The literature to date suggests an overreliance on self-assessments and survey questionnaires to assess change. To move forward, we should consider the use of novel assessment methods. For example, Simpson et al. (1992) have developed standardized teaching situations to develop faculty teaching skills; Zabar et al. (2004) have utilized objective structured teaching examinations to evaluate impact.


적절한 측정은 reliable하고 valid 해야 한다. 대부분의 연구는 psychometric property를 보고하지 않았다. FD개발자들과 연구자들은 validity와 reliability가 확립된 설문지 활용을 고려할 필요가 있다. 혹은 그러한 척도를 개발하고자 노력해야 한다. 예컨대 많은 교수 효과에 대한 여러 scale이 개발되어 있으며, 가능하다면 이러한 평가도구를 사용하고 리소스를 더 공유해야 한다.

Accurately measuring change requires reliable and valid measures. The majority of studies in this review used questionnaires for which psychometric properties were not reported. Faculty developers and researchers interested in assessing change should consider using questionnaires that have already been tested for validity and reliability, or work to establish these measures. For example, a number of scales and measures of teacher effectiveness have been developed in education (e.g. Gibbs & Coffey, 2004). Whenever possible, we should try to make use of these assessment tools and collaborate in order to share resources more consistently. 

 

 

우리는 다양한 수행능력의 척도(자기평가, 비디오테입평가, 학생평가) 간 상관관계를 살펴봄으로써 모든 척도를 모든 연구에 사용하지 않아도 되게 해야 한다. 예컨대 일부 연구에서는 비디오테입 평가와 지식 검사의 강한 상관관계를 보고했다. 이들 연구결과는 입증되기만 한다면 항상 직접 관찰(비용과 시간이 많이 드는)을 사용하지 않아도 된다는 것을 시사한다. 비슷한 결과에 따르면 학생이나 레지던트의 교수 수행능력에 대한 평가를 (지식검사와 함께 사용하여) 비디오테입 녹화 대신 사용할 수 있다. 그러나 삼각측량의 가치는 축소될 필요가 없다. 대부분의 높이 평가된 연구들을 보면 성과 측정을 위해서 다양한 방법을 사용하였다.

We should also try to correlate different measures of performance (e.g. self-assessment questionnaires and videotape recordings; student assessments and faculty self-ratings) so that we do not need to include all measures of change in every study. For example, several studies (e.g. (Mahler & Benor, 1984; Sheets & Henry, 1984) found a strong correlation between videotape ratings (albeit sometimes based on single observations) and knowledge tests. These findings, if corroborated, suggest the possibility of conducting reliable evaluations without always using direct observation (which can be costly and time-consuming). Based on similar results, we might be able to use student or resident evaluations of teachers’ performance (together with knowledge tests) instead of videotaped observations. However, the value of triangulation to validate results cannot be understated. Some of the most highly rated studies (Skeff, 1983; Skeff et al., 1986) used multiple measures to assess outcome (e.g. self-ratings, videotaped observations and student ratings).

 

 

FDP의 중요한 성과는 학생들의 수행능력 향상이 되어야 한다. 우리는 따라서 교수들의 교육행동과 학생들의 성과와의 관계를 봐야 한다. 즉, 더 철저하게 학생과 레지던트의 자료를 수집해야 하며, 학생들의 교육 역량에 대한 평가는 매우 유용하다. 이러한 방식이 더 활용되어야 하나, 학생과 레지던트의 지식/태도/술기에 대한 평가가 더 면멸히 이뤄져야 한다.

An important outcome of faculty development is improved student performance. We must therefore work to seek evidence of a relationship between changes in faculty members’ teaching behaviors and learner outcomes. That is, we need to collect student and resident data (including indices of learner behaviour) more rigorously. Student evaluations of teaching competencies are invaluable; they need to be augmented, however, by a careful assessment of changes in students’ and residents’ own knowledge, attitudes and skills. 


 

응답 편향에 관심가지기: 편향된 응답에도 관심을 기울여야 한다. Skeff등이 언급한 바와 같이 FDP 이후의 자기평가는 종종 기대보다 낮거나 더 떨어지기도 하는데, 이는 개개인이 시작시에는 스스로를 과대평가하다가 과정이 끝나면 스스로를 더 정확하게 평가하기 때문일 수 있다. Skeff 등이 말한 바와 같이 더 조직적으로 후향적 사전-, 사후- 검사를 평가하여 이러한 편향을 극복해야 한다. 한 흥미로운 연구에서 후향적 사전-검사의 결과는 (일반적인 사전검사보다) 학생의 교수에 대한 워크숍-이전 평가가 더 정확함을 보여준다. 이에 더하여 후향적 사전- 사후- 검사는 태도 측면에서 유의미한 결과를 보여주는데, 이는 전통적인 사전- 사후- 에서는 잘 드러나지 않는다.

Attention to response shift bias: The notion of ‘response shift bias’ warrants more careful attention. As noted by Skeff et al. (1992a), post-course self-ratings are often lower than expected, and occasionally decrease, when increases are expected. This may occur because individuals overrate themselves at the beginning of a course, and then after the course (when they have a better idea of what is meant by different aspects of teaching and learning), they rate themselves more accurately (Nayer, 1995). As Skeff et al. have argued, we should more systematically consider the value of retrospective pre–post testing to overcome this possible response shift bias. In an interesting study (Skeff et al., 1992a), retrospective pre-tests correlated better with students’ pre-workshop evaluations of their teachers’ performance than did the regular pre-test. In addition, the retrospective pre- and post-tests showed significant differences in attitudes towards teaching that were not apparent in more traditional pre- and post-tests.


 

시간 변화에 따른 변화 평가: 소수의 연구에서 시간 변화에 따른 FDP성과의 유지를 보았다. 많은 경우 1년까지 그 변화가 유지됨을 보여주었다.

Assessment of change over time: A few studies assessed the maintenance of change over time. Most of them (Mahler & Benor, 1984; Skeff et al., 1986; Steinert et al., 2001)


 

FD 전략간 비교: 비록 우리가 효과적인 FDP의 '핵심 특징'을 따로 떼어 놓았지만, FDP의 어떤 요소가 가장 유용한지에 대한 비교 연구는 거의 없으며, 한 방법이 다른 방법보다 우월한지에 대한 연구도 없다. 예컨대 비록 워크숍이 가장 흔한 방식이지만, 많은 연구자들이 지속적 변화를 가져오기에는 너무 짧다고 지적한다. 그러나 워크숍은 여전히 가장 많이 쓰이는 방법이다. 우리의 연구에 따르면 더 긴 인터벤션이 더 오래 지속되는 성과를 가져온다.

Comparison of faculty development strategies: Although we have attempted to tease apart key ‘features’ of effective faculty development, there is little comparative research on which components of faculty development interventions are most useful (e.g. micro-teaching; role plays) and whether one method (e.g. seminar series) is more effective than another (e.g. short courses). For example, although workshops are one of the most common methods, many have suggested that they are too short to bring about lasting change. At the same time, they persist as a method of choice. Our findings suggest that longer interventions may have more durable outcomes. This, too, requires further investigation.


 

이론과 실천에 기반한 FD: 리뷰 결과에 따르면 단 하나의 '완벽한 인터벤션'을 찾으려는 노력을 경계해야 한다. 실제로 다양한 접근법이 존재하며, 적절한 활용이란 환경에 따라 다 다르다. 그러나 FDP는 이론과 실제적 근거에 기반해야 한다. 아직 교육이론이 어떻게 학습이 일어나는지에 대한 통합된 이해를 제공해주지는 못하나, 학습에 있어 상당한 지지를 받는 모델이나 원칙이 있고, 이것을 기획/성과측정/효과분석에 활용해야 한다. 여기에는 다음과 같은 것들이 있다.

Grounding faculty development in theory and practice: Based on the findings of our review, we should caution ourselves against searching for the single ‘perfect intervention’. In fact, an array of approaches exists and their appropriate use may differ from activity to activity and across settings. However, the work of faculty development should be grounded in both theory and empirical evidence. While educational theory has not yet provided us with a unified understanding of how learning occurs, there are well-supported models and principles of learning that can inform us in planning interventions, measuring outcomes and analysing effects (Mann, 2002). These include principles that draw

  • 인지과학 on the science of cognition (e.g. how individuals make meaning of information and store it in memory) (Regehr & Norman, 1996);
  • 사회적 학습 on understandings of social learning (e.g. how learning occurs from and with others;
  • 환경 the influence of the learning environment) (Bandura, 1986);
  • 경험 learning through experience (Kolb, 1984);
  • 성찰 and making meaning of learning and experience through reflection (Scho¨n, 1987; Moon, 1999).
  • 실천 커뮤니티에의 참여 More recently, the idea of learning through participation in communities of practice has also been explored (Lave & Wenger, 1991; Boud & Middleton, 2003), and this notion will have important implications for faculty development.


프로그램과 다양한 전공간 협력: 자원을 공유하고 프로그램간 협력을 해야 한다. 교육 영역에서 배울 것이 많은데, 우리의 결과를 보면 다른 대학 교수들의 트레이닝의 리뷰에서의 결과와 유사하다. 많은 경우 이들 연구로부터 배우고 우리에게 적용해야 한다.

Collaborating across programs and disciplines: The value of sharing resources and collaborating across programs has been highlighted earlier in this review. There is also much for us to learn from colleagues in the field of education. For example, many of our findings resemble what has been found in reviews of research on training of university teachers (Gibbs & Coffey, 2004); in many ways, it would be wise to learn from these studies and incorporate their methodologies (and findings) into our work.



FDP 실천에 대한 함의

Implications for practice:

We need to:


  • 우리의 성공을 토대로 하자. 성공적 프로그램에 들어있는 식별가능한, 복제가능한 요소들을 활용하자
    Build on our successes. The literature describes successful programs, with recognizable, replicable elements. It is now important to tease apart the elements that work.
  • 이론과 교육 원칙을 설계와 개발에 더 잘 활용하자. 더 나아가 이론을 실천에 연결시켜야 한다. 교수들의 실제 교육행위를 더 잘 이해하고, 실제로 맞닥뜨리는 문제를 이해하여 이 정보를 이론에 관련지어 더 향상된 인터벤션을 개발과 효과성 평가로 이끌어야 한다.
    Make more deliberate use of theory (particularly theories of learning) and educational principles in the design and development of our faculty development programs. Further, we need to link theory with practice, in an iterative cycle of asking questions in practice, studying these questions and testing our answers. We also need to better understand teachers’ educational practices and the real problems that teachers encounter so that we can use this knowledge to inform theory, which can help us in developing improved interventions and evaluating effectiveness.
  • 맥락의 중요성을 인정하자. 조직문화, 교육과정, 교사와 학생이 모두 '맥락'에 기여한다.
    Acknowledge the importance of context. The organizational culture, the curriculum, teachers and students all contribute to a context that is critical to the effectiveness of educational change.
  • 장기간에 걸쳐 진행되는 프로그램을 개발하자. 학습, 실천, 성장을 축적하자
    Develop more programs that extend over time, to allow for cumulative learning, practice and growth.
  • 학습자 간 성찰과 배움을 촉진하는 프로그램을 개발하자. 그들이 스스로를 교사로 인식하게 하자. 이것이 교수자-지도 인터벤션이 아니라 지속적인 자기주도발전의 토대가 될 것이다.
    Develop programs that stimulate reflection and learning among participants, raising their awareness of themselves as teachers. This would form the basis for ongoing self-directed development rather than the need to primarily have ‘teacher-directed’ interventions.
  • 자발적 참여의 문제를 다시 생각하자. 많은 경우 효과적인 교육을 위해 필요한 전제조건은 참여가 없으면 달성되지 않는다. 더 나아가 FD의 자발성이란 특징을 생각할 때 조직문화와 그 조직이 교수-학습에 있어 어디에 가치를 두는지를 고려해야 한다.
    Re-examine the question of voluntary participation. In many contexts, the requirement to prepare for teaching effectiveness may not be met unless participation is expected and required. Moreover, the voluntary nature of faculty development raises questions about the institutional culture and the values (both explicit and implicit) that it places on teaching and learning. 


연구에 대한 함의

Implications for future research:

We need to:

  • 더 철저한 연구를 수행하자. 통제그룹과 비교집단, 질적연구를 활용하자. 성과를 더 잘 정의하고, 프로그램의 시작시부터 평가를 계획하고 연구를 함께 하는 동료들과 협력하라
    Conduct more rigorous research studies, using control or comparison groups and qualitative methodologies. This requires careful definitions of outcomes, planning for evaluation at the inception of any program, and closer collaboration with research colleagues. We must also find a way to corroborate anecdotal observations and capture faculty members’ stories.
  • 결과 중심의 연구가 아닌 과정 중심의 연구를 하자. 즉, 어떻게 변화가 일어나는 것인지 더 잘 이해할 필요가 있다. (어떻게 교수의 신념이 변하는가, 인터벤션이 교수의 성찰기술을 향상시켰는가) 질적연구방법이 더 적합할 것이다.
    Carry out process-oriented studies in addition to outcomeoriented ones. That is, we need to better understand how change occurs, both as a result of the intervention and within the individual (e.g. how did teachers’ beliefs change; did the intervention result in improving teachers’ reflective skills). In fact, qualitative methods may be more appropriate here.
  • 수행능력-기반 변화를 측정하고 이를 위한 척도를 개발하자.
    Continue to develop and utilize performance-based measures of change. The use of these methods, which do exist, is an essential and natural next step. 
  • 데이터 수집에 다양한 방법을 사용하자
    Use multiple methods and data sources to allow for 
    triangulation of data.
  • 평가도구의 타당도와 신뢰도를 평가하자. 적절한 도구가 있다면 새로운 도구 개발에 앞어서 먼저 고려되어야 한다. 표준화된/비교가능한 도구를 사용하자.
    Assess and report the validity and reliability of instruments used. Further, where appropriate instruments exist, these should be considered in preference to developing new instruments. Using standardized or comparable measures across studies will help to understand the field and improve the quality of research in this area.
  • 다양한 변수가 예측불가능하게 돌아가는 복잡한 환경에서 이뤄지는 인터벤션에 대한 연구를 장려하자. 여러 요인 간 상호작용이 있는 연구를 더 해야 한다.
    Promote studies in which an intervention is recognized as occurring in a complex environment in which many unforeseen and unpredictable variables play a role. We need to conduct more studies in which the interaction between different factors is investigated, highlighting under what conditions and why an intervention might be successful or not.
  • 서로 다른 FD 방법 간 비교하자
    Compare different faculty development methods to enable an analysis of which features of faculty development contribute to changes in teacher performance.
  • 시간에 따른 변화를 평가하자.
    Assess change over time. This is important both in determining any enduring effects, and in understanding which interventions or factors may be associated with more sustained change. Longitudinal follow-ups may also help us to understand the development of faculty members throughout their careers.
  • 기관이나 조직에 대한 FD의 효과를 더 철저하게 평가할 수 있는 수단 개발
    Develop means of assessing the impact of faculty development on the institution/organization in a more rigorous and systematic fashion.
  • 이론/개념틀 안에서 연구 진행. 결과 해석에 이론 활용
    Embed our research studies in a theoretical or conceptual framework, and utilize theory in the interpretation of our results.
  • 의학 외 분야와 협력
    Collaborate with colleagues within and outside medicine.



HITCHCOCK, M.A., STRITTER, F.T. & BLAND, C.J. (1992) Faculty development in the health professions: conclusions and recommenda-tions, Medical Teacher, 14(4), pp. 295–309. 




 


 







 2006 Sep;28(6):497-526.

systematic review of faculty development initiatives designed to improve teaching effectiveness in medicaleducationBEME Guide No. 8.

Author information

  • 1Faculty of Medicine, McGill University, Montreal, Quebec, Canada. yvonne.steinert@mcgill.ca

Abstract

BACKGROUND:

Preparing healthcare professionals for teaching is regarded as essential to enhancing teaching effectiveness. Although many reports describe various faculty development interventions, there is a paucity of research demonstrating their effectiveness.

OBJECTIVE:

To synthesize the existing evidence that addresses the question: "What are the effects of faculty development interventions on the knowledge, attitudes and skills of teachers in medical education, and on the institutions in which they work?"

METHODS:

The search, covering the period 1980-2002, included three databases (Medline, ERIC and EMBASE) and used the keywords: staffdevelopment; in-service training; medical facultyfaculty training/development; continuing medical education. Manual searches were also conducted. Articles with a focus on faculty development to improve teaching effectiveness, targeting basic and clinical scientists, were reviewed. All study designs that included outcome data beyond participant satisfaction were accepted. From an initial 2777 abstracts, 53 papers met the review criteria. Data were extracted by six coders, using the standardized BEME coding sheet, adapted for our use. Two reviewers coded each study and coding differences were resolved through discussion. Data were synthesized using Kirkpatrick's four levels of educational outcomes. Findings were grouped by type of intervention and described according to levels of outcome. In addition, 8 high-quality studies were analysed in a 'focused picture'.

RESULTS:

The majority of the interventions targeted practicing clinicians. All of the reports focused on teaching improvement and the interventions included workshops, seminar series, short courses, longitudinal programs and 'other interventions'. The study designs included 6 randomized controlled trials and 47 quasi-experimental studies, of which 31 used a pre-test-post-test design.

KEY POINTS:

Despite methodological limitations, the faculty development literature tends to support the following outcomes: Overall satisfaction with faculty development programs was high. Participants consistently found programs acceptable, useful and relevant to their objectives. Participants reported positive changes in attitudes toward faculty development and teaching. Participants reported increased knowledge of educational principles and gains in teaching skills. Where formal tests of knowledge were used, significant gains were shown. Changes in teachingbehavior were consistently reported by participants and were also detected by students. Changes in organizational practice and student learning were not frequently investigated. However, reported changes included greater educational involvement and establishment of collegiate networks. Key features of effective faculty development contributing to effectiveness included the use of experiential learning, provision of feedback, effective peer and colleague relationships, well-designed interventions following principles of teaching and learning, and the use of a diversity of educational methods within single interventions. Methodological issues: More rigorous designs and a greater use of qualitative and mixed methods are needed to capture the complexity of the interventions. Newer methods of performance-based assessment, utilizing diverse data sources, should be explored, and reliable and valid outcome measures should be developed. The maintenance of change over time should also be considered, as should process-oriented studies comparing different faculty development strategies.

CONCLUSIONS:

Faculty development activities appear highly valued by participants, who also report changes in learning and behavior. Notwithstanding the methodological limitations in the literature, certain program characteristics appear to be consistently associated witheffectiveness. Further research to explore these associations and document outcomes, at the individual and organizational level, is required.

PMID:
 
17074699
 
[PubMed - indexed for MEDLINE]



의학교육에서 비디오의 효과적 활용을 위한 열두가지 팁(Med Teach, 2015)

Twelve tips for the effective use of videos in medical education

Chaoyan Dong & Poh Sun Goh





스탠포드 의과대학은 Khan Academy와 함께 핵심 교육과정을 짧은 비디오 클립을 활용한 거꾸로 교실 모델로 만들었다. 최근 몇 년간 Medtube 와 UndergroundMed는 기본적 내용과 임상역량의 도입 수업을 담당했고, 심지어 YouTube 비디오도 의료전문직 교육에 널리 사용되고 있다.

Stanford University School of Medicine has collaborated with the Khan Academy to develop and teach the core curriculum using short video clips in a flipped classroom model (Prober & Khan 2013). Medtube and UndergroundMed have been covering basic content and clinical competencies for introductory classes in recent years, and even YouTube videos have been widely used in health professions education.


많은 학생들이 이미 강의를 건너뛰고 독자적으로 디지털 형식의 온라인 교육자료에 접근하고 있다. 이상적으로는 교육이론과 최고의 교육실천법으로부터 비디오 활용을 이끌어가야 한다. 교육에서 효과적인 테크놀로지의 활용에 대한 연구를 보면 교육자들은 내용전문가가 되어야 할 뿐만 아니라 내용 전달을 위한 테크놀로지 활용에 대한 실제적 이해도 함께 가지고 있어야 한다. 또한 특정 교육형식을 사용하는 것과 관련하여 교수-학습에 관련한 교육학 지식도 필요하다.

Many students already skip lectures and solely access educational materials online in digital format (Billings-Gagliardi & Mazor 2007; Kircher et al. 2010; Traphagan et al. 2010). Ideally, educational theories and best practices should guide the use of videos. Research on the effective use of technology in education shows that instructors need to be not only subject matter experts, but also have an empirical understanding of the technology to be used to deliver the content, as well as the teaching and learning pedagogy underpinning the use of a particular instructional format (Harris et al. 2009; Roblyer & Doering 2012).





비디오 활용의 교육적 장점

Ti p 1 The pedagogical advantages of using videos


비디오는 디지털 세대인 요즘 학생의 니즈에 잘 맞는다.

Video use meets the needs of the current digital generation of students (Prensky 2001; Prober & Khan 2013). 


비디오를 용함으로써 더 많은 숫자의 학생에게 물리적/지리적 한계 없이 내용을 전달할 수 있다. 거꾸로교실 FLCL 상황에서 학생은 비디오 형식으로 된 강의를 각자 알아서 보고, 수업시간을 보다 학생-중심 활동으로 활용할 수 있다. 그렇게 함으로써 교수자의 역할은 '강의자'에서 '촉진자'로 변화하게 된다.

Video presentations allow teaching to be scaled up to a large population of students without physical or geograph- ical limitations. In a flipped classroom situation, students can review lectures in video format on their own, and spend class time on more student-centered activities such as interaction with peers and instructors (Baker 2011). By doing so, a teachers’ role changes from that of lecturer to facilitator.


사람이 직접 할 때는 그 형태가 다양해질 수 있으나 비디오를 통하면 의학적 절차 역시 표준화된 형태로 전달할 수 있다. 비디오는 실제 임상 시나리오를 활용함으로써 호기심을 자극하고 학생의 관심을 끌 수 있다. 또한 authentic learning을 촉진할 수 있다. 비디오는 지식의 닻(anchor)로서 역할할 수 있으며, 이를 중심으로 학생이 더 탐구하고, 질문을 던지고, 학습활동에 깊이 참여할 수 있다.

Medical procedures can be demonstrated in a standardized manner to avoid inconsistencies between different live human demonstrators (Bellini & Akullian 2007; van Det et al. 2011). Videos also stimulate curiosity and engage students’ attention by situating them in realistic clinical scenarios, and promoting authentic learning (Graham & Johnson 2011). Videos also serve as knowledge anchors, around which students can explore, ask questions, and deepen involvement in a learning activity (CTGV 1990).




교사에게 필요한 것

Ti p 2 The requirements for teachers


성공적으로 비디오를 교육과정에 통합시키기 위해서는 비디오 테크놀로지, 교육 내용, 멀티미디어 학습에 대한 현재 이론, 비디오를 활용한 최선의 실제 교육행위에 대한 지식이 있어야 한다. 비디오가 어떤 역할을 할 것인지 (강의를 보충하는 것인지, 전통적 강의를 대체하는 것인지), 언제, 어떻게 비디오를 활용할 것인지, 비디오를 사용하는 것이 학습과정과 성과에 어떠한 영향을 줄 것인지 와 같은 질문을 생각해봐야 한다. 

Successful video integration into the curriculum should be guided by knowledge of video technology, the subject matter to be covered, current theory of multimedia learning, and empirical best practice in using video (Shulman 1986; Mayer 2001; Mishra & Koehler 2006; Triola et al. 2012). What roles do videos play; for example, supplementing lectures or replacing traditional lectures with online modules? When and how to use videos? How does video use affect the learning process and outcomes?



대상 학생이 누군지 결정하기

Ti p 3 Identify who the target students are


의학교육의 테크놀로지와 관련된 이전 연구는 교육내용을 설계하는데 있어서 학생 집단을 이해하는 것이 중요함을 강조한다. 교육 내용과 관련한 대상 학생의 사전 지식을 이해하고, 테크놀로지 관련 스킬을 이해하고, 비디오 활용과 관련한 태도와 동기부여에 대해 이해해야 한다.

Prior research on technology in medical education highlights the importance of understanding the student cohort in designing educational content (Ruiz et al. 2006; Cook et al. 2008). It is important to get a thorough understanding of the target students’ prior knowledge in the content area, skills in technology, as well as attitudes and motivation regarding the use of videos. 


반대로, 비디오 활용에 대한 부정적 태도는 학습에 안 좋은 영향을 미칠 수도 있다.

Otherwise, negative attitudes toward video use may have a deleterious effect on their learning (Davis 1989; Reisslein et al. 2005).




학생들을 비디오 교육에 적응(Orient)시키기

Ti p 4 Orient students to the video content


학생들이 비디오를 통해서 무엇을 보게 될 것인지에 대해 준비시키는 것이 중요하다. 비디오를 시청하기에 앞서서 질문을 던지거나 토론을 하게 하는 것이 학생들이 비디오에서 얻을 정보를 처리하는데 있어 인지적 토대를 마련하게 해줄 것이다.

It is important to prepare students for what they are about to see in the video. Questions or discussions before viewing thevideo help to build the cognitive foundation for studentsto process the information (O’Neill & Wyness 2005).


그러나 중간중간 잠시 멈추면서 질문 등을 통해서 학생들이 참여하게 하는 것이 더 참여를 촉진할 수도 있다.

However, intermittent pauses to invite participation such as answering questions can promote deeper engagement.


질문이나 퀴즈를 포함시킴으로써 능동적 시청과 수동적 시청의 균형을 맞출 필요가 있다. 비디오를 수동적으로 시청하는 것은 심화 이해나 어려운 이론에 대한 이해, 추상적 주제, 비판적 사고의 촉진 등에 효과적이지 않다. 상호작용적 요소를 넣어서 학습을 촉진할 수 있다.

A balance between passive and active viewing with embedded questions or quizzes should be sought. Passive viewing of video is like information transmission to students, which is often not particularly effective in deepening understanding of difficult theoretical or abstract topics or promoting critical thinking (The Nielsen Company 2013). Interactive elements can facilitate learning of these topics.


상호작용적 요소를 통해서 참여를 높이기

Ti p 5 Use interactive elements to promote students’ participation


대부분의 비디오는 선형적 형식을 따르며, 시청 전후로 토론이 있을 때 효과가 있다. 토론은 학생들의 관심을 핵심 교육내용으로 향하게 한다. 이를 통해서 단순히 수동적 시청자가 아니라 능동적 참여자가 된다.

Most videos are presented in a linear format, which works well when discussions happen before and after viewing. Discussion helps to direct students’ attention to the key teaching points of the video (CTGV 1990). By doing so, students are no longer passive listeners, but become active participants.


컴퓨터 프로그램들도 있다.

Computer programs allow teachers to embed different formats of questions and tests within videos, such as 

  • YouTube, 
  • Adobe Presenter (http://www.adobe.com/ products/presenter.html), 
  • Camtasia Studio (http://www.techs- mith.com/camtasia.html), 
  • Adobe Captivate (http://www.ado- be.com/products/captivate.html), 
  • Articulate Storyline (https:// en-uk.articulate.com/products/storyline-overview.php), 
  • Lectora Inspire (http://lectora.com/e-learning-software/), 
  • Windows Movie Maker (http://windows.microsoft.com/en- us/windows-live/movie-maker#t1¼overview),and 
  • Raptivity (http://www.raptivity.com/). 


They also make it possible for students to bookmark, annotate, search for certain points, fast- forward, playback, or pause the video, and view it as many times based on their own needs.



비디오를 학습목표와 과목 목표와 일치시켜라

Ti p 6 Align videos with learning objectives and course outcomes


비디오는 - 다른 학습 테크놀로지나 교수법과 마찬가지로 - 교수자가 세심하게 학습목표를 설정하고, 학습활동을 설계하고, 나머지 교육과정과 비디오 활용을 연계시켰을 때 학습목표 달성에 도움이 된다. 

Videos only promote learning outcomes when instructors carefully set learning objectives, design learning activities, and align the use of video with the rest of the curriculum, similar to the use of any other learning technology or instructional method (Biggs 2007).




파워포인트 슬라이드, 교수자 사진, 자막/대본 등과 통합하라

Ti p 7 Integrate PowerPoint slides, lecturer’s image, on-screen captions, and transcript


비디오는 글자와 이미지로 정보를 전달하고, 뇌의 서로 다른 영역에서 처리된다. 글자는 화면에 띄우거나 오디오로 제공할 수 있으며, 정지된 이미지는 파워포인트로, 움직이는 이미지는 애니메이션으로 전달 가능하다.

Video utilizes words and images to deliver information, with each being processed in different parts of brain (Mayer & Moreno 2003). Words can be presented as on-screen text or narrated audio. Images may be static such as PowerPoint slides or dynamic as animations or moving images. The multiple formats of information presentation serve students’ prefer- ences for either auditory, visual or verbal channels of learning (Mayer 2001).



– PPT 슬라이드: 널리 사용됨. 효과도 증명됨. 스크린 캡쳐 도구들 PowerPoint slides. PowerPoint slides are widely used during lectures to highlight key points, and their impact on learning has been validated (Blalock & Montgomery 2005; Nouri & Shahid 2005). Screen capture tools such as 

  • Camtasia (http://www.techsmith.com/camtasia.html) and 
  • Adobe Presenter (http://www.adobe.com/products/pre- senter.html) 

have been used widely to capture PowerPoint presentations and lectures.


 교수자의 존재 여부: 교수자의 존재 여부와 관련하여 합치된 의견은 없다. 우리의 경험과 Khan Academy 비디오 리뷰에 따르면, 일반적인 방식은 교수자가 비디오의 시작 부분에만 나오는 것이다. 교수자의 얼굴을 보는 것이 학생으로 하여금 학생-교수간 거리를 더 가깝게 느끼게 하고, 실제 교실과 비슷한 환경에서 있는 것처럼 느끼게 한다. 강의의 단조로움을 탈피하기 위해서는 교수자의 얼굴을 몇 차례 보여주는 것이 도움이 된다. 그러나 교수자의 얼굴이 모든 강의 내내 등장한다면, 학생의 집중력을 발표(교육)내용에서 흐트러뜨릴 수 있다. 또한 교수자도 전문가적인 페르소나를 유지하기 위해서 노력해야 한다.

Presence of lecturers. There is no agreement regarding whether the instructor should be seen in educational video (Prober & Khan 2013). Based on our experience and review of videos by Khan Academy (https:// www.khanacademy.org/), the common practice is that the lecturer appears at the beginning of the video. We speculate that seeing the lecturer’s face helps to reduce the distance between students and the lecturer, and to situate students in a simulated classroom environment. To break the monotony of lecture, it is helpful to show the lecturer’s face a few times. However, if the lecturer’s face is displayed during the whole lecture, it can distract students from focusing on the presentation. In addition, the presenter should strive to maintain a professional persona.


– 자막: 자막을 활용하는 것은 다양한데, 다음을 권장한다. 만약 비디오가 삽관과 같은 어떤 절차(procedure)에 대한 것이라면 자막이 핵심 단계를 강조하는데 유용하다. 만약 비디오가 대화 상황을 보여주는 것이라면, 자막은 필요하지 않으나 핵심적인 부분을 마지막 부분에 언급할 수 있다. 만약 비디오가 강의를 녹화한 것이라면 강의의 대본이나 lecture note가 온라인에서 접근가능한 것이 좋다.

 On-screen text. The use of text in video varies, a topic widely studied in the area of multimedia learning (Mayer 2001; Kalyuga et al. 2004). Based on a review of these studies and our experience using educational videos, we recommend the following guidelines. If the video dem- onstrates a procedure, for example, how to perform intubation, captions are useful to highlight the key steps. If the video narrates, for example, showing an inter- professional clinical scenario, captions are not necessary, however key commentary points should be presented at the end of the video. If the video is a record of a lecture, the transcript of the lecture or lecture notes should be made available online.



인지 과부하를 주의하라

Ti p 8 Avoid cognitive overloading


문자와 이미지는 뇌에서 서로 다른 channel에서 처리된다. 그러나 각 채널은 한정된 정보만 처리할 수 있다. 인지적 용량을 초과하는 정보가 제공되면 인지과부하가 생기고 학습에 안 좋은 영향을 미칠 수 있다. 인지과부하를 방지하기 위해서는 verbal and video channel의 동시적 활동이 통합되어야 한다. 비디오의 구조는 학생이 학습한 내용을 인지구조에 조직화시키고, 관련된 사전 지식과 통합하며, 정보를 새로운 상황에 적용하여 문제를 해결할 수 있는 능력을 길러줘야 한다.

Text and images in video are processed separately in verbal and visual channels in our brain (Sweller 1988; Mayer 2001). However, each channel can only process a certain amount of information because of our limited working memory capacity (Miller 1956; Sweller 1988, 1999; Mayer 2001). When the information presented exceeds our cognitive capacity, this results in cognitive overload, which has been shown to have a detrimental impact on learning (Mayer & Moreno 2003). To avoid cognitive overloading, concurrent activities in both channels should be integrated (Clark 2007; van Merrie¨nboer 1997). The structure of the video should facilitate students’ ability to organize content into a coherent cognitive structure, to integrate it with relevant prior knowledge, and to apply the information in new situations to solve problems (Elman et al. 1997; Quartz & Sejnowski 1997).



비디오 제작에 학생을 참여시키기

Ti p 9 Engage students in the video production


비디오 제작에 학생을 참여시키는 것은 내용에 대한 학습을 향상시키고, 학생들의 디지털 문해력을 향상시킨다. 학생들은 비디오의 내용과 그 내용을 가장 효과적으로 전달하는 방법을 결정하는데 참여할 수 있다. 이러한 접근법은 learning by teaching이라 불린다.

Engaging students in video production is a student-centered activity that enhances content learning and improves student digital literacy (O’Neill & Wyness 2005). Students can be involved in selecting the content of the video and the most effective way to deliver the content. This approach is called learning by teaching, which has been used widely in profes- sional education (Moore 1973).


Kaufman은 "학습자는 교육 프로세스에서 능동적 기여자가 되어야 한다"라고 하였다.

Kaufman (2003) advocates that ‘‘the learner should be an active contributor to the educational process’’ (p. 215).




비디오의 길이를 제한하라

Ti p 10 Limit video duration


2011년 Nielson Social Media Report 에 따르면 밀레니엄 세대는 비디오 시청에 다양한 디바이스를 활용한다. 스마트폰 플랫폼의 등장은 비디오의 길이를 특히 더 중요하게 만들었다. 비디오의 길이가 길수록 집중력을 유지하기가 어려워진다. Prober and Khan은 "'10분'은 일반적인 성인에서 학습이 최고조에 달하는 기간(peak learning period)에 잘 맞으며, 저장과 검색이 쉽다는 장점이 있다" 비디오의 길이가 더 길어질 때는 작게 나눠서 각 부분 사이에 상호작용적 요소를 넣거나, 섹션별 제목을 달아서 학생들이 다시 보고자 할 때 그 섹션을 쉽게 찾을 수 있게 해야 한다.

According to the Nielson Social Media Report of 2011, the millennial generation tends to use multiple devices such as smart phones to view videos (The Nielsen Company 2011). The smart phone platform makes attention to video length especially important. Longer videos demand greater effort to sustain focused attention. The longer the video, the less likely a student will watch it completely. Prober and Khan (2013) suggested that ‘‘Ten-minute videos have the advantages of being sensitive to the typical peak learning period for adults and are easily archived and searchable’’ (p. 1409). A longer video should be divided into shorter segments with interactive elements in- between, or include section titles so that students can easily find the segments that they need to replay.



신뢰할 수 있는 양질의 비디오를 찾아보라
Ti p 11 Identify credible professional quality videos

Open Educational Resource (OER) movement에 따라서 온라인에서 무료로 접근가능한 비디오들이 많다. 기존의 작업물에 따라 다음의 가이드라인을 제안할 수 있다.

Review of existing work on this topic suggests the following guidelines. 


  • 첫째, 비디오는 교수자의 교육적 목표 달성을 보완할 수 있어야 한다. 
    First, the video should complement your instructional objectives. 
  • 둘째, 비디오의 출처(교수자, 기관, 내용)가 신뢰할 수 있는 것이어야 한다. Second, the video should come from credible sources. To evaluate this you should ask: Is the author an authority on this topic? What are the author’s credentials related to this topic? In which institution does he or she work? Does the author have peer reviewed publications on this topic? Is the information current and accurate? 
  • 셋째, 비디오의 품질이 준수해야 한다. 출처가 신뢰할 수 있다고 해서 품질이 우수한 것은 아니다. 좋은 품질의 이미지, 오디오, 텍스트를 갖춰야 한다. 비디오는 접근이 용이해야 한다.
    Third, the videos should have profes- sional quality. Credible sources do not necessarily correlate with high quality videos. Professional quality videos contain high quality images, audio, and text. The video should be easy to access. 
  • 넷째, 저작권을 위반해서는 안된다. 
    Fourth, using these videos should not violate potential copyright issues. You should find out your institu- tional policies regarding use of online videos.



각자 비디오를 만들 때 필요한 기술적 요소들

Ti p 12 Pay attention to technical requirements when producing your own video



– 촬영 Shooting. Advanced video technology does not necessar- ily make it easier to produce a professional quality video. Do you have the skills and equipment to shoot the video? Will you hire a professional videographer? What is your budget and the timeframe involved? 


– 대본 Scripting. A script should be written with two columns, one representing the narration and the other indicating the video shots. A storyboard uses pictures to represent the shots. The script should indicate the framing and camera movements for each shot. 


– 장소 Location. A sound-proof location is generally required to filmthe video. It is often better to filmin a lecturer’s office than classroom or lecture hall. The background should be kept minimalistic so as not to distract the audience. 


– 편집 Editing. Do you have the skills and software to edit the video footage? What are the costs and timeframe involved? A few video editing programs to consider are: Jing (http://www.techsmith.com/jing.html), JayCuy (http://jaycut.com/), Tellagami (https://tellagami.com/), Plotagon (https://plotagon.com/), and GoAnimate (http://goanimate.com/). 


– 형식 Format. Will the video be compressed for online viewing? PCs and Macs have different video and software compatibility requirements, so it is important that the video plays on any type of computer. 


– 호스팅 Hosting. Cloud-hosted and self-hosted are two hosting options. For cloud hosting, there are free platforms like YouTube or fee-based ones such as Amazon cloud video hosting. Using free hosting platforms like YouTube can raise ethical issues because the video content is available to anyone with Internet access. This is an issue when the video is related to patient care. However, YouTube does give you the option to control access to your own channels. Many institutions use a password-protected online platform to host videos. For self hosting, you do have more control over the hosted content. However, institutional platforms normally do not offer unlimited server space. Compressed videos should be used if this is the case. 


– 피어리뷰 Peer review. Get another subject matter expert to review your video before publishing online.





Prober CG, Khan S. 2013. Medical education reimagined: A call to action.Acad Med 88(10):1407–1410. 






 2015 Feb;37(2):140-5. doi: 10.3109/0142159X.2014.943709. Epub 2014 Aug 11.

Twelve tips for the effective use of videos in medical education.

Author information

  • 1National University of Singapore , Singapore.

Abstract

Videos can promote learning by either complementing classroom activities, or in self-paced online learning modules. Despite the wide availability of online videos in medicine, it can be a challenge for many educators to decide when videos should be used, how to best use videos, and whether to use existing videos or produce their own. We outline 12 tips based on a review of best practices in curriculum design, current research in multimedia learning and our experience in producing and using educational videos. The 12 tips review the advantages of using videos in medical education, present requirements for teachers and students, discuss how to integrate video into a teaching programme, and describe technical requirements when producing one's own videos. The 12 tips can help medical educators use videos more effectively to promote student engagement and learning.

PMID:
 
25110154
 
[PubMed - indexed for MEDLINE]


성인학습이론과 의학교육: 리뷰 (Malta Med J, 2009)

Adult learning theories and medical education: a review

Jürgen Abela





Introduction

과거에는 해당 주제를 아주 잘 알면 그것을 가르칠 수 있다는 생각이 있었다. 그러나 의료행위의 복잡성으로 인해서 UME와 PGME에서 그 복잡성을 적절히 다룰 수 있는 교육전략이 필요하게 되었다. 실제로 교수들에게 교육 기술을 가르치는 것은 더 나은 학습성과로 나타난다.

In the past, there has been an assumption that if a person knows very well the subject, then, he will be able to teach it. However the complexity involved in practising medicine must be tackled with appropriate educational strategies in the training and education of undergraduate and postgraduate students. In fact, training teachers in educational techniques translates in better student learning outcomes.1



성인학습이론

Adult learning theories


대부분의 교육이 이뤄지는 의학의 맥락에는 '성인'이 포함되며, 따라서 성인학습이론에 초점을 맞추는 것이 논리적이다. 여기에는 도구적 학습, 자기주도 학습....등등이 포함된다. 이 중에서 자기주도학습은 학습자 개개인에 초점을 맞추며, 자기주도 이론은 성인학습에 초점을 둔다.

Given that in the medical context most education involves adults, it is logical to focus on adult learning theories. There are many adult learning theories, which can be grouped into five main classes. These include Instrumental learning, self-directed learning, experiential learning, perspective transformation and situated cognition.2 Of these, self-directed learning particularly focuses on the individual learner as primary focus. Prominent amongst these self-directed theories is andragogy.


'성인학습'이라는 용어는 1833년 Alexander Kapp에 의해서 그리스 철하자 플라토의 이론을 설명하기 위해 처음 사용되었다. 그는 continuing education에서 성인이 참여하게되는 일반적 과정을 표현하고자 했다. 20세기에 존 듀이, 에듀어드 린데만, 마타 앤더슨 등은 성인학습이론을 추구하였으나 미국에서는 대체로 관심을 받지 못했다. 1980년대에 상황이 변하면서 Malcolm Knowles가 지지하면서 개념을 더 정교화했다.

The term andragogy (andra – meaning “man”; agogos – meaning “learning”) was first used by Alexander Kapp in 1833 to describe the educational theory of the Greek philosopher Plato. He used it to refer to the normal process by which adults engage in continuing education. In the 20th century, various respected intellectuals, such as John Dewey, Eduard Lindeman, and Martha Anderson pursued theories of andragogy, but were largely ignored in the US. Things changes in the 1980’s, with the work of Malcolm Knowles who championed this theory and further elaborated the concept.3


성인학습은 성인에 대해서 다음을 가정한다.

Andragogy assumes that adults: 

• are independent and self directing

• have (various degrees of) experience 

• integrate learning to the demand of their everyday life 

• are more interested in immediate problem centred approaches and 

• are motivated more by internal than external drives.


다른 특징은 성인학습환경과 관련되었는데, 이는 교수자와 학습자의 상호존중, 그리고 학습자간의 상호존중이 중요하다. 존중은 안전한 학습환경의 촉매이기에 중요하다.

Another characteristic deemed to be relevant to adult learning environments, is the importance of mutual respect between teacher and learner and also amongst the learners themselves. Respect is important since it is a catalyst for a safe educational environment.4


안타깝게도, 성찰(reflection)은 성인학습의 중요한 요소임에도 Knowles의 성인학습 개념에서 배제되어 있다. 사실 '성찰'은 Kolb의 학습사이클에서 네 단계 중 두 번째 단계이다. 이에 더하여 성찰은 성인학습과 아동학습의 중요한 차이점이다. 또한 성찰은 성인학습에서 학습동기를 향상시키는 역할을 한다.

Unfortunately, reflection is left out of Knowles’ concept of adult learning, despite being an important component of adult learning skills.5,6 In fact, reflection is the second of the four steps in Kolb’s Learning Cycle.7 In addition, the importance of reflection can be appreciated even more when one considers it to be an important difference between adult learning (andragogy) and child learning (pedagogy) theories. Finally, reflection can be seen to enhance adult learning by increasing motivation to learn.8


동기부여는 성인학습을 쌓아나가는 또 다른 중요한 기둥이다. 동기부여에 대한 이론 중에는 두 가지 주요 그룹이 있다.

Motivation is another important pillar on which adult learning is built. There are two major groups of theories describing motivation: 

• 내용 이론: 무엇이 사람들에게 동기를 부여하는가  content theories: these describe what motivates people, and 

• 절차 이론: 어떻게 사람들이 동기부여되는가 process theories: these describe how people are motivated.9


내용 이론중 가장 유명한 것 중 하나는 Maslow의 욕구위계이론이다.

One of the most popular of the content theories is Maslow’s Hierarchy of Needs.10




그러나 Maslow의 모델은 너무 경직되어 있다. 또한 개개인은 학습 궤적동안 만족하면서 불만족할 수 있다. 따라서 이는 부적절하게 보일 수 있다. 더 매력적인 내용이론은 ERG이다.

However, Maslow’s model may be seen to be too rigid. In addition, an individual may be satisfied and unsatisfied with various needs simultaneously throughout his learning trajectory. It can thus be seen to be inadequate. A more appealing content theory is the one put forward by Clayton Alderfer, who describes and summarises motivation in three needs, ERG:9 

• 존재 Existence – this is more or less equivalent to Maslow’s safety and physical well being steps 

• 관계 Relatedness – stresses the importance of interpersonal and social relationships 

• 성장 Growth – intrinsic individual desire for personal growth


절차이론은 특정 행동은 특정 자극에 의해서 발생한다는 생각에 기초한다. 이러한 것 중하나는 기대이론(Expectancy Theory)이다.

Process theories of motivation, on the other hand, are based on the idea that certain behaviours are produced by particular stimuli. One such theory is the Expectancy Theory which states that motivation depends on two perceptions:11 

1. 성과가 기대한 보상을 가져올 것이다.

2. 요구되는 행동을 수행할 능력이 그 사람 안에 있다.

1. an expectation that an outcome will bring the desired rewards 

2. the required performance is within the capability of the person.


성인학습에서 Knowles는 성인학습자는 본질적으로 내적동기를 부여한다고 기술했다. 그는 외적동기를 기술하지 않았으며, 특히 주된 동기부여의 원천으로 교사의 역할을 기술하지 않았다. 실제로 Peyton은 대부분의 성인학습자가 효과적인 학습을 하기 위해서는 교사에 의해서 부여되는 동기가 필요하다고 지적했다.

In andragogy, Knowles states that adult learners are self (intrinsically) motivated.3 He fails to mention extrinsic motivation and especially, the role of the teacher as major source of motivation. In fact, as Peyton9 points out, most adult learners require the motivation provided by teachers for effective learning to take place.


모든 성인학습이 동등하게 내적동기부여가 되는 것은 아니며, 이는 성인학습이론이 동기부여에 관해서는 부적절할 수 있음을 보여준다. 실제로 교수-주도 학습에서 학생-주도 학습까지 다양한 학습전략을 혼합할 필요가 있다. 이는 학습자와 교수자의 스타일이 맞춰져야 할 필요를 시사한다.

Not all adult learners are equally intrinsically motivated, and this further highlights the inadequacy of andragogy with respect to motivation. In fact, there necessarily arises the need of a mix of learning strategies, ranging from teacher-directed to student-directed learning.13 This implies that there needs to be a “match” (Figure 2) between the learner and the teaching styles used.14



분명히, 이는 교사가 어느 정도는 유연해야 함을 시사한다. 그러한 역학관계에 도달하기 위해 가장 중요한 것은 학생의 요구를 사정하는 것이며, 요구사정이 없는 교육은 진단 없는 치료와 다를 바 없다.

Certainly, this involves an amount of flexibility on the part of the teacher. The most important step to clinch such dynamic relationship is to carry out a needs-assessment of the student/ trainees involved. Without such needs assessment, teaching would be tantamount to treatment without a diagnosis.


예컨대 첫 번째 임상실습에서 학생은 교사가 어떻게 병력을 청취하는지를 봐야 하며, 이후 몇 년이 지나며 검진 기술에 대한 학습전략을 거쳐 몇달이 더 지나면 감별진단과 치료까지 논의할 수 있게 된다.

For example, whereas in the first clinical attachments, students would be dependent on the teacher to show them how to take a history, during the subsequent years, the learning strategies should deal with examination skills, going on further along the months to discussing the differential diagnosis and treatment of the patient’s symptoms.


이러한 역동적 과정은 다양한 도구를 통해 가능하다

This dynamic process can be successfully achieved with a variety of tools: 

• Reflective diary/practice – this will stimulate reflection and facilitate in-depth search on certain topics in addition to allowing for personal development.

• The relevance of what is being taught to medical practice should always act as a background for any discussion on topics.

• Use of the trainees’ experiences to discuss issues in practice, especially at postgraduate level. 

• Small group work on abstract or “difficult” concepts e.g. end of life. 

• Problem based learning. 

• Open discussions on “hot topics” such as medico-legal litigation.


성인학습이 적절히 '성찰'과 '동기부여'를 다루는데 실패했다고 할 때, Mezirow의 '전환학습'이론이 보다 적합해보인다. 이 이론의 핵심은 성인이 경험을 축적하고 이해해가는 메커니즘과 구조에 있다. 전환학습은 성인학습자가 사용하는 확립된 기준점으로부터 변화를 가져오는 것에 목적이 있다. "frame of reference"라 불리는 이것은 사람들이 경험에 대해서 의미를 찾기 위한 구조를 말한다. 따라서 이 FOR이 한 성인의 유전적 구성과 문화적 축적을 반영하는 것이 명확하다. 이 FOR은 다양한 과정을 통해서 변할 수 있는데, 주로 FOR을 구성하는 assumption에 대한 '비판적 성찰'을 통해서 변화한다.

Given that andragogy fails to adequately address reflection and motivation, Mezirow’s concept of Transformative Learning seems more appropriate.12 Crucial to this theory are the structures and mechanisms through which adults assimilate and understand their experiences. Transformative learning aims to effect change in established reference points used by the adult learner. These so called frames of reference are the meaning which people give to experiences and the structures used to arrive to such meaning. It is thus clear that these frames of reference are a reflection of the genetic make-up and cultural assimilation of the particular adult. These frames of reference can be transformed through a variety of ways, but primarily can be changed by critically reflecting on the assumptions which make up each frame of reference.



성인학습에서 교사가 동기부여의 역할을 한다는 점에서, 전환학습은 교사가 학습자에게 질문을 던지고, 자신과 타인의 assumption에 대해 성찰하는 것을 촉진해야 함을 강조한다. critical incident analysis, small group work 등의 방법이 있다.

In line with the motivating role of the teacher in adult learning, Transformative Learning stresses the importance of the teacher in facilitating learners to question and reflect on their own and others’ assumptions.12 Methods that may be particularly useful in this situation include critical incident analysis, small group work to formulate ideas on particular topics and reflective practice.


이러한 생각은 "The Inner Apprentice."와도 유사하다. Neighbour 가 1992년 주장한 이 개념은, 피훈련자의 학습과정을 묘사한다. Neighbour 는 The Inner Apprentice 라는 용어를 통해서 "정확한 정보가 정확한 장소에 정확한 시점에 제공될 때, 본질적으로 자기-교육 의 성격을 가지는 무의식적 학습기전"이라 설명했다. 우호적인 학습환경이 있다면 inner apprentice(피훈련자)는 kairos라는 단계를 거치면서 cognitive dissonance 에서 cognitive resonance로 나아가며 지식을 습득한다. 그리스어로 Kairos란 행동의 적절한 시점을 말하며, "Kairos"의 기간에 피훈련자는 가장 명확하게 문제의 핵심을 인식할 수 있고, 변화하는 정보를 가장 잘 받아들인다. mutative information은 FOR의 변화를 일으키고, cognitive resonance에 도달하게 한다.

It is very similar to and indeed complements the idea of The Inner Apprentice.16 This concept was put forward by Neighbour in 1992, to describe the learning process of trainees. Neighbour put forward this concept to highlight what he called The Inner Apprentice i.e. the unconscious learning mechanism that is intrinsically self-educating, provided the right information is provided in the right place and at the right time. Given such favourable learning climate, the inner apprentice (trainee) acquires knowledge (learns) by moving from cognitive dissonance to cognitive resonance through stages of “kairos.” Kairos in Greek means the right time of action, and by analogy, during points of “kairos” the trainee can most clearly recognise the nub of the issue and is most receptive to mutative information. This mutative information eventually leads to changes in the frames of reference to achieve cognitive resonance.


성인학습 이론을 살펴나갈 때, 그것이 의미하는 바가 무엇인지에 대해 길을 잃을 수 있다. 즉 성인학습을 촉진하고 효과적인 교육을 촉진하는 것이 무슨 의미인지 하는 것이다. 다른 말로는, 의학 영역에서 이는 의학역량의 성취를 말한다. 실제로 의학역량을 달성하는 것은 의학교육 상황에서 궁극적 동기부여요인이 되어야 한다. 이는 많은 경우 당연한 것으로 여겨지지만 실제로 의학교육에서 많은 경우 역량달성에 실패한다. 의학역량은 다음과 같이 정의된다.

Going through the theories of adult learning, one runs the risk of losing track of what they stand for – to enhance adult learning and facilitate effective teaching. In other words, in the medical field, this means the achievement of medical competence, whatever speciality, by the trainee. Indeed, achieving medical competence should be (and usually is) one of the ultimate motivations of any medical educational setup. This statement is many times taken for granted, but medical education, may at times actually lead to incompetence.17 Medical competence can be defined as: 


“The habitual and judicious use of communication, knowledge, technical skills, clinical reasoning, emotions, values and reflection in daily practice for the benefit of the individual and the community being served.”18


의학교육이 지식/술기/태도의 상호연관된 영역에 기반하고 있을 때 이는 이론적 개념을 실천할 때 타당하고 효과적인 적용이 필요함을 의미한다. 지식은 이렇게 정의된다.

This implies that there needs to be a sound and effective application in practice of theoretical concepts, using the fact that medical education is based on three interrelated domains, which are knowledge, skills and attitudes.19 Knowledge can be defined as: 


지식이란: “…a background of facts and interactions between facts that should lead to an understanding of the material being learned”20


지식의 결여는 종종 발견하기 어렵다. 지식에 관한 한 가지 흥미로운 묘사방법은 JoHari Window(JW)를 사용하는 것이다.(JoHari Windows (JW)put forward by Joseph Luft and Harry Ingham (hence: Joseph & Harry = JoHari))

Lack of knowledge is occasionally difficult to identify. An interesting way to picture knowledge (and the lack of it) in learners is by the using the concept of JoHari Windows (JW)put forward by Joseph Luft and Harry Ingham (hence: Joseph & Harry = JoHari)21. They developed this model in the 1950’s while working on group dynamics.




이 분류를 통해서 지식의 다양한 측면을 설명할 수 있다.

Through this categorisation, various aspects of knowledge can be addressed accordingly, though 

  • 4영역: 다루기 어려움 area 4 (The Unknown area) is not amenable to modification. 
  • 2영역: Blind Spot Thus area 2, is called the Blind Spot since it refers to knowledge not known to the trainee. This can be addressed through didactic-interactive type lecturing, where new information is provided. 
  • 3영역: Facade Area Area 3 is called the Facade Area, and refers to what the person knows about himself which the rest of the people do not know. This can be tackled using discussions and small group work. 

Teaching of both areas 2 and 3, and possibly area 4 is augmented with reflective practice.


Skill은 마치 만병통치약처럼 여겨져왔다. 최근까지도 "시행하는 것을 한번 보고, 한번 직접 해보고, 다. 른 사람에게 한번 가르쳐 보라" 라는 말이 사용되었다. 이 방법은 skill의 다양한 측면, 예를 들면 무언가 잘 못 되었거나 부작용이 생겼을 때에 대한 추가적 탐색을 제한시킨다. 이에 더하여 의사소통과 같은 어떤 skill 은 이러한 형식에 들어맞지 않는다.

Skills are very much the panacea of medical institutions.Until recently the adage used to be “see one, do one, teach one.” This method fosters a sense of competition and pride in the medical profession but at the same time creates undue tension in the learner and also may inhibit exploration of various aspects of the studied skill for example when things go wrongs or possible complications which arise during or after the particular procedure. In addition, certain skills such as communication skills do not lend themselves readily to this format.


George와 Doto는 흥미로운 스킬-교육 프레임워크를 제시했다.

George & Doto offer an interesting skill-teaching framework:22 

1. 개요: 왜 이 스킬이 필요하며 어떻게 관련이 되는가. 스킬의 기본 개념 Overview: introduction to why the skill is needed and its relevance in the area of practice of the learner. Basic concepts on the skill. 

2. 코멘트 없이 보여주기 Demonstration without comment: allows the learner to observe a whole picture of required skill. 

3. 코멘트와 함께 보여주기(분절해서, 한 번에 다룰 수 있는 분량씩) Demonstration with comment: allows fragmentation of the skill into more manageable portions. 

4. 학습자가 skill을 구두 시연하기 Verbalisation: learner talks through the skill. 

5. 학습자가 skill을 실제 시연하기 Practice: the learner executes the skill.



마지막 단계에서는 긍정적 피드백과 격려를 해줄 수 있다.

In addition, it is felt that the final stage can be further supplemented by positive feedback and encouragement from the trainer.


George & Doto는 스킬 습득을 방해하는 요인도 기술하였다.

George & Doto go further and describe reasons which may prevent the acquisition of the required skill such as 

  • inadequate demonstration/description, 
  • imprinting of previous wrong exposures and 
  • improper correction.


의학교육의 마지막 영역을 태도이다. 그러나 교육과정에서 적절한 수준의 인정을 받지 못하고 있다. 

The last domain in medical education is attitudes. In guidelines of desired medical conduct the attitudes of the medical professional are highly regarded.23 However, it is generally felt that in medical curricula this aspect is not given its due recognition. Through their own nature, attitudes are difficult to describe, quantify and address. Passing on desirable attitudes seems even more difficult.


다음과 같이 정의할 수 있다.

Attitudes can be defined as 

“…a learned predisposition to respond in a consistently favourable or unfavourable manner with respect to a given object.”19


'태도'를 교육할 수 있는 다양한 방법이 있다. 적절한 시나리오를 사용할 수 있다.

There are various ways in which attitudes can be addressed. In the undergraduate scenario, certain specialties, more than others, are useful in passing on particular attitudes. General practice and palliative care, for example, through their philosophies of holistic assessment and “total care” respectively, are suitable to pass on attitudes related to managing the patient and family.24,25 


실제로 "의학 수련과정에 완화의료를 포함시키는 것은 완화의료의 질을 향상시킬 뿐만 아니라 의사의 도덕적 수준 향상에도 기여한다"

In fact, “incorporating palliative care into medical training not only improves the quality of palliative care, but also contributes to the moral quality of the doctors being trained.”19


또 다른 방법은 임상환경에서의 OTJT(On the Job Teaching)이다. 지식와 술기 뿐 아니라 다양한 이슈에 대해서 토론할 기회가 된다.

Another relevant way of passing on appropriate attitudes is teaching in the clinical environment (also known as On the Job Teaching - OTJT). Together with providing an opportunity to pass on skills and even knowledge, OTJT offers an opportunity to discuss, albeit briefly, various issues which may crop up from different clinical scenarios.26 Such issues may include ethical questions and dealing with one’s own feelings when faced with a sick patient.


OTJT가 성공적이기 위해서는 교사의 계획과 헌신이 중요하다. 시간의 압박이 있을 경우에는 교육이 어려워질 수 있다.

For a successful OTJT experience, planning and commitment on behalf of the teacher is paramount. In addition, time pressures will certainly make things more difficult for teaching, especially during a busy ward round or outpatient session.


OTJT를 수행하는 다양한 방법이 있다.

There are various methodologies of carrying out OTJT which have been highlighted in a recent systematic review and are summarised in Table 1.27 




모든 상황에서 피드백은 Pendleton's rule을 따라야 하며 학생이나 피훈련자의 지위/평판(standing)을 훼손해서는 안된다.

Feedback, in such situations and indeed in all situations should be given along Pendleton’s’ rules, thereby not undermining the standing of the student or trainee.28


  • first, the students says what went well, 
  • followed by what the teacher thinks went well; then 
  • the student talks about what could be improved and how, 
  • followed by what the teacher thinks could be improved and how



Conclusion


However, the educational cycle is a useful concept for planning teaching activities. It consists of four steps:7 

1. Assessing the needs of the learner 

2. Setting educational objectives 

3. Choosing and using a variety of methods 

4. Assessing that learning occurred.




22. George JH, Doto FX. A simple five-step method for teaching clinical skills. Fam Med. 2001;33:577-8.










Abstract


Adult learning theories describe ways in which adults assimilate knowledge, skills and attitudes. One popular theory is andragogy. This is analysed in detail in this review. The importance of extrinsic motivation and reflective practice in adult learning is highlighted, particularly since andragogy fails to address adequately these issues. Transformative Learning is put forward as an alternative concept. Using the three recognised domains of knowledge, skills and attitudes, ways of applying these theoretical concepts in medical education are subsequently discussed.





교수 스타일: 우리는 지금 어디에 있는가? (New directions for adult and continuing education, 2002)

Teaching Style: Where Are We Now?

Joe E. Heimlich, Emmalou Norland





"교수 스타일"은 종종 서로 다른 것을 표현한다. 교육 방법이나 교육 기술을 표현하기도 하지만 다음의 것도 있다.

Teaching style is a phrase sometimes used to describe different things. Although some authors use it as if it is synonymous with teaching method or technique, most researchers who have defined teaching style refer to style as 

  • 교수 행동에 관한 선호, 교육자의 교수행동과 교수신념 사이의 일치 a predilection toward teaching behavior and the congruence between an edu- cator’s teaching behaviors and teaching beliefs (Heimlich and Norland, 1994), 
  • 교육 내용이 변하더라도 유지되는 교육활동의 질 a pervasive quality in the educational activities of an educator that persists even when content changes (Fisher and Fisher, 1979), 
  • 한 교사에게서 지속적으로 나타나는 특성 the distinct qualities a teacher displays that are persistent (Conti, 1998), or 
  • 개인이 정보를 수집하고, 조직하고, 유용한 지식으로 전환하는 방법 the characteristic ways each individual collects, organizes, and transforms information into useful knowledge (Cross, 1979).


'스타일'이란 방법이 아니라, 교수-학습에 전체와 관련된 더 큰 것이다.

Style is not method but something larger that relates to the entire teaching-learning exchange.


어떤 교육사건에서든지 일관되게 나타나는 다양한 요소들이 있다. 이 다섯 개의 요소 teacher, learner, group, con-tent, and environment 교수-학습 교환(teaching-learning exchange)의 모델을 구성한다.

In any educational event, several elements are constant: there is an edu-cator who conveys or facilitates the content to each learner and the group of learners within a situation that is both physical and the affective reaction to the physical environment. These five elements—teacher, learner, group, con-tent, and environment—comprise a model of the teaching-learning exchange.


대부분의 교육자들은 모든 학습자들이 학습에 대해 서로 다른 선호와 스타일을 가지고 있으며, 한 학습사건에서 다양한 학습스타일을 만족시킬 수 있는 기술과 전략을 사용하는 것이 중요함을 안다. 그러나 교육현장에서의 교수와 학생 간 상호작용에 대한 스스로의 신념에 대해 성찰해본 사람은 더 적다. 교육자들이 교수-학습에 관한 스스로의 신념과 가치를 아는 것이 중요하나, 그들의 가치와 신념, 철학을 행동과 매칭시키는 것이 더 중요하다. 이러한 일치(match, congruence)가 교수스타일을 이해하는데 중심이 된다.

Most educators understand that all learners have different preferences and styles of learning and believe that it is important to teach using techniques and strategies that will satisfy the variety of learning styles in the learning event (Seaman and Fellenz, 1990). Fewer educators, however, have reflected on their own beliefs regarding the interaction around the educational event between the teacher and the learner that we call the teaching-learning exchange. Although it is impor- tant for educators to know their own beliefs and values regarding learning and teaching, it is more important for them to understand the match between their values and beliefs, or their philosophy, with their behavior in the exchange. This match, or congruence, is the central element of under- standing teaching style (Brookfield, 1990).


교수-학습 상호교환의 합치(congruence)를 통한 학습의 효과가 제시하는 바는, 스타일을 이해함으로써 교수자와 학습자 마음 사이에 상호교환이 더 성공적일 수 있다는 것이다.

The impact on learning through congruence in the teaching-learning exchange suggests that understanding style can enhance the likelihood that the exchange will be successful in both the learners’ and the educators’ minds.


그러나 스타일은 성찰적 실천 혹은 교육 철학과 동일한 것이 아니다. 성찰은 교육자가 자신의 스타일을 점검하는데 중요하지만 교수스타일은 단순히 행동과 관련된 것이 아니기에 그와 다르며, 스타일은 마찬가지로 신념하고만 관련된 것이 아니므로 철학과 다르다. 스타일은 '합치'에 대한 것이다. '합치'를 이루기 위해서 교육자들은 그들의 교육과 학습에 대한 가치를 고려하고, 교수-학습 상호교환에 대한 신념을 점검해야 한다. 그리고나서 이 신념을 실천과 비교하면서 다양한 방식으로 '합치'를 위해 노력할 수 있다.

But style is not the same as reflective practice or philosophy of teaching. Reflection is an important activity when an educator examines his or her style, but teaching style differs fromreflec- tive practice in that it is not just about behaviors, and style is different from philosophy in that it is not just about beliefs. Style is about congruence. To achieve congruence, educators must consider their values about teaching and learning and examine their beliefs regarding each of the elements of the teaching-learning exchange. They must then compare this set of beliefs with their practice and work for congruence in one of several ways.


왜 교수스타일을 공부해야 하는가?

Why Should We Study Teaching Style?


대부분의 성인교육자에 대한 연구에 있어서 우리는 교육자를 평생학습자로 대하지 않고 있다. 성인교육자에 대한 Knowles의 가설부터 Brookfield의 목표까지, 성인교육분야는 '학습자를 이해하기 위해서 무엇이 필요한가'에 대한 특정 측면을 정의했다. 평생학습자에 대한 이해는 성인을 가르치는 사람에 대한 것까지 확장되어야 한다. 모든 성인교육자들이 해야 하고 할 수 있는 것은 스스로에 대해서 공부하는 것이며, 그 결과를 교육에 적용해야 한다.

One major concern is that in much of the study of adult educators, we are not treating the educator as a lifelong learner. From Knowles’s (1980) assumptions to Brookfield’s (1986) goals for adult educators, the field of adult education has defined certain aspects of what is necessary to under- stand learners. The understanding of the lifelong learner should and must extend to ourselves as the teachers of adults. One of the things all adult edu- cators can and should continue to study is themselves—and the application of the resultant understanding to their teaching.


스타일에 대한 연구

The Study of Style


스타일에 대한 연구는 교육자가 가진 신념/가치/태도/근무철학/기술/인격 등으로부터 시작한다. 교육의 합치를 이루기 위해서는 '내가 누구이며 내가 믿는 것은 무엇인가'를 찾아가는 과정을 요구하며 이는 끝나지 않는다. Eble이 말한 바와 같이  '교수 스타일을 습득하는 것은 총체적이며 일생에 걸친 과정이다. 비록 스타일이 스킬과 테크닉의 형태로 나타나더라도 스타일을 개발한다는 것은 그 이상의 것이다'

The study of style starts with what each educator holds: beliefs, values, atti- tudes, working philosophy, skills, and personality. The core of the individ- ual is what makes that individual a unique, potentially powerful educator of adults. Congruence in teaching demands that the personal exploration of “who I am and what I believe” be unending. As Eble (1980) suggests, the acquisition of teaching style “is a whole and lifetime process, and . . . though style may manifest itself in skills and techniques, the development of style involves much more than these” (p. 1).


교육자의 선호가 무엇인지 결정하는 것에는 여러 차원이 관계된다. 대표적인 두 가지 Inclusion과 sensitivity

Many dimensions could be used to determine an educator’s preferences and predilections in teaching. Two used to measure the beliefs about teach- ing are those of 

  • inclusion, which can be considered as level of control of the exchange held by the educator, and 
  • sensitivity or orientation to the five elements in a continuum of the nonhuman to the most human of consider- ations (Heimlich and Norland, 1994). 

Zinn’s (1983, 1994) Philosophy of Adult Education Inventory provides a measure of the educator’s philosophy regarding decisions and actions the educator holds regarding determination of the purpose and outcomes of the learning activity. Conti’s (1985)


Principles of Adult Learning Scale (PALS) 

Principles of Adult Learning Scale (PALS) compares the frequency of an educator’s practice with the principles described in the adult education lit- erature. Seevers (1991) found that sensitivity and inclusion, followed by number of adult education courses taken and attitude toward being an adult educator, were the best predictors of teaching style as measured by PALS.


교수 스타일을 탐색하는 것은 교육자의 행동을 철학과 매칭시키는 과정을 포함한다. 교수 스타일은 나쁜 교육의 변명이 될 수 없으며, 교실에서의 부적절한 행동, 잘못된 교육법 등의 변명도 될 수 없다. 교수 스타일에 깔린 전제는 비록 '나쁜' 스타일은 없다 하더라도, 교육자의 교육행위 중 안 좋은 것은 있다는 것이다. 스타일을 공부하는 목적은 개개 교육자들이 그들의 신념이 무엇인지 알고 그들의 신념이 그들의 행동과 어떻게 연결되는지를 이해함으로써 학생의 학습기회를 개선하는 것이다. 교수 스타일은 교육자로하여금 스스로의 교육을 살펴볼 수 있는 시작점이 된다.

The exploration of teaching style ultimately involves matching the edu- cator’s behavior with his or her philosophy. Teaching style is not an excuse for bad teaching, inappropriate classroom behaviors, or the use of poorly conducted teaching methods. An underlying premise of teaching style is the understanding that although there are no “bad” styles, there are poor prac- tices by educators. The purpose of studying style is for individual educators to understand better what they believe and how those beliefs can be con- gruent with their teaching behaviors in order to improve the opportunity for learning by students or participants in programs. Teaching style gives educators a starting point for exploring their own teaching.


종종 교육자들은 내용과 학습자에 맞는 교육법을 쓸 것을 요구받는다. 좋은 교육은 언제나 다양한 교육법을 다양한 학습의 지향과 학습 감각에 맞게 사용하는 것이지만, 모든 교육법이 모든 교수 스타일에 다 맞는 것은 아니다. 스타일이 반드시 달라져야 하는 것은 아니며 반드시 그럴 필요도 없다. 어떻게 교육자들이 교수 전략을 선택하고 테크닉을 적용하는가는 교육법에 관한 신념과 가치의 함수이며 교육자의 독특한 신념체계에 따라 변화될 수 있다.

Often, educators are implored to match their methods to the content and the learner (Draves, 1997; Lovell, 1987). Good teaching always involves using a variety of methods to appeal to multiple learning orientations and senses, but all teaching methods are adaptable to every teaching style; style does not necessarily change, nor should it. How educators select their teaching strategies and implement techniques is a func- tion of their beliefs and values regarding the methods and can be modified to fit within the unique belief systemof the educator.


합치를 이루는 방법

Options for Congruence


There are three ways in which educators who are exploring their beliefs and their behaviors can move to congruence: (1) a change of teaching behav- iors, (2) a change of beliefs, or (3) a change in both or neither.


행동을 바꾼다.

Changing Behaviors.


많은 교수자가 학습자가 교육자-학생의 상호교환을 조절해야 하며, 학습과정에 포함되어야 한다고 생각하면서도, 70%의 교육시간은 강의와 같은 발표 형식으로 이뤄진다. 이들 교육자는 학습자의 참여가 높아야 하며, 교육자의 통제가 낮아야 한다는 신념을 가지지만, 강의는 통제수준이 높고 학습자-중심 수준은 낮다. 

The most obvious area for exploration is that of methods used in the exchange. In a study of nonformal adult educators, Heimlich and Meyers (1999) found that a large majority of the educators held beliefs that learners should control the exchange and be involved in the learning process. Yet over 70 percent of instruction time was spent in presentation methods (lec- tures and lectures with visuals). These educators held beliefs that suggested a high degree of inclusion of the learner in the learning event and low con- trol by the educator over the learners, but the high control and low learner orientation of the predetermined lecture, even with questions and answers or visual aids, suggests dissonance between beliefs and behaviors. In prac- tice, the need to “excuse” or “apologize” for teaching in a certain way is often an indicator of dissonance between beliefs and behaviors. Understanding the purposes of different methods and then exploring ways in which the method can better be used to match the beliefs would strengthen the teaching- learning event.


신념과 행동은 완전히 구분되는 것이 아니며, 정서-행동-인지 사이에는 상호작용이 있다.

It is impossible to view beliefs and behavior as fully separate, and it is well understood that there is interaction among affect, behavior, and cog- nition (Eagly and Chaiken, 1993).



신념을 바꾼다.

Changing Beliefs.


신념을 바꾸는 것은 종종 더 어려운 과정이지만, 더 근본적이고 더 오래가는 변화일 수 있다. 많은 성인교육자들은 그들이 배워온 신념체계를 미리 정해진 신념체계인 것처럼 해석한다. 공식교육과정에는 교육자가 가져야 하는 다양한 신념을 제시한다.

The process of changing beliefs is often a more dif- ficult, but perhaps more fundamental and long-lasting, change. Many adult educators have been instructed in a system of beliefs about teaching and learning that they may interpret as suggesting a prescribed series of beliefs. Formal education, too, has laundry lists of suggested beliefs that educators should hold: 

    • student centered is better than teacher or content centered; 
    • teach to the various learning styles of the students; 
    • engage the students in defining learning outcomes or qualities of success; and so on.


학습자-중심을 지향하는 교육자도 서로 완전히 다를 수 있다.

As an illustration, any two educators can be learner centered in dra- matically different ways: 

    • an educator can be high control, low sensitivity and still be oriented to the needs of the learner and be correct in the man- ner in which he or she is learner centered. 
    • Another educator can involve the learners in defining their learning needs, organizing their learning activi- ties, and guiding the learning process and be no more or less student cen- tered than the other educator.


교육 내용에 대한 교수자의 방식도 다양하다.

The teacher’s orientation toward content also varies widely. Budak (1993) found, for example, that a teacher’s philosophy was not significantly related to training, attitude toward teaching, nature of content, or physical environment but was related to experience and the sta- tus of content held by the teacher.


Rokeach은 신념을 양파껍질에 비유했는데, 더 핵심부로 갈수록 더 변할 가능성이 낮다. 행동의 근간이 되는 핵심 가치는 소수이며, 문화적으로 규정된다. 핵심 신념에서 멀리 떨어진 신념일수록(주변부 peripheral) 상황에 따라 변하기 쉽다. 이들 신념은 경험, 가족, 교육, 상황, 외부 환경에 따라 형성된다.

Rokeach (1968) described beliefs as an onion skin in which the closer to the core the beliefs are held, the less likely they are to change. At the core, he suggests, there are only a few central values on which all behaviors are based and that tend to be culturally bound. The beliefs that lie further from the core are those (derived and peripheral) likely to vary depending on the situation; these beliefs are formed from experiences, family, instruction, sit- uations, and outside influences.


종종 이들 external belief가 공개적으로 천명되는 신념이 되며, 개개인은 그것을 믿어야 한다고 생각하여, Bem은 사람들은 그들 자신이 특정한 행동을 한다고 믿으며, 그 인식에 부합하는 태도를 지속적으로 보고할 것이라고 했다. 성인교육자에게 있어서 이는 기술된 신념이며, 그 직종에 종사하는 사람이 이상적으로 가져야할 것이어서, 교육자가 깊은 곳에 가지고 있는 신념과 반드시 부합하는 것은 아니다.

Sometimes these more external beliefs are professed beliefs and are assumed by an individual as the things they should believe, which Bem (1967) summarizes in his self-perception theory stating that if people perceive themselves to have certain behaviors, they will report consistent attitudes to match that perception. For adult educators, these may be the stated beliefs that echo the ideals of the profession but do not necessarily match the more deeply held beliefs of the educator.


모든 사람은 상반되거나 서로 경쟁하는 신념을 가지고 있으나, 이 신념들도 스스로에 대한 이해에 통합된다.

Contradictory or competing beliefs exist in all people. Yet these con- tradictory beliefs somehow become integrated into an individual’s under- standing of self.


많은 교육자들이 쉽게 빠지는 함정은 교육과 관련한 신념을 더 넓은 차원 - 자신의 삶 - 의 신념과 무관하게 생각하는 것이다. 교육자들이 그들 자신을 자신의 삶에서 떼어놓는 것은 교육의 가장 인간적 특성을 거부하는 것과 같다. 교육자도 하나의 인간으로서 다른 사람과 연결을 통해서 통찰과 지식과 인식과 정서와 기술을 습득한다. 교육자가 자신의 whole self를 교수-학습 상호교환에 더 많이 통합시킬수록 그 상호작용의 초점이 교수자와 교육모델이 아닌 학습과 학습과정에 집중될 수 있다.

The trap for many educators is to explore their beliefs around teaching and learning without placing those beliefs in the context of their larger belief systems—their lives. To suggest that edu- cators are able to separate themselves completely from their life outside the teaching event is to deny the very human nature of teaching. The educator is a human who, by connecting with other humans, is facilitating acquisi- tion of insights, knowledge, awareness, affect, or skills. The more fully the educator is able to integrate his or her whole self into the teaching-learning exchange, the more the focus of the exchange can be on the learning and the learning process rather than the teacher and the methods or models for teaching (Tight, 2000).



행동과 신념을 모두 바꾸거나 모두 바꾸지 않는다.

Changing Both Beliefs and Behaviors or Neither.


If through reflection and consideration, an educator finds that her professed belief sys- tem does not match what she truly believes on a deeper level, and that her behaviors do not match what she thinks she truly believes, she can choose to change both or change neither.


Changing both suggests that an educator has found no clarity in either her current philosophy or behavior. This is not to say that an individual seeking to change both philosophy and behavior is not a good teacher, but that this person has discovered that her beliefs may be inconsistent or inher- ently contradictory and that what she does in a teaching event may not always feel genuine. Seeking change of both beliefs and behavior requires intense critical reflection and a willingness to grow in ways that may be somewhat difficult, at least for a while.


There are, of course, those who may find beliefs or behaviors inconsis- tent or forced but choose to change neither. In some cases, the fear of change or the fear of trying something new can create a barrier to an indi- vidual’s willingness to change. In other situations, an educator may truly believe there is no reason to change. Challenging one’s beliefs or behaviors is what we call growth as educators. Not all educators are prepared for, or willing to work on, growth at all times during their careers.


In any of these situations, the educator will be unlikely to growin con- gruence or as a teacher. In other situations, educators may understand that there is potential for growth, but they do not have the tools, resources, or access to tools and resources to know how to effect change.









Teaching Style: Where Are We Now?

  1. Joe E. Heimlich1 and
  2. Emmalou Norland2

Article first published online: 20 MAR 2002

DOI: 10.1002/ace.46

New Directions for Adult and Continuing Education

New Directions for Adult and Continuing Education

Special Issue: Contemporary Viewpoints on Teaching Adults Effectively

Volume 2002Issue 93pages 17–26Spring 2002





의학교육에서 자기성찰(reflection) 가르치기 위한 12가지 팁(Med Teach, 2011)

Twelve tips for teaching reflection at all levels of medical education

Louise Aronson






Ti p 1 Define reflection


Critical reflection은 Mezirow는 다음과 같이 묘사했다.

Critical reflection,by contrast, has been described by Mezirow as follows:


어떻게, 그리고 왜 우리의 예상(presupposition)이 우리가 세상을 인식하고 이해하고 느끼는 방법을 제한하는지 비판적으로 인식하는 과정이다. 또한 이러한 이러한 가정(assumption)을 재형성하여 더 수용적이고, 사리분별이 있고, 투과가능하고, 통합적 관점을 만드는 과정이다. 또한 이러한 새로운 이해에 작용하는 의사결정을 내리는 과정이다. 더 수용적이고, 사리분별이 있고, 투과가능하고, 통합적 관점은 성인이 선택할 수만 있다면 선택하는 우월한 관점이며, 왜냐하면 이것이 그들의 경험의 의미를 덛 잘 이해할 수 있게 동기를 부여하기 때문이다.

...the process of becoming critically aware of how and why our presuppositions have come to constrain the way we perceive, understand, and feel about our world; of reformulating these assumptions to permit a more inclusive, discriminating, permeable and integrative perspective; and of making decisions or otherwise acting on these new understandings. More inclusive, discriminating, permeable and integrative perspectives are superior perspectives that adults choose if they can because they are motivated to better understand the meaning of their experience (Mezirow 1990).


단순히 말하자면, 비판적 성찰은 경험을 분석하고, 경험에 대하여 의문을 가지고, 그것을 경험을 재구성하는 평가를 수행하여 학습(성찰적 학습)을 하고 실천을 개선(성찰적 실천)하는 것이다.

Simply put, critical reflection is the process of analyzing, questioning, and reframing an experience in order to make an assessment of it for the purposes of learning (reflective learning) and/or to improve practice (reflective practice).


효과적인 성찰이란, 따라서, 시간과 노력, 의지를 필요로 한다. 행동에 대한 의문을 가져야 하고, 잠재된 신념과 가치에 의문을 가져야 하며, 다양한 관점을 추구하는데 필요하다. 이 트리플-루프 접근법은 단순히 미래의 비슷한 경험에 대한 대안을 찾는 것(싱글-루프), 또는 결과의 이유를 찾는 것(더블-루프)를 넘어서 그 기저에 깔린 개념틀과 힘의 시스템에 대한 의문을 갖는 것이다.

Effective reflection, then, requires time, effort and a willingness to question actions, underlying beliefs and values and to solicit different viewpoints. This ‘‘triple loop’’ approach moves beyond merely seeking an alternate plan for future similar experiences (single loop) or identifying reasons for the outcome (double loop) to also questioning underlying conceptual frameworks and systems of power (Argyris & Scho¨n 1974; Carr & Kemmis 1986).



(http://managementhelp.org/misc/learning-types-loops.pdf)



성찰을 실천하기 위한 학습 목표를 결정하라

Ti p 2 Decide on learning goals for the reflective exercise


학습 목표를 설정할 때, 교육자들은 다음의 질문을 생각해봐야 한다. 더 집중해야 하는 핵심 역량/태도/내용/기술이 있는가? 어떻게 성찰을 통해서 학습자들이 (1)기존의 지식과 새 학습내용을 통합하고, (2) 인지적 경험과 정서적 경험을 통합하며, (3) 과거와 현재, 현재와 미래를 통합하는지를 배울 수 있을 것인가? 성찰적 학습 또는 성찰기술이 explicit focus가 되어야 하는가?  등등 
In selecting learning goals, educators should answer the following questions: Are there key competencies, attitudes, content areas, or skills in need of greater attention or assessment? How can the exercise be used to help learners integrate (1) new learning with existing knowledge; (2) affec- tive with cognitive experience; and/or (3) past with present or present with future practice? Will reflective learning or reflective skill building be an explicit focus of the exercise? Is one of the goals to identify learning or practice needs and strategies to address them?

성찰을 자극하는 것은 어떤 형태든지 가능하지만, 가장 유용한 것은 학습자로 하여금 "혼란을 주는 딜레마"를 선택하게끔 하는 것이다. 즉, 이 전의 문제해결전략으로는 해결되지 않는 상황을 말한다. 이와 같은 딜레마는 다음과 같은 상황에서 발생한다 (1) 충분한 지식과 기술을 갖추지 않은 상황 (2) 잘 진행되었지만, 왜 그렇게 잘 되었는지 확신하지 못하는 상황 (3) 복잡하거나 놀라운, 임상적으로 불확실한 상황, (4) 개인적, 전문직적으로 도전을 느끼는 상황

Prompts can take any number of forms but are most useful if they ask the learner to choose a ‘‘disorienting dilemma,’’ i.e. a situation that cannot be resolved using previous problem solving strategies (Mezirow 2000). Such dilemmas generally arise from experiences which such triggered questions or concerns, as: (1) a situation where they did not have the necessary knowledge or skills; (2) a situation that went well but they are not entirely sure why; (3) a complex, surprising, or clinically uncertain situation; or (4) a situation in which they felt personally or professionally challenged (Scho¨n 1983).


성찰을 위한 적절한 교수법을 선택하라

Ti p 3 Choose an appropriate instructional method for the reflection


분명히, oral 성찰은 Schon이 말한 "reflection-in-action" 혹은 Eva와 Regehr가 "self-monitoring"이라고 말한 것에 가장 적합하다. 이는 당황스럽거나 곤란한 상황 중에 일어나는 성찰을 말한다. 의학교육에서 대부분의 성찰은 "reflection-on-action"으로 어떤 사건이 벌어진 이후에 일어나는 것이다. 이러한 유형의 성찰에서는 written 연습과 일부 디지털 기록 미디어가 장점이 될 수 있다.

Certainly, oral reflection is most suitable to what Scho¨n called reflection- in-action and what Eva and Regehr call self-monitoring, reflection that occurs during a surprising or troubling experi- ence (Scho¨n 1983; Eva & Regehr 2008). In medical education, most reflection is reflection-on-action which occurs after the event. For this type of reflection, written exercises and perhaps some of the new digitally recorded media offer multiple advantages.


성찰 자극 상황을 만들기 위해서 구조화된 혹은 비구조화된 접근법을 사용할 것인지 결정하라

Ti p 4 Decide whether you will use a structured or unstructured approach and create a prompt


성찰에 대한 교육이나 가이드 없는 상황에서 대부분의 학습자는 학습이 결여된 일화의 기록물만 생성한다. 이것이 일부 학습자 - 그리고 일부 교육자 - 들이 성찰을 거부하는 이유이며, 대부분의 초보 성찰자가 '무성찰'에서 '비판적 성찰'의 연속체 속에서 낮은 위치에 있음을 감안하면, 더 효과적인 방법은 솔직한 가이드와 피드백을 주는 것이다.

Absent guidance and education about reflection, a majority of learners produce reflections which are largely anecdotes devoid of learning (Wong et al. 1995; Niemi 1997). This may in part be why learners – and some educators – object to reflection. given the low placement of most novice reflectors on the continuum of non-reflection to critical reflection, the more efficient approach is to provide both upfront guidance and feedback.


이는 구조화된 자극상황을 활용함으로써 가능한데, 이것은 비판적 성찰의 요소들을 더 명확하게 만들어준다.

This can be done by using a structured prompt which makes explicit the compo- nents of critical reflection: 

  • discussion of processes and assumptions as well as actions and thoughts; 
  • consideration of the role of associated emotions and relevant past experi- ences; 
  • solicitation of feedback and review of relevant literature where appropriate; 
  • explicit notation of lessons learned; and 
  • creation of a plan to improve future behavior and outcomes.

구조화된 성찰에 반대하는 논리들은 그러한 구조가 성찰에서 끌어내고자 하는 응답을 제약하고 비뚤어지게 하며, 통찰력있는 분석보다 생각없이 "정해진 칸을 채워넣는"것만 하게 만들 위험이 있음을 우려한다. 이러한 우려를 낮출 수 있는 한 가지 방법은 자유기술로부터 시작해서 구조화 분석으로 이어지게 하는 것이다. 

Arguments against structured reflections include concerns that structure limits and distorts the very response the exercise is designed to elicit and that it risks encouraging mindless ‘‘recipe following’’ rather than insightful analysis (Boud & Walker 1998; Branch & Paranjape 2002). One potential strategy to mitigate these concerns is to start with a free write approach and follow that with a structured analysis.



윤리적, 정서적 우려에 대한 계획을 마련하라

Ti p 5 Make a plan for dealing with ethical and emotional concerns


성찰은 치료가 아니다. 교육자들은 이것을 처음부터 명확하게 함으로써 부적절한 폭로를 방지해야 한다. 그러나 이렇게 조심하더라도 성찰일지를 읽는 사람은 종종 우려스러운 폭로를 접하곤 한다. 여기에는 작성자의 심리적 스트레스, 부적절한 행위에 대한 기록, 불법적 사실, 작성자 혹은 다른 사람의 문제의 소지가 있는 행동이나 기술 등이 포함된다. 교육자들은 반드시 그러한 내용을 어떻게 다룰 것인지 미리 계획을 세워야 한다. 접근법을 계획할 때 성찰일지가 단순히 상황에 대한 한 가지 관점만 보여준다는 것을 명심해야 하며, 부정확하거나 사실을 호도할 가능성이 있다는 것도 생각해야 한다. 동등하게 불법행위나 학습자/환자/기타 사람들에게 위험이 될 만한 가능성이 기록되어 있을 때 이를 무시하는 것 역시 무책임한 것이 될 수 있다.
Reflection is not therapy. Educators should make this clear at the outset of the exercise so as to avoid inappropriate disclosures. Even with this caveat, however, readers of reflections sometimes will come across concerning revelations. These typically consist of psychological distress on the part of the writer or depictions of unprofessional, illegal, or trouble- some statements or actions by the writer or others. Educators must plan in advance for how they will handle such material. In deciding on an approach, it is crucial to remember that a reflection presents just one viewof a situation and as such may be misleading or inaccurate. Equally, it would be irresponsible to disregard comments which suggest the possibility of illegality or danger to the learner, patients, or others.

이러한 상황에 대한 가장 적절한 대처는 실용적인 혹은 기관 차원의 가이드라인을 마련하여 개개 교육자가 그 다음에 어떤 일을 해야할지를 조직 차원의 지원이 없는 상태로 개별적으로 결정하지 않아도 되게 해주는 것이다. 가이드라인에는 다음의 것이 포함되어야 한다.

The best way of dealing with such situations is to develop programmatic or institutional guide- lines so individual educators do not have to decide on next steps under trying circumstances and manage the situation without organizational support. Some key considerations in designing guidelines include:


  • 작성자의 스트레스 상황: 자기 자신이나 타인에게 위험한 상황인가 아니면 단순히 도움이 필요한 것인가? 도움이 필요한 것이라면 성찰연습을 돕는 교육자가 그 도움을 줄 수 있는가?  In cases of reflector distress: Is the reflector of danger to self or others or merely in need of support? If in need of support, is the educator for the reflection exercise qualified to provide that support and if not, who is? 
  • 부적절 행위: 법적 문제가 있는가? 프로페셔널적인 문제라면 이것이 학습 기회가 되는가? 혹은 징계위원회에 회부되어야 하는가? 아니면 둘 다인가? In cases of inappropriate behavior: Is this a legal issue or a professional one? If the latter, is this a learning opportunity or an occasion for referral to a disciplinary body (or both)? 
  • 노골적/암시적 비난: 누가 어떻게 팩트를 확인할 것인가?  If accusations have been made, implicitly or explicitly, who will determine the facts of the situation and how?



학습자의 후속 계획을 추적할 메커니즘 만들기

Ti p 6 Create a mechanism to follow up on learners’ plan


성찰이란 반복적인 것이다. 성찰의 목표는 경험으로부터 배우는 것이지만, 무엇을 배웠고, 무엇이 유용했는지를 알기 위해서는 다시 현실에 적용되어야 한다. 구조화된 프롬프트에 따른 것이든, 피드백에 의한 것이든 학습자는 학습-격차를 해소하고 스스로 분석을 통해서 행동-가설을 점검해야 한다. 이상적으로는 성찰자는 개인의 경험 차원을 넘어서서 어떻게 그들의 행동이 해당 주제와 관련이 있는지를 명확히 해야 한다. 만약 그렇게 하지 않으면 교육자 혹은 동료가 피드백 시간에 더 넓은 시야에서 보게 도와줘야 한다.

Reflection is iterative. The goal is to learn from experience, but in order to ascertain whether what was learned was useful, it needs to be applied (Kolb 1984). Either in the reflection itself, perhaps with the help of a structured prompt, or in the feedback, the learner should be encouraged to make a plan to address learning gaps or test out behavioral hypotheses generated by their analysis. Ideally, the reflector will state explicitly the relevance of the topic to their practice beyond the individual described experience. If not, educators and/or peers can help them see the larger issue in the feedback session.





학습에 도움이 되는 환경 조성(면학 분위기 조성)

Ti p 7 Create a conducive learning environment


성찰 연습이 성공하기 위해서는 긍정적인 학습환경을 만들어줘야 하며, authentic context를 사용하고, 성찰을 위한 안전하고 지지적인 환경을 조성해줘야 한다. 성찰연습의 authenticity는 성찰연습이 더 넓은 차원의 교육 프로그램, 그리고 성찰연습을 하는 시점에서 학습자의 니즈와 얼마나 잘 연결되어 있는가와 연관된다. 학습자의 현재 활동을 성찰과 연결시키는 데 있어서 좋은 학습목표는 필요조건이지만 충분조건은 아니다. 예를 들어 수술 술기에 대한 성찰은 외과-로테이션 도중에 일어나는 것이 적절하며, 로테이션이 끝나고 외과 지식에 대한 지필고사 전날에는 덜 유용할 것이다.

To succeed, reflective exercises require the establishment of positive learning climate through the use of an authentic context and creation of a safe and supportive environment for reflection. The authenticity of the exercise depends on how well it is tied into the larger educational program and the individual learners’ needs at the time of the exercise. Good learning objectives are necessary but not sufficient to link reflection to the learners’ current activities. For example, reflecting on surgical skills would be appropriate partway through a surgical rotation but less useful at the conclusion of the rotation on the eve of pen-and-paper test of surgical knowledge.


다른 중요한 환경적 요소

Other critical environmen- tal elements include 

  • 충분한 시간 providing enough time for the reflective activity, 
  • 그룹토의 집단의 존중과 지지적 치료 insistence upon respectful and supportive treatment of others in group discussions of reflection, 
  • 후판단 비뚤림 인정, 자신이 바라는 모습이 아니라 진짜 모습을 드러내는 것 explicitly acknowl- edging hindsight bias and the inclination to present an expected rather than an authentic persona, and 
  • 누가 어떤 목적으로 성찰에 접근권한을 가지며 누가 피드백을 주고, 평가는 형성평가인지 총괄평가인지 making clear at the outset who will have access to the reflection and for what purposes, who will provide feedback, and whether assessment will be formative or summative.


성찰을 하라고 시키기 전에 성찰에 대해서 가르치기

Ti p 8 Teach learners about reflection before asking them to do it


reflection과 critical reflection의 융합(conflation)은 교육자들로 하여금 학습자에게 어떻게 성찰을 해야 하는지 가르쳐주지 않고 성찰을 시켜도 된다는 오해를 낳았다. 성찰연습을 시작하기 전에, 교육자들은 학습자들에게 있어 성찰(혹은 비판적 성찰)이 무엇인지 정의해주어야 하며, 성찰의 교육적, 실용적 이점이 무엇인지 근거를 제시해야 하며, 좋은 성찰의 요소가 무엇인지 알려줘야 한다.
The conflation of reflection and critical reflection has led to the misperception that educators can ask learners to reflect without teaching them how to do so first. Before initiating a reflective exercise, educators need to define reflection, (or preferably, critical reflection as discussed above) for their learners, provide them with evidence of the educational and practice-related benefits of reflection, and outline the components of good critical reflections, such as 
  • (1) linking past, present, and future experience; 
  • (2) integrating cognitive and emotional experience; 
  • (3) considering the experience from multiple perspectives; 
  • (4) reframing; 
  • (5) stating the learning or lessons learned; and 
  • (6) planning for future behavior.

피드백과 후속 조치 제공

Ti p 9 Provide feedback and follow-up


피드백은 개인/그룹/교수/동료 등이 가능하며 어떤 피드백도 없는 것 보다는 낫다. 논문을 보면 shared reflection 이 개인차원의 성찰보다 나으며, 자기-평가는 종종 부정확하다. 성찰에 있어 타인은 자신이 보지 못하는 것을 볼 수 있다. 잘 이뤄지면 피드백은 경험에 대한 다양한 관점을 제공해주며, 정서적/인지적 경험의 통합을 도와주며, 경험의 무비판적 수용을 억제해주며, Eva 와 Regehr가 "자기주도적 평가 탐색"을 가이드해준다.

Feedback can be individual, group, faculty, or peer and any feedback is better than none. The literature shows that shared reflection is better than individual and self- assessment is often inaccurate (Branch & Paranjape 2002; Eva & Regehr 2008). In reflection, others often see things the reflector cannot see. When done well, feedback provides multiple perspectives on the experience, supports integration of affective and cognitive experience, discourages uncritical acceptance of experience and guides what Eva and Regehr have called ‘‘self-directed assessment seeking.’’


피드백의 장점(관련된 학습을 도와주고 성찰 기술을 발전시켜줌). 교육자들은 피드백을 제공할 때 내용 뿐만 아니라 학습자의 성찰기술에 대해서도 제공해줘야 한다. 성찰의 다양한 측면에 대해서도 코멘트를 줄 수 있다. 피드백의 목표는 광범위한 피드백이 아니라, 압도적인 피드백이 아닌 도전적일 수 있고/학습목표와 연관되며/교육적으로 유용해야 한다. 2~3개의 핵심 교육 포인트를 목표로 해야 하며, 그 중 하나는 학습자의 성찰기술과 관련되어야 한다.

The nature of the feedback merits note as well since reflective exercises often serve two purposes: addressing the relevant learning objectives and developing reflective skill. Educators should provide feedback not just on the content of a reflection but on the learner’s reflective skill as well. Often, it will be possible to comment on many different aspects of the reflection. The goal should not be comprehensive feedback but feedback which is challenging rather than overwhelming, aligned with the learning objectives, and educationally useful. Aim for 2–3 key teaching points, one of which addresses the learner’s reflective skill.



성찰을 평가하라

Ti p 10 Assess the reflection


평가는 피드백과 연결될 수도 있고 별개로 이뤄질 수도 있다. 피드백의 목표는 심층학습이다. 평가의 목표는 학습을 포함할 수도 있지만, 학습자가 해당 주제에 대해서 했던 성찰에 대한 평가가 될 수도 있다. 평가는 narrative하게 될 수도 있고, validated and reliable한 점수표에 따를 수도 있다. 

Assessment can be linked to or distinct from feedback. The goal of the feedback is deeper learning. The goal of assessment may include learning but also involves evaluation of the learners’ abilities in the topic areas of the reflection and/or in reflection itself. Assessment can be done in narrative by stating judgments about the learners’ abilities or engage- ment with the exercise or by using validated and reliable scoring rubrics (Learman et al. 2008; Wald et al. 2009).


교육자는 평가가 형성평가가 되는지(학습자의 능력 개발을 목표로 함), 학점 결정/진급여부 결정/CME credit 부여 등의 목적을 가지는 총괄평가가 되는지 결정해야 한다. 일부는 성찰의 목표가 피훈련자의 스킬을 배양하고 그것을 직업이 되었을 때 적용할 수 있게 하는 것이므로 성찰에 대한 평가는 반드시 저부담, 형성평가가 되어야 한다고 한다. 어떤 사람들은 온전히 형성평가인 경우에 복잡한 주제 혹은 부정적 평가에 대한 걱정 없이 프로페셔널한 약점에 초점을 둘 수 있다. 그러나 이러한 주장은 성찰-기술에 대한 평가와 성찰자에 대한 평가를 혼동한 것이다. 여러 자료를 보면 평가가 학습을 유도한다.

Educators must decide whether assessment will be forma- tive, with the exclusive goal of developing learners’ abilities, or summative and used for grading purposes in courses or clerkships, advancement in a training program or certification process, or award of continuing medical education (CME) credit. Some have argued that the goal of reflection is to nurture a skill the trainee or practitioner can apply throughout their career so its assessment should always be low stakes and formative. Others believe an exclusively formative approach encourages focus on complex topics and professional vulnerabilities without fear of negative evaluations. But such arguments confuse evaluation of reflective skill with evaluation of the reflector. Extensive data demonstrate that evaluation drives learning.




성찰연습을 더 넓은 차원의 교육과정의 일부가 되게 하라

Ti p 11 Make this exercise part of a larger curriculum to encourage reflection 


피훈련자에게 있어서 최선의 접근법은 지속적-통합-교육과정으로서, 학습자가 교육과정을 거치면서 성찰기술과 실제 적용 모두에 대한 다양한 이정표를 달성하게끔 하는 것이다. 학생 수준에서 잠재적 궤적은 다음과 같다.

For trainees, the best approach to developing reflective skills may be a longitudinal integrated curriculum with different mileposts in terms of both reflective skills and application contexts as the learner moves through their professional program. At the student level,with for example, one potential trajectory might 

  • 비판적 성찰의 요소를 이해 begin understanding the components of critical reflection, 
  • 이들 요소를 학습전략/임상관련기술에 적용하는 능력을 시범보여주며, 전임상 시기에 연습할 수 있게 함
    move to demonstrating the ability to apply those components to learning strategies and/or clinically relevant skills which can be practiced in the preclinical years such as leadership or teamwork, then 
  • 비판적 성찰을 임상실습과 임상추론에 적용함
    apply critical reflection to clinical practice and clinical reasoning, and 
  • 수련기간의 발달에 대해서 마지막으로 비판적으로 성찰함 
    finally critically reflect on their development over the course of the training period.


성찰 교육 과정을 성찰하기

Ti p 12 Reflect on the process of teaching reflection


당신이 가르치는 기술을 실제로 적용하라

Practice the skills you are teaching


피드백을 받을 수 있는 사람을 찾아라. 구조화된 접근법의 피드백 사용한다면, 그 사람으로 하게끔 당신의 성찰에 대해서 코멘트 할 때 그 포멧을 사용하게끔 하라

Identify someone from whom to seek feedback. If you will take a structured approach to feedback, have that person use your format to comment on your reflection.


그리고나서 당신의 성찰연습을 다시 점검하고 더 효과적으로 수정하며, 잠재적 오류를 회피할 수 있게끔 하라.

You can then re-examine your reflective exercise and modify it to more effectively avoid the potential pitfalls described by Boud and Walker, including: 

  • recipe following, 
  • reflection without learning, 
  • mismatch between the exercise and its learning context, 
  • intellectualizing, 
  • inappropriate disclosure,
  • uncritical acceptance of experience, and 
  • raising issues beyond the educator’s expertise (Boud & Walker 1998).









 2011;33(3):200-5. doi: 10.3109/0142159X.2010.507714. Epub 2010 Sep 27.

Twelve tips for teaching reflection at all levels of medical education.

Author information

  • 1Department of Medicine, Division of Geriatrics, University of California, 3333 California St, Suite 380, San Francisco, CA 94118, USA. aronsonl@medicine.ucsf.edu

Abstract

BACKGROUND:

Review of studies published in medical education journals over the last decade reveals a diversity of pedagogical approaches and educational goals related to teaching reflection.

AIM:

The following tips outline an approach to the design, implementation, and evaluation of reflection in medical education.

METHOD:

The method is based on the available literature and the author's experience. They are organized in the sequence that an educator might use in developing a reflective activity.

RESULTS:

The 12 tips provide guidance from conceptualization and structure of the reflective exercise to implementation and feedback and assessment. The final tip relates to the development of the faculty member's own reflective ability.

CONCLUSION:

With a better understanding of the conceptual frameworks underlying critical reflection and greater advance planning, medicaleducators will be able to create exercises and longitudinal curricula that not only enable greater learning from the experience being reflected upon but also develop reflective skills for life-long learning.

PMID:
 
20874014
 
[PubMed - indexed for MEDLINE]


The Education Review Board: 의학교육에서의 COI를 방지하는 메커니즘(Acad Med, 2015)

The Education Review Board: A Mechanism for Managing Potential Conflicts of Interest in Medical Education


Jonathan F. Borus, MD, Erik K. Alexander, MD, Barbara E. Bierer, MD, F. Richard Bringhurst, MD, Christopher Clark, JD, Kaley E. Klanica, JD, MPH, Erin C. Stewart, and Lawrence S. Friedman, MD




NIH와 CMMS에서 의학 연구와 교육 지원이 줄어들면서 일부 AMC는 산업/기업으로 하여금 academic mission을 위한 지원을 찾고 있다. 산업/기업에서의 지원은 전문직 단체와 입법 기관으로부터 면밀한 감사를 받고 있으며, Macy Foundation, Institute of Medicine, and Association of American Medical Colleges 등이 이러한 관계에 대해서 비판하는 보고서를 발간한 바 있다.

This heightened concern developed during a period of shrinking federal support for medical research and education from the National Institutes of Health and from the Centers for Medicare and Medicaid Services, which prompted some academic medical centers (AMCs) to look to industry to help fund their academic missions. Industry support has been closely scrutinized by professional organizations as well as by legislative bodies, and the Macy Foundation, Institute of Medicine, and Association of American Medical Colleges all have issued reports criticizing some of these relationships.4–6


2007년 Partners HealthCare System (이하 Partners)는 기업체와 Partners 제공자와 피훈련자의 잠재적 conflict에 대한 해결책을 찾기 위한 '산업/기업과의 관계에 대한 위원회'를 설립하였다. 2009년 보고서에서 위원회는 산업체와의 관계가 Partners가 academic mission을 수행하는데 중요하긴 하나, 반드시 투명해야 하고 산업체가 환자 진료, 연구, 교육프로그램에 부적절한 영향을 주지 않도록 유의해야 한다고 지적하였다. 기업체와의 잠재적 conflict에 대한 reporting, reviewing, and managing 을 위한 구체적 권고를 만들었으며, 이 권고를 도입할 세 개의 새로운 기구를 설립하였다.

In 2007, Partners HealthCare System (Partners), the Boston-based health care system that includes two major AMCs (Massachusetts General Hospital and Brigham and Women’s Hospital); several rehabilitation, psychiatric, and community hospitals; and a large number of affiliated primary care and subspecialty practitioners, established a Commission on Interactions with Industry to explore solutions to potential conflicts in the relationships between industry and Partners providers and trainees. In its 2009 report, the commission concluded that interactions with industry were important to Partners’ ability to carry out its charitable academic mission but must be transparent and managed to ensure that industry does not inappropriately influence patient care, research, or educational programs.9 It made specific recommendations for reporting, reviewing, and managing potential conflicts with industry, and established three new bodies to implement its recommendations: 

  • (1) an Office for Interactions with Industry (OII) that reports to the Partners general counsel and provides legal and administrative support relating to Partners interactions with industry; and two peer professional committees: 
  • (2) an Education Review Board (ERB) to oversee industry support of all Partners-sponsored educational activities and 
  • (3) a Committee on Conflicts of Interest to oversee research and other interactions between industry and Partners-affiliated individuals or groups, including those Partners faculty who speak at non-Partners educational activities.9


ERB의 책임과 개발

Charge and Development of the ERB


ERB의 미션과 목표

Mission and goals of the ERB


연구의 integrity를 위한 IRB와 유사하게, ERB는 Partners의 기업체-지원 교육프로그램의 교육적 integrity를 확실히 하기 위하여 만들어졌다.

Analogous to an institutional review board whose mission is to ensure research integrity, the ERB was created to ensure the educational integrity of Partners industry-supported educational programs.


ERB는 다음의 의무가 있다.

The ERB was charged with 

    • 모든 기업체-지원 교육 프로그램의 검토/승인/감독 
      reviewing, approving, and monitoring all industry-supported educational programs; 
    • 교수의 기업체 스폰서 관련성에 의해서 특수한 관심의 대상이 되는 발표나 프로그램에 대한 추가 검토 
      conducting an additional review of presentations or programs deemed to be of particular concern because of faculty members’ connections to the program’s industry sponsors; 
    • 교육 관련 프리젠테이션과 프로그램에 대한 기업체 자금지원의 양과 출처 검토
      reviewing the amount and source of industry funding for educational presentations and programs; and 
    • 인증기구 기준에 대한 적합성 확인 
      ensuring the adherence of such programs to accrediting body standards. 


또한 다음의 의무가 있다.

The ERB also was charged with 

    • 위원회의 구체적 교욱 권고안을 도입하는 것
      implementing the commission’s specific educational recommendations, foremost among which were that all industry-supported educational activities (presentations, conferences, training programs, etc.) have multiple funders and that all conferences, lectures, or other presentations meet the Accreditation Council for Continuing Medical Education (ACCME) Standards for Commercial Support,10 whether or not the program is offered for CME credit.


ERB의 구조

Structure of the ERB


2009년 설립. 공동위원장

The ERB was established in September 2009; its co-chairs (JFB and LSF) were chosen for their seniority and professorial status, expertise in and commitment to education, and ability to withstand pressure if the ERB made unpopular decisions that required changes in relationships with, and potential support of, industry.


월간 정기 미팅

The ERB has met monthly since January 2010 

    • to develop guidelines for system-wide implementation of the commission’s recommendations, 
    • to review and monitor individual educational programs seeking support from industry, and 
    • to use its experience with such “cases” to improve ways of avoiding both actual and perceived conflicts of interest.


기업체 지원에 대한 ERB 가이드라인 개발

Development of the ERB Guidelines for Industry Support





첫 4년간, multi-funder framework에 따라서 여러 영역에 대한 가이드라인 개발

During the first four years, within this multifunder framework, the ERB developed guidelines in a number of areas, including 

  • 컨퍼런스/펠로우십/외부 교육 컨퍼런스/훈련 프로그램에 대한 기업체 지원 
    industry support 
    of educational conferences, clinical fellowships, and trainees’ expenses for attending external educational conferences and training programs; 
  • 교재 및 다른 교육 자료 선물
    gifts of textbooks and other educational 
    materials; 
  • Partners 교육 활동과 관련된 홍보 기회
    promotional opportunities 
    associated with Partners educational activities;
  • 기업체와 계약에 따른 Partners 교육활동 
    Partners educational activities 
    under contract with an industry entity; and 
  • Partners 자원을 활용하는 기업체-운영 교육 프로그램
    industry-run educational programs 
    using Partners resources. 


위의 각 영역에 대한 가이드라인은 Appendix 1에 있음

The ERB deliberated on the details of each of these guidelines to find the right balance between receptivity to funding from industry and the actuality, or appearance, of inappropriate industry influence on educational programs. The ERB was especially sensitive to the vulnerability of trainees to such influence. See Appendix 1 for a summary of these guidelines.



ERB 가이드라인에 대한 반응

Responses to the ERB Guidelines


일반적으로 내부적으로 잘 받아들여지고 있다. HMS 등에서 비슷한 가이드라인을 적용함.

The ERB guidelines generally have been well received internally and independently; Harvard Medical School (HMS), the academic affiliate of most Partners practitioners, as well as several non-Partners HMS-affiliated hospitals not subject to the ERB’s jurisdiction, adopted similar guidelines for educational programs in 2011.


그러나 기업체 지원에 익술해진 채로 성장해온 일부 Partners 커뮤니티는 발발하고 있다. "늘 이렇게 해왔다"라고 말하며, "우리 교육 프로그램에 필수적이다", "피훈련자 지원이 불가능하다" 등의 불만을 말한다.

However, some in the Partners community who had grown accustomed to receiving ready support from industry for their educational programs have “pushed back” against the ERB policies. They have argued that “we’ve always done it this way”; “industry support is essential for our educational programs, which are among the best in the country, and the ERB guidelines make it more difficult to obtain that funding”; and “without such support we would not be able to fund trainees, which would decrease unique learning opportunities for young physicians (particularly subspecialty fellows) to receive cutting-edge training.”


또한 일부는 ERB의 정책이 지나치게 행정중심적이며 기업체 스폰서를 discourage할 것을 우려한다.

In addition, some have complained that the ERB’s policies are overly bureaucratic and likely to discourage industry sponsors from funding Partners educational programs when other AMCs have more permissive policies.


ERB는 가이드라인을 더 refine할 수 있다.

The ERB has been open both to refining its guidelines based on case experience or changes in the external environment and to making exceptions to guidelines based on extenuating circumstances.


기업체에서도 일반적으로 ERB 가이드라인을 받아들이고 있다.

Industry funders of Partners educational activities generally seem to have accepted the ERB guidelines.


ERB 이전에는 자료 수집이 없었으나, ERB와 OII는 그러한 자료를 최초로 수집하고자 한다. Partners 교육프로그램에 대한 기업체-Grants의 숫자는 거의 일정한 반면, 총 기업체 자금 지원은 22% 감소.

Before the inception of the ERB, comprehensive system-wide data on Partners educational activities, including the annual number of industry gifts and total amount of industry funding, were not collected, so the ERB and OII were tasked with systematically collecting such data for the first time. Many changes in industry funding at the national level occurred around the same time as the creation of the ERB. Reviewing the data for the first three full calendar years (2011–2013) of ERB activity, we were surprised to find—despite our increased oversight, declining sole-funded industry support, and increasing external constraints on industry support—that the number of industry gifts for Partners educational programs remained relatively stable, while the total amount of industry funding declined approximately 22% (see Figures 1 and 2).








ERB 도입의 성과

Outcomes of Implementing the ERB Guidelines


ERB 도입 이전의 기업체-지원 펠로우십은 지속되고 있으며, 지금은 다수의 자금제공처를 가지고 있다. 기업체-운영 컨퍼런스에 대한 펠로우들의 참가는 제한되며, 펠로우들은 이들 프로그램이 기업체 지원에 따라 이뤄질 때 그 사실을 알게 되며, ERB와 OII에 자금제공자의 영향력에 대한 우려를 알릴 수 있다.

Many of the pre-ERB industry-supported fellowships have continued, but they now have multiple funders, fellows’ participation in industry-run conferences is now limited, and fellows are aware when their program has industry support and are encouraged to notify the ERB and OII if they have concerns about a funder’s influence on their training.


다수-자금제공자 규칙은 컨퍼런스와 훈련 프로그램 모두에서 기업체-자금지원을 더 당야하게 만들었으며, 일부 경우에는 교육활동에 대한 총 기업체-지원이 줄기보다는 늘어났다.

The multi-funder rule has stimulated more widespread requests for industry funding for both conferences and training programs, in some cases resulting in more rather than less total industry support of an educational activity.


프리젠테이션은 더 이상 "X기업체 발표" 자리가 아니며, 펠로우도 역시 "Y기업체 펠로우"로 인식되지 않는다. 교재는 직접 피훈련자에게 제공되지 않으며, 특정 기업에서 제공하는 것도 아니다.

Presentations no longer are seen as the “Company X talk,” and fellows no longer are seen as the “Company Y fellow.” Textbooks may not be given directly to trainees or identified as coming from a particular company.


모든 컨퍼런스는 (CME credit이 있든 없든) ACCME 표준에 따라 열리며, 발표 내용에 대한 책임은 기업-자금제공자가 아니라 교수 자신에게 있다는 것이 명황해졌다.

All conferences, whether for CME credit or not, are held to the ACCME standards, and it is clearer to faculty that they, rather than the industry funders, are responsible for all presentation content.


ERB는 그러한 컨퍼런스에서 '교수'로 간주되는 사람들을 정의했으며, "가짜-교수"참가자를 방지하였다. 

The ERB has defined who are considered faculty at such events, preventing “pseudo- faculty” participants from having their expenses paid.


기업체-제공 교육활동이 소셜 이벤트로 열리는 것이 규제되며, 그러한 교육활동은 Partners 병원이나 근처 호텔에서 열리며, 멀리 떨어진 리조트 타입의 장소에서 열리지 않는다.

Social events as part of industry-supported educational activities have been curbed, and such educational activities are now held in Partners hospitals or nearby hotels rather than more distant, resort-type venues.


교육활동에 대한 예산은 교육 경비만 포함되도록 더 면밀히 감시받게 되며, 모든 추가적 수익은 교육 목적으로 사용되어야 한다.

Educational activities’ budgets have been tightened and are more closely monitored so that only educational expenses are covered; any surplus revenues (resulting from the sum of clinical revenues, registration fees, and industry support) must be used for educational purposes.












 2015 Dec;90(12):1611-7. doi: 10.1097/ACM.0000000000000788.

The Education Review Board: A Mechanism for Managing Potential Conflicts of Interest in Medical Education.

Author information

  • 1J.F. Borus is Stanley Cobb Distinguished Professor, Department of Psychiatry, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts. E.K. Alexander is associate professor, Department of Medicine, Brigham and Women's Hospital and Harvard MedicalSchool, Boston, Massachusetts. B.E. Bierer is professor, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts. F.R. Bringhurst is associate professor, Department of Medicine, Massachusetts General Hospital and Harvard MedicalSchool, Boston, Massachusetts. C. Clark is director, Office for Interactions with Industry, and senior counsel, Office of the General Counsel, Partners HealthCare, Boston, Massachusetts. K.E. Klanica was manager, Office for Interactions with Industry, Partners HealthCare, Boston, Massachusetts, at the time this article was written. She is currently senior associate general counsel, Allina Health, Minneapolis, Minnesota. E.C. Stewart is senior project specialist, Office for Interactions with Industry, Partners HealthCare, Boston, Massachusetts. L.S. Friedman is Anton R. Fried Chair, Department of Medicine, Newton-Wellesley Hospital, Newton, Massachusetts, and professor of medicine, Harvard Medical School and Tufts University School of Medicine, Boston, Massachusetts.

Abstract

Concerns about the influence of industry support on medical education, research, and patient care have increased in both medical and political circles. Some academic medical centers, questioning whether industry support of medical education could be appropriate and not a conflict ofinterest, banned such support. In 2009, a Partners HealthCare System commission concluded that interactions with industry remained important to Partners' charitable academic mission and made recommendations to transparently manage such relationships. An Education Review Board (ERB) was created to oversee and manage all industry support of Partners educational activities.Using a case review method, the ERB developed guidelines to implement the commission's recommendations. A multi-funder rule was established that prohibits industry support from only one company for any Partners educational activity. Within that framework, the ERB established guidelines on industry support of educational conferences, clinical fellowships, and trainees' expenses for attending external educational programs; gifts of textbooks and other educational materials; promotional opportunities associated with Partners educational activities; Partners educational activities under contract with an industry entity; and industry-run programs using Partners resources.Although many changes have resulted from the implementation of the ERB guidelines, the number of industry grants for Partners educational activities has remained relatively stable, and funding for these activities declined only moderately during the first three full calendar years (2011-2013) of ERB oversight. The ERB continually educates both the Partners community and industry about the rationale for its guidelines and its openness to their refinement in response to changes in the external environment.

PMID:
 
26083402
 
[PubMed - in process]


의과대학입학면접: 구조화 면접이 비구조화 면접보다 더 Reliable한가? (Teaching and Learning in Medicine, 2010)

Medical School Preadmission Interviews: Are Structured Interviews More Reliable Than Unstructured Interviews?


Rick Axelson and Clarence Kreiter

Department of Family Medicine, University of Iowa, Iowa City, Iowa, USA

Kristi Ferguson

Office of Consultation and Research in Medical Education, University of Iowa, Iowa City, Iowa, USA

Catherine Solow and Kathi Huebner

Office of Student Affairs and Curriculum, Carver College of Medicine, Iowa City, Iowa, USA





평가점수의 신뢰도를 향상시키는 한 가지 흔한 방법은 구조화면접을 사용하는 것. 면접의 구조는 scoring rubric의 활용, 질문의 효준화, 프로빙의 사용, 기타 요인 등에 따라 정해진다. 그러나 구조화된 면접을 사용하는 것을 지지하는 직접적 근거는 희박하며, Kreiter 등의 연구에 따르면 공정성이나 신뢰도와 관련하여 모든 질문을 모든 지원자에게 동일하게 제시하는 것에 대한 논리적 rationale는 없다. 이러한 결과는 직관에 반하는 것일 지도 모른다. 그러나 면접의 질문은 facet의 무작위 측정으로 받아들여져야 하며, that sampling a small number of questions effectively equates for question difficulty across applicants.

One commonly advocated method for enhancing score reli- ability is to use a structured interview format.3,4 The level of interview structure is defined by the use of a scoring rubric, question standardization, the use of probing, and other factors. There is, however, little direct evidence to support the practice of using structured interviews, and a recent study by Kreiter et al.5 suggests there is no logical rationale related to fairness or reliability that would support presenting the same questions to all applicants. This finding may appear counterintuitive; how- ever, it is easily demonstrated that interview questions should be regarded as a random measurement facet and that sampling a small number of questions effectively equates for question difficulty across applicants.









방법

METHODS


25분간, 2명의 교수가 면접. 각 교수는 면접관이 되면 매년 8~10명의 지원자를 평가함.

The University of Iowa Roy J. and Lucille A. Carver College of Medicine (UICCOM) is a public medical school with a total enrollment of 572 students. As part of the application process to UICCOM, particularly well-qualified candidates are selected to participate in a 25-min interview with two faculty members. A pool of faculty interviewers is recruited by the director of medical admissions each year (average interviewers used per year are approximately 150) to conduct the interviews. Each faculty member interviews approximately eight to ten applicants per year.


면접을 두 파트로 나눴음

Hence, there are two parts to the interview: 

  • (a) a structured component—where candidates are read predetermined ques- tions and their responses are scored on a scale from 1 to 5 using an established scoring rubric, and 
  • (b) an unstructured component—where there is a free-flowing exchange between faculty and the candidate on any appropriate topic of interest to the faculty interviewer and/or the candidate. 

비구조화 파트에 있어서 평가는 5점 척도로. 명백한 scoring rubric은 없었으며, 5 (excellent) and 1 (poor).

Scores ranging from1 to 5 are also awarded on this unstructured portion of the interview but, given the variable nature of these exchanges, are not guided by explicit scoring rules or rubrics. For each of these 5-point rating scales, the anchors are 5 (excellent) and 1 (poor).


면접 진행 프로토콜

The interview protocol is as follows. 

  • 4개 표준질문이 있는 구조화 파트로 시작 Each interview begins with a highly structured component that asks the same four standard questions of all applicants being interviewed. 
  • 질문은 매년 바뀌나 면접질문의 표준 pool 중에서 선정됨 Ques- tions vary somewhat from year to year, but they are drawn from a standard pool of interview questions.1 
    • 지원 동기 In general, these ques- tions ask about applicants’ motivation for pursuing a career in medicine, 
    • 난관 극복 how they might deal with various challenges encoun- tered in practicing medicine, and 
    • 과거 경험, 성격 특성 how applicants’ experiences and/or attributes will enable them to be outstanding physicians. 
  • 질문에 답한 직후 두 명의 평가자는 scoring rubric에 따라서 평가하고 다음 질문으로 넘어감. Immediately following the applicant response to a question, the two faculty raters, guided by a scoring rubric, independently rate each of the applicant’s responses before moving on to the next question. 
  • 후속 질문 불가 Interviewers are not allowed to probe or ask follow- up questions. 
  • 각 질문에 대한 시간 제한은 없음 There is no time limit set for responses to each question; candidate responses are typically about 2 to 3 min per question. 
  • 모든 구조화 질문이 끝난 후, 남은 시간은 개방형 대화 After all the structured questions are completed, the remaining minutes of the interview are devoted to an open conversation with the applicant.

면접관 훈련

  • 처음 참여하는 교수는 모두 훈련대상 Training is provided for all first-time interviewers. 
  • 프로토콜, 구조화 질문의 평가 rubric, 샘플 비디오을 이용한 평가 Train- ing sessions provide faculty with an overview of the interview protocol, scoring rubrics for structured questions, and an oppor- tunity to score some sample (fictitious) videotaped responses using the scoring rubrics. 
  • 샘플 비디오를 본 이후에 트레이너와 토론 After each sample response is viewed and scored, faculty discuss their rationale for awarding a given score with the trainer. 
  • 비구조화 파트에 대해서 리뷰하고, 면접 주제로 적절한 것과 부적절한 것을 강조함. Trainers also review the protocol for the unstructured portion of the interview, emphasizing what are considered appropriate and inappropriate topics for discussion. 
  • 실제 면접 세션에 잘 적응하도록 촉진하기 위하여 처음 면접에 참여하는 면접관은 관찰자로부터 피드백 받음 To facilitate adjustment to actual interview sessions, first-time interviewers receive feedback from an observer who is present during their initial day of interviewing. 
  • 관찰자는 숙련된 면접관으로서 새로 참여하는 면접관이 어긋나갈 수 있는 어떤 부분에 대해서든 피드백을 주는 역할
    Observers are experi- enced interviewers who provide feedback regarding any areas where new interviewers may be straying from the established interview protocols and scoring procedures.


Variance components

Table 5 shows variance components and reliability obtained from two complete interview occasions•× • each employing a struc-tured and unstructured format [p o ] and provides informa-tion related to a complete replication of an interview using both the structured and unstructured format. 






The proportion of person variance for the structured format was 22% compared with 30% for the unstructured format and implies the unstructured format will yield more consistent scores across replications. The universe score correlation between the formats was .82, suggesting the formats may not assess identical attributes of the applicant. 



DISCUSSION


기존의 연구결과와 달리, 비구조화 형식이 평가자간 일치도라는 관점 뿐만 아니라 무작위 복제(평가-재평가) 분석에서도 더 reliable함이 확인되었다. 더 나아가 서로 다른 형식이 - 서로 관련되지만 - 서로 구분되는 구인을 평가하는 것으로 보인다. 전체 점수의 상관관계와 Person X Format 상호작용은 두 개의 형식이 지원자와 관련하여 동일한 구인을 측정하는 것이 아님을 보여준다. 마지막으로 신뢰도가 두 개의 척도를 병합함(구조화+비구조화)으로써 더 높아질 수 있음을 알아내었다. 

Contrary to the predominant view in the research literature, we found that the unstructured format was more reliable from both an interrater rater agreement perspective and in the random replications (test–retest) analysis. Further, it appears that the different formats are measuring related, yet distinct, constructs. The universe score correlation (ru = .82) and Person × Format interaction indicated that the two formats do not measure identical constructs related to the applicant. Last, we found that reliability can be increased by combining the two measures into a composite score. An examination of weighted composite scores indicates a sum score with approximately equal weights on both formats maximizes reliability and the information obtained.








 2010 Oct;22(4):241-5. doi: 10.1080/10401334.2010.511978.

Medical school preadmission interviews: are structured interviews more reliable than unstructured interviews?

Author information

  • 1Department of Family Medicine, University of Iowa, Iowa City, Iowa 52242, USA. rick-axelson@uiowa.edu

Abstract

BACKGROUND:

The medical education research literature consistently recommends a structured format for the medical school preadmissioninterview. There is, however, little direct evidence to support this recommendation.

PURPOSE:

To shed further light on this issue, the present study examines the respective reliability contributions from the structured andunstructured interview components at the University of Iowa.

METHODS:

We conducted three univariate G studies on ratings from 3,043 interviews and one multivariate G study using responses from 168 applicants who interviewed twice.

RESULTS:

Examining interrater reliability and test-retest types of reliability, the unstructured format proved more reliable in both instances. Yet, combining measures from the two interview formats yielded a more reliable score than using either alone.

CONCLUSIONS:

At least from a reliability perspective, the popular advice regarding interview structure may need to be reconsidered. Issues related to validity, fairness, and reliability should be carefully weighed when designing the interview process.

PMID:
 
20936568
 
[PubMed - indexed for MEDLINE]


인턴선발 MMI에서 과거행동면접 vs 상황면접 : 신뢰도와 수용가능도 비교(BMC Med Educ, 2015)

Past-behavioural versus situational questions in a postgraduate admissions multiple mini-interview: a reliability and acceptability comparison

Hiroshi Yoshimura1,2,3*, Hidetaka Kitazono2, Shigeki Fujitani2, Junji Machi2,3, Takuya Saiki4, Yasuyuki Suzuki4 and Gominda Ponnamperuma5






세팅과 참가자

Settings and participants


TBIIMC 개요; 진료과; 미션; 교육목표; MMI 진행

TBUIMC is a Japanese general hospital, which newly introduced three specialty training programmes: internal medicine, surgery, and emergency medicine. To accomplish the trans-specialty mission of ‘fostering high-quality generalist physicians providing holistic patient care’, the educational committee of TBUIMC decided to introduce the Accreditation Council for Graduate Medical Education (ACGME) six general competencies [36] as educational outcomes. In 2013, the MMI took place at the partitioned TBUIMC conference room, in three separate weekends. Of the 26 candidates who applied for the TBUIMC programmes, 13, 10, and 3 were invited for the MMI on the first, the second, and the third day of the MMI, respectively.


면접 진행; 대상자; 면접관; 

Three separate days were set for candidates’ convenience, having better access to selection opportunities in TBUIMC; this facilitated the recruitment process. All candidates were Japanese medical graduates, whose level of training ranged from Post Graduate Year (PGY)-2 to PGY-4. They were either in the second year of, or had concluded the two-year National Obligatory Initial Postgraduate Clinical Training Programme (NOIPCTP), following their graduation from Japanese medical schools, and the Japanese National Licensure Examination [37]. A total of 18 examiners, including TBUIMC’s educational committee members (most of whom were US specialty board certified) and clinical supervisors, were all Japanese physicians in the aforementioned three specialties. All candidates, regardless of their applying specialties or the PGY level, were examined by all examiners, who were randomly allocated to the stations. All examiners stayed within the same station, on all three days.


인터벤션

Intervention


ACGME 여섯 개 역량 중 의학지식 제외; 나머지 다섯 개 역량은 하나당 한 스테이션; 각 역량당 2~8개의 하부 영역; 스테이션당 2명의 평가자; PBQ에서는 STAR Approach 사용. SQ에서 평가자는 독단적으로 probing은 못하게 함.

To base stations on the competencies of the ACGME, except ‘medical knowledge’, 5 stations were created to assess one competency (domain) per station. Out of the 2 to 8 sub-domains in each competency [36], two sub- domains (one for the PBQ, and the other for the SQ) per station were selected so that one PBQ followed by one SQ was administered within the same station (Table 1). The same questions were asked from all candidates. Two examiners were assigned to one station and they alternated questioning roles. 

  • In PBQs, Situation-Task-Action-Result (STAR) approach was applied for guiding interviews [38]. 
  • In SQs, presenting a scenario with a dilemma and making the candidates describe what they would do, in a situation where the candidate had to choose between two or more mutually exclusive courses of action [21,22] were followed by structured probing [27]. Examiners were not allowed to probe independently. 

A sample of instructions to exam- iners for one of the stations is shown in Table 2.




인터뷰가이드





10개 스테이션이면 충분히 reliable하다. 질문의 형태 외에도 다른 요인들이 영향을 미쳤을 것.

The current study suggests that less than 10 stations of the MMI with one examiner per station may be suffi- ciently reliable. In addition to the question format, other structuring processes may have contributed to this, e.g. 

  • 기존에 확립된 프레임워크 basing stations on an established competency framework; 
  • 불필요한 라포형성 최소화 minimising unnecessary rapport building between exam- iners and candidates; 
  • 계획에 따른 동일한 질문 asking exactly the same questions from each candidate with planned probing; 
  • 3개의 구분가능한 평가기준 활용 using three distinguishable rating rubrics; 
  • 구체적 anchor에 따른 평가 rating candidates on points anchored with detailed descriptors; and 
  • 평가자 훈련 providing exam- iner training. 

이러한 구조화 노력이 스테이션 수를 줄이는데 도움을 주었을 것임

These structuring efforts would help reduce the number of stations, especially where only limited examiner resources are available for a relatively smaller number of candidates.


평가자와 지원자가 긍정적(하지만 중등도의) 반응을 보인 것에는 스테이션 면접 형식이 이처럼 고도로 구조화된 것이 기여하는 바가 있을 것임. 흥미롭게도, 본 연구에서는 SQ와 PBQ에 대해 지원자와 면접관의 상반되는 반응을 보여준다. SQ는 지원자가 더 선호하였고, 평가자는 PBQ를 더 선호하였다. 특히 모든 참여자는 현재 MMI가 공평하며 SQ와 PBQ를 모두 사용하는 것의 중요성에 대해 언급하였다. 

As non-medical personnel selection studies have sug- gested [27], the highly structured nature of the station interview formats and other structuring efforts in the present study may be responsible for the positive but modest candidate and examiner reaction compared with previous studies [1,7-9,11-15]. Interestingly, this study also indicates contrasting acceptability for SQs and PBQs amongst candidates and examiners, i.e. SQs being more favourable for candidates as opposed to PBQs be- ing more favourable for examiners. Of particular note, all participants admitted fairness of the current MMI and most expressed importance of using both SQs and PBQs. As to how best PBQs and SQs could be com- bined, the participant reactions could be used as a guide for generating a discussion on both question formats at a given level (undergraduate or postgraduate [founda- tion, specialty, or subspecialty]) of admissions MMIs in the future, as is being discussed in the area of SSPIs in non-medical personnel selection [27].

























 2015 Apr 14;15:75. doi: 10.1186/s12909-015-0361-y.

Past-behavioural versus situational questions in a postgraduate admissions multiple mini-interview: a reliabilityand acceptability comparison.

Author information

  • 1Educational Committee, Prefectural Okinawa Nanbu and Children's Medical Centre, Haebaru Town, Okinawa Prefecture, Japan. yoshimura.hiroshi@gmail.com.
  • 2Educational Committee, Tokyo Bay Urayasu-Ichikawa Medical Centre, Urayasu City, Chiba Prefecture, Japan. yoshimura.hiroshi@gmail.com.
  • 3Department of Surgery, University of Hawaii, John A. Burns School of Medicine, Honolulu, State of Hawaii, USA. yoshimura.hiroshi@gmail.com.
  • 4Educational Committee, Tokyo Bay Urayasu-Ichikawa Medical Centre, Urayasu City, Chiba Prefecture, Japan. hkitazono@gmail.com.
  • 5Educational Committee, Tokyo Bay Urayasu-Ichikawa Medical Centre, Urayasu City, Chiba Prefecture, Japan. shigekifujitani@gmail.com.
  • 6Educational Committee, Tokyo Bay Urayasu-Ichikawa Medical Centre, Urayasu City, Chiba Prefecture, Japan. junji@hawaii.edu.
  • 7Department of Surgery, University of Hawaii, John A. Burns School of Medicine, Honolulu, State of Hawaii, USA. junji@hawaii.edu.
  • 8Medical Education Development Centre, Faculty of Medicine, Gifu University, Gifu City, Gifu Prefecture, Japan. saikitak@gifu-u.ac.jp.
  • 9Medical Education Development Centre, Faculty of Medicine, Gifu University, Gifu City, Gifu Prefecture, Japan. ysuz@gifu-u.ac.jp.
  • 10Faculty of Medicine, University of Colombo, Colombo, Western Province, Sri Lanka. gomindap@hotmail.com.

Abstract

BACKGROUND:

The Multiple Mini-Interview (MMI) mostly uses 'SituationalQuestions (SQs) as an interview format within a station, rather than 'Past-BehaviouralQuestions (PBQs), which are most frequently adopted in traditional single-station personal interviews (SSPIs) for non-medical and medical selection. This study investigated reliability and acceptability of the postgraduate admissions MMI with PBQ and SQ interview formats within MMI stations.

METHODS:

Twenty-six Japanese medical graduates, first completed the two-year national obligatory initial postgraduate clinical training programme and then applied to three specialty training programmes - internal medicine, general surgery, and emergency medicine - in a Japanese teaching hospital, where they underwent the Accreditation Council for Graduate Medical Education (ACGME)-competency-based MMI. This MMI contained five stations, with two examiners per station. In each station, a PBQ, and then an SQ were asked consecutively. PBQ and SQ interview formats were not separated into two different stations, or the order of questioning of PBQs and SQs in individual stations was not changed due to lack of space and experienced examiners. Reliability was analysed for the scores of these two MMI question types. Candidates and examiners were surveyed on this experience.

RESULTS:

The PBQ and SQ formats had generalisability coefficients of 0.822 and 0.821, respectively. With one examiner per station, seven stations could produce a reliability of more than 0.80 in both PBQ and SQ formats. More than 60% of both candidates and examiners felt positive about the overall candidates' ability. All participants liked the fairness of this MMI when compared with the previously experienced SSPI. SQs were perceived more favourable by candidates; in contrast, PBQs were perceived more relevant by examiners.

CONCLUSIONS:

Both PBQs and SQs are equally reliable and acceptable as station interview formats in the postgraduate admissions MMI. However, the use of the two formats within the same station, and with a fixed order, is not the best to maximise its utility as an admission test. Future studies are required to evaluate how best the SQs and PBQs should be combined as station interview formats to enhance reliability, feasibility, acceptability and predictive validity of the MMI.

PMID:
 
25890189
 
[PubMed - indexed for MEDLINE] 
PMCID:
 
PMC4427914
 
Free full text


타당도를 위협하는 것들 (Med Educ, 2004)

Validity threats: overcoming interference with proposed interpretations of assessment data

Steven M Downing1 & Thomas M Haladyna2






타당도란 검사점수 해석에 있어서 meaningfulness가 얼마나 되느냐에 관한 것이다. 

Validity refers to the degree of meaningfulness for any interpretation of a test score. In a previous paper in this series1 validity was discussed and sources of validity evidence based on the Standards for Educational and Psychological Testing2


meaningful interpretation을 훼방하는 모든 것이 타당도를 위협하는 것이다.

Any factors that interfere with the meaningful interpretation of assessment data are a threat to validity.


Messick은 두 개의 주요 위협을 언급했다. 구인-과소반영(CU)와 구인-무관변인(CIV)이다. CU는 내용영역에 대하서 과소-샘플링 혹은 편향된 샘플링을 하는 것이다. CIV는 측정하려는 구인과 무관한 변인에 의해서 생기는 평가자료의 시스템적 에러(systematic error)이다. (무작위 에러(random error)가 아니다).

Messick3 noted 2 major sources of validity threats: construct under-representation (CU) and construct-irrelevant variance (CIV). Construct under-representation refers to the undersampling or biased sampling of the content domain by the assessment instrument. Construct-irrelevant variance refers to systematic error (rather than randomerror) introduced into the assessment data by variables unrelated to the con- struct being measured.




지필고사 

Written examinations


지필고사에서 CU는 너무 짧은 시험 등이 원인이 될 수 있다. 또 다른 예시는 시험문항의 내용이 시험의 blueprint와 맞지 않아서 어떤 영역이 과대반영되거나 어떤 영역이 과소반영 되는 것이다. 수업목표는 고차원의 인지행동인데 시험에서는 낮은 수준의 인지행동만 평가한다거나(암기, 사실인식) 하는 것도 마찬가지다. 또한 미래의 학습과 무관한 사소한(지엽적) 내용에 대해서만 묻는 것도 이에 포함된다.

In a written examination, such as an objective test in a basic science course, CU is exemplified in an exam- ination that is too short to adequately sample the domain being tested. Other examples of CU are: test item content that does not match the examination specifications well, so that some content areas are oversampled while others are undersampled; use of many items that test only low level cognitive beha- viour, such as recall or recognition of facts, while the instructional objectives require higher level cognitive behaviour, such as application or problem solving; and, use of items that test trivial content that is unrelated to future learning.4


시험문항은 적절한 샘플링을 위해서는 일반적으로 30개 이상으로 충분해야 한다.

Tests must have suffi- cient numbers of items in order to sample adequately (generally, at least 30 items)


지필고사에서 CIV는 모든 학생이 아니라 종종 일부 학생에게서만 발생한다. CIV는 의도하지않은, 타겟을 벗어난(off-target) 구인에 대한 측정이며, 일차적으로 관심대상이 되는 구인에 대한 것이 아니고, 따라서 타당도를 위협하게 된다.

Con- struct-irrelevant variance represents systematic noise in the measurement data, often associated with the scores of some but not all examinees. This CIV noise represents the unintended measurement of some construct that is off-target, not associated with the primary construct of interest, and therefore interferes with the validity evidence for assessment data.


CIV는 statistically biased items을 사용한다거나(일부 집단이 과도하게 문제를 잘 풀거나 못 푸는 경우), 혹은 문화적으로 둔감한 언어를 사용하여 학생들을 offend하는 경우가 있다.

Construct-irrelevant variance is also introduced by including statistically biased items6 on which some subgroup of students under- or over-performs compared to their expected performance, or by including test items which offend some students by their use of culturally insensitive language.


만약 문항이 기술된 방식이 학생에게 적합하지 못하면, 읽기능력이 CIV variable이 된다. 자신의 모국어가 아닌 언어로 시험을 치르는 경우 특히 중요하다.

If the reading level of achievement test items is inappropriate for students, reading ability becomes a CIV variable which is unrelated to the construct measured, thereby introducing CIV.7 This reading level issue may be particularly important for students taking tests written in a language that is non-native to them.


CIV의 마지막 사례는 정당화되지 못하는 합격선에 대한 것이다. 모든 합격선을 결정하는 방법은 상대적이든 절대적이든 arbitrary한 것이다. 그럼에도 이러한 방법과 그 결과가 변덕스러워서는 안된다.

A final example of CIV for written tests concerns the use of indefensible passing scores.10 All passing score determination methods, whether relative or absolute, are arbitrary. These methods and their results should not be capricious, however.



Performance examinations

OSCE같은 것은 실제상황의 시뮬레이션이며, 실제 상황은 아니다. 학생들의 수행능력은 훈련된 SP에 의해서 통제된 환경에서 평가하게 되며, 최대치의 수행능력을 요구하는 제한된 수의 선택된 사례에 대해서 평가하게 된다. 이것은 실제 상황에서의 수행능력이 아니고, 체크리스트나 평가스케일에 부여된 의미에 대한 구체적 해석을 통해 어떤 영역에 대한 평가점수를 바탕으로 추론하게 되는 것이다. 

They are simulations of the real world,but are not the real world. The performance of students, rated by trained SPs in a controlled environment on a finite number of selected cases requiring maximum performance, is not actual per- formance in the real world; rather, inferences must be made from performance ratings to the domain of performance, with a specific interpretation or mean-ing attributed to the checklist or rating scale data. 


어떤 domain에 관하여 최소한의 일반화가능한 추론을 위해서는 각각 20분정도 진행되는 약 12명의 SP정도는 필요하다. 충분한 generalisability가 확보되지 않는 것은 CU에 해당한다.

Approximately 12 SP encounters, lasting 20 minutes each, may be required to achieve even minimal generalisability to support inferences to the domain.16 Lack of sufficient generalisability repre- sents a CU threat to validity.


만약 SP가 충분히 잘 훈련되지 않아서 환자의 일반적인 모습을 잘 보여주지 못하는 경우에는 모든 학생이 동일한 환자문제 혹은 자극에 노출되지 않으므로 관심을 갖는 구인이 잘못 해석될 수 있다.

If the SPs are not sufficiently well trained to consistently portray the patient in a standardised manner, The construct of interest is there- fore misrepresented, because all students do not encounter the same patient problem or stimulus.


  • 학생에게 부적절한 난이도 inappropriate difficulty for students
  • 모호한 체크리스트나 평가스케일 checklist or rating scale items that are ambiguous
  • 발견/교정되지 않은 특정 학생 그룹에만 영향을 주는 통계적 비뚤림 Statistical bias for 1 or more subgroups of students, which is undetected and uncorrected,
  • 평가자의 인종/민족 편견 Racial or ethnic rater bias

학생이 SP에게 거짓행동을 할 수도 있으며, 특히 SP 사례의 비-의학적 측면에서 그러할 수 있다. 그 경우 그러한 학생들이 평가를 더 잘 받을 수도 있다.

It is possible for students to bluff SPs, particularly on non-medical aspects of SP cases, making ratings higher for some students than they actually should be.


일반화가능도는 generalizability theory를 활용하여 이러한 유형의 시험에서 반드시 측정되어야한다. 고부담 수행능력 평가에서 일반화가능도 계수는 최소한 0.8이상은 되어야 한다. phi-coefficient 는 criterion-referenced performance examinations (상대적 기준이 아니라 절대적 기준으로 합/불합을 결정하는 시험)에서 적합한 방법이다.

Generalisability must be estimated for most performance-type examina- tions, using generalisability theory.17,18 For high- stakes performance examinations, generalisability coefficients should be at least 0AE80; the phi-coefficient is the appropriate estimate of generalisability for criterion-referenced performance examinations (which have absolute, rather than relative passing scores).16


수행능력을 평가하기 위한 case는 최종적으로 사용되기에 앞서 학생을 대표할 수 있는 집단을 대상으로 미리 테스트를 해보아야 한다.

Performance cases should be pretested with a representative group of students prior to their final use, testing the appropriateness of case difficulty and all other aspects of the case presentation.



임상수행능력 평가

Ratings of clinical performance


의학교육에서 Clerkship이나 Preceptorship에서 임상수행능력 평가는 종종 주요한 평가 수단이다. 이 방법은 주로 현실 그대로의 상황에서 교수가 관찰한 학생의 수행능력에 의존하게 된다.

In medical education, ratings of student clinical performance in clerkships or preceptorships (on the wards) are often a major assessment modality. This method depends primarily on faculty observations of student clinical performance behaviour in a naturalis- tic setting.


이 경우에 CU위협은 관찰 결과가 너무 적은 것 혹은 교수가 평가한 행동의 숫자가 적은 것이다. William 등은 유용하고 해석가능한 충분히 일반화가능한 자료를 얻기 위해서는 7개에서 11개의 독립적 평가가 필요하다고 했다.

The CU threat is exemplified by too few observations of the target or rated behaviour by the faculty raters (Table 1). Williams et al.20 suggest that 7–11 inde- pendent ratings of clinical performance are required to produce sufficiently generalisable data to be useful and interpretable.


주요 CIV위협은 평가자의 systematic error에 의한 것이다. 이러한 측정평가에서 평가자는 측정오류의 주된 원인이나 CIV는 평가자의 엄격/관대 오류, central tendency 오류, 제한된 범위의 점수만 사용(restriction of range) 등과 같은 systematic error와 관련이 있다.  평가자가 평가해야 하는 특질이 무엇인지 외면하게 되면 halo effect가 생길 수 있다.

The major CIV threat is due tosystematic rater error. Raters are the major source of measurement error for these types of observational assessments, but CIVis associated with systematic rater error, such as rater severity or leniency errors, central tendency error (ratingin the centre of the rating scale) and restriction of range (failure touse all the points on the rating scale). The halo rater effect occurs when the rater ignores the traits to be rated and treats all traits as if they were one.


비록 더 많은 훈련을 통해서 부적절한 평가자 영향을 줄일 수는 있지만, 평가자의 엄격/관대 성향에 대응하는 또 다른 방법은 얼마나 엄격/관대한지를 추정하여 최종 평가단계에서 그로 인한 영향을 보정하는 것이다.

Although better training may help to reduce some undesirable rater effects, another way to combat rater severity or leniency error is to estimate the extent of severity (or leniency) and adjust the final ratings to eliminate the unfairness that results from harsh or lenient raters.


평가스케일은 흔히 사용되는 방법인데, 평가문항의 기술이 잘 되어있지 않으면, 즉 평가자가 워딩에 의해 혼란을 겪을 수도 있고, 의도한 특정이 아닌 다른 것을 평가하게 될 수도 있다. 

Rating scales are frequently used for clinical per- formance ratings. If the items are inappropriately written, such that raters are confused by the wording or misled to rate a different student characteristic from that which was intended,


합격/불합격 결정이나 성적을 결정하는 방법도 CIV의 원인이 된다.

the methods used to establish passing scores or grades may be a source of CIV.




안면타당도는?

What about face validity?


'안면타당도'라는 용어는, 비록 일부 의학교육자들이 흔히 사용하는 단어지만 교육측정전문가들 사이에서는 이미 1940년대부터 조롱의 대상이 되어왔다. 안면타당도는 여러 다른 의미를 가질 수 있다. 가장 치명적인 의미는 Mosier에 따르면.."검사의 타당도는 상식(common sense)를 활용하여 그 검사가 시험 상황과 직무 상황 모두에 존재하는 세부 능력을 측정한다는 것을 발견함으로써 가장 잘 결정할 수 있다"와 같은 것이다. 명백하게, 의학교육자들의 논문이나 그들이 쓰는 단어에 안면타당도의 자리는 없다. 따라서 이러한 유형의 안면타당도에 의존하는 것은 타당도의 주요 위협이 된다.

The term face validity, despite its popularity in some medical educators’ usage and vocabulary, has been derided by educational measurement professionals since at least the 1940s. Face validity can have many different meanings. The most pernicious meaning, according to Mosier, is: …the validity of the test is best determined by using common sense in discov- ering that the test measures component abilities which exist both in the test situation and on the job. 23(p 194) Clearly, this meaning of face validity has no place in the literature or vocabulary of medical educators. Thus, reliance on this type of face validity as a major source of validity evidence for assessments is a major threat to validity.


안면타당도는, 위의 정의에 따르면, 근대의 교육측정연구자들에 의해서 지지받지 못한다. 안면타당도는 타당도의 적합한 근거가 될 수 없으며, 다른 여러 타당도 근거 중 어떤 것도 안면타당도가 대체할 수는 없다.

Face validity, in the meaning above, is not endorsed by any contemporary educational meas- urement researchers.24 Face validity is not a legit- imate source of validity evidence and can never substitute for any of the many evidentiary sources of validity.2


그럼에도 안면타당도라는 용어는 종종 의학교육에서 사용된다는 것을 감안하면, 어떠한 정당성을 가질 수는 없을까? 만약 안면타당도라는 용어를 통해서, 어떤 측정이 의도한 구인을 측정하는 것으로 보이는 표면적 퀄리티를 갖는다는 것을 의미한다면(예컨대 SP사례를 통해 병력청취 기술을 판단한다) 이는 그 평가의 필수적 특성을 보여줄 수는 있을지는 몰라도 타당도는 아니다. 이 SP 특징은 학생이나 교수가 그 평가를 받아들일 수 있느냐와 연관이 되고, 따라서 행정가들에게, 심지어는 대중들에게 중요할 수는 있으나 타당도는 아니다. 이러한 식의 안면-비타당도를 회피하자는 것이 Messick의 주장이었다. 타당해보이는 것이 타당도는 아니다. 외관(appearance)는 가설이나 이론에서 유도된, 실제 자료를 바탕으로 지지하거나 반박할 수 있는, 그래서 논리적 주장으로 만들어질 수 있는 과학적 근거가 아니다.

However, as the term face validity is sometimes used in medical education, can it have any legitimate meaning? If by face validity one means that the assessment has superficial qualities that make it appear to measure the intended construct (e.g. the SP case looks like it assesses history taking skills), this may represent an essential characteristic of the assessment, but it is not validity. This SP charac- teristic has to do with acceptance of the assessment by students and faculty or is important for admin- istrators and even the public, but it is not validity. (The avoidance of this type of face invalidity was endorsed by Messick.3) The appearance of validity is not validity; appearance is not scientific evidence, derived from hypothesis and theory, supported or unsupported, more or less, by empirical data and formed into logical arguments.


안면타당도라는 용어를 대체할 수 있는 용어가 필요하다. 예컨대, 만약 객관시험이 관심의 대상이 되는 구인을 측정할 수 있는 것 처럼 보인다면, 그것이 이 시험이 성공하기 위한, 받아들여지고 활용되는데 있어서 시험의 가치와 중요성에 무언가 기여한다고 볼 수 있다. 그러나 이것은 타당도의 충분한 근거는 아니다. 표면적으로 보이는 것, 평가에 대해서 느끼는 것과 제대로 된 타당도 근거가 일치한다는 것은 "알맞음" 또는 "사회정치적 의미"라고 볼 수는 있지만, 명백하게 타당도 근거의 기본적 유형은 아니며, 앞서 언급한 다섯 가지의 타당도의 primary source 중 어떤 것도 이것이 대체할 수는 없다.

Alternative terms for face validity might be consid- ered. For example, if an objective test looks like it measures the achievement construct of interest, one might consider this some type of value-added and important (even essential) trait of the assessment that is required for the overall success of the assessment programme, its acceptance and its utility, but this clearly is not sufficient scientific evidence of validity. The appearance of validity may be necessary, but it is not sufficient evidence of validity. The congruence between the superficial look and feel of the assessment and solid validity evidence might be referred to as congruent or sociopolitical meaningfulness, but it is clearly not a primary type of validity evidence and can not, in any way, substitute for any of the 5 suggested primary sources of validity evidence.2



2 American Educational Research Association, American Psychological Association, National Council on Meas- urement in Education. Standards for Educational and Psychological Testing. Washington, DC: American Edu- cational Research Association 1999.









 2004 Mar;38(3):327-33.

Validity threatsovercoming interference with proposed interpretations of assessment data.

Author information

  • 1University of Illinois at Chicago, College of Medicine, Department of Medical Education, Chicago, Illinois 60612-7309, USA. sdowning@uic.edu

Abstract

CONTEXT:

Factors that interfere with the ability to interpret assessment scores or ratings in the proposed manner threaten validity. To be interpreted in a meaningful manner, all assessments in medical education require sound, scientific evidence of validity.

PURPOSE:

The purpose of this essay is to discuss 2 major threats to validity: construct under-representation (CU) and construct-irrelevant variance (CIV). Examples of each type of threat for written, performance and clinical performance examinations are provided.

DISCUSSION:

The CU threat to validity refers to undersampling the content domain. Using too few items, cases or clinical performance observations to adequately generalise to the domain represents CU. Variables that systematically (rather than randomly) interfere with the ability to meaningfully interpret scores or ratings represent CIV. Issues such as flawed test items written at inappropriate reading levels or statistically biased questions represent CIV in written tests. For performance examinations, such as standardised patient examinations, flawed cases or cases that are too difficult for student ability contribute CIV to the assessment. For clinical performance data, systematic rater error, such as halo or central tendency error, represents CIV. The term face validity is rejected as representative of any type of legitimate validity evidence, although the fact that the appearance of the assessment may be an important characteristic other than validity is acknowledged.

CONCLUSIONS:

There are multiple threats to validity in all types of assessment in medical education. Methods to eliminate or control validitythreats are suggested.

PMID:
 
14996342
 
[PubMed - indexed for MEDLINE]


MMI 기반 선발의 결과와 의과대학 지원자의 인종/민족/사회경제적지위의 관계(Acad Med, 2015)

How Medical School Applicant Race, Ethnicity, and Socioeconomic Status Relate to Multiple Mini-Interview–Based Admissions Outcomes: Findings From One Medical School

Anthony Jerant, MD, Tonya Fancher, MD, MPH, Joshua J. Fenton, MD, MPH, Kevin Fiscella, MD, MPH, Francis Sousa, MD, Peter Franks, MD, and Mark Henderson, MD





MMI 도입에 따라 underrepresented racial/ethnic minority (URM) 집단이나 낮은 SES의 지원자가 어떤 영향을 받았는가에 대한 연구가 적다. 미국 의과대학에 URM과 Low SES 학생의 비율이 불균형을 이루고 있음을 감안하면 중요한 사안이다.

Little studied is how underrepresented racial/ethnic minority (URM) and lower socioeconomic status (SES) applicants may be affected by adoption of the MMI. This is a key issue given that U.S. medical schools admit disproportionately few URM and lower SES individuals.6–8


전통적인 비구조화면접은 오랜 시간 면접관의 편견에 취약하다는 지적이 있었다. 무의식적 편견이 인종/민족 소수자들과 낮은 SES 지원자를 탈락시키는 방향으로 작용하는 것은 의사는 물론 미국에 흔한 현상이다. 면접에서 발생하는 비뚤림의 영향은 구조화를 높임으로서(모호성을 제거하고, 정형화된 구조에 따라 판단하게 하는) 줄일 수 있고, 다양한 평가자의 평가결과를 취함함으로써 개개인의 편견의 영향을 희석시킬 수 있다.

A long-recognized problem with traditional nonstructured interviews is vulnerability to interviewer biases triggered by various applicant characteristics.17–22 Implicit (i.e., unconscious) biases disfavoring racial/ ethnic minority and lower SES persons are common in U.S. society,23 including among physicians.24 The effects of bias during interviews can be reduced by increasing structure (removing ambiguity and, therefore, the tendency to rely on stereotype-driven judgments) and pooling evaluations from multiple raters (potentially diluting or offsetting individual biases).20,25–27


우리가 아는 한 MMI수행능력과 URM, SES의 관련성에 대한 연구는 세 개이다.

Only three studies to our knowledge have explored the associations of medical school applicants’ racial/ethnic minority status or SES with MMI performance.


MMI를 치른 이후 합격에 인종/민족이 영향을 주었는지에 대한 연구는 없다. 혹은 인종/민족, SES가 MMI invitation영향에 대한 연구도 없다.

To our knowledge, no studies have examined whether applicants’ race/ ethnicity influences acceptance following MMI participation, or whether race/ ethnicity or SES influences the likelihoodof being invited to an MMI.



방법

Method


지원, 스크리닝, MMI초청, 일정조정

Application, screening, and MMI invitation and scheduling


다음에 따라서 MMI invitation을 평가함
Faculty evaluated secondary applications for invitation to an MMI based on cumulative GPA and MCAT scores, personal statements, extracurricular activities, recommendation letters, and other characteristics that could contribute to fulfilling the educational and service missions of the school.

MMI절차와 점수

MMI process and scoring


2분-8분, 다음의 10개 주제

The MMI consisted of 10 individual 10-minute stations. At each station, applicants had 2 minutes to read a brief set of instructions, and 8 minutes to address the assigned tasks on entering the room. Nine stations assessed skills in the following domains: 

    • integrity/ ethics, 
    • professionalism, 
    • interpersonal communication, 
    • diversity/cultural awareness, 
    • teamwork, 
    • ability to handle stress, and 
    • problem solving. 
    • An additional station asked applicants to explain their choice to pursue a career in medicine

Most stations were adapted from content developed at McMaster University and marketed by ProFitHR.34


학생의 AMCAS 지원 정보를 모르는 한 명의 숙련된 평가자가 각 스테이션에 배정됨

A single trained rater, blinded to participants’ AMCAS application information, attended each station.


총 216명의 서로 다른 평가자

There were 216 different raters during the study period; 

    • 평균 참가 스테이션 the mean number of MMI stations that each evaluated was 104 (standard deviation [SD] 61.9; range 8–276). 
    • 여성 Women made up 61% of raters. 
    • 평가자 Background Rater professional backgrounds were as follows: physicians, 31%; medical students, 15%; other clinicians (e.g., nurses), 11%; basic science faculty, 6%; patients, 2%; and various nonclinician leaders (e.g., deans), professionals (e.g., lawyers), and high- level administrative staff (e.g., curriculum manager), 35%. 


평가자의 배경이 다양한 것은 다양한 관점이 미래에 온갖 계층의 사람들과 효율적으로 일할 의사를 선발하는데 도움이 된다고 생각했기 때문. 의무적인 평가자 훈련은 입학절차에 대한 1시간의 리뷰, 평가자 역할과 의무, 계급문제를 지양할 필요성 등을 다뤘다.

The range of rater backgrounds reflected the conviction that diverse perspectives are helpful in selecting future physicians who will be able to work effectively with people from all walks of life. Mandatory rater training included a one-hour course reviewing the admissions process, rater roles and duties, and the need to avoid pursuing protected class issues (e.g., race/ethnicity, gender).36


각각 스테이션의 평가 (4점 척도)

At each station, raters scored overall applicant performance using an anchored four-point scale: 

    • 0, < 25th percentile performance (relative to other applicants); 
    • 1, 25th–50th percentile; 
    • 2, 51st–75th percentile; or 
    • 3, > 75th percentile. 

또한 지원자의 의사소통능력과 이해도를 고려하도록 함. 

Raters were instructed to consider both the applicant’s communication abilities and the content (e.g., comprehensiveness) of their statements in assigning ratings. The total MMI score was the mean of each applicant’s individual station scores. Scale internal consistency (Cronbach alpha = 0.67) was comparable to that observed in other MMI studies.2,18,37–41



입학 판정

Acceptance recommendation


Subsequently, the committee made one of the following recommendations: reject, low waitlist, high waitlist, or offer acceptance.


URM 상태

URM status


AMCAS 지원정보를 바탕으로 URM Status를 판단

We determined URM status (URM [black, Southeast Asian, Native American, or Pacific Islander race and/or Hispanic ethnicity] versus not [all other responses]) from self-reported race/ethnicity information in the AMCAS application.



SES 불이익

Socioeconomic disadvantage


AMCAS 지원정보를 바탕으로 SES 척도를 개발

We developed a composite measure of SES using self-reported information in the AMCAS application,


다음의 정보를 활용

The following predictors (yes/ no items except where indicated) were significant and maximized the area under the receiver operating characteristic curve (0.95):

    • fee assistance received for medical school application (yes/no); 
    • childhood spent in an underserved area; 
    • family recipients of family assistance program; 
    • income level category of applicant’s family (< $25,000; $25,000 to < $50,000; $50,000 to < $75,000; or > $75,000); 
    • applicant contributed to family income; 
    • any financial-need-based scholarship(s) in paying for postsecondary education; 
    • percentage of postsecondary education costs contributed by the family; and 
    • parents’ highest level of educational attainment (< high school, high school graduate, some college, or college graduate).


Applicant characteristics



MMI invitation




MMI score



Acceptance recommendation





Discussion


URM지원자는 non-URM지원자보다 MMI invitation을 받을 가능성이 더 낮지 않았고, MMI에서 유사한 정도의 점수를 받았으며, 입학할 가능성은 더 높았다.

Further, URM applicants were no less likely than non-URM applicants to receive an MMI invitation, performed similarly on the MMI, and were just as likely to be recommended for acceptance.


URM과 non-URM지원자 사이의 유사한 MMI점수는 구조화된 면접이 다양한 평가자의 관점을 포함하게 하면서 개개인이 은연중에 가지는 편견으로부터 덜 취약하게 해주는 효과가 있음을 보여준다. 비록 우리가 평가자의 implicit bias를 측정하지는 않았지만, 이렇한 정보는 미국사회에 널리 퍼져있음이 이미 여러 문헌에서 나타난 바 있으며, 의사나 다른 전문직도 예외는 아니고, 의료를 포함하여 다양한 고용면접의 결과에 영향을 준다. 따라서 implicit bias는 우리의 평가자들 사이에도 있었을 것이다. 그러나 이는 net로 보았을 때 유의한 영향은 없었고, URM과 non-URM사이에 차이가 있지도 않았다. 의료분야에 URM 비중이 낮은 것이 이미 많이 인정된 문제인만큼, URM에 대한 안좋은 편견은 (이러한 인종/민족의 문제를 해결하기 위하여 평가자들이 들이는 노력에 따른) URM지원자에 대한 우호적 편견으로 offset할 수 있다.

The similar MMI scores for URM and non-URM participants support the notion that structured interview processes that incorporate the perspectives of multiple evaluators like the MMI may be less vulnerable to the effects of individual evaluator implicit biases.20,25–27 Although we did not measure rater implicit biases regarding racial/ ethnic minorities, such biases have been documented to be pervasive in U.S. society, including among physicians and other professionals,23,24 and can affect the outcomes of employment interviews in various fields including medicine.17,19–22 Thus, it is likely that implicit biases were present among our raters; however, they did not exert a significant net influence, given that mean MMI scores did not differ between URM and non-URM applicants. Because lack of URMs in medicine is a widely acknowledged problem,6,7,13,33,42–44 it is possible that biases against URM applicants were offset by ratings biased in favor of URM applicants, made by raters seeking to address limited racial/ethnic diversity in the physician workforce.


반면, 낮은 SES는 더 낮은 MMI점수를 받았다.

In this context, our finding that lower SES applicants had worse adjusted MMI performance may be cause for concern. 


그럼에도 불구하고, 낮은 SES가 MMI점수에 미치는 영향은 작았다. SES를 0-1로 평가했을 때 그 감소 정도가 0.12정도였다. 또한 낮은 MMI점수는 더 높은 합격률로 offset되었다. 이러한 결과는 AAMC가 지향하는 바와 같이 순전히 metric-based의 지원자 검토보다 더 holistic process로 변하고 있음을 보여준다.

Nonetheless, the decrement in MMI performance with decreasing SES in our study was small: The MMI score (scale of 0–3 points) declined by a mean of 0.12 points across the 0–1 range of the SES score. Further, the lower MMI scores among lower SES applicants were more than offset by their greater likelihood of being invited to an MMI and recommended for acceptance. These findings may reflect the ongoing shift from a purely metric-based applicant review process toward the more holistic process advocated by the Association of American Medical Colleges.12,15


낮은 SES 지원자는 MMI에서 평가하는 생애 경험이 더 적을 수 있다. 더 낮은 MCAT점수를 받은 지원자에 대해서도 유사한 추론이 제기된 바 있다. 덜 부유한 지원자가 postsecondary education기간동안 임금노동을 더 많이 했을 수는 있지만, 그들이 일한 것이 MMI식의 선발절차를 거치진 않았을 것이다. MMI와 같은 유형의 선발절차 경험이 없는 것은 특정 면접 형식에 대한 과거 경험이 유사한 방식의 면접에서 더 높은 점수와 관계됨을 고려할 때 의과대학 MMI에서 약점으로 작용할 수 있다. 또한 낮은 수준의 일자리는 높은 수준의 의사소통, 비판적 사고, 문제해결 등 MMI에서 요구하는 능력 개발을 촉진시키지 않을 가능성이 높으며, 그러한 일자리에 투자하하는 시간이 이들 skill 개발에 장애가 될 것이다.

Lower SES applicants may have fewer life experiences bolstering skills assessed by the MMI. Similar reasoning has been suggested to explain the lower MCAT scores among such applicants.45 Although less affluent applicants are more likely to report paid employment during postsecondary education, their financial circumstances may require taking jobs that do not require MMI-type preemployment screening. Lack of prior experience with MMI-type screening may be a disadvantage in the medical school MMI because prior experience with a particular interview format is associated with better future performance with that format.46 Lower-level jobs also may not facilitate the higher-level communication, critical thinking, and problem-solving skills the MMI assesses, and the time required for such jobs may limit participation in pursuits that build such skills (e.g., scholarly presentations, volunteer clinic work).


기존의 연구를 보면 익숙하지 않은 언어(표현)를 사용하는 것이 낮은 평가로 비뚤리게 하는 요인이 된다고 한다. 지원자의 언어 기술은 면접관의 즉각적 인상을 결정하고, 그 결과 최종 평가에도 영향을 줄 수 있다. 의사인력의 SES 불균형은 인종/민족 불균형보다 관심을 덜 받아왔다. 따라서 면접관이 낮은 SES 지원자에게 우호적으로 bias하려고 의식적으로 신경을 썼을 가능성은 낮다. 

Prior work indicates that applicant factors such as use of language unfamiliar to the typical rater could trigger a biased low rating.20,21 Applicants’ verbal skills have been shown to determine immediate interviewer impressions and, in turn, final appraisals.49 The issue of SES-based physician workforce disparities has received less attention than race/ ethnicity-based disparities.6 Thus, it is less likely that raters consciously biased their evaluations in favor of lower SES applicants to address SES-based physician workforce disparities.


34 Advanced Psychometrics for Transitions Inc. Welcome to ProFitHR. http://www.profithr.com/. Accessed April 4, 2015.




















 2015 Dec;90(12):1667-74. doi: 10.1097/ACM.0000000000000766.

How Medical School Applicant RaceEthnicity, and Socioeconomic Status Relate to Multiple Mini-Interview-Based Admissions OutcomesFindings From One Medical School.

Author information

  • 1A. Jerant is professor, Department of Family and Community Medicine, Center for Healthcare Policy and Research, University of California, Davis,School of Medicine, Sacramento, California. T. Fancher is associate professor, Division of General Internal Medicine, Department of Internal Medicine, University of California, Davis, School of Medicine, Sacramento, California. J.J. Fenton is associate professor, Department of Family and Community Medicine, Center for Healthcare Policy and Research, University of California, Davis, School of Medicine, Sacramento, California. K. Fiscella is professor, Department of Family Medicine, University of Rochester School of Medicine and Dentistry, Rochester, New York. F. Sousa is assistant dean, Admissions and Student Development, and volunteer clinical professor, Department of Internal Medicine, University of California, Davis, School of Medicine, Sacramento, California. P. Franks is professor, Department of Family and Community Medicine, Center for Healthcare Policy and Research, University of California, Davis, School of Medicine, Sacramento, California. M. Henderson is associate dean, Admissions and Outreach, and professor, Division of General Medicine, Department of Internal Medicine, University of California, Davis, School of Medicine, Sacramento, California.

Abstract

PURPOSE:

To examine associations of medical school applicant underrepresented minority (URM) status and socioeconomic status (SES) withMultiple Mini-Interview (MMI) invitation and performance and acceptance recommendation.

METHOD:

The authors conducted a correlational study of applicants submitting secondary applications to the University of California, Davis, Schoolof Medicine, 2011-2013. URM applicants were black, Southeast Asian, Native American, Pacific Islander, and/or Hispanic. SES from eight application variables was modeled (0-1 score, higher score = lower SES). Regression analyses examined associations of URM status and SES with MMI invitation (yes/no), MMI score (mean of 10 station ratings, range 0-3), and admission committee recommendation (accept versus not), adjusting for age, sex, and academic performance.

RESULTS:

Of 7,964 secondary-application applicants, 19.7% were URM and 15.1% self-designated disadvantaged; 1,420 (17.8%) participated in the MMI and were evaluated for acceptance. URM status was not associated with MMI invitation (OR 1.14; 95% CI 0.98 to 1.33), MMI score (0.00-point difference, CI -0.08 to 0.08), or acceptance recommendation (OR 1.08; CI 0.69 to 1.68). Lower SES applicants were more likely to be invited to an MMI (OR 5.95; CI 4.76 to 7.44) and recommended for acceptance (OR 3.28; CI 1.79 to 6.00), but had lower MMI scores (-0.12 points, CI -0.23 to -0.01).

CONCLUSIONS:

MMI-based admissions did not disfavor URM applicants. Lower SES applicants had lower MMI scores but were more likely to be invited to an MMI and recommended for acceptance. Multischool collaborations should examine how MMI-based admissions affect URM and lower SES applicants.

PMID:

 

26017355

 

[PubMed - in process]


MMI에서 면접관의 특성과 평가 점수의 관계(Acad Med, 2004)

The Relationship between Interviewers’ Characteristics and Ratings Assigned during a Multiple Mini-Interview

Kevin W. Eva, PhD, Harold I. Reiter, MD, MSc, Jack Rosenfeld, PhD, and Geoffrey R. Norman, PhD






MMI는 지원자의 수행능력에 대한 신뢰도있는 추정을 가능하게 해주나, 이질적인 평가자들의 서로 다른 vantage point로부터 생길 수 있는 bias에 관심을 둬야 한다.

This Multiple Mini-Interview (MMI) has been shown to provide a reliable estimate of candidates’ perfor- mance,1 but the new protocol demands that attention be paid to the biases that might arise as a result of the different vantage points held by heterogeneous raters.



배경

Background


문제는 내용-특이성이다. 학생선발 결정은 Albanese 등이 지적한 바와 같이, "거의 무한에 가까운 서로 다른 상황에 대해서 발생가능성이 가장 높은 안정적인 특질에 관심이 있다". 비록 그러한 "안정적인 특질"이 존재하느냐에 대한 논쟁은 있지만, 다양한 상황을 맞닥뜨리면서 보여주는 평균적인 수행능력이 어떠한 단일한 상황에서의 모습보다 한 개인의 질(qualities)에 대해서 더 일반화가능하다는 것이 여러 context에서 명확해지고 있다.

The problem is one of content spec- ificity. In making selection decisions, as indicated by Albanese et al. “one is most interested in stable qualities that have a high probability of occurrence in an almost infinite number of different sit- uations.”2,p.317Although debate exists regarding whether such “stable qualities” exist, it has become clear in various con- texts that the average performance an individual displays over the course of many encounters is a more generalizable indication of that individual’s qualities than is any single encounter.5


MMI

The Multiple Mini-Interview


MMI가 입학에서 사용되는 OSCE라고 할 수 있지만, 우리는 이 이름을 바꿨는데, 그 이유는 판단이 객관적이지 않고, 스테이션이 의도적으로 임상과 무관하게 설정되기 때문이다.

Although essen- tially an admissions OSCE, we have opted to change the name of the proto- col to make explicit the facts that the judgments are not objective and the stations are intentionally nonclinical.


이 절차는 입학위원회가 종사하는, MMI를 도입하는 기관의 교육 철학에 따라 영향을 받게 되며, 또한 더 넓은 차원에서 진료행위를 하는 의사의 핵심역량에 대해 설명하는 문헌의 영향을 받는다. 그 절차는 Reiter and Eva에 의해서 개발된 바 있다.

This process should be informed by the educational philosophy adopted by the institution in which the admissions committee works as well as broader documents that out- line the key competencies of practicing physicians.6,7 A process for doing so has been developed by Reiter and Eva.8


기존의 연구를 살펴보면, MMI는 지원자의 역량에 대한 신뢰도높은 평가를 가능하게 해준다. 전반적인 검사의 신뢰도는 스테이션당 평가자보다 스테이션의 숫자를 늘릴 때 더 향상되며, 지원자와 평가자 모두에게 긍정적인 평가를 받는다. 그러나 아직 남겨진 질문은 교수와 비-교수 사이에 평가가 서로 다른가 하는 것이다. McMaster에서 다양성(heterogeneity)는 언제나 근본적인 원칙이었는데, 왜냐하면 학생들의 경험의 폭을 넓혀주는 것이 학업적 경험을 더 풍요롭게 해준다고 믿기 때문이다. 학생들의 다양성을 최대화하기 위하여 면접관들은 다양한 인구집단에서 선발되어왔는데, 여기에는 교수, 학생, 지역사회인사 등이 다 포함된다. 우리가 한 스테이션당 한 명의 면접관을 배치하기 때문에, 교수와 지역사회인사의 평가향상이 서로 일치하는가를 보는 것이 중요하다.

Previous research has shown that the MMI provides a reliable assessment of candidates’ abilities, that the overall test reliability improves to a greater ex-tent by maximizing the number of sta-tions rather than by maximizing the number of observers per station, and that the MMI is viewed positively by both candidates and examiners alike.1Remaining unanswered, however, is the question of whether faculty members and nonfaculty members are distin-guishable by their ratings. At McMas-ter, heterogeneity has always been a fundamental principle because it is be-lieved that breadth of experiences across students enriches the scholastic experi-ence.9 To try to maximize heterogeneity across students, interviewers have tradi-tionally been drawn from various popula-tions, including faculty members, medical students, and individuals from the com-munity at large. As we propose assigning a single interviewer to each station, the question of whether faculty members and individuals from the community assign performance ratings consistent with one another becomes an increasingly impor-tant question.



방법

METHOD


참가자

Participants


In addition, 18 health sciences fac- ulty members and 18 community mem- bers drawn from the legal profession and human resource departments of both local businesses and the university were recruited to act as examiners. In two instances, faculty members had to with- draw—they were replaced with current medical students.


절차

Procedure


On the study weekend, three sessions were run sequentially on each of two days with a 40-minute break for the examiners between sessions. Two examiners were assigned to each station. 

    • 3개는 교수만 Three of the nine stations were staffed by two faculty members, 
    • 3개는 지역사회인사만 three by two community members, and 
    • 3개는 교수와 지역사회인사 각 1명씩 three by one member of each group. 

Before the first MMI on each day the authors of this article met with the examiners to ensure that the procedure was clear, to answer any last-minute queries, and to reinforce that the ratings should be assigned in- dependently.



결과

RESULTS


점수

Scores

internal consistency는 높음. 총점만 사용하기로 함.

Table 1 shows the average score and standard deviation assigned to candi- dates for each of the four items on the evaluation form. The internal consis- tency (i.e., the average relationship be- tween pairs of questions) was found to equal .96, indicating a high degree of redundancy. As a result, only the “over- all performance” score was used in sub- sequent analyses.



To determine whether the ratings faculty members assigned were biased relative to those community members assigned, a repeated measures ANOVA was performed on the data collected within the three stations that were staffed by both a community and a fac- ulty member. The mean score assigned by faculty members (4.66) bordered on being significantly less than that as- signed by community members (4.96; F1,53 3.972, mean squared error 1.790, p .06).




신뢰도 분석

Reliability Analysis



평가자의 특성과 평가 점수와의 관계

The Relationship between Interviewers’ Characteristics and Ratings


두 명의 지역사회인사가 들어간 경우 일반화가능도는 가장 높은 경우 0.58정도였다. 두 명의 교수가 들어간 곳에서는 0.46, 한 명의 교수와 한 명의 지역사회인사가 들어간 경우는 0.31이었다. 각각 짝을 지어 보았을 때 그 차이는 통계적으로 유의했다.

The generaliz- ability for the three stations that were staffed by two community members was highest at .58. The three stations that were staffed by two faculty members revealed the second highest generaliz- ability .46. Least reliable were the three stations that were staffed by one member of each group (generalizability .31). Each pairwise difference is statis- tically significant: .58 versus .46, z(106) 2.78, p .05; .46 versus .31, z(106) 3.12, p .05; .58 versus .31, z(106) 5.90, p .05.


어떤 경우든 MMI의 일반화가능도는 각각 1명씩 들어간 경우 가장 낮았고, 둘 간에 larger inconsistency가 있음을 의미한다.

In either case, the generaliz- ability of the MMI appears to be lowest among stations evaluated by one commu- nity member and one faculty member, suggesting that there are larger inconsis- tencies in the way that community mem- bers rate candidates relative to the way that faculty members rate candidates than there are within either group of raters.



Post-MMI Surveys








DISCUSSION


면접이 지원자의 성격을 안정적이고 일반화가능한 수준으로 측정하기 위해서 평가자간 신뢰도를 보여주는 것 만으로는 충분한 근거가 되지 않음을 보여준다. 반면, 지원자가 이 면접과 저 면접 사이에 예측불가능한 형태로 엄청난 차이를 보여준다는 것을 제시한다. 그 결과 한 면접에서의 결과는 다음 면접에서의 결과를 거의 예측해주지 못한다.

These findings suggest that the dem- onstration of adequate interrater reli- ability, which has been used in the past as an argument for standardized inter-views, is insufficient evidence to ensure that an interview is measuring stable and generalizable applicant characteris-tics. By contrast, the findings suggest that applicants will vary considerably,in unpredictable fashion, from one in-terview to another. Consequently, the scores derived from any one interview will be a poor predictor of performance in a second interview.


적어도 이 결과는 Ferrier 등이 주장한 '다양한 평가자가 더 다양한 학생군을 만든다'라는 것을 지지한다. 교수와 지역사회인사가 준 평균점수의 차이는 더 많은 평가자 훈련을 통해서 극복가능하겠지만, 점수 차이의 절대값은 각 그룹에 속한 평가자가 동등한 비율로 있다면 문제가 되지는 않을 것이다. 

At the very least these results support Ferrier et al.’s9 claim that using heterogeneous raters may result in a more heterogeneous class. The difference we observed in the mean scores faculty and community rat- ers provide may be overcome with fur- ther training, but the absolute differ- ence in scores will not matter as long as all circuits contain an equal proportion of examiners from each group. It should be noted that the distinction drawn in this study between raters of different backgrounds is very broad.


MMI의 또 다른 장점은 Edward 등이 밝힌 네 가지 입학면접의 목적을 (굳이 한 차례의 면접에 뒤섞지 않고서도) 달성할 수 있다는 것이다. (정보 수집, 의사 결정, 확인, 모집) 또한 전통적인 면접에서 지적된 시간의 비효율적 사용 문제도 극복할 수 있다.

Additional advantages to the MMI include the potential to achieve the four purposes of admissions interviews identified by Edwards et al.4 (i.e., infor- mation gathering, decision making, ver- ification, and recruitment) without con- founding these purposes within a single interview (e.g., one station could be designed as a recruitment station with- out the goal of attracting the best can- didates affecting the rest of the inter- view process). The MMI also corrects for the inefficient use of time that has been identified by Litton-Hawes et al.12 as a problem in more traditional inter- views.


"깐깐한" 혹은 "널럴한" 면접관에게 배정될 가능성이 무작위였지만 더 많은 수의 평가자에 의해 평가되면 이 효과는 사라질 것이다.

Similarly, any chance effects of being randomly assigned to an “easy” or “hard” panel of interviewers will be di- luted with the MMI as candidates are exposed to a greater number of examin- ers.


왜 지역사회인사의 평가가 교수들의 평가보다 더 less consistent 할까?

Of further interest is the finding that community members’ ratings were less consistent with those provided by fac- ulty members than were the ratings pro- vided within either group.




8. Reiter HI, Eva KW. Reflecting the relative values of community, faculty, and students in the admissions tools of medical school. Sub- mitted manuscript.


Background: In defining the characteristics of medical students that society and the medical profession find desirable, little effort has been spent assessing the relative value of the dozens of characteristics that have been identified. Furthermore, many institutions go to great lengths to ensure equal representation across stakeholder groups in an effort to maximize the heterogeneity of the pool of students accepted to study medicine; however, the extent to which different stakeholders value different characteristics has yet to be determined. 


Purpose: This study was an attempt to assess the relative value of the characteristics of medical students that society and the medical profession find desirable. 


Methods: Using documents created internationally to identify the core competencies of medical personnel, a series of 7 characteristics were generated for inclusion in a study that adopted the paired comparison technique. Of 347 surveyed, 292 respondents indicated the rank ordering they would assign to each characteristic by circling the more important characteristic in all possible pairings. 


Results: Overwhelmingly,ethical” was deemed to be the most important characteristic on which selection tools should be based. Surprisingly, the pattern of responses was highly consistent regardless of stakeholder group and degree of affiliation with the undergraduate medical program. 


Conclusions: The generalizable features of this study not only include the empirical findings but also demonstrate useful survey protocol that can be adapted by any admission committee to guide the generation of an institution-specific admissions blueprint. A novel protocol that provides the necessary flexibility is discussed.














 2004 Jun;79(6):602-9.

The relationship between interviewers' characteristics and ratings assigned during a multiple mini-interview.

Author information

  • 1Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada. evakw@mcmaster.ca

Abstract

PURPOSE:

To assess the consistency of ratings assigned by health sciences faculty members relative to community members during an innovative admissions protocol called the Multiple Mini-Interview (MMI).

METHOD:

A nine-station MMI was created and 54 candidates to an undergraduate MD program participated in the exercise in Spring 2003. Three stations were staffed with a pair of faculty members, three with a pair of community members, and three with one member of each group. Raters completed a four-item evaluation form. All participants completed post-MMI questionnaires. Generalizability Theory was used to examine the consistency of the ratings provided within each of these three subgroups.

RESULTS:

The overall test reliability was found to be .78 and a Decision Study suggested that admissions committees should distribute their resources by increasing the number of interviews to which candidates are exposed rather than increasing the number of interviewers within each interview. Divergence of ratings was greater within the pairing of community member to faculty member and least for pairings of community members. Participants responded positively to the MMI.

CONCLUSION:

The MMI provides a reliable protocol for assessing the personal qualities of candidates by accounting for context specificity with amultiple sampling approach. Increasing the heterogeneity of interviewers may increase the heterogeneity of the accepted group of candidates. Further work will determine the extent to which different groups of raters provide equally valid (albeit different) judgments.

PMID:
 
15165983
 
[PubMed - indexed for MEDLINE]


MMI 시험 특성: 지원자에게 상상하길 요구하기보다는 회상하길 요구하라(Med Educ, 2014)

Multiple mini-interview test characteristics: ‘tis better to ask candidates to recall than to imagine

Kevin W Eva1 & Catherine Macala2





MMI는 그 정의상 일련의 독립적 관찰을 통해 지원자에 대한 정보를 얻으며(대개 인터뷰의 형태로), 선발을 하는 주체가 되는 기관의 목적이나 이상(desires), 그리고 선발된 학생이 장차 될 전문직의 특성을 바탕으로 blueprint를 만든다. 따라서, MMI는 어떤 평가의 도구나 수단이라기보다는 평가의 프로세스로 봐야 한다. 따라서 "MMI는 무엇을 위해서 하는것인가?"라는 질문은 무의미하며, implementation에 따라서 완전히 달라질 수 있기 때문이다.

By definition, it involves collecting (and aggregat- ing across) a series of brief independent observa- tions of the candidate (typically in the form of interviews), preferably blueprinted against the goals and desires of both the institution making the selection and the profession to which the candidate is applying. As a result, the MMI should be considered a process of assessment rather than a tool or instru- ment, and generic questions such as ‘For what does the MMI select?’ are meaningless because the answer is entirely dependent on implementation.


MCQ를 가지고 다양한 내용을 대표하는 시험을 만들 수 있는 것처럼, 매우 다양한 스테이션들로 MMI를 구성할 수 있다.

Just as one can populate a multiple-choice question (MCQ) examination with questions representative of diverse content areas, one can populate an MMI with highly variable stations.


기존 연구를 살펴보면 일반적인 원칙들을 발견할 수 있다. 신뢰도에 대해서는 관찰의 횟수를 증가시키면 신뢰도가 증가하는데, 10~12개 스테이션에서 plateau에 도달하며, 스테이션당 시간을 늘리는 것의 장점은 별로 없고, 각 상황마다 평가자의 수를 늘리는 것보다는 여러 개의 독립적 상황에 대한 수행능력을 관찰하는 것이 더 효과가 좋다.

Research has identified gen- eral principles, including that the reliability of mea- surement improves with increasing number of observations, often reaching a plateau in the 10–12 range,2 that extending the length of the interactions has little discernible benefit,3 and that observing per- formance across independent situations has a greater beneficial impact on the reliability of measurement than does incorporating the opinions of multiple rat- ers within each situation.4,5



배경

Background


MMI 프로세스는 크게 두 가지에 토대를 둔다. Sampling과 Structure

The MMI process was largely designed on two foundations: sampling and structure.


Sampling이 중요하다는 것은 인간 행동에 대한 trait-based model에 대한 우려로부터 출발했다. 사람을 묘사하는데 쓰이는 단어(똑똑한, 달변의, 전문적인)는 변하지 않는 특성인 것처럼 묘사하지만, 실제 행동을 보면 매우 맥락-특이적이다.

The priority placed on sampling is drawn from empirically derived concerns about trait-based mod- els of human behaviour.6 Whereas the adjectives we use to describe people (e.g. ‘smart’, ‘eloquent’, ‘professional’) imply unwavering features of the individual, behaviour has been shown repeatedly to be context-specific.7


한 가지 임상상황에 대한 단일한 관찰결과가 의미하는 바는 한 사람의 지식에 대해서 한 문항의 MCQ가 말해주는 것과 다를 바가 없다.

One observation tells us no more about an individual’s clinical prowess than one MCQ answer tells us about the extent of an individual’s knowledge base.


8분짜리 면접이 지원자의 능력에 대해 충분히 모든 측면을 보여주지 않는다는 주장과 달리, 우리는 이것을 logistic한 필요에 따른 (약점이 아니라) 강점이라고 본다. 여러 연구를 보면 더 긴 면접시간의 가치는 그저 환상일 뿐이며, 이는 지원자에 대한 면접관의 인상은 매우 빠른 시간내에 형성되기 때문이다. 더 나아가서 시간이 더 많을 경우 지원자가 애초에 면접에서 의도한 방향과 다른 방향으로 비틀어버릴 기회를 준다.

Contrary to the argument that 8- minute selection interviews do not allow sufficient time to yield a full perspective on a candidate’s abil- ity, we view this logistic necessity as a strength rather than a liability. A variety of studies have demon- strated that the added value of longer interviews is illusory as examiners tend to form impressions very quickly.9,10 Further, more time yields greater oppor- tunity for the applicant to sway the conversation to issues that are distinct from the intended focus of the interview.11


9 Ambady N, Bernieri F, Richeson J. Toward a histology of social behaviour: judgmental accuracy from thin slices of the behavioural stream. Adv Exp Soc Psychol 2000;32:201–72.

10 Ambady N, Rosenthal R. Thin slices of expressive behaviour as predictors of interpersonal consequences: a meta-analysis. Psychol Bull 1992;111:256–74.


두 번째 토대인 Structure의 가치는 조금 덜 명확하다. MMI가 처음 만들어졌을 때, panel-based 면접은 면접자간 신뢰도 차이가 크지만 면접이 구조화되면(구체적인 문항을 주면) 더 나아진다고 했다. 비록 직관적으로는 그럴 듯 하지만, 최근의 연구 결과를 보면 이 가정에 대한 의문을 갖게 한다. Kreiter 등은 기존 연구는 간접적 비교만 한다고 지적했다. 다섯 개의 구조화된 질문으로 구성된 25분짜리 의과대학입학면접으로부터 일반화가능도 분석을 통해서 '질문'에 기인하는 variance가 무시할 만한 정도라고 밝혔다. 이로부터 저자들은 다수의 질문을 통해서(즉 sampling을 늘려서) 문항 간 난이도에서 오는 차이를 상쇄시킬 수 있기에, 문항의 구조화에서 얻을 수 있는 장점이 없다는 결론에 이르렀다. 몇 년 후, 같은 기관의 면접에서 비구조화 요소가 구조화 면접에 추가되었고, Axelson은 그 결과로부터 구조화 요소보다 평가자간, 평가-재평가 신뢰도가 높다고 보고했다. 결론은 모호하다.

The value of the second foundation, structure, how- ever, has become less clear over time. When the MMI was created, the literature on panel-based interviewing practices revealed that the inter-rater reliability of such exercises was highly variable, but tended to be greater when interviews were structured by giving interviewers a specific set of questions.16 This remains intuitively appealing, but recent research has led us to question this assump- tion. Kreiter et al.17 critiqued the literature for offering only indirect comparisons. Using data collected from a set of 25-minute medical school selection interviews containing five structured questions, they used generalisability analyses to illustrate that the variance attributable to question had a negligible influence on the reliability observed. These findings led the authors to argue that asking multiple questions (i.e. increased sam- pling) washes out differences in difficulty level across questions such that structuring questions offers no advantage. A few years later, an unstruc- tured component was added to the end of the struc- tured interview at the same institution, and Axelson et al.18 reported that resulting scores had greater inter-rater and test–retest reliability than the struc- tured component. As the authors noted, it is unclear whether the performance of the unstructured interview derived fromthe fact that it followed the structured interview or whether the benefit of such structuring is illusory.


구조화 스테이션을 만드는 것은 MMI 프로세스 도입에 가장 큰 장애라는 점에서 이 질문은 대단히 중요하다. 시험 보안에 관한 우려가 많은 대학으로 하여금 (그것을 예방하고자) 스테이션의 데이터베이스를 구축하거나 구입하게 만들었다(비록 시험 보안 위반에 대한 영향력은 확실하지 않더라도). 만약 MMI의 장점이 구조화와 무관하다는 결론에 이른다면, 즉, 주로 sampling의 효과만 있다면, MMI를 도입하는 비용이 크게 절감될 것이다.

This is an important question because the creation of structured stations is one of the primary barriers to adoption of the MMI process.19 Concern about test security breaches derived from the repeated use of set questions has led most institutions we have encountered to generate or purchase a database of stations to reduce this risk (although the impact of such breaches remains questionable20,21). If the benefits that have been observed to accrue from the adoption of MMI practices are unrelated to struc- ture and, instead, are derived dominantly from the sampling it promotes, then the cost inherent in creating an MMI might be substantially reduced.


MMI에서 가장 흔한 타입의 스테이션은 어떤 이슈와 관련하여 면접관과 토론하게 하는 것인데, 이 때 '관련성'의 정의는 그 기관이 만든 blueprint에 달려있으며, 공개되어있는 예시들을 보면 주로 지원자가 경험하게 될 상황과 관련된 딜레마를 제시하는 경우가 많다. 조직/산업 관련 심리연구 문헌을 보면 그러한 면접 대화는 경험-기반(과거 경험을 떠올리게 하기) 이거나 상황-기반(맞닥뜨릴 상황을 상상하게 하기)이다. 어떤 종류의 면접이 더 효과적인지에 대해서 많은 논란이 있었다. 

The most common type of MMI station involves ask- ing a candidate to discuss an issue of relevance with an examiner. The definition of ‘relevance’ depends on the blueprint the institution establishes, but pub- lished examples indicate a tendency towards describ- ing a dilemma about which the candidate is expected to engage in dialogue. The organisational and industrial psychology literature defines such dialogues as generally being ‘experience-based’ (i.e. candidates are required to recall their particular experiences and the behaviours they demonstrated) or ‘situation-based’ (i.e. candidates are required to imagine and describe what they would do if they were to encounter a particular situation).22 There has been considerable debate in this literature regarding which type of interview is most effective.


상황-기반 면접을 선호하는 사람들은 면접이 미래지향적으로 이뤄져야 하며, 과거에 유사한 경험이 없던 지원자라도 주어진 상황에서 자신의 인적특성을 보여줄 기회가 있어야 한다고 주장한다. 

반면 경험-기반 면접을 선호하는 사람들은 과거의 행동이 미래 행동의 가장 정확한 예측인자라고 주장하며, 가상적 상황을 지양하고 과거의 경험에 초점을 둬야 한다고 말한다. 


인상-관리(자기가 어떻게 보이는지를 관리하는 것)이 면접 상황에 따라서 서로 다르게 나타나는데, 상황-기반 면접에서는 환심을 사려는 방향(호감을 유발하고 의견을 동조하게 하는) 으로 나타나며, 경험-기반 면접에서는 자기-홍보 (자신의 성공이 다른 요인보다 스스로의 능력 덕분이다)가 주로 나타난다.

Those who favour situation-based interviewing argue that structure is important and that interviews should be future-oriented so that interviewees with- out previous experience in a given context are granted the opportunity to demonstrate their per- sonal qualities; those who favour experience-based interviewing argue that past behaviour is most pre- dictive of future behaviour and, as a result, one should avoid discussion of the hypothetical and focus on previous experience.20 Impression manage- ment (i.e. attempts to control the image one pro- jects) appears to take place in different ways according to interview type, with situation-based interviews tending to induce ingratiating tactics (i.e. behaviours aimed at inducing liking, such as opin- ion conformity) and experience-based interviewing tending to induce self-promotion (i.e. behaviours aimed at indicating that one’s success is attributable to competence rather than other factors).11



참가자 Participants


4개 서킷, 12개 스테이션, 48명 평가자

Four distinct circuits of 12 stations required the participation of 48 examiners.



문항 Materials


모든 스테이션은 CanMEDS 프레임워크에 기반. 

All stations were focused upon the Professional role promoted within the CanMEDS framework pre- sented by the Royal College of Physicians and Sur- geons of Canada.25 


네 개의 SJ스테이션은 이후 training기간 동안 발생할 수 있는 상황에 대해서 그 상황을 상상하고 어떻게 할지를 물었음.

Four SJ stations were designed around this role, the operational definition being that the station had to present a situation that could plausibly occur during medical training and would require the candidate to imagine and discuss what he or she would do in that situation.


4명의 평가자, 문 앞에 설명, 스테이션 목적에 관한 한 쪽 짜리 설명, 스테이션당 6개까지 문항. 대화를 진행할 것(스크립트처럼 질문만 하지 말고) 질문은 대화를 하는데 도움을 주는 정도. CanMEDS에 대한 설명. 평가지. 6점척도로 세 가지에 대해서 평가 (i) communication skills, (ii) reasoning ability, and (iii) professionalism. 

This information was provided to the four examin- ers who were assigned to that station (one per cir- cuit) and posted on the doors of their rooms for candidates to read. In addition, examiners were given one page of information outlining both the intent of the station and a list of up to six questions they could ask the candidate. They were told that they should engage in actual dialogue with candi- dates rather than treating the list of questions as a script (i.e. the questions were presented simply as prompts that examiners might find useful if conver- sation stalled). Examiners were also given a page of background information outlining aspects of the CanMEDS competencies that were relevant to the situation described, along with a copy of the score- sheet on which they were to offer their assessment. None of the background information or prompting questions contained content that was specific to the instructions given to candidates and thus the same information could be given to examiners in other experimental conditions. The scoresheet consisted of a series of 6-point scales (1 = weak, 2 = below average, 3 = average, 4 = very good, 5 = excellent, 6 = exceptional) on which examiners were asked to rate each candidate’s (i) communication skills, (ii) reasoning ability, and (iii) professionalism. Brief definitions were provided for each quality.


네 개의 BI 스테이션을 위해서 SJ 스테이션을 약간 modify함. 

To generate the four BI stations, each of the SJ sta- tions was modified so that the candidate was instructed to think of a time in which he or she had experienced a situation analogous to the scenario presented in the SJ station.


다른 정보는 SJ 스테이션과 동일

All other information provided to the examiners on these stations was identical to that provided to the SJ station interviewers with the exception of minor wording revisions to ensure that the grammar remained appropriate.


FF스테이션에 대해서는 지원자의 적합성을 평가할 수 있는 대화를 하라고 함. 

To generate the four FF stations, examiners were told simply that we wanted them to conduct a con- versation that would help them evaluate the candi- date’s suitability for the Professional role. They were given the same background information as used in other stations, but the prompting questions were removed. The station instruction, as presented to candidates, said simply:



절차 

Procedure


지원자는 무작위 배정 

Candidates were randomly assigned to a circuit and a starting station.


2분 지시문 숙지, 7분 후 종료, 옆 방 이동. 스테이션 간 3분이 있어서 1분은 지원자 설문 작성, 2분은 다음 스테이션 지시문 숙지

At the start of the MMI, candidates were given 2 minutes to read the first station, after which a buz- zer was sounded to alert them to enter the inter- viewing rooms. Seven minutes later, another buzzer was sounded to indicate that the interview was com- plete and that the candidate should move to the next station. From this point onward, a pause of 3 minutes was provided between stations and candi- dates were asked to spend 1 minute completing a candidate survey about the preceding station and 2 minutes reading and preparing for the next sta- tion.



분석 

Analysis


맥락-특이성은 Applicant x Station 상호작용에 의해서 나타난다. 연구 디자인 상 평가자의 영향을 분리해내기 어렵게 만들며 따라서 순수한 맥락-특이성은 불가능하다. 이러한 연구 설계는 세 가지 이유에 근거한다.

Context specificity is generally indicated by a large Applicant X Station interaction. The design of this study did not allow us the capacity to separate rater influences from station influences and therefore a pure test of context specific- ity is not available. This design decision was based on three reasons: 

  • 평가자 효과는 모든 실험조건에서 나타난다.
    (i) rater effects are likely to be present in all experimental conditions; 
  • 한 스테이션에 한 명의 평가자를 두는 것은 MMI나 OSCE에서 흔한 일이다. 
    (ii) the inclusion of one examiner per station is common practice in MMIs, objective structured clinical examinations (OSCEs) and other comparable assessment activities, and 
  • 기존 연구들을 보면 평가자의 variance는 station variance에 비해서 기여하는 바가 작다.
    (iii) previous work has robustly indicated that rater vari- ance tends to contribute little error relative to station variance.4,5



RESULTS


신뢰도 Reliability


Applicant x Station error가 가장 컸고, 그 다음은 Residual error, 그 다음은 Applicant 였다.

Table 1 reveals that the dominant source of vari- ance in all cases was the Applicant X Station inter- action. The residual error (Item X Station X Applicant [Circuit]) was next most dominant, fol- lowed by Applicant differences, which accounted for 10.0–18.7% of the variance.


Applicant에 따른 variance는 BI > SJ > FF 순이었는데, 이는 BI 스테이션이 지원자간 변별에 가장 뛰어남을 보여준다. Station, Item, Circuit의 main effect와 그것들의 상호작용은 무시할만한 수준이었음. 

The variance attribut- able to Applicant declined from BI to SJ and then to FF stations, suggesting that BI stations offered better capacity to consistently discriminate between applicants relative to the other forms of interview. The main effects of Station, Item and Circuit, and their interactions, were negligible, generally contrib- uting < 3% of the variance in scores.


스테이션간 신뢰도는 스테이션간 평가 결과가 일관되는가에 대한 것으로, BI가 가장 우수하다.

Inter-station reliability, reflecting the extent to which the scores assigned are consistent across stations, suggested that BI stations allowed better measurement than SJ or FF stations.




실제 MMI 결과와의 비교

Relationship to the actual admission MMI

SJ, r = 0.45; BI, r = 0.57, and FF, r = 0.42.

The correlations between the average of the four stations within each station type and the average of the 9-station MMI used for the actual admis- sion decision were: SJ, r = 0.45; BI, r = 0.57, and FF, r = 0.42.



수용가능성

Acceptability


지원자에서 지원자들이 FF가 더 어렵고, 더 긴장을 느낌

In general, candidates considered the FF stations to be more challenging and more anxiety-provoking than either the SJ or BI stations (Table 4). 


평가자의 관점은 유형간 큰 차이가 없었음.

In gen- eral, examiners’ perceptions of their ability to assess candidate performance and the amount of strain MMI stations placed on candidates were insensitive to station type, although BI stations were rated rela- tively low on one question (Table 5).



결론

DISCUSSION


평가프로세스의 질을 평가하기 위한 도구의 다양한 측면이 잘 align 되어있지 않아(신뢰도를 높이면 활용가능도가 떨어짐), 적절한 협상을 하게 된다. 우리는 다양한 결과가 internally 그리고 validity study에 대해서 일관된 결과를 낸다는 것에 놀랐다. 다양한 관찰을 모으는 것 만으로도 중등도의 신뢰도는 도달할 수 있지만(FF 에서 G=0.66), 스테이션을 구조화하는 것은 acceptability는 물론 신뢰도에 있어서도 이득이 있었다. 다만 신뢰도에 대해서는 BI에 대해서만 이득이 있었다. SJ가 신뢰도 측면에서 BI와 같다고 하더라도, feasibility (만들기 쉬움)과 동등한 수용가능성을 고려하면 BI를 쓰는 것이 낫다.

Given that the various aspects of utility used to assess the quality of assessment processes commonly do not align (e.g. increasing reliability tends to decrease feasibility), thereby requiring that compro- mises are made,14 we were surprised by the extent to which the various outcomes considered yielded consistent conclusions both internally and with respect to validity studies that have been conducted in other domains of selection. Although moderate reliability can be achieved simply by aggregating across many observations (G = 0.66 in the FF condi- tion), there did appear to be some benefit from the structuring of stations in terms of both acceptability and reliability, the latter being true only when BI techniques were used (G = 0.77). Even if SJ stations were to be considered equal to BI stations in terms of their reliability, the greater feasibility (i.e. ease of generation) and equivalent acceptability of BI stations would support the prioritising of their use.


추측하건대, BI를 사용하면 - 자신의 경험을 성찰하게 만들고 - MMI 사용에 대한 초창기의 비판 - 지원자가 자신의 과거 자서전적 내용을 설명할 기회가 없다 - 도 극복할 수 있다.

Speculatively, the use of BI stations, which require candidates to reflect on and discuss personal experiences they have had, may also help MMI administrators to address one of the more robust early criticisms of the MMI process, which claims that candidates desire an opportunity to pres- ent autobiographical details during their interview.1







17 Kreiter CD, Solow C, Brennan RL, Yin P, Ferguson K, Huebner K. Examining the influence of using same versus different questions on the reliability of the medical school preadmission interview. Teach Learn Med 2006;18 (1):4–8.


18 Axelson R, Kreiter C, Ferguson K, Solow C, Huebner K. Medical school preadmission interviews: are structured interviews more reliable than unstructured interviews? Teach Learn Med 2010;22 (4):241–5.


20 Reiter HI, Salvatori P, Rosenfeld J, Trinh K, Eva KW. The effect of defined violations of test security on admissions outcomes using multiple mini-interviews. Med Educ 2006;40:36–42. 


21 Griffin B, Harding DW, Wilson IG, Yeomans ND. Does practice make perfect? The effect of coaching and retesting on selection tests used for admission to an Australian medical school. Med J Aust 2008;189:270–3.


23 Taylor PJ, Small B. Asking applicants what they would do versus what they did do: a meta-analytic comparison of situational and past behaviour employment interview questions. J Occup Organ Psychol 2002;75 (3):277–94.


24 Klehe U-C, Latham G. What would you do – really or ideally? Constructs underlying the behaviour description interview and the situation interview in predicting typical versus maximum performance. Hum Perform 2006;19:357–82.


















 2014 Jun;48(6):604-13. doi: 10.1111/medu.12402.

Multiple mini-interview test characteristics: 'tis better to ask candidates to recall than to imagine.

Author information

  • 1Centre for Health Education Scholarship, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada.

Abstract

CONTEXT:

The multiple mini-interview (MMI), used to facilitate the selection of applicants in health professional programmes, has been shown to be capable of generating reliable data predictive of success. It is a process rather than a single instrument and therefore its psychometric properties can be expected to vary according to the stations generated, the alignment between the stations and the qualities an institution prioritises, and the outcomes used. The purpose of this study was to explore the MMI's test characteristics when station type is manipulated.

METHODS:

A 12-station MMI was established in which four stations were presented in three different ways. These included: situational judgement (SJ) stations, in which applicants were asked to imagine what they would do in specific situations; behavioural interview (BI) stations, in which applicants were asked to recall what they did in experienced situations, and free form (FF) stations, which were unstructured in that the examiner was simply given a brief explanation of the intent of the station without further guidance on how to conduct the discussion. Four circuits of the 12 stations were run with one examiner within each station. Candidates and examiners were surveyed regarding their experience. The reliability of the scores derived from the assessment was analysed separately for each station type.

RESULTS:

A total of 41 medical school candidates participated after completing the regular admission process. Although the score assigned did not differ across station type, BI stations more reliably differentiated between candidates (g = 0.77) than did the other station types (SJ, g = 0.69; FF, g = 0.66). The correlation between actual MMI scores and BI stations was also greatest (BI, r = 0.57; SJ, r = 0.45; FF, r = 0.42). Candidates' opinions indicated that FF stations were more anxiety-provoking, less clear, and more difficult than structured stations (SJ and BI stations). Examiner opinions indicated equivalence on these measures.

CONCLUSIONS:

The results suggest that structuring stations has value, although that value was gained only through the use of BI stations, in which candidates were asked to recall and discuss a specific experience of relevance to the purpose of the interview station.

© 2014 John Wiley & Sons Ltd.

PMID:
 
24807436
 
[PubMed - indexed for MEDLINE]


MMI 점수가 면접관의 엄격/관대 성향에 따라 보정되어야 하는가? (Med Educ, 2010)

Should candidate scores be adjusted for interviewer stringency or leniency in the multiple mini-interview?

Chris Roberts,1 Imogene Rothnie,2 Nathan Zoanetti3 & Jim Crossley4






Theoretical framework for interviewer performance


평가자와 관련된 오류에는 크게 세 가지가 있다. (1. 엄격/관대, 2. 면접관 주관(지원자 관련, 문항 관련), 3. 상호작용)

There are broadly three areas of interviewer-related error within the MMI,1,4,8 which are expanded upon in Fig. 1.



그러나 복잡한 평가 절차로 인해서 어떤 MMI 결과자료를 가지고도 아직까지 1차 효과 혹은 2차 효과(상호작용)을 정밀하게 추정해내지는 못하고 있다. 이는 기본적으로 대규모의 면접 계획에서 면접관은 문항에 nested 되어있기 때문이다. 현재까지 지원자-간 variance는 22%에서 25% 수준이다. MMI의 난이도에 따른 것은 0-3%, 평가자 관련 요인 중 엄격/관대 성향은 14% 를 차지한다.

However, because of the designs inherent in complex assessment procedures,6 no set of MMI data has thus far allowed for precise estimates of each first-order effect and their second-order interactions using G theory. This is because of confounding within the naturalistic large-scale interviewing plan, in which interviewers are usually nested in MMI questions. Current estimates suggest candidate-to-candidate variance ranges from 22%4 to 25%.1 MMI question difficulty variance is in the range of 0–3%.1,4 Of the interviewer-related factors, interviewer strin- gency ⁄ leniency accounts for 14% of error,4 


면접관의 지원자-특이 주관은 45% 정도에 달하는 것으로 연구된 바도 있다.

Variance reflecting interviewer candidate-specific subjectivity has been estimated to be as high as 45%in a study of assessments which used two interviewers within each station.8


MMI에 참여하는 면접관들이 자신들이 내리는 판단에 대해서, Kumar 등은 면접관이 결정을 내릴 때 생기는 긴장에 대한 preliminary insight를 제공한 바 있다. 

Kumar et al.9 have provided some preliminary insights into the tensions that arise in the process of making such decisions. 

  • 독립적 차원의 의사결정의 가치와 입학생에게 기대되는 수준에 대한 합의
    These highlight, firstly, the contrast between appre- ciation of independent decision making and the need to achieve a consensus around the standards expected of entry-level students. 
  • 의사소통기술과 대비하여 입학생 수준에서 요구되는 추론능력을 평가한다고 느낌
    The second source of tension concerns the extent to which interviewers may feel they are assessing entry-level reasoning skills in professionalism domains compared with communications skills. 
  • 어떻게 면접관이 지원자에 대한 주관적 판단을 극복할 수 있을까? 
    The third source relates to how interviewers overcome their subjectivity towards certain candidates and 
  • '탈락하는' 지원자에 대한 우려를 어떻게 극복할 것인가?
    the fourth to how they handle their concerns over ‘failing’ candidates. 
  • 참가자들은 적극적으로 면접관과의 상호작용을 통해서 자기 자신에 대한 긍정적 판단을 이끌어내고자 노력하며, 이는 대답의 질과는 무관하다.
    Finally, candidates are actively interacting with interviewers using their impression management skills to promote a favourable decision for themselves, which is not necessarily related to the quality of their answers.9



방법론적 접근

Methodological approaches


IRT 사용

Researchers have turned to item response theory (IRT)11 to provide this opportunity.


MFRM 사용

Roberts et al.12 applied multi-faceted Rasch modelling (MFRM) to the MMI, but they focused on differences in the performance of MMI questions in an item bank rather than on differences between the interviewers themselves. However, they did note that questions appeared to be measuring a unidimensional con- struct, ‘entry-level reasoning skills in professionalism’, as suggested by a good fit to the IRT model.12 The consistency of judgements within and between judges and candidates has been the focus of a number of papers.13–17 IRT software such as FACETS provides easily derived estimates of candidate ability, inter- viewer stringency ⁄ leniency and question difficulty.


'관찰평균점수'는 raw score에 기반한 점수이며 'fair average score'는 다른 모든 facet의 요소들이 평균값일 경우를 가정한 점수이다. 이러한 세팅에서 FAS는 면접관 엄격/관대 성향에 따라 보정된 점수이다.

An ‘observed average score’ is the average rating based on raw scores received by the candidate. The ‘fair average score’ is the measure that would have been observed if all the measures of the other elements on all other facets had been located at the average measure.18 In this setting, the fair average for candidates is the score that has been adjusted for interviewer stringency ⁄ leniency and question difficulty.


McManus는 엄격/관대 성향에 따라서 보정하면 95.9%는 바뀌지 않지만 2.6%가 원점수로는 탈락이지만 합격하게 되며, 1.5%가 원점수로는 합격하나 보정후 탈락함을 보였다. Harasym은 11%의 지원자가 영향을 받을 수 있다고 했다. 

For exam- ple, in the case of a clinical examination for entry into a professional college, McManus et al.14 found that if examination scores were adjusted for examiner stringency ⁄ leniency and the same pass mark was kept, the outcome for 95.9% of candidates would be unchanged using adjusted marks, whereas 2.6% of candidates would pass, although they had failed on the basis of raw marks, and 1.5%of candidates would fail, despite having passed on the basis of raw marks. However, Harasym17 estimated that as many as 11% of candidates in an MMI might be affected by adjusting for interviewer stringency ⁄ leniency,




Psychometric analysis


소프트웨어 

Multi-facet Rasch modelling was used in FACETS Version 3.65 (Winsteps.com, Chicago, IL, USA) to perform a concurrent estimation of several indepen- dent first-order facets and their associated error variances. A model was specified that included identification of the individual facets, the rating scale and how the interviewer was expected to interact with the rating scale.



세팅 

Setting


Details of the MMI design principles have been reported elsewhere.4,9,12 Candidates were applying to a 4-year, graduate-entry, problem-based learning (PBL) programme. From 2007 onwards, candidates were applying for medicine or dentistry or both. The MMI in this study was designed to assess entry-level reasoning skills in professionalism and had eight stations, with each candidate rotating through the circuit and meeting a different single interviewer at each station. Questions were sourced from a preprepared bank and took the format of a non-clinical scenario followed by structured prompts. Each question had five prompts marked with a 4-point Likert scale, giving a total of 20 raw marks per station and 160 for the whole assessment. In this design, although the performance of a candidate on any particular MMI question was assessed once only by a single interviewer, the total performance was rated by eight interviewers. Furthermore, each MMI question was assessed by several interviewers during the course of the MMI process. This created a network through which every parameter was linked to every other parameter with these connecting observations, allowing the measures estimated from the observations to be placed on one common scale.11 This naturalistic interviewing plan also allowed for the partially nested G study design.4



평가자

Interviewers

각 면접관은 평균 22명의 지원자를 면접함. 교수 89명, 지역사회인사 47명, 졸업생 39명.

Each interviewer had interviewed a median of 22 candidates (SD 18.44, range 4–121). Complete details were available in the database for 117 interviewers. Of the 207 used, 88 interviewers were known to be male and 95 were known to be female. Twenty-two were aged 18–34 years, 27 were aged 35–44 years and 68 were aged > 45 years. They included 89 faculty members, 47 community members and 39 graduates.


MFRM

Multi-facet Rasch modelling


Y축이 위로 갈수록 면접관이 엄격해지고, 지원자 능력이 높아지고, 난이도가 높아짐.

Reading the ruler (Fig. 2) from bottom to top shows increasing interviewer stringency, increasing candi- date ability and increasing question difficulty.


Fig 2와 Table 1 모두 면접관이 MMI 문항보다 더 variable함을 보여줌.

Both Fig. 2 and Table 1 show that interviewers are more variable than MMI ques- tions and the spread of interviewers is nearly 3.5 times that of MMI questions.


면접관 J는 모델의 예측과 over-fitting하여 지나치게 예측가능함, 즉 halo effect의 가능성을 시사하며, 면접관 G는 under-fitting으로 점수를 줄 때 randomness가 심함.

Interviewer J appeared to be over-fitting the model and his or her ratings were too predictable, suggesting a halo effect. Interviewer G seems to be under-fitting the model with too much randomness in his or her scoring.







Making adjustments for interviewer leniency and question difficulty


지원자 E는 엄격한 면접관을 만나서 OAS가 3.5로 낮지만 FAS는 3.64. 

Here, candidate E has a lower observed average score of 3.50, but a higher fair average score of 3.64 because he or she answered harder MMI questions and sawmore stringent interviewers. 


OAS대신 FAS를 사용하면, 합격자 270명중 31명(11.5%)는 합격에서 불합격이 되며, 여기서 중요한 것은 이것이 쌍방 이동의 과정으로, 그 대신 누군가가 합격하는 것이다.

Let us assume a scenario in which the fair average rather than observed average scores are used to rankthe candidates. In our situation, in which 270 studentplaces were on offer, if the MMI were the sole determinant of ranking, 31 of 270 (11.5%) candi- dates who were offered a place on the basis of their observed score rankings would not have been offered a place on the basis of their fair average rankings. This is a two-way movement.





Interviewer goodness-of-fit statistics


For the interviewer, the in fit mean square statistic ranged from 0.74 to 1.58 (mean 1.03, SD 0.74). This was a high-stakes assessment and was similar to a clinical rating situation and well within the accepted lower- and upper-control limits of 0.5 and 1.7 to indicate acceptable model fit.19



Number of candidates examined

면접관의 엄격 성향은 면접한 학생의 수와 유의하게 부적 상관관계가 있었다. 즉, 더 많은 학생을 면접한 경우 더 관대해진다. 이는 McManus의 연구결과와 반대되는 것.

Interviewer stringency ⁄ leniency showed a significant but inverse correlation with the number of candidates examined (r = ) 0.21, n = 207, p = 0.002). Thus, interviewers who interviewed more candidates tended to be somewhat more lenient. McManus et al.14 found examiners became more stringent with more candidates. Our finding contrasts with this, but we do not have data to show whether more lenient interviewers participated in more assessments or whether more interviewing caused interviewers to become more lenient.



시사점

Implications


IRT결과를 variance로 변환하는 과정이 중요하다. MFRM 사용에 관한 내용.

The translation of IRT output into variance compo- nents is important. Some have reported a number of limitations in applying IRT models to assessments which measure the performance of skills or behav- iours, as in the MMI.14 These arose because of claims that the MFRM analysis could not take into account the second-order effects of interviewer-by-station, interviewer-by-candidate and candidate-by-station var- iance. There was concern that, as in an incorrectly designed G study,6 error would be apportioned wrongly and hence any calculation of reliability or standard error of measurement was likely to be inflated. The use of MFRM to isolate variance com- ponents is very new and there has been some misun- derstanding in the medical education literature about how they can be estimated and reported with software such as FACETS. This has inflated reliability estimates undermining the credibility of the IRT method for this type of assessment. For example, McManus et al.14 reported variation between examinees in a clinical examination for entry into a professional college as an unrealistic 87%. This resulted from a calculation which partly assumed that the three first-order effects of examiner, item and person were proportions of 100%and thus neglected to take account of the bias or interactions and the residuals that MFRMalso reports.


FACETS를 활용하여 variance component를 분해할 수 있다.

An iterative relationship between the FACETS software developer and the educational research measure- ment community has ensured that later iterations of FACETS are able to provide the decomposition of variance components, including interactions, with naturalistic data.


MMI 훈련 과정에서 면접관들은 누가 hawk이고 누가 dove인지 피드백을 줘야하느냐에 대한 질문을 한다. 그러나 IRT로 측정하든 GT로 측정하든 MMI에서 엄격/관대 성향은 비교적 일관된 것이라는 점이, McManus의 연구와도 같은 결과이다. 따라서 이것의 함의는 McManus가 제안한 것과 같이, 면접관은 염격/관대 성향을 고치려고 하기보다는 지속적으로 하던대로 하는 것이 낫다.

In MMI training, interviewers often ask whether they should be given feedback on which of them are ‘hawks’ and which are ‘doves’ so that they can try to correct their tendencies to mark higher (leniently) or lower (stringently) on the rating scale. The finding that interviewer stringency ⁄ leniency seems to be a stable characteristic in the MMI, whether measured by IRT or by G theory, is remarkable and echoes the findings of McManus et al.14 in examiner stringency in clinical rating situations. The implications, as McManus et al.14 suggest, is that interviewers should not try to correct their hawkish or dove-like tendencies, but should instead continue to behave as they have always done.


Kumar가 지적한 바와 같이, 면관의 MMI 프로세스에 대한 경험이나 트레이닝의 효과에 대한 이론적 개발이 부족하다.

As Kumar et al.9 have noted, theoretical develop- ment in the area of interviewers’ experience of the process and impact of training is lacking.



13 Downing SM. Threats to the validity of clinical teaching assessments: what about rater error? Med Educ 2005;39:353–5.





















 2010 Jul;44(7):690-8. doi: 10.1111/j.1365-2923.2010.03689.x.

Should candidate scores be adjusted for interviewer stringency or leniency in the multiple mini-interview?

Author information

  • 1Sydney Medical School-Northern, University of Sydney, Sydney, New South Wales, Australia. christopher.roberts@sydney.edu.au

Abstract

CONTEXT:

There are significant levels of variation in candidate multiple mini-interview (MMI) scores caused by interviewer-related factors. Multi-facet Rasch modelling (MFRM) has the capability to both identify these sources of error and partially adjust for them within a measurement model that may be fairer to the candidate.

METHODS:

Using facets software, a variance components analysis estimated sources of measurement error that were comparable with those produced by generalisability theory. Fair average scores for the effects of the stringency/leniency of interviewers and question difficulty were calculated and adjusted rankings of candidates were modelled.

RESULTS:

The decisions of 207 interviewers had an acceptable fit to the MFRM model. For one candidate assessed by one interviewer on one MMI question, 19.1% of the variance reflected candidate ability, 8.9% reflected interviewer stringency/leniency, 5.1% reflected interviewer question-specific stringency/leniency and 2.6% reflected question difficulty. If adjustments were made to candidates' raw scores for interviewerstringency/leniency and question difficulty, 11.5% of candidates would see a significant change in their ranking for selection into the programme. Greater interviewer leniency was associated with the number of candidates interviewed.

CONCLUSIONS:

Interviewers differ in their degree of stringency/leniency and this appears to be a stable characteristic. The MFRM provides a recommendable way of giving a candidate score which adjusts for the stringency/leniency of whichever interviewers the candidate sees and the difficulty of the questions the candidate is asked.

PMID:
 
20636588
 
[PubMed - indexed for MEDLINE]


대학의학에서 다양성(Diversity)과 포용(Inclusion) 측정: The Diversity Engagement Survey (Acad Med, 2015)

Measuring Diversity and Inclusion in Academic Medicine: The Diversity Engagement Survey

Sharina D. Person, PhD, C. Greer Jordan, PhD, MBA, Jeroan J. Allison, MD, MS,

Lisa M. Fink Ogawa, PhD, RN, CNE, Laura Castillo-Page, PhD, Sarah Conrad, MS,

Marc A. Nivet, EdD, MBA, MS, and Deborah L. Plummer, PhD, MEd






지속적인 건강 불평등, 인구구조의 변화, ACA이후 보험 환자의 증가 등으로 인해 이제 졸업할 의사들이 보다 다양해지고 첨단의 진료에 준비되게끔 하고, 문화적인 역량을 갖추게끔 하는 것이 미국 academic medical center의 과제가 되고 있다. 따라서 여러 기관이 다양성과 포용에 대한 자신의 역량을 평가하고 이를 통해 얻은 insight에 적절히 반응하는 것이 필요하다.

Persistent health disparities,1 changing population demographics,2 and growing numbers of insured patients following the enactment of the Affordable Care Act3 make graduating a diverse physician and scientific workforce prepared to advance high-quality, culturally competent health care and research increasingly challenging for U.S. academic medical centers. Therefore, it is imperative for institutions to assess their organizational capacity for diversity and inclusion and respond effectively to insights gained.4


DES(Diversity Engagement Survey)를 만들어서 AMC가 지역사회의 다양성에 어떻게 대응하고 있는지 측정하였다.

We created the Diversity Engagement Survey (DES) to measure how well academic medical centers are responding to the diversity of their community members (i.e., their faculty, staff, and students).



방법

Method


배경

Background


DES를 뒷받침하는 요인들은 수년간의 다양성/포용/관여에 관한 문헌 고찰과 이를 다양성관리에 적용해본 경험에서 나온 것이다. 앞서 12개 조직에서 이 도구를 반복적으로 사용해 본 결과, 그 기관의 구성원이 다양성에 대해서 가진 관점을 평가하는데 유용했다. 그러나 진단을 하거나 미래에 어떤 식으로 intervention을 해야하는지에 있어서는 효과적이지 않았다. 도구의 효과성을 개선하기 위해서, 어떻게 각 기관의 문화적 조건이 관여와 포용에 의해서 영향을 받는지 알아보았다.

The factors that undergird the DES emerged from years of study of the diversity, inclusion, and engagement literature and applied diversity management experience. Previous iterations of the instrument were used with 12 organizations (6 corporations, 4 hospital systems, 1 government agency, and 1 social service organization). These previous iterations were useful in evaluating perspectives about diversity among individuals at the participating institutions. However, they were not as effective in providing diagnostic data and strategic direction for future interventions. To improve the effectiveness of the instrument, we recognized the need to focus on how the cultural conditions of an institution are influenced by the interplay of engagement and inclusion.



DES의 개념 근간

Conceptual underpinnings of the DES


문화, 분위기, General purpose engagement survey와 달리, DES는 관여와 포용에 관하여 기관의 문화와 사회적 역동적 측면을 보여주기 위한 것이고, 문화와 사회적 역동(institutional culture and social dynamics)은 생산성과 고용 안정에 강하게 관련되어 있디고 알려진 관여와 포용(engagement and inclusion)에 관계가 있다고 알려진 바 있다.

Unlike culture, climate, or general purpose engagement surveys the DES is designed to reveal the aspects of institutional culture and social dynamics related to engagement and inclusion that have been shown to be the most strongly related to productivity and employee retention.5,6 


DES 프레임워크에서 다양성은 인간의 다름에 관함 모든 측면을 포괄하며, 포용, 상호존중, 다양한 관점에 대한 인식 등을 포함하는 핵심 가치이다. 포용(Inclusion)은 개개인이 정보에 접근하고, 소속감, 직업안정성, 사회적 지지를 느끼는 것에 영향을 주는 사회적 프로세스의 집합이다. 다양한 관점/경험/지식을 포용하는 조직 문화가 없이는 다양성의 잠재력은 온전하게 실현될 수 없다.

Within the DES framework (described below), diversity is conceptualized as encompassing all aspects of human differences and is viewed as a core value that embodies inclusiveness, mutual respect, and awareness of multiple perspectives.7 Inclusion is conceptualized as a set of social processes that influence an individual’s access to information and sense of belonging, job security, and social support received from others.8,9 Without an institutional culture that supports the inclusion of the differences in perspectives, life experiences, and knowledge that individuals bring to the institution, the full potential of diversity cannot be realized.4


조직의 모든 구성원이 참여하게끔 하는 것은 진정으로 포용성있는 AMC를 만드는 토대이다. 성공적인 참여는 기본적 지능적/정서적 요구를 충족시켜줌으로부터 시작될 수 있다.

Engagement of every member of the institution is the foundation on which a truly inclusive academic medical center is built. Successful employee engagement is derived from meeting the basic intellectual and emotional needs of workers.10–14


참여는 비전과 조직의 목적을 공유하는 것, 동지애(camaraderie), 구성원의 기여에 대한 인정 등으로부터 싹튼다.

Engagement results from cultural conditions that foster a shared sense of the vision and purpose of the organization as well as camaraderie and appreciation of employees’ contributions to the institution.

  • 비전과 목적의 공유: 조직의 미션에 기여해야 하는 정당한 근거를 제시 A sense of vision and purpose provides employees with a compelling reason to contribute to the organization’s mission.
  • 동지애: 소속감을 제공함과 더불어 주변에 손을 내밀 수 있게 만들어줌 Camaraderie gives employees a sense of belonging and provides them with opportunities to reach out and personally connect with those around them.
  • 인정: 조직에 대한 개개인의 기여와 가치를 인정해줌 Appreciation recognizes individuals’ contributions and values what each person brings to the organization.

DES 프레임워크

The DES framework


  • 1. 공동의 목표 
    1. Common purpose: Individuals experience a connection to the mission, vision, and values of the organization.
  • 2. 신뢰
    2. Trust: Individuals have confidence that the policies, practices, and procedures of the organization will allow them to bring their best and full self to work. 
  • 3. 개인의 기여에 대한 인정
    3. Appreciation of individual attributes: Individuals perceive that they are valued and can successfully navigate the organizational structure in their expressed group identity. 
  • 4. 소속감 
    4. Sense of belonging: Individuals experience their social group identity as being connected with and accepted in the organization. 
  • 5. 열려있는 기회 
    5. Access to opportunity: Individuals perceive that they are able to find and utilize support for their professional development and advancement.
  • 6. 동등한 보상과 인정
     6. Equitable reward and recognition: Individuals perceive the organization as having equitable compensation practices and nonfinancial incentives. 
  • 7. 문화적 역량
    7. Cultural competence: Individuals believe the institution has the capacity to make creative use of its diverse workforce in a way that meets business goals and enhances performance. 
  • 8. 존중
    8. Respect: Individuals experience a culture of civility and positive regard for diverse perspectives and ways of knowing.


DES 도구

The DES instrument

문헌조사와 현장경험으로부터 문항을 개발함. 22개 문항, 8개의 참여와 포용 요인들. 각 문항은 기관과 구성원의 관계의 핵심을 잡아내기 위하여 만들었으며, 개개인이 기관의 현실을 어떻게 느끼는지에 대해서 알아보려는 것이 아니었음. 모든 문항은 1인칭, 긍정형으로 서술되어 있음. 마지막으로 주관식 문항 포함.

We proposed survey items derived from a review of literature and our own experience in the field relative to the framework’s factors. The final DES consisted of 22 items chosen to reflect the eight engagement and inclusion factors (see Table 1). Each item was created to capture the essence of the relationship between the institution and its members, not individuals’ perceptions about how they, and those who share a group identity with them, perceive or experience institutional practices. All items were written in the first person and phrased positively. We also included a final open-ended question (“If you wish, please provide additional comments on the diversity and inclusion efforts”) to provide the respondents the opportunity to express any concerns, insights, or experiences related to their institutional context.


5점척도

All responses on the 22-item instrument were scored on a 5-point Likert scale





파일럿 테스트

Pilot testing and survey implementation


안면 타당도와 내용 타당도는 저자 중 1명이 소속한 기관의 대표 응답자들로 구성된 리뷰 패널이 평가하였음. 동일한 설문을 한 개의 AMC에서 2011년 3월 파일럿 설문 시행함. 이후 전 AAMC와 the Group on Diversity and Inclusion 를 통해서 전 AAMC 소속 기관에 설문 응답을 요청. 이후 추가적으로 13개의 AMC에서 시행

Face and content validity of the survey were assessed and improved through a review panel consisting of representative respondents at the home medical institution of one of the authors. The same survey was piloted at an academic medical center in March 2011. After the pilot, an invitation to participate in the survey benchmarking process was sent through the Association of American Medical Colleges (AAMC) and the Group on Diversity and Inclusion to all AAMC member institutions. The survey was subsequently administered to 13 additional U.S. academic medical centers from March 2011 through April 2012.


통계 분석

Statistical analysis


내적 일관성

Internal consistency.


We measured the internal consistency of the eight engagement and inclusion factors by calculating Cronbach alphas.


구인 타당도 

Construct validity.


CFA를 시행. 

Based on the expected mapping of survey items to engagement and inclusion factors, we performed confirmatory factor analysis (CFA) via structural equation modeling to investigate construct validity and to examine the dimensionality of the DES. We examined item correlations and selected two representative fit indices—comparative fit index (CFI)18 and the standardized root mean square residual (SRMR)19—to assess model fit. 

      • CFI is an index that ranges from 0 to 1; values greater than 0.90 are considered an indicator of a good fitting model.18 
      • The SRMR is an absolute measure of fit and is defined as the standardized difference between the observed and predicted correlation; models with an SRMR value less than or equal to 0.08 are considered good.19


인구통계학적으로 서로 다른 집단 사이간 비교함.

We also sought to demonstrate the instrument’s usefulness in understanding specific disparities within a given institution by distinguishing between the experiences of different demographic groups.



준거 타당도

Criterion validity.


기존 문헌에서 제시된 핵심 응답자 특성을 바탕으로 DES factor mean score 차이를 확인함.

As a final step in assessing the utility of the DES, we examined criterion validity, which is a measure of how well a construct predicts an outcome based on information from other variables.17 Here, we examined differences in DES factor mean scores based on key respondent characteristics suggested by the literature, such as race/ethnicity, gender, and sexual orientation. Respondents had the opportunity to self-identify as lesbian, gay, bisexual, transgender, queer, questioning, asexual, or other. For purposes of analysis and reporting we collapsed these responses into one category labeled LGBTQ.



결과

Results


전미에 걸쳐 13694명 응답. 14개 참여기관의 평균 응답률은 26.7%

Broad representation across each region of the United States was obtained through the 13,694 respondents to the DES. The average response rate across the 14 participating institutions was 26.7% (SD = 9.5), and institutional response rates ranged from 11% to 46%.



내적 일관성 Internal consistency


The Cronbach alphas for the eight engagement and inclusion factors of the DES ranged from 0.68 to 0.85 (Table 1), with an overall Cronbach alpha of 0.96.




구인 타당도 Construct validity


명확하게 세 그룹으로 나눠짐

The graphical displays of institutions’ mean engagement and inclusion factor scores clearly delineated institutions with higher, middle, and lower degrees of engagement and inclusion by their respondents (Figure 2).


흑인-백인간 차이가 기관의 DES level과 유의한 상관관계

We also found that greater disparity between black and white respondents at the institutional level was strongly correlated with lower black respondent scores. Spearman correlations for institutional rankings based on disparities and institutional rankings based on black respondent mean item scores ranged from 0.70 to 0.95 and were statistically significant for all items except 4 and 14 (see Supplemental Digital Table 1 at http://links.lww.com/ACADMED/A303).





준거 타당도 Criterion validity


흑인, 히스패닉/라틴 응답자가 배깅ㄴ보다 점수가 낮음. 여성 응답자가 남성보다 낮음

Analysis of the responses by demographic group revealed that black respondents and Hispanic/Latino respondents had lower mean factor scores than white respondents. Female respondents had lower mean factor scores than male respondents (Table 2).





Discussion


예컨대, 만약 한 기관의 DES 총점과 특정 요인에 대한 세부 그룹의 점수가 모두 낮았다면, 조직 전반에 대한 정책 변화가 필요함. 반대로, 전체적 DES는 높은데, 특정 그룹만 낮다면 그 그룹에 대한 intervention이 필요함.

For example, if both an institution’s overall and subgroup scores for a given factor or item are equally low, changes in organization-wide policy may be needed. On the other hand, if the overall score is high but a subgroup score is low, a policy targeting the subgroup may be appropriate.


전반적으로 Cronbach alpha는 reliable. Common purpose 요인에 대한 낮은 Cronbach alpha의 원인은 'violation of the essential tau equivalence assumption'일 수 있음. 이는 여기에 속한 두 문항의 observed variance가 크게 달랐기 때문이다. 그러나 이 가정을 위배할 경우 보통 alpha coefficient가 저평가 되기 때문에, 여기서 보고한 값은 실제 값의 하한 정도가 될 것이다. DES가 기존의 문헌을 바탕으로 face validity를 갖추고 있고 리뷰 패널의 검토를 거쳤기에 common purpose를 유지하기로 결정.

Overall, the Cronbach alpha results indicate that the DES is a reliable instrument. One possible explanation for the marginally low Cronbach alpha of the common purpose factor may be violation of the essential tau equivalence assumption,21 which is suggested because the observed variances of the two items comprising this factor were significantly different (data not shown). However, violation of this assumption usually leads to underestimation of the alpha coefficient, so it is reasonable to assume that the reported coefficient represents a lower bound for the true value. Because the entire DES has face validity based on existing literature and vetting with the review panel, we have chosen to retain the common purpose factor in the survey.


다양성에 대한 조직의 역량을 기르기 위해서는 조직 내 다양한 그룹이 어떻게 느끼고 있는가를 먼저 알아야 할 것이다.

To build institutional capacity for diversity, institutions must start with an understanding of the extent to which their various groups feel included and engaged.25









 2015 Dec;90(12):1675-83. doi: 10.1097/ACM.0000000000000921.

Measuring Diversity and Inclusion in Academic Medicine: The Diversity Engagement Survey.

Author information

  • 1S.D. Person is associate professor, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, Massachusetts. C.G. Jordan is associate vice chancellor, Diversity and Inclusion, and assistant professor, Departments of Nursing, Psychiatry, and Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, Massachusetts. J.J. Allison is associate vice provost, Health Disparities Research, and vice chair and professor, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, Massachusetts. L.M. Fink Ogawa is clinical assistant professor and director, Quality and Safety Scholarship, University of Kansas Medical Center School of Nursing, Kansas City, Kansas. L. Castillo-Page is senior director, Diversity Policy and Programs and Organizational Capacity Building Portfolio, Association of American Medical Colleges, Washington, DC. S. Conrad is senior research analyst, Association of American Medical Colleges, Washington, DC. M.A. Nivet is chief diversity officer, Association of American Medical Colleges, Washington, DC. D.L. Plummer is vice chancellor, Diversity and Inclusion, and professor, Departments of Psychiatry, Quantitative Health Sciences, and Nursing, University of Massachusetts Medical School, Worcester, Massachusetts.

Abstract

PURPOSE:

To produce a physician and scientific workforce that advances high-quality research and culturally competent care, academic medical centers (AMCs) must assess their capacity for diversity and inclusion and leverage opportunities for improvement. The Diversity EngagementSurvey (DES) is presented as a diagnostic and benchmarking tool.

METHOD:

The 22-item DES consists of eight factors that connect engagement theory to inclusion and diversity constructs. It was piloted at 1 AMC and then administered at 13 additional U.S. AMCs in 2011-2012. Face and content validity were assessed through a review panel. Cronbach alpha was used to assess internal consistency. Confirmatory factor analysis (CFA) was used to establish construct validity. Cluster analysis was conducted to establish ability of the DES to distinguish between institutions' degrees of engagement and inclusion. Criterion validity was established using observed differences in scores for demographic groups as suggested by the literature.

RESULTS:

The sample included 13,694 respondents across 14 AMCs. Cronbach alphas for the engagement and inclusion factors (range: 0.68-0.85), CFA fit indices, and item correlations with latent constructs indicated an acceptable model fit and that items measured the intended concepts. Cluster analysis of DES scores distinguished institutions with higher, middle, and lower degrees of engagement and inclusion by their respondents. Consistent with the literature, black, Hispanic/Latino, female, and LGBTQ (lesbian, gay, bisexual, transgender, queer) respondents reported lower degrees of engagement than their counterparts.

CONCLUSIONS:

The DES is a reliable and valid instrument for assessment, evaluation, and external benchmarking of institutional engagement andinclusion.

PMID:
 
26466376
 
[PubMed - in process]


학습 접근법 향상: 학생과 교사를 위한 팁 (Med Teach, 2013)

Enhancing learning approaches: Practical tips for students and teachers

SAMY A. AZER1, ANTHONY P. S. GUERRERO2 & ALLYN WALSH3

1King Saud University, Saudi Arabia, 2University of Hawai, USA, 3McMaster University, Canada






According to Biggs (1994), approaches to learning can be defined as ‘‘ways in which students go about their academic tasks, thereby affecting the nature of their learning out- comes.’’


주제 1: 심층 학습을 촉진하기 위한 구체적 테크닉

Theme 1: Apply specific techniques that foster deep learning 


좋은 질문을 하는 방법을 배우라

Ti p 1 Learn how to ask good questions


표면적 학습을 주로 하는 학생은 배운 지식을 분석하여 새로운 아이디어를 내놓는 것을 어려워한다. 따라서 의미있는 이해를 개발하기 위한 중요한 요소 중 하나는 학생이 스스로 배우고 싶은 것과 관련하여 좋은 질문을 할 줄 아는 것이다.

For example, students who usually use superficial or shallow learning find it difficult to analyze knowledge learnt or present newideas. Therefore, one of the elements of developing meaningful understanding is to learn how to format good questions in relation to what students want to learn (Graig et al. 2006; Zhang et al. 2010).


왜 질문하는 법을 배워야 하는가?

Why do students need to learn how to ask questions? 

Asking good open-ended questions help students to: 

  • Make their research for a learning issue more focused. 
  • Evaluate the different aspects about a concept. 
  • Identify what they know, and what they did not know. 
  • Go deeper into an issue and develop a meaningful learning. 
  • Weigh supportive evidence for and against each hypothesis. 
  • Turn learning to a journey of discovery. 
  • Make a purpose for their learning.

어떻게 질문이 심화 이해를 강화할 수 있는가?

How can questions reinforce deep understanding? 

Questions can help in reinforcing deep understanding when students aim at exploring a particular cognitive skill, for example: 

  • 가설 설정 Generating hypotheses: ‘‘What are the possible causes for ...?’’ 
  • 가설 검증 Testing a hypothesis: ‘‘What if ...?’’ ...‘‘Are there relation-ships between...and...?’’ 
  • 추론 Reasoning: ‘‘How can I justify ...?’’ 
  • 탐색 계획 수립 Developing an enquiry plan: ‘‘What are my goals? What approach should I take? What questions should I ask to make a hypothesis less likely, more likely, or excluded?’’ 
  • 관찰 결과 해석 Interpreting findings: ‘‘How these findings are different from normal? What do they mean?’’ 
  • 프로세스 탐색 Examining a process: ‘‘What are the consequences of ...? 
  • 설명 Explaining: ‘‘How can we explain...?’’ 
  • 임상 변화와 기초과학 연결 Linking clinical changes with basic sciences: ‘‘What does this relate to our knowledge about normal body...?’’ 
  • 비판적 평가 Critically evaluate: ‘‘What evidence do I have...?’’ 
  • 핵심 개념 파악 Identifying key concepts learnt: ‘‘What is the take homemessage?’’ 
  • 학습 요구 파악 Identify learning needs: ‘‘What do I know?’’ ...‘‘What do Ineed to know? How?’’ 


비유 사용하기

Ti p 2 Use analogy


왜 비유를 사용해야 하는가?

Why do students need to use analogy? 

잘 만든 비유는..

A well-constructed analogy: 

  • 이해를 돕고 Allows learners to better understand the different compo- nents of a difficult concept. 
  • 복잡한 것을 이해하기 단순하고 명확한 것과 비교해주고 Enables learners to take complicated issue/concept and compare it to something simpler that listeners/readers are familiar with and clearly understand. 
  • 세 가지 제약(유사성, 구조/기능, 목적)을 다룸 Covers three constraints: similarity, structure/function, and purpose.



어떻게 비유를 통해서 의미있는 이해를 할 수 있을까?

How can analogy help learners to explore meaningful understanding? 

By using analogy, learners can: 

  • 익숙한 모델과 관련성 찾기 Identify relationship between the familiar model and the new information they are learning. The more this relation- ship is well established, the more learners are able to comprehend the function and structure of the new system they are studying. 
  • 실제 상황에 지식 적용 Apply knowledge learnt in different real-life situations with confidence (Busari 2000). 
  • 어려운 개념 이해, 다른 요소들이 기여하는 방식 이해 Comprehend difficult concepts and how the different components contribute to a particular function. 
  • 새로운 질문 만들기 Identify new questions in relation to the difficult concept they are learning. 
  • 유사한 문제에 적용하기 Build on knowledge learnt and use the analogy in solving similar problems.




메커니즘과 개념지도 그리기

Ti p 3 Construct mechanisms and concept maps


학습자는 지식을 조직화하여 기존의 지식에 새 지식을 통합하고, 의미있는 학습을 위한 한신을 한다. 

Therefore, in this process, the learner deliber- ately seeks to organize his/her knowledge, incorporates new knowledge learnt to what he/she possesses, and demonstrates commitment to meaningful learning. The use of mechanisms and concept mapping is a good example of construction of knowledge, and integration of knowledge in an organized and logical way (Guerrero 2001; Novak 2003).



피어-튜터링에 참여하기

Ti p 4 Join a peer-tutoring group


튜티의 학습에 큰 도움이 되지만 튜터에게도 큰 도움이 된다. 

While peer tutoring could make significant improvement on learning of tutees, it also has benefits for the tutors. For example, Sobral (2002) found that acting as a peer tutor can be an appealing and constructive educational opportunity to further students’ academic development.


왜 피어-티칭을 해야하나?

Why do students need peer teaching? 

Peer teaching can: 

  • 배움의 문화 만들기 Help in developing a culture of learning in the college. 
  • 능동적 학습 Encourage active learning where students take a deep learning approach. 
  • 협력적 학습 Enforce collaborative learning. 
  • 어려움 겪는 학생에 대한 지원 Support students struggling with their learning. 
  • 가르침과 배움의 책임이 학생에게 있는 파트너십 시스템 Provide a system of partnership in the learning process, where the teaching and learning are owned by students. 
  • 의사소통/인간관계/시간관리 등의 전이가능한(transferable) 기술 개발 Enable students to develop transferable skills such as communication, interpersonal skills, and time management. 


어떻게 피어-티칭이 학습을 촉진하나?

How peer teaching can foster student’s learning? 

Peer teaching can help in reinforcing deep understanding by: 

  • 격차 발견 Discovering gaps/inadequacies in their knowledge. 
  • 잘못 이해한 개념 발견 Identifying misconceptions and correct them. 
  • 동기부여 Enhancing motivation to learning. 
  • 새로운 기술 습득 Learning new skills. 
  • 시간 효율적 관리 Using their time more effectively. 
  • 학습 스킬, 협력 학습, 어려운 개념 토론 Enhancing their learning skills, collaborative learning, and discussion of difficult concepts. 
  • 의사소통과 대인관계 향상 Improving their communication and interpersonal skills.



비판적 사고 기술 개발

Ti p 5 Develop critical thinking skills


비판적 사고는 이런 것들(self- regulation, analysis, evaluation, interpretation, conceptual, and methodological assessment)을 필요로 한다. 최근, 비판적 사고는 두 가지 주 구성요소(cognitive skill, affective domain)으로 이뤄져있다고 여겨진다. 

Critical thinking is a purposeful process that involves self- regulation, analysis, evaluation, interpretation, conceptual, and methodological assessment as a learning approach. Recently, critical thinking has been viewed beyond cognitive skills and it is believed that it comprises two main components: cognitive skills and affective domains of reasoning and attitude (Scheffer & Rubenfeld 2000). This is because of complexity of processes involved in critical thinking together with the oriental of mind during critical thinking. Such views are supported by other researchers (Simpson & Courtney 2002; Profetto-McGrath 2003).


비판적 사고의 구성요소는?

What are the components of critical thinking? 

According to Scheffer and Rubenfeld (2000), the two components of and critical thinking are cognitive skills affective dispositions, which are as follows: 

  • 인지기술 Cognitive skills: Discrimination, analyzing, predicting, logical reasoning, information seeking, applying standards, and transforming knowledge.
  • 정동적 요소 Affective dispositions: Intellectual integrity, open-minded- ness, flexibility, contextual perspective, confidence, inquisi- tiveness, reflection, creativity, and intuition.


왜 비판적 사고 기술을 개발해야 하나?

Why do students need to develop critical thinking skills? 

Critical thinking skills enable students to: 

  • 자기조절, 분석, 해석능력 개발 Develop their self-regulation, analysis, and interpretation skills. 
  • 언급한 인지와 정동 특성 강화 Enhance their skills in cognitive and affective dispositions discussed above. 



어떻게 개발할 수 있나?

How can students develop their critical thinking? 

Critical thinking is a skill that can be developed by students gradually as they keep practicing and working on it. So the willingness of the learner to focus on such approach and work on it is vital. 

  • 단순 암기에서 비판적 사고로 학습과정 조정 Readjusting their learning approaches from memorization to critical thinking. 
  • 지속적 훈련 Continuous training and working on tasks that can reinforce students’ critical thinking. 
  • 비판적 사고 기술을 길러줄 수 있는 학습자료 활용 Using learning resources that encourage learner’s skills such as interpretation of findings, analysis, inferences, explana- tion, and application of knowledge. These resources may include computer-aided learning programs and interactive e-cases. 
  • 피드백, 열정을 공유하는 동료와 일하기 Receiving feedback on performance and working with colleagues who share with you a passion about exploring critical thinking in their learning. 
  • 이러한 기술이 평생학습의 일부임을 이해하기 Understanding that such skills are part of continuous life- long learning and continuous development.


자기성찰 활용

Ti p 6 Use self-reflection


자기성찰은 모든 단계에서 발생할 수 있다. 자신과 상황을 이해하는데 도움이 된다.

Self-reflection is a metacognitive process that can occur at all stages of an encounter: before, during, and after. It helps the learner to understand both the self and the situation (Azer 2008b; Sandars 2009).



비록 최근 Lew and Schmidt 의 연구는 자기성찰이 학업능력 향상에 주는 도움은 미미하다고 밝혔으나, 다음을 도울 수 있다.

Although a recent study by Lew and Schmidt (2011) found that self- reflection has caused limited improvement in students’ academic performance, there is evidence that reflection can help learners to: 

  • 자기조절과 평생학습자가 되게 함 Become self-regulated and life-long learners. 
  • 그룹 학습 과정을 리뷰하고, 그 그룹 내에서 자신의 기여를 돌아보게 함 Review their group learning process and their own personal input in their groups. 
  • Experiential learning에서와 같이 학습 향상 Improve their learning as it is the case in experiential learning (Kolb 1984). According to Kolb, there are four phases: 
    • 1단계: 경험 in the first phase, the learner has an experience; 
    • 2단계: 성찰 in the second phase, the learner reflects, which in turn leads 
    • 3단계: 자신의 행동/리액션 이해. to the third phase where the learner makes attempts to understand their actions or reactions to the experience. This usually leads to identification of learning needs and skills that need to be acquired before facing a similar situation. 
    • 4단계: 배운 내용 학습 In phase four, the learner applies what he/she has learnt. This four-phase cycle may be repeated several times with the purpose of improving the learner’s performance. 
  • 새로운 지식을 이전 지식과 연결 Relate new knowledge to prior understanding. 
  • 스스로의 발전상황 평가, 성취에 대한 비판적 분석 Help them assess their own progress in learning and critically analyze their achievements.



주제 2: 능동적 학습 마스터하기

Theme 2: Master active learning


능동적 학습은 학생들이 교육과정에서 잘 설계된 다양한 학습 활동에 참여할 것을 필요로 하며, 이것의 목적은 학생의 참여 증진, 관련성 증대, 동기부여, 자기주도학습 향상 등이다.

Active learning requires active participation of students in different learning activities that have been carefully designed in the curriculum (Gleason et al. 2011). The aim is to facilitate student engagement, enhance relevance, and improve motiva- tion for learning and self-directed learning.



적절한 학습자원 활용

Ti p 7 Use appropriate range of learning resources


목적에 맞는 적절한 자원을 찾아야 한다. 서로 다른 자원은 서로 다른 목적에 도움이 될 수 있다. 또한 학생이 발전함에 따라서 필요한 정보나 요구가 변한다. 어떤 단계에서 어떤 목적으로 무엇이 필요한지 이해해야 한다.

Students need to identify resources that can help them learn and meet their objectives: different resources will be helpful for different needs. In addition, information and learning needs change as students advance; it is important to have a realistic understanding of what resources are useful at what stage, and for which purpose.


이런 것들이 있다.

These learning resources may include e-case packages,multimedia, journal articles, review papers, textbooks, educa-tional Web sites, YouTube videos, museum, plastinatedspecimen, bed-side teaching, outpatient clinic, rural services,lecture notes, and so on


적절한 자원을 적절히 사용하면 이렇게 도움이 된다.

The use of a wide range of appropriate learning resources could help students in: 

  • 자기평가, 장기기억 촉진 Self-testing and facilitating the process of long-term recall through repeated testing (Larsen et al. 2009). 
  • 지식을 평가하고 새로운 관심 찾기 Assessing their knowledge and what they already knowand exploring new aspects of interest to them. 
  • 자기주도학습을 더 확대하기 Engaging students more efficiently to expand their self- directed learning and find more answers to their questions. 
  • 특정 시간에 배운 내용을 다른 자원에서 토론한 내용과 비교. 개념지도 작성에 도움 Enabling students to compare and contrast between what they learnt from a particular learning session and what they have discussed in other learning resources. The new help students knowledge could to construct concept maps, integrate and synthesize information. 
  • 중요한 프로세스/개념을 설명할 기회 Providing students with opportunities to explain important processes/concepts orally and in writing. 
  • 배운 내용을 다시 보고 그것이 다른 핵심 과학적 원칙에 어떻게 들어맞는지 설명 Allowing students to revisit what they were taught, organize, relate new knowledge, and explain how topics fit together, and how to relate fine details to a key scientific principle. 
  • 학습을 스스로 조절하게끔 함 Enabling students to self-regulate their learning: plan, monitor, and evaluate their own learning (Sandars & Cleary 2011). So some of their learning resources could be designed in a way that allow them to develop their self- directed learning, self-monitoring skills, and how to ask good questions as they prepare for their learning. For example, ‘‘What do I really know about this area?’’ ‘‘What do I need to know?’’ ‘‘What learning resources should I use?’’ ‘‘Do I need to do any further research?’’


피드백을 찾아라

Ti p 8 Ask for feedback


코칭이나 피드백 없이 연습만 해서는 발전이 없다. 피드백은 평가와 달라서, 객관적이고 기대되는 기준과 비교하여 판단이 개입되지 않은 묘사이며, 의미를 가지려면 자주 피드백이 있어야 하고, 수행능력 향상을 촉진하는데 효과적이다. 피드백은 받는 사람을 위한 것이다. 

It has been well shown that practice alone without coaching and feedback does not lead to improvement in performance and the development of expertise (Ericsson et al. 1993). ‘‘Feedback,’’ which is different from ‘‘evaluation,’’ involves the objective, nonjudgmental description of perfor- mance, particularly in relation to the expected standard, and should take place frequently in order to be meaningful and effective in facilitating performance improvement. Feedback is for the benefit of its receiver.


과목 초반에 학생과 교사는 향후 정기적인 피드백을 설정하기 위한 토론에 참여해야 한다. 이 토론은 다음과 같은 구체적인 것 들을 정해야 한다.

We suggest that, early in the course, students and teachers should engage in a discussion that sets the stage for regular, expected feedback. This discussion should address specifics, such as: 

  • 언제 피드백이 이뤄지나 when will feedback occur (e.g., at the end of each session, at the end of each week, etc.); 
  • 어떻게 기록되고 그 기록은 어떻게 사용되나 how feedback will be recorded if at all and what happens to any documentation of feedback; and 
  • 갑작스러운 필요시에 피드백을 구하거나 주는 것은 어떻게 되나 how to ask for or give feedback at any time that it may be necessary.



교실을 넘어선 학습

Theme 3: Practice learning beyond the classroom


새로운 문제에 지식을 적용하라

Ti p 9 Apply knowledge learnt to new problems


왜 학생들은 지식을 새로운 문제에 적용해봐야 하나?
Why do students need to apply knowledge learnt to new problems? 

Applying knowledge learnt helps students in: 

  • 지식의 통합 Integrating knowledge learnt from several disciplines. 
  • 관계 탐색 Exploring relationships. 
  • 비교/분석/평가/가설설정 등등의 인지기술을 비롯한 여러 능력 향상 Practicing a number of cognitive skills such as comparing, analyzing, evaluating, hypothesizing, looking for evidence, linking basic sciences to clinical presentation, questioning, examining possible contributing factors, studying pathogen- esis, explaining, identifying gaps, referring to the literature, researching several resources, interpreting findings, and designing a management plan. 
  • 실제 임상에서 기초과학의 중요성 탐색 Examining the significance of basic sciences in clinical situations and real practices. 
  • 심화학습 촉진 Enhancing deep understanding. 
  • 이해가 미진한 부분 발견 Discovering inadequacies or gaps in their understanding.

지식을 새로운 문제에 적용하는 기술을 어떻게 익힐 수 있을까?

How can students develop their skills in applying knowl- edge to new problems?


스스로의 수행능력을 검토해보도록 권장되어야 한다. 다음과 같은 질문을 활용할 수 있다.

Students should also be encouraged to review their performance as they try to apply knowledge learnt to new problems. They may ask themselves questions such as 

  • ‘‘What did I learn from this experience?’’ 
  • ‘‘Why did I miss to provide the correct answer?’’ 
  • ‘‘Did I miss understand the question?’’ 
  • ‘‘Do I need to read and check my resources again in regard to this particular area?’’ and 
  • ‘‘I do remember studying this part but this case is not typical as the examples I studied. What did I learn so far?’’

학습한 지식을 새로운 문제에 적용하는 것은 일반적으로 심화 이해와 학습기술 향상에 중요하다

In fact, applying knowledge learnt by students to solve new problems is one of the key elements for enhancing deep understanding and learning skills in general (Metcalfe 2009).



시뮬레이션을 이용한 연습

Ti p 10 Practice learning by using simulation



실천과 봉사를 통한 학습

Ti p 11 Learn by doing and service learning


노력을 유도할 만큼 어려우면서, 학습 의욕을 꺾지는 않을 만큼의 난이도를 가진 활동에 참여함으로써 학습기술을 익힐 수 있다.

There is evidence from research that students could master their learning skills when they engage themselves in activities that are sufficiently difficult to promote mental effort, but not so difficult that could inhibit their desire to learn (Snelgrove 2004).



왜 실천과 봉사를 통해서 배워야 하는가?

Why do students need to learn by doing and service learning? 

Learning by doing and service learning have several educational benefits including: 

  • Fostering responsibility, accountability, and caring for others. 
  • Extending students’ learning from classroom learning to community services. 
  • Enabling students to develop skills that are less likely developed in traditional modes of learning. These skills include contributing to public safety, environmental protec- tion, and public education about common diseases and healthy living habits. 
  • Enabling learners to experience the exact meaning of learning and how service learning could add new dimen- sions to their learning experience.

환자로부터 배우기

Ti p 12 Learn from patients


환자와 효과적으로 의사소통하고, 환자-중심적으로 접근하며, 환자와 라뽀를 쌓는 독특한 기회이다. 그러나 이들 기술의 중요성은 종종 학생들이 간과하며, 일상의 임상교육에서 덜 강조되고는 한다. 아래와 같이 진행되어야 한다. 

it is a unique opportunity to learn how to communicate effectively with patients, become patient- centered in approach, and build rapport with patients. The importance of these skills is often ignored by students and less emphasized in the day-to-day clinical teaching. Therefore, learning from patients in bedside teaching and outpatient clinics brings a number of teaching and learning opportunities (Wang-Cheng et al. 1989; LaCombe 1997). 


These can be summarized as follows:

  • 학생이 직접 경험을 하도록 Allow students to gain first-hand experience of doctor– patient relationships. 
  • 환자-중심 진료를 강화. 단순히 임상진단을 하는 것이 아니라 질병이 인간에게 미치는 영향을 다루고, 질병을 이해하며, 환자가 자신의 상태를 관리하는데 참여할 수 있게 하는 것 Enhance the development of patient-centered care. The aim is not just how to make a clinical diagnosis, but also how to address the human impact of disease, understand illness, and engage the patient to take control of their condition. 
  • 실제 환자에 대한 병력청취와 검진 역량 강화 Enhance students’ competence in taking a medical history and examining real patients. 
  • 의사가 일상적으로 겪는 문제를 경험하게 함 Experience the reality of clinical practice and challenges that could face doctors and patients on day-to-day practice (e.g., this is completely different from experience with a simulated).


어떻게 환자에게서 배울 수 있는가?

How can students maximize their learning from patients? 

환자에게도 도움이 되게 해야 하며, 편안하게 해줘야 한다. 서두르지 말라.

Be helpful to patients and attend to their comfort. Do not rush. 

  • 비언어적 큐(cue)의 활용 익히기 Learn how to use nonverbal cues (both giving and receiving) in your communication. . 
  • 환자와 대화하고, 공감하고, 개방형 질문 사용 Engage the patient in the discussion, convey empathy to the patient’s concerns, and use open-ended questions. . 
  • 의학전문용어는 사용 지양 Avoid the use of medical jargon, use simple language and in a direct manner. . 
  • 환자의 생각/감정/기대를 이야기함에 있어서 진실되고 솔직하게 Be sincere, honest, and interested in talking to the patient about their ideas, emotions, and expectations rather than just asking them to listen to their hearts or take a medical history. . 
  • 환자를 보기 전에 동의 구하기. 진찰이 끝나면 잠깐 쉴 수 있는 시간 주기 Ask permission from the patients before examining them. Once you complete your examination, thank them, and leave them to have some rest. . 
  • 문화적 교육적 차이에 대한 민감성 기르기 Be sensitive to cultural and educational differences in the way you communicate, build rapport, and express positive attitude. .
  • 동료에게 피드백 구하기 Ask a colleague to give you feedback on your communica- tion and approaches with patients. . 
  • 스스로 성찰하기 Reflect on your experiences with the patients: 
    ‘‘What did you learn?’’ 
    ‘‘What type of challenges did you face?’’
    ‘‘How did you handle these challenges?’’
    ‘‘What will you do differently next time?’’ and
    ‘‘Why?’’ (Lew & Schmidt 2011).
















 2013 Jun;35(6):433-43. doi: 10.3109/0142159X.2013.775413. Epub 2013 Mar 15.

Enhancing learning approachespractical tips for students and teachers.

Author information

  • 1Department of Medical Education, College of Medicine, King Saud University, Riyadh, Saudi Arabia. azer2000@optusnet.com.au

Abstract

BACKGROUND:

In an integrated curriculum such as problem-based learning (PBL), students need to develop a number of learning skills and competencies. These cannot be achieved through memorization of factual knowledge but rather through the development of a wide range of cognitive and noncognitive skills that enhance deep learning.

AIM:

The aim of this article is to provide students and teachers with learning approaches and learning strategies that enhance deep learning.

METHODS:

We reviewed current literature in this area, explored current theories of learning, and used our experience with medical students in a number of universities to develop these tips.

RESULTS:

Incorporating the methods described, we have developed 12 tips and organized them under three themes. These tips are (1) learn how to ask good questions, (2) use analogy, (3) construct mechanisms and concept maps, (4) join a peer-tutoring group, (5) develop critical thinking skills, (6) use self-reflection, (7) use appropriate range of learning resources, (8) ask for feedback, (9) apply knowledge learnt to new problems, (10) practicelearning by using simulation, (11) learn by doing and service learning, and (12) learn from patients.

CONCLUSIONS:

Practicing each of these approaches by students and teachers and applying them in day-to-day learning/teaching activities are recommended for optimum performance.

PMID:
 
23496121
 
[PubMed - indexed for MEDLINE]


고등교육 프로그램 평가에 커크패트릭 4단계 평가모형 활용 (Educ Asse Eval Acc, 2010)

Adaptation of Kirkpatrick’s four level model of training criteria to assessment of learning outcomes and program evaluation in Higher Education

Ludmila Praslova







1.1 평가의 정의와 목적

1.1 Definitions and purposes of assessment


평가의 정의

Definitions of assessment


평가는 다양한 상황에서, 다양한 의미로 사용됨. 학생에게 점수를 부여하는 것, 학생의 성취도에 대한 자료를 모아서 프로그램이나 기관 단위의 성취도를 점검하는 것 등. 다양한 수준에서 이뤄진다(교실, 과목, 프로그램, 교육 일반, 기관 등)

The term assessment is used in various contexts and has somewhat different connotations. For example, it is commonly used to describe the processes used to certify individual students or even to award grades (Ewell 2001). On the other hand, for accreditation purposes, assessment refers to the collection and use of aggregated data about student attainment to examine the degree to which program or institution-level learning goals are being achieved (Ewell 2001). Thus, assessment takes place at multiple levels: the classroom, course, program, general education, and institution (Bers 2008).


Ewell은 "평가는 학생이 학업단계마다 무엇을 알고 무엇을 할 수 있는지에 대한 타당하고 신뢰도있는 근거를 수집하는 체계적 방법으로 구성되어 있으며, 학생의 학습성과에 대한 공식-기술에 의해서 좌우된다."

(p.1). Ewell (2006) suggests that “Assessment comprises a set of systematic methods for collecting valid and reliable evidence of what students know and can do at various stages in their academic careers . . . governed by formal statements of student learning outcomes” (Ewell 2006, p. 10).



평가에 있어서 이해관계자

Stakeholder emphasis on assessment


평가에 있어서 이해관계자의 Interest는 전 세계적으로 높아지고 있다.

Growing stakeholder interest in assessment appears to be a global phenom- enon.



기관 차원에서의 피드백으로서의 평가

Assessment as vital institutional feedback



Allen에 따르면, 외부에서 지속적으로 평가를 요구하는 것이 평가의 중요도가 높아지는 한 가지 이유이다. 그러나 더 중요한 이유는 고등교육에서 점차 '가르치는 것'보다 '배우는 것'에 초점을 두어 학생의 성과를 강조하기 때문이다. 따라서 교육 성과에 대한 평가는 외부 이해관계자의 요구를 만족시키기 위해서만 행해지는 것이 아니다. 학습에 대한 평가는 고등교육기관이 핵심 교육 미션에 대한 피드백을 받기 위한 것이다.

According to Allen (2006), an external requirement for continuous assessment is only one of the reasons that underlie the growing importance of assessment. Perhaps an even more important reason is the overall movement of Higher Education toward being learning-focused and emphasizing student outcomes, as opposed to being teaching-focused. Thus, assessment of educational outcomes is not something that should only be done to satisfy external stakeholder requirements. Assessment of student learning is also a way for Institutions of Higher Education to receive feedback regarding the effectiveness of their core educational mission.


시스템스-이론에 따르면 고등교육기관은 환경과 여러가지로 연결되어 있는 open system으로 볼 수 있다.

According to the systems theory, institutions of Higher education, like other social organizations, can be understood as open systems connected to their environment in multiple ways, including input, output, and feedback (Katz and Kahn 1966).


조직 차원이나 기관 차원에서 환경과 어떻게 상호작용하는가에 대한 정보는 피드백의 형태로 나타나며, 이는 어떤 변화가 필요한지를 찾아서 그 변화를 이루고 이를 통해 적절하게 기능하여 그 시스템 내에서 살아남기 위한 것이다.

Information regarding organizational or institutional functioning in relation to the environment, in the form of feedback, is essential to adjustment and making needed changes and thus to proper functioning and, ultimately, to the survival of the system (Katz and Kahn 1966).



1.2 평가 수행에 관한 기관 차원의 어려움

1.2 Institutional struggles with assessment


평가에 대한 기관 내적/외적 중요도에도 불구하고, 여러 대학들은 여전히 평가를 이해하고 평가결과를 교수-학습의 개선을 위해 사용하는 것에 어려움을 겪는다. 또한 학생 성과를 평가하는 기술적 측면에서도 어려움을 겪고 있어 학습준거의 명확화, 적절한 평가도구 선정 등이 그것이다.

Despite both external and internal importance of assessment, many colleges and universities still struggle with understanding assessment and using assessment results to improve learning and teaching (Bers 2008), as well as with technical aspects of assessing student outcomes, including clarification of learning criteria and selecting appropriate measures and instruments (Allen 2006;Bers2008; Brittingham et al. 2008;Ewell2001).



1.3 평가를 위한 접근법

1.3 Overview of proposed approach to assessment


4단계 모형은 classic framework이다.

The four level model is a classic framework for assessing training effectiveness in organizational contexts.


Alliger 등은 그것을 보다 정교화했는데, behavior criteria를 transfer criteria라고 명명한다거나, reaction을 affective reaction과 utility judgement로 나눈다거나 하는 것이 그것이다.

Alliger et al. (1997) proposed some augmentation to the framework and further refined terminology and criteria, for example, by referring to behavior criteria as transfer criteria, and by specifying affective reactions and utility judgments as subtypes of reaction criteria.



2 고등교육에서 4단계 모형의 적용

2 Adaptation of the four level model of training evaluation criteria to assessment in Higher Education



Reaction과 Learning은 internal한 것으로, 프로그램 내에서 발생한 것에 초점을 둔다. Behavior와 Result는 프로그램 이후에 발생한 것에 관심을 두므로 external 하다고 본다.

Reaction and learning criteria are considered internal, because they focus on what occurs within the training program. Behavioral and results criteria focus on changes that occur outside (and typically after) the program, and are thus seen as external criteria.


2.1 반응

2.1 Reaction criteria


Alliger는 얼마나 프로그램을 즐겼는가 affective reactions 와 얼마나 배웠다고 생각하나 utility judgments로 나누었다.

Alliger et al. (1997) proposed the distinction between trainee’s reports regarding how much they enjoyed the training (affective reactions) and how much they believe they have learned (utility judgments) within the reaction criteria.


많은 여구자들이 reaction과 다른 나 머지 사이에 관련이 부족함을 지적했고, Alliger 등은 affective reaction과 다른 level은 관련이 없고, utility judgement와 다른 level은 약한 상관관계가 있다고 보여주었다.

Many researchers have pointed out the lack of relationship between reaction criteria and the other three levels of criteria (learning, behavior and results), and the meta-analytic study by Alliger et al. (1997) found no relationship between affective reactions and other levels, and only a weak relationship between utility judgments and the other levels of criteria.


그러나 많은 연구자들이 reaction 평가에 대해 유의할 것을 지적했음에도, 가장 흔히 평가하는 것으로 아직 남아있다.

However, despite the fact that many researchers caution against the use of reactions alone for the assessment of learning, reaction level criteria remain the most often assessed (Alliger et al. 1997; Arthur et al. 2003a; Dysvik and Martinsen 2008; Van Buren and Erskine 2002).



2.2 학습

2.2 Learning criteria


학습은 학습성과의 측정이며 다양한 형태의 지식검사를 하거나 훈련프로그램 직후에 수행능력이나 스킬 검사를 한다.

Learning criteria are measures of the learning outcomes, typically assessed by using various forms of knowledge tests, but also by immediate post-training measures of performance and skill demonstration in the training context (Alliger et al. 1997).


Alliger 등은 immediate knowledge, knowledge retention and behavior/skill demonstration 로 나누었지만, 큰 지지를 받지는 못하고 있다.

Alliger et al. (1997) proposed specifying immediate knowledge, knowledge retention and behavior/skill demon- stration measured within training as subtypes of learning criteria, but this idea received relatively limited support.


2.3 행동

2.3 Behavioral criteria


Transfer라고도 한다.

Behavioral criteria are also referred to as transfer criteria, a terminology change proposed by Alliger et al. (1997).


조직에서 Behavioral criteria는 supervisor rating이나 수행능력의 객관적 지표에 의해서 평가한다.

In organizations, behavioral criteria are typically operationalized as supervisor ratings or objective indicators of performance such as job outputs (Alliger et al. 1997; Arthur et al. 2003a; Landy and Conte 2007).


비록 Learning과 Behavior가 개념적으로 관련되어 있을 것으로 기대할 수 있으나, 연구 결과는 이 둘 사이에 중등도 상관관계만을 보여준다. 이는 수련-후 환경이 학습한 내용이나 기술을 적용할 기회를 주지 않아서 일 수 있다. 이러한 잠재적 제약은 평가도구 설계, 자료 수집, 자료 해석에 고려되어야 한다.

Although learning criteria and behavioral criteria conceptually are expected to be related, research has found relatively modest relationship between the two (Alliger et al. 1997; Arthur et al. 2003a). This is typically attributed to the fact that post-training environments may or may not provide opportunities for the learned material or skills to be demonstrated (Arthur et al. 2003a). This potential constraint needs to be considered in design of assessment instruments, and in collection and interpretation of behavioral data.


2.4 결과

2.4 Results criteria


Results는 매우 중요하지만 동시에 평가가 매우 어렵기도 하다. 조직 수준의 세팅에서 productivity gains, increased customer satisfaction, increased employee morale following management training, or increase in profitability of organizations  등으로 나타난다. 다른 level들에 비해서 평가 빈도가 현저히 낮다. Alliger는 조직 차원의 제약이 results 단계의 자료 수집을 어렵게 하며, 따라서 sponsor들은 비현실적 기대를 해서는 안된다고 했다.

Results criteria are both highly desirable and most difficult to evaluate. In organizational settings, they are operationalized by productivity gains, increased customer satisfaction, increased employee morale following management training, or increase in profitability of organizations (Arthur et al. 2003a; Landy and Conte 2007). Results are often difficult to estimate and results criteria are used considerably less frequently than assessments of any other level of Kirkpatrick’s model. Alliger et al. (1997) caution that organizational constraints substantially limit opportunities for collecting results data and remind that sponsors of training may have unrealistic expectations with regard to results level outcomes.


고등교육에서 results criteria에는 이해관계자가 다양하게 개입되어 있다.

Results criteria in Higher Education and multiple stakeholders of education


교육으로부터 이득을 보는 적어도 두 개의 집단이 있다. a) 학생, b) 사회.

Thus, it appears that there are at least two parties that are to profit from education: a) the student, who should develop skills useful for the workplace and life in general, and b) the society, which is interested in college graduates who are competent and responsible contributors to local and global communities.


따라서 results criteria에는 광범위한 성과가 포함되어야 한다. 또한 이 대부분은 개인과 사회 모두에 이익이 되어야 한다.

Thus, results criteria in education may include a wide range of outcomes, such as 

  • alumni employment and workplace success, 
  • graduate school admission, 
  • service to underprivileged groups or work to promote peace and justice, 
  • literary or artistic work, 
  • personal and family stability, and 
  • responsible citizenship. 

Moreover, most of these outcomes benefit both individual and the society




Alliger, G. M., Tannenbaum, S. I., Bennett, W., Jr., Traver, H., & Shotland, A. (1997). A meta-analysis of relations among training criteria. Personnel Psychology, 50, 341–358.


Katz, D., & Kahn, R. L. (1966). The social psychology of organizations. New York: Wiley.













Adaptation of Kirkpatrick’s four level model of training criteria to assessment of learning outcomes and program evaluation in Higher Education


Ludmila Praslova


Received: 7 July 2009 / Accepted: 11 May 2010 /

Published online: 25 May 2010

Springer Science+Business Media, LLC 2010


Abstract Assessment of educational effectiveness provides vitally important feedback to Institutions of Higher Education. It also provides important information to external stakeholders, such as prospective students, parents, governmental and local regulatory entities, professional and regional accrediting organizations, and representatives of the workforce. However, selecting appropriate indicators of educational effectiveness of programs and institutions is a difficult task, especially when criteria of effectiveness are not well defined. This article proposes a comprehensive and systematic approach to aligning criteria for educational effectiveness with specific indicators of achievement of these criteria by adapting a popular organizational training evaluation framework, the Kirkpatrick’s four level model of training criteria (Kirkpatrick 1959; 1976; 1996), to assessment in Higher Education. The four level model consists of reaction, learning, behavior and results criteria. Adaptation of this model to Higher Education helps to clarify the criteria and create plans for assessment of educational outcomes in which specific instruments and indicators are linked to corresponding criteria. This provides a rich context for understanding the role of various indicators in the overall mosaic of assessment. It also provides Institutions of Higher Education rich and multilevel feedback regarding the effectiveness of their effort to serve their multiple stakeholders. The importance of such feedback is contextualized both in the reality of stakeholder pressures and in theoretical understanding of colleges and universities as open systems according to the systems theory (Katz and Kahn 1966). Although the focus of this article is on Higher Education, core principles and ideas will be applicable to different types and levels of educational programs.


Keywords Assessment . Evaluation . Program evaluation . Higher Education . Education . Criteria

역진행교실(뒤집힌교실, 거꾸로교실..)을 만드는 12가지 팁 (Med Teach, 2015)

Twelve tips for ‘‘flipping’’ the classroom

JENNIFER MOFFETT

Department of Clinical Sciences, Ross University School of Veterinary Medicine, West Farm, St. Kitts, West Indies







flipped classroom(FlCl)은 전통적인 강의에서의 수업과 숙제 방식을 뒤집은 것이다. 학생은 우선 과목 내용을 미리 제공받고(책을 읽거나, 영상을 보거나, 팟캐스트를 듣거나) 수업시간에는 정보의 단순 제공에서 벗어나 소그룹/능동적 학습활동 등에 활용된다.

The flipped classroom describes an educational approach that reverses the traditional lecture and homework elements of a course. Students are first presented with course material in advance of class: they read a book chapter, watch a video or listen to a podcast. Class time is then freed from simple delivery of information and used for other purposes, notably small group, active learning exercises (Bishop & Verleger 2013).


FlCl의 가장 포괄적인 정의를 본다면, 이 아이디어는 전혀 새로운 것이 아니다. 기록으로만 보면 최초 기록은 1800년대부터 등장하는데, 미국 군사학교에서 General Sylvanus Thayer 가 공학수업에서 수업 전에 self-source 수업내용을 제공한 바 있다.

If you accept the loosest definition of the term ‘‘flipped classroom’’, the idea is neither new nor novel. One of the first recorded examples comes fromthe early 1800s, when General Sylvanus Thayer instructed engineering students at the US military academy at West Point, New York, NY, USA to self- source content prior to class (Musallam 2011).


Bishop and Verleger 은 기술의 역할을 강조하는 정의를 내렸다. "두 부분으로 이뤄진 교육 테크닉: 수업 내적으로는 상호작용적 그룹 학습활동, 그리고 수업 외적으로는 컴퓨터-기반 개별 학습"

Bishop and Verleger (2013), authors of a comprehensive survey of flipped classroom research, use a description that highlights the role of technology, calling it ‘‘an educational technique that consists of two parts: inter- active group learning activities inside the classroom and direct computer-based individual instruction outside the classroom’’.


최근의 연구결과를 보면 여러 장점이 있다.

Current research evidence shows that the flipped classroom has a number of potential advantages. 

  • 개별화된 교육, 근거중심 테크닉 도입 These include increased opportunities to provide individualized education to learners and to incorporate evidence-based teaching techniques into existing courses (Kachka 2012a; Johnson 2013). 
  • 교수자 입장에서 시간의 최적화된 활용 In addition, the approach allows educators to optimize their time; flipped classrooms increase educator–student interaction time as the educator is present when students attempt to analyse and apply new knowledge (Bergmann et al. 2012; Johnson 2013). 
  • 자기주도성 함양, 스스로의 학습에 대한 책임 Educators who have used the approach say that the flipped classroom improves student self-direction and encourages students to take responsibility for their own education (Bergmann et al. 2012). 
  • 학습의 유연성 (속도에 맞는 학습) Students also report enjoying the flipped classroom, particularly the flexibility associated with being allowed to move through material at their own pace (Johnson 2013; Butt 2014).
  • 스스로 동기부여 In addition, the approach requires that students are self-motivated and take responsibil- ity for their education. 
  • 그러나 수업 전 과제를 하지 않거나 교실 내 활동에 참여하지 않으면 Fail. It is a valid concern that the flipped classroom will not support effective learning if students fail to engage with the assigned pre-class or in-class activities (Kachka 2012b).




이미 인정받은 교육 이론과 근거-기반 기술을 활용하라

Ti p 1 Use recognized educational theory and evidence-based techniques to drive your flipped classroom


FlCl의 강점이 기술의 활용에 있다는 것은 흔한 착각이다. 예컨대, 강의를 온라인으로 옮긴다거나, 비디오-녹화 발표로 바꾸는 것 등이다.

There’s a common misconception that the strength of the flipped classroom centres on the use of technology, for example moving lecture material into online, video-recorded presentations (Bergmann et al. 2012).


무슨 테크놀로지를 활용하고자 하든, 그것은 등식의 한 부분일 뿐이다. Flip 하기로 결정을 내렸다면, 교육자들은 과목 설계를 위한 필수적 요소를 알고 이를 고려해야 한다. 요구 사정, 내용과 목표 결정, 적절한 교수-평가 방법 등이 그것이다.

The technology that we decide to use, or not, is only part of the equation. When a decision to flip is made, educators should first consider the recognized essential elements of course design. These include conducting needs assess- ments, determining content and learning outcomes, and selecting appropriate educational and assessment methods (Lockyer et al. 2005).



FlCl의 긍정적 특징을 활용하라
Ti p 2 Capitalize on the positive features of the flipped classroom

의학교육자들은 새로운 토픽/방법/사람을 어떻게 기존 과목에 통합시킬 것인지 등을 생각해보게 된다. 예컨대 FLCL 모델은 기존의 방식대로라면 시간이나 지리적 한계로 배제되었을 관련된 외부 전문가를 포함시킬 수 있다. 또 다른 장점은 교육 혁신을 위한 시간과 공간을 확보해준다는 점이다. 내용의 전달이 온라인 환경으로 옮겨간다면, 수업시간은 무수한 근거-기반 교육모델을 활용할 수 있는 장이 된다. (경험학습, TBL, PBL 등)

Medical educators are advised to reflect on how the approach may be used to integrate new topics, methods, and people onto a course, or solve other existing challenges. For example, flipped classroom models have been used successfully to involve external subject experts, who would otherwise have been excluded from being educators, due to limitations of time or geographical location (Wagner et al. 2013). Another recognized advantage of the flipped classroom is its ability to create time and space in an existing curriculumfor educational innovations (Kachka 2012a). Once the delivery of content (either whole or partial) has been removed to an online environment, class time now becomes a space to introduce a wide variety of evidence-based educational models, for example experiential learning, team-based learning, and problem-based learning (Kolb & Kolb 2005; Klegeris et al. 2013; Ofstad & Brunner 2013).




수업자료를 어떻게 조직화할 것인지 결정하라

Ti p 3 Decide how you want to organize your course material


가장 먼저 내려야 하는 결정은 어떻게 교육자료를 두 가지로 나눌지에 대한 것이다. 즉 어떤 것을 수업 전에 제공할 것이며, 어떤 것을 수업 중에 할 것인가에 대한 것이다. 우리는 Bloom's taxonomy와 같은 교육모델을 활용할 수 있는데, 예컨대 수업 전에는 낮은 수준(지식 습득, 이해) 그리고 수업 중에는 높은 수준(적용과 분석)과 관련된 활동을 할 수 있다.

One of the first decisions that the flipped classroom educator faces is how to divide the course material into two elements: what will be addressed prior to class and what will be addressed during class. We can use educational models such as Bloom’s taxonomy (revised) (Anderson & Krathwohl 2001) to help organize the approach. For example, pre-class activities are used to support lower levels of learner cognitive work (e.g. knowledge and comprehension) and in-class activities are used to facilitate higher levels (e.g. application and analysis).


이 단계는 어떤 수업자료를 우선적으로 제공할지를 결정하는 단계이기도 하다. FLCL 접근법에서 빠지기 쉬운 위험은 온라인에 학습내용을 '덤핑'해놓음으로써 학생들이 정보 과다에 휘말리게 하는 것이다. 과도한 양의 내용은 여러 FLCL 연구에서 조심해야 할 것으로 지적된 바 있다.

This stage of planning is also an appropriate place to consider what course material needs prioritizing. With a flipped classroom approach, educators can encounter the risk of ‘‘dumping’’ content into an online learning environment, resulting in information overload for students. Excessive content has been highlighted as a concern for students in several flipped classroom studies (Johnson 2013; Wagner et al. 2013).


이상적으로는, 수업 전 자료와 수업 중 활동은 기존의 방식에서 강의와 강의 후 숙제에 소요되는 시간과 비슷하거나 더 적어야 한다.

Ideally, the time allowed for pre-class and in-class activities in the flipped classroom should mirror, or be less than, the time used for lectures and post-lecture homework in the traditional classroom.



수업 전 활동 선정에 투자하라

Ti p 4 Invest in your choice of pre-class activities


온라인 수업 전 활동을 학생들이 할 수 있으면서, 하고 싶게 만들어야 한다. 예컨대 접근가능성에 신경을 써야 하는데, 온라인으로 제공한다면, 모든 학생이 그 자료에 접속할 수 있는가를 생각해봐야 한다.

It is necessary to design online pre- class activities with students can, and are willing to, engage (Ellaway & Masters 2008). For example, consideration needs to be made to accessibility; if course material is delivered online, do all students have access to appropriate technology and reliable internet?



FLCL에 대해서 이야기 할 때, 비디오-기반 교육을 많이 이야기하는데 교육자들은 비디오-기반 자료를 만드는 것이 부담스러울 수 있다. 그러나 다행히 Vimeo and iTunes University 와 같은 것들의 도움을 받을 수도 있다.

Much of what is written on the flipped classroomcentres on the use of video-based instruction. It is,however, also recognized that educators, particularly thosewith time constraints or low confidence in using technology,may find the idea of creating video-based course material unappealing (Shimamoto 2012; Snowden 2012; Johnson 2013).Fortunately, there are many ways in which course content canbe delivered through video, some of which require little technological expertise such as Vimeo and iTunes University,


그 외에도 Camtasia, ShowMe.  등이 있다.

Other solutions include screen casting applications, for example Camtasia, ShowMe. As with many other facets of the flipped classroom, there is no ‘‘one- size-fits-all’’ solution.


비디오-기반 교육이 유일한, 최선의 방법은 아닐 수 있다.

Video-based instruction is not the only, or necessarily ‘‘best’’, way of delivering content (Figure 1),





VLE를 활용하라

Ti p 5 Utilize VLEs to best effect


예컨대 Moodle, Blackboard,  등이 있다.

Modern VLEs, for example Moodle, Blackboard, can be used to support learning


독특한 특징은 학생이 수업자료와 좀 더 능동적 방식으로 상호작용 할 수 있다는 것인데, 학습자가 현실상황과 가까운 프로젝트와 문제해결 상황에 놓임으로써 enquiry-based 학습이 가능하다는 점이다.

A specific feature is that learners can interact with core course material in a more active way, that is learners become involved in authentic projects and problem-solving situations, which hold the concept of enquiry-based learning at their centre (Berge 2002).


이에 더하여, 교육자들은 온라인 상호작용의 형태를 결정해야 한다. 유용한 질문은 다음과 같다.

In addition, educators should define what format online interaction will take. Useful questions include: 

  • 동시적/비동시적 should communication be synchronous or asynchronous? 
  • 교사의 지도/개입 수준 Do discus- sions require heavy, light or no moderation from educators? 
  • 토론내용의 기록/저장 여부 Are discussions saved and used in any way, for example learners given course credit for participation, and 
  • 의사소통의 방식 how is this communicated?



수업시간을 창의적으고 효과적으로 사용하라

Ti p 6 Use class time creatively and effectively


대부분의 FLCL은 수업시간동안 능동적 학습활동을 촉진하는 방식으로 사용되며 다양한 방법이 제시된 바 있다. 이러한 다양성은 오히려 어떤 방법이 가장 효과적인지 결정하는 것을 어렵게 만들기도 한다.
Most flipped classrooms use class time to facilitate active learning exercises, and a wide variety of methods, for example peer support groups, case-based learning, and experiential learning, has been described in the literature. This variety makes it difficult to draw any firm conclusions on what methods work best, and this has been highlighted as an area that needs further research (Bishop & Verleger 2013).


무슨 방법을 쓰든 FLCL은 일반적으로 교사-학생 상호작용을 증진시킨다. 실제 상황에서 이것은 학생이 문제를 해결해가는 과정에서 지지를 얻고, 모호한 부분에 대한 명확한 설명을 받음을 의미하고, 교수자에게는 수업시간 활동 중에 학생들이 어떤 것을 혼란스러워하는지 실시간으로 피드백 받을 수 있다.

Whatever educational methods are adopted, the flipped classroom normally allows for increased educator–student interaction during lectures. On a practical level, this means that students can get support and clarification as they work through problems, whilst educators get real-time feedback on the in- class activities and what specific topics cause confusion for the students.




학습자의 요구에 맞춘 교육을 하라

Ti p 7 Utilize the flipped classroom to tailor education to your learners’ needs


FLCL에서는 교육 테크놀로지를 활용하여 학습자의 요구에 맞춘 교육을 할 수 있다. 예컨대 온라인 상호작용 활동 (토론, 퀴즈, 컴퓨터-도움 학습 모듈)을 활용하여 학생의 참여와 이해수준에 관한 정보를 얻을 수 있다.

Educational technology can be harnessed in flipped class- rooms to tailor education to learners’ needs. For example, educators can use online interactive exercises such as discus- sions, quizzes, and computer-assisted learning modules to gather rich information about student engagement and under- standing (Cooper 2000).


예컨대 1학년 수업에서 뇌의 육안해부학에 대해서 비디오 발표를 한다고 하면, 발표 후에 온라인으로 짧은 퀴즈를 풀게 할 수 있다. 만약 대부분 영역에서 잘 하는데, 소뇌 관련 문제에서 어려움을 겪는 것으로 보인다면, 이는 교수자에게 중요한 피드백이 된다. 이를 바탕으로 수업 전 활동이나 수업 중 활동을 수정할 수 있다.

For example, a class of first year medical students view a video presentation on the gross anatomy of the brain. After the presentation, they are required to complete a short online quiz on the material covered in the presentation. A large proportion of the class performs well but has difficulty with a series of questions relating specifically to the cerebellum. This gives the educator important feedback, on the pre-class activity (‘‘Do I need to change or provide more information about the cerebellum during the online part of the course?’’) and/or the students (‘‘They need more clarity on the cerebellum. How will I address this in class?’’).


FLCL은 더 관심을 기울여야 하는 학생을 찾는데도 도움을 준다.

The flipped classroom also facilitates the identification of individual students that may need extra attention.



FLCL로 전환하는 타임라인에 신경쓰라

Ti p 8 Be aware of the timelines involved with converting to a flipped classroom


FLCL로 전환하는데 있어서 가장 우려하는 것 중 하나는 이 때 필요한 시간과 작업량이다. 이는 타당하다. FLCL이 성공하려면 교수자는 새로운 기술을 익히고 적용해야 하며, 교육자료를 제시할 효과적인 방법을 찾아야 한다. 그러나 대부분의 시간투자는 초기에만 필요한 것이며 일단 익히고 나면 그 다음부터는 큰 부담이 되지 않는다.

One of the main concerns that educators have about convert- ing to a flipped classroom is the amount of time and work involved (Snowden 2012). This is a legitimate concern; for the flipped classroom to succeed, educators require time to learn and incorporate new technologies, and devise effective ways to present course material (Shimamoto 2012; Snowden 2012). It should also be recognized, however, that most of the initial time outlay involved in flipping a classroom is once-off in nature. 


추가로, FLCL이 효과적으로 기능하면 전체적인 강의 시간과 office hour가 모두 줄게 된다.

In addition, the flipped classroom, if functioning effectively, can result in reduced overall lecture time and office hours for educators (Wagner et al. 2013).



FLCL을 제공하는 교수들에게 적절한 트레이닝을 제공하라

Ti p 9 Offer training to those involved in delivering a flipped classroom course


교육 준비도는 FLCL의 성공에 중요한 요소이다. 만약 교육자들이 스스로 능력이 있다고 느끼지 못하거나 열정이 없다고 느끼면 잘 되지 않는다.

Educator ‘‘readiness’’ is an important factor in the success of a flipped classroom course; if educators do not feel capable, or enthusiastic, to flip then it is unlikely to work (Shimamoto 2012; Snowden 2012).


교육연구 결과를 보면 교수자들에게 어떻게 테크놀로지를 사용하고 근거-중심 교육을 적용할지에 대해 가르쳐줄 것을 권고한다. 이 과정은 자신의 분야에서 FLCL을 활용한 예시를 보고, 자신감을 기르는 단계가 되어야 한다. 또한 교육자들에게 FLCL의 핵심 특징을 알려줌으로써 그러한 변화의 가치를 느끼게 해야 한다.

Educational researchers advise that instructors are shown how to use new technologies and incorporate evidence-based teaching into their courses (Shimamoto 2012). Training should also provide educators with worked examples of how the flipped classroomcan be applied within their own area of expertise, as this has been recognized as a confidence-building step (Shimamoto 2012). Training sessions can also be used to inform educators about the key features of the flipped classroom so that they can see value in making the change.




학생을 준비시키라

Ti p 10 Prepare your students


학습자들도 지원이 필요하다.

It is also likely that learners need support in transitioning to a flipped classroom.


미국의 대학-수준 학생들 대상 연구에서 Strayer는 "학생들은 FLCL에서 수업의 구조가 학습과제와 연결되는 방식에 대해서 덜 만족해한다. 이 분석은 FLCL의 여러 학습활동이 학생들 사이에 불안감 (길을 잃은 느낌)을 야기하며 이는 전통적 수업에서는 겪지 못하는 것이다.

In a study of US college-level students, Strayer (2007) found that: ‘‘students in the flip(ped) classroom were less satisfied with how the structure of the class oriented them to the learning tasks in the course. The analysis showed that the variety of learning activities in the flip(ped) classroom contributed to an unsettledness amongst students (a feeling of being ‘‘lost’’) that students in the traditional classroom did not experience’’.


학습자들에게 FLCL을 사용하는 rationale에 대해서 설명해주는 것이 좋다.

It may be useful to provide learners with the rationale for using the flipped classroom in a medical education setting,



어떻게 FLCL을 평가할 것인지 결정하라
Ti p 11 Decide on how you will evaluate your flipped classroom approach

FLCL에 대한 여러 문헌 중 대부분은 학생의 '인식'에 대한 것이고 실제 객관적 평가의 결과에 대한 연구는 적다.

Much of what appears in the literature about the flipped classroom reports on student perceptions of the approach, rather than objective assessment of student performance Verleger (Bishop & 2013).



Flip이 'all-or-nothing'이 아님을 기억하라

Ti p 12 Remember that a flip does not have to be ‘‘all-or-nothing’’


FLCL은 하나의 작은 주제에 대해서만 적용할 수도 있다. 실제로 학생들은 전통적 수업과 FLCL 부분이 둘 다 있는 수업을 좋아하며, 한 연구에서는 30%의 FLCL과 70%의 전통적 수업의 비율이 적절하다고 이야기한다.

Flipped classroom techniques can be incorporated around single topics or modules; indeed there is evidence to suggest that students prefer courses that are divided into both one study traditional and flipped classroom portions. In (Wagner et al. 2013), engineering students indicated that a balance of 30% flipped classroom and 70% traditional class- room represented the optimum balance.















 2015 Apr;37(4):331-6. doi: 10.3109/0142159X.2014.943710. Epub 2014 Aug 26.

Twelve tips for "flipping" the classroom.

Author information

  • 1Department of Clinical Sciences, Ross University School of Veterinary Medicine , West Farm, St. Kitts , West Indies.

Abstract

The flipped classroom is a pedagogical model in which the typical lecture and homework elements of a course are reversed. The following tipsoutline the steps involved in making a successful transition to a flipped classroom approach. The tips are based on the available literature alongside the author's experience of using the approach in a medical education setting. Flipping a classroom has a number of potential benefits, for example increased educator-student interaction, but must be planned and implemented carefully to support effective learning.

PMID:
 
25154646
 
[PubMed - in process]


타당도: 평가결과의 유의한 해석을 위한 도구 (Med Educ, 2003)

Validity: on the meaningful interpretation of assessment data

Steven M Downing





타당도는 평가결과에 따르는 의미나 해석을 지지하거나 반박하기 위한 근거이다. 모든 평가는 타당도 근거를 필요로 하고, 평가의 거의 모든 주제가 어떤 식으로든 '타당도'와 관련이 있다. 타당도는 평가의 필수불가결한 요소이며, 타당도가 결여되면 평가는 거의 혹은 아무 의미가 없다.

Validity refers to the evidence presented to support or refute the meaning or interpretation assigned to assessment results. All assessments require validity evidence and nearly all topics in assessment involve validity in some way. Validity is the sine qua non of assessment, as without evidence of validity, assess- ments in medical education have little or no intrinsic meaning.


타당도는 언제나 '가설'의 형태로 접근하게 된다. 평가자가 기대하는 해석적 의미를 평가 결과와 연관짓게 되고, 이 최초 가설을 바탕으로 자료가 수집되고, 타당도 가설을 지지하거나 반박하는 결과로 나타난다. 이러한 개념하에서 평가자료는 어특 특정한 목적, 의미, 해석에 대해서 더 타당하거나 덜 타당할 수 있으며, 이는 특정 시점이나 특정 집단에 대해서만 그러할 수도 있다. 평가 그 자체만으로는 절대로 '타당'하다거나 '타당하지 않다'라는 말을 할 수 없으며, 평가점수의 해석을 하는 데 있어서 그것을 지지하거나 반박하는 과학적으로 타당한 근거가, 특정 시점에 존재한다고 말할 수 있을 뿐이다.

Validity is always approached as hypothesis, such that the desired interpretative meaning associated with assessment data is first hypothesized and then data are collected and assembled to support or refute the validity hypothesis. In this conceptualization, assess- ment data are more or less valid for some very specific purpose, meaning or interpretation, at a given point in time and only for some well-defined population. The assessment itself is never said to be ‘valid’ or ‘invalid’ rather one speaks of the scientifically sound evidence presented to either support or refute the proposed interpretation of assessment scores, at a particular time period in which the validity evidence was collected.


타당도라는 것이 다양한 근거원을 고려하는 일원화된(unitary) 개념이라는 것이 지금의 개념적 해석이다. 근거에 대한 출처는 보통 의도한 방향의 해석이나 의미와 관련하여 논리적으로 제안된다. 현재의 프레임워크에서 모든 타당도는 구인타당도(construct validity)이며Messick이 SEPM에서 보다 우아하게 설명한 바 있다. 과거에는 타당도는 세 가지 다른 종류로 나눠졌다. Content, Criterion, Construct. 이 중 Criterion-related validity는 준거자료의 수집 시점에 따라 종종 concurrent 와 predictive 로 나눠졌다.

In its contemporary conceptualization,1,3–14 validity is a unitary concept, which looks to multiple sources of evidence. These evidentiary sources are typically logi- cally suggested by the desired types of interpretation or meaning associated with measures. All validity is construct validity in this current framework, described most eloquently by Messick8 and embodied in the current Standards of Educational and Psychological Meas- urement.1 In the past, validity was defined as three separate types: content, criterion and construct, with criterion-related validity usually subdivided into con- current and predictive depending on the timing of the collection of the criterion data.2,15


왜 구인타당도가 이제 유일한 유형의 타당도가 된 것일까? 과학의 철학에서 그 복잡한 답을 찾을 수 있는데, 어떤 영역이나 더 넓은 인구집단에 대해서 의미있고 논리적인 추론을 위해서는 무수한 상호-연결된 추론의 거미줄은 그 contents를 sampling하는 것과 연결되어있다.

Why is construct validity now considered the sole type of validity? The complex answer is found in the philosophy of science8 from which, it is posited, there are many complex webs of inter-related inference associated with sampling content in order to make meaningful and reasonable inferences to a domain or larger population of interest.


보다 직접적 대답은 이러하다: 거의 모든 평가는 사회과학이기 때문이다. 의학교육도 마찬가지이다. 이러한 평가는 무형의 추상적 개념과 원칙 - 행동으로부터 추론할 수 있고, 교육이나 심리학 이론으로부터 설명할 수 있는 - 의 집합체를 구성한다. 교육적 성취 역시 구인(construct)으로서, 잘 정의된 지식영역에 대한 지필고사나, 특정 문제나 사례에 대한 구술고사, 표준화환자를 이용한 병력청취와 의사소통 등으로 부터 추론(infer)되는 것이다.

The more straightforward answer is: Nearly all assessments in the social sciences, including medical education, deal with – constructs intangible collections of abstract concepts and princi- ples which are inferred from behavior and explained by educational or psychological theory. Educational achievement is a construct, usually inferred from per- formance on assessments such as written tests over some well-defined domain of knowledge, oral exami- nations over specific problems or cases in medicine, or highly structured standardized patient examinations of history-taking or communication skills.


교육적 능력이나 적성 역시 우리에게 친근한 구인의 또 다른 사례이다. 이 구인은 학업성취보다도 더 실체가 없고 추상적인데, 왜냐하면 교육자나 심리학자 사이에 합의가 덜 되어있기 때문이다. 교육적 능력을 측정하기 위한 검사는 - MCAT같은 - 북미에서 의과대학 입학시에 주요하게 활용되며, 따라서 MCAT을 사용하는 타당성에 대해 지지하려면 다양한 출처로부터, 과학적으로 타당한 근거를 제시할 수 있어야 한다. 타당도 근거의 중요한 출처로는 이러한 MCAT점수가 의과대학 입학 후 학업성취를 얼만 예측하는가를 보여주는 것이다.

Educational ability or aptitude is another example of a familiar construct – a construct that may be even more intangible and abstract than achievement because there is less agreement about its meaning among educators and psychologists.16 Tests that purport to measure educational ability, such as the Medical College Admissions Test (MCAT), which is relied on heavily in North America for selecting prospective students for medical school admission, must present scientifically sound evidence, from multiple sources, to support the reasonableness of using MCAT test scores as one important selection criterion for admitting students to medical school. An important source of validity evi- dence for an examination such as the MCATis likely to be the predictive relationship between test scores and medical school achievement.


타당도는 평가 점수 해석을 그 의도한 해석의 논리성을 지지하거나 반박하는 이론/가설/논리와 연결시키는 evidentiary chain을 필요로 한다. 타당도는 절대로 당연히 가정될 수 있는 것이 아니며, 지속적으로 가설을 수립하고 자료를 모으고, 검증하고, 비판적으로 평가하고, 논리적으로 추론해야 하는 것이다. 타당도에 대한 근거, 그에 관련된 이론, 경험적 근거는 어떤 특정한 해석이 타당하고 어떤 해석이 그렇지 않은가를 알려주는 것이다.

Validity requires an evidentiary chain which clearly links the interpretation of the assessment scores or data to a network of theory, hypotheses and logic which are presented to support or refute the reasonableness of the desired interpretations. Validity is never assumed and is an ongoing process of hypothesis generation, data collection and testing, critical evaluation and logical inference. The validity argument11,12 relates theory, predicted relationships and empirical evidence in ways to suggest which particular interpretative meanings are reasonable and which are not reasonable for a specific assessment use or application.


유의미한 점수의 해석을 위해서, 어떤 평가는 - 예컨대 지식에 대한 학업성취도 - 상당히 직접적인 시험 내용의 적합성에 대한 근거로 내용-관련 근거, 점수의 재생산가능성, 문항의 통계적 질, 합격선이나 학점을 결정한 근거 등이 필요할 수 있다. 수행능력 평가와 같은 다른 종류의 평가에서는 다른 것이 필요하다.

In order to meaningfully interpret scores, some assessments, such as achievement tests of cognitive knowledge, may require fairly straightforward content- related evidence of the adequacy of the content tested (in relationship to instructional objectives), statistical evidence of score reproducibility and item statistical quality and evidence to support the defensibility of passing scores or grades. Other types of assessments, such as complex performance examinations, may require both evidence related to content and consider- able empirical data demonstrating the statistical rela- tionship between the performance examination and other measures of medical ability, the generalizability of the sampled cases to the population of skills, the reproducibility of the score scales, the adequacy of the standardized patient training and so on.



평가의 목적이나 의도한 해석에 따라 달라질 수 있는 타당도 근거의 전형적인 출처에는 다음과 같은 것이 있다.

Some typical sources of validity evidence, depending on the purpose of the assessment and the desired interpretation are: 

  • evidence of the content representa- tiveness of the test materials, 
  • the reproducibility and generalizability of the scores, 
  • the statistical character- istics of the assessment questions or performance the statistical prompts, 
  • relationship between and among other measures of the same (or different but related) constructs or traits, 
  • evidence of the impact of assessment scores on students and 
  • the consistency of pass–fail decisions made from the assessment scores.


평가에 따르는 부담이 클수록 , 타당도 근거를 더 다양한 출처로부터 수집하고, 지속적으로, 재평가할 필요가 커진다.(면허, 자격증 등) 

The higher the stakes associated with assessments, the greater the requirement for validity evidence from multiple sources, collected on an ongoing basis and continually re-evaluated.17 The ongoing documenta- tion of validity evidence for a very high-stakes testing programme, such as a licensure or medical specialty certification examination, may require the allocation of many resources and the contributions of many different professionals with a variety of skills – content specialists, psychometricians and statisticians, test editors and administrators. 




구인타당도의 출처

Sources of evidence for construct validity


the Standards에 따르면 "타당도는 근거나 이론이 검사점수를 해석하는 데 있어서 의도한 활용을 지지하는 정도"이다. 현재의 Standards는 타당도에 대한 일원화된 관점 - 모든 타당도는 구인타당도이다 - 을 충분히 반영하고 있다. 이 때 타당도란, 구인을 잘 정의하고, 자료와 근거를 모으고 통합해서 그 매우 구체적인 해석을 지지하거나 반박하는 절차이다. 역사적으로 타당도를 점증하는 방법과 구인타당도와 관련된 근거들은 Cronbach,3–5Cronbach and Meehl6 and Messick.7의 초기 과업에 많은 토대를 두고 있다. 초기의 단일화된 개념은 1957년 Loevinger, Kane 의 논문으로 거슬러올라가며, 이들은 타당도라는 것을 해석적 주장의 맥락에 두면서, 각각의 평가마다 확립되어야 하는 것이라고 했다. 

According to the Standards: ‘Validity refers to the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests’1 (p. 9). The current Standards1 fully embrace this unitary view of validity, following closely on Messick’s work8,9 that considers all validity as con-struct validity, which is defined as an investigative process through which constructs are carefully defined,data and evidence are gathered and assembled to form an argument either supporting or refuting some very specific interpretation of assessment scores.11,12 His-torically, the methods of validation and the types of evidence associated with construct validity have their foundations on much earlier work by Cronbach,3–5Cronbach and Meehl6 and Messick.7 The earliest unitary conceptualization of validity as construct validity dates to 1957 in a paper by Loevinger.18 Kane11–13 places validity into the context of an interpretive argument, which must be established for each assessment; Kane’s work has provided a useful framework for validity and validation research. 



The Standards


다섯 가지 근거

The Standards1 discuss five distinct sources of validity evidence (Table 1)


평가의 종류에 따라 한 종류의 타당도를 다른 종류의 타당도보다 더 강조하곤 한다.

Some types of assessment demand a stronger emphasis on one or more sources of evidence as opposed to other sources and not all sources of data or evidence are required for all assessments. 

  • For example, a written, objectively scored test covering several weeks of instruction in microbio-logy, might emphasize content-related evidence, to-gether with some evidence of response quality, internal structure and consequences, but very likely would not seek much or any evidence concerning relationship to other variables.
  • On the other hand, a high-stakes summative Objective Structured Clinical Examination (OSCE), using standardized patients to portray and rate student performance on an examination that must be passed in order to proceed in the curriculum, might require all of these sources of evidence





Sources of validity evidence for example assessments


점수 자체는 아무런 의미도 없다. 따라서 이 '근거'는 특정 평가에서 얻은 점수가 의도한 방식대로 해석할 수 있다는 것에 대한 논리적 근거를 제시해야 한다.

The scores have little or no intrinsic meaning; thus the evidence presented must convince the skeptic that the assess- ment scores can reasonably be interpreted in the proposed manner.



내용 타당도 근거 

Content evidence


 Examination blueprint

• Representativeness of test blueprint to achievement domain

• Test specifications

• Match of item content to test specifications

• Representativeness of items to domain

• Logical/empirical relationship of content tested to achievement domain

• Quality of test questions

• Item writer qualifications

• Sensitivity review

지필고사에 있어서 '내용'과 관련된 타당도 근거자료가 가장 필수적이다. Blueprint나 Test specification에서 드러난다. 

For the written assessment, documentation of validity evidence related to the content tested is the most essential. The outline and plan for the test, described by a detailed test blueprint or test specifications, clearly relates the content tested by the 250 MCQs to the domain of the basic sciences as described by the course learning objectives. The test blueprint is sufficiently detailed to describe subcategories and subclassifications of content and specifies precisely the proportion of test questions in each category and the cognitive level of those questions. The blueprint documentation shows a direct linkage of the questions on the test to the instructional objectives. 


독립적인 내용전문가가 test blueprint가 합리적인지 판단할 수 있다. 시험문항과 주요 학습목표와 교수-학습 활동의 관계가 명확해야 한다. 만약 대부분의 학습목표가 적용이나 문제해결 수준의 것이라면, 시험문항도 그러한 인지수준에 맞춰야 한다.

Independent content experts can evaluate the reasonableness of the test blueprint with respect to the course objectives and the cognitive levels tested. The logical relationship between the content tested by the 250 MCQs and the major instructional objectives and teaching⁄ learning activities of the course should be obvious and demonstrable, especially with respect to the proportionate weighting of test content to the actual emphasis of the basic science courses taught. Further, if most learning objectives were at the applica- tion or problem-solving level, most test questions should also be directed to these cognitive levels.


시험문항의 질 역시 내용-관련 타당도 근거의 하나이다. 

The quality of the test questions is a source of content-related validity evidence. 

    • MCQ가 효과적인 문항작성법에 근거했나?
      Do the MCQs adhere to the best evidence-based principles of effective item-writing.19 
    • 문항작성자가 내용전문가로서 자격이 있는가?
      Are the item-writers qualified as content experts in the disciplines? 
    • 문항 수가 충분한가?
      Are there sufficient numbers of questions to adequately sample the large content domain? 
    • 문항의 문장을 분명하고 오류 없이 기술했는가?
      Have the test questions been edited for clarity,removing all ambiguities and other common item flaws?
    • 문화적 민감성에 따라 검토되었는가?
      Have the test questions been reviewed for cultural sensitivity? 


SP에 있어서 마찬가지로 contents에 대한 이슈가 있다. SP case가 10개 있다면 이 10개는 - 예컨대 일차의료 외래상황에 대한 - 대표성이 있어야 함. 

For the SP performance examination, some of the same content issues must be documented and presen- ted as validity evidence. 


For example, each of the 10 SP cases fits into a detailed content blueprint of ambula- tory primary care history and physical examination skills. There is evidence of faculty content–expert agreement that these specific 10 cases are representative of primary care ambulatory cases. Ideally, the content of the 10 clinical cases is related to population demographic data and population data on disease incidence in primary care ambulatory settings. 


또한 임상 전문가가 SP case를 (체크리스트와 평가기준 포함) 공동으로 작성/검토/수정했는지에 대한 근거가 있어야 함. SP case에 대한 editing이 잘 되었고 SP가 자세한 가이드라인을 제공받았으며, 평가준거가 전문가에 의해서 준비되고 검토되며, SP trainer에 의해서 훈련되었는가 등도 모두 중요함.

Evi- dence is documented that expert clinical faculty have created, reviewed and revised the SP cases together with the checklists and ratings scales used by the SPs, while other expert clinicians have reviewed and critic- ally critiqued the SP cases. Exacting specifications detail all the essential clinical information to be portrayed by the SP. Evidence that SP cases have been competently edited and that detailed SP training guidelines and criteria have been prepared, reviewed by faculty experts and implemented by experienced SP trainers are all important sources of content-related validity evidence.


SP로 시험을 수행하는 동안에도 SP가 수행하는 내용을 면밀히 감시해서 모든 학생이 거의 동일한 case를 경험하게 해야 함. 서로 다른 SP가 동일한 case를 수행했다면, 학생 평가도 동일하게 내려야 함. 

There is documentation that during the time of SP administration, the SP portrayals are monitored closely to ensure that all students experience nearly the same case. Data are presented to show that a different SP, trained on the same case, rates student case perform- ance about the same. Many basic quality-control issues concerning performance examinations contribute to the content-related validity evidence for the assessment.20





평가 절차 근거 

Response process


• Student format familiarity

• Quality control of electronic scanning/scoring

• Key validation of preliminary scores

• Accuracy in combining different formats scores

• Quality control/accuracy of final scores/marks/grades

• Subscore/subscale analyses:

• Accuracy of applying pass-fail decision rules to scores

• Quality control of score reporting to students/faculty

• Understandable/accurate descriptions/interpretations of scores for students


Validity 근거로서 response process는 이상해보일 수 있다. 여기서 Response process란 시험 수행과 관련한 모든 관련 error가 가능한 최대한 통제/제가 되었느냐에 대한 것이다. 

As a source of validity evidence, response process may seem a bit strange or inappropriate. Response process is defined here as evidence of data integrity such that all sources of error associated with the test administration are controlled or eliminated to the maximum extent possible. Response process has to do with aspects of assessment such as ensuring 

  • 응답의 정확도
    the accuracy of all responses to assessment prompts, 
  • 평가에 있어 data flow의 질 관리
    the
     quality control of all data flowing from assessments, 
  • 다양한 평가점수를 하나의 점수로 산출하는 방식의 적합성
    the appropriate- ness of the methods used to combine various types of assessment scores into one composite score and 
  • 피평가자에게 제공되는 점수의 유용성과 정확도
    the usefulness and the accuracy of the score reports provided to examinees.


지필고사에 있어서 모든 시험시행절차와 관련된 문서와 시험에 대한 정보, 학생에게 제공되는 지침을 기록하는 것이 중요. 시험점수의 절대적 정확성을 확보하기 위한 모든 quality-control procedure와 관련된 것의 문서화가 중요한 근거. 이는 일차 채점 이후 final key validation이다. scoring key의 정확성을 확실하게 하고, final scoring에서 안 좋은 문항을 배제시키는 것이다. 

For evidence of response process for the written comprehensive examination, documentation of all practice materials and written information about the test and instructions to students is important. Docu- mentation of all quality-control procedures used to ensure the absolute accuracy of test scores is also an important source of evidence: the final key validation after a preliminary scoring – to ensure the accuracy of the scoring key and eliminate from final scoring any poorly performing test items; a rationale for any combining rules, such as the combining into one final composite score of MCQ, multiple true–false and short-essay question scores.


SP시험에 있어서, SP rating의 정확성을 보여주는 자료가 있어야 한다. 점수 계산법, reporting methods와 그 논리 - 특히 수행능력 평가 점수의 적절한 해석에 대한 설명자료 등.

For the SP performance examination, many of the same response process sources may be presented as validity evidence. For a performance examination, documentation demonstrating the accuracy of the SP rating is needed and the results of an SP accuracy study is a particularly important source of response process evidence. Basic quality control of the large amounts of data from an SP performance examination is important to document, together with information on score calculation and reporting methods, their rationale and, particularly, the explanatory materials discussing an appropriate interpretation of the performance- assessment scores (and their limitations).


global rating과 checklist rating 중 어떤 것을 선택했는지에 대한 논리에 대한 근거.

Documentation of the rationale for using global versus checklist rating scores, for example, may be an important source of response evidence for the SP examination. Or, the empirical evidence and logical rationale for combining a global rating-scale score with checklist item scores to form a composite score may be one very important source of response evidence.




내적 구조 근거 

Internal structure


• Item analysis data:

1. Item difficulty/discrimination

2. Item/test characteristic curves (ICCs/TCCs)

3. Inter-item correlations

4. Item-total correlations

• Score scale reliability

• Standard errors of measurement (SEM)

• Generalizability

• Dimensionality

• Item factor analysis

• Differential Item Functioning (DIF)

• Psychometric model


통계적, psychometric 특징과 관련되어 있음.

Internal structure, as a source of validity evidence, relates to the statistical or psychometric characteristics of the examination questions or performance prompts, the scale properties – such as reproducibility and general- izability, and the psychometric model used to score and scale the assessment.


문항 분석

Many of the statistical analyses needed to support or refute evidence of the test’s internal structure are often carried out as routine quality-control procedures. Ana- lyses such as item analyses – which computes 

  • 난이도 the difficulty (or easiness) of each test each question (or performance prompt), 
  • 변별도 the discrimination of question (a statistical index indicating how well the question separates the high scoring from the low scoring examinees) and 
  • 각 답가지별로 선택한 학생 비율 a detailed count of the number or proportion of examinees who responded to each option of the test question, 

are completed.


신뢰도는 타당도 근거의 중요한 측면. 신뢰도 없이 타당도 없다.

Reliability is an important aspect of an assessment’s validity evidence. Unless assess- ment scores are reliable and reproducible (as in an experiment) it is nearly impossible to interpret the meaning of those scores – thus, validity evidence is lacking.


합격-불합격의 재생산가능성이 매우 중요하다. 평가의 궁극적 결과(합-불합)이 일정 수준 이상으로 재생산가능하지 않으면 검사점수의 의미있는 해석이 불가능

In both example assessments described above, in which the stakes are high and a passing score has been estab- lished, the reproducibility of the pass–fail decision is a very important source of validity evidence. That is, analogous to score reliability, if the ultimate outcome of the assessment (passing or failing) can not be repro- duced at some high level of certainty, the meaningful interpretation of the test scores is questionable and validity evidence is compromised.


SP와 같이 수행능력 평가에서는 일반화가능도이론에서 유도한 특별한 타입의 신뢰도가 있음

For performance examinations, such as the SP example, a very specialized type of reliability, derived from generalizability theory (GT)21,22 is an essential component of the internal structure aspect of validity evidence. GT is concerned with how well the specific samples of behaviour (SP cases) can be generalized to the population or universe of behaviours. 


GT는 error의 source를 찾는데 유용함

GT is also a useful tool for estimating the various sources of contributed error in the SP exam, such as error due to the SP raters, error due to the cases (case specificity),and error associated with examinees. As rater error and case specificity are major threats to meaningful inter-pretation of SP scores, GT analyses are important sources of validity evidence for most performance assessments such as OSCEs, SP exams and clinical performance examinations. 


IRT와 같은 복잡한 통계측정법을 활용하는 경우, 측정 모델(measurement model) 그 자체가 internal structure와 construct validity이다. 요인 구조, 아이템-간-상관관계, 기타 구조적 특성 등등 

For some assessment applications, in which sophis- ticated statistical measurement models like Item Response Theory (IRT) models23,24 the measurement model itself is evidence of the internal structure aspect of construct validity. In IRT applications, which might be used for tests such as the comprehensive written examination example, the factor structure, item-inter- correlation structure and other internal structural characteristics all contribute to validity evidence.


편항과 비뚤림의 이슈도 중요하다. 모든 평가는 다양한 그룹을 대상으로 치뤄지게 되는데, 통계적 편향의 가능성이 있다. differential item functioning (DIF)과 같은 Bias analysis와 문항이나 performance prompts의 sensitivity review가 모두 내적구조 타당도 근거이다. 

Issues of bias and fairness also pertain to internal test structure and are important sources of validity evidence. All assessments, presented to heterogeneous groups of examinees, have the potential of validity threats from statistical bias. Bias analyses, such as differential item functioning (DIF)25,26 analyses and the sensitivity review of item and performance prompts are sources of internal structure validity evidence. Documentation of the absence of statistical test bias permits the desired score interpretation and therefore adds to the validity evidence of the assess- ment.


다른 변인과의 관계 근거 

Relationship to other variables


• Correlation with other relevant variables
• Convergent correlations - internal/external:
1. Similar tests
• Divergent correlations-internal/external
1. Dissimilar measures
• Test-criterion correlations
• Generalizability of evidence

전형적인 '타당도 연구'의 방법. 새로운 척도를 기존의 척도와 비교하는 것.
This familiar source of validity evidence is statistical and correlational. The correlation or relationship of assessment scores to a criterion measure’s scores is a typical design for a ‘validity study’, in which some newer (or simpler or shorter) measure is ‘validated’ against an existing, older measure with well known characteristics.


이 때 confirmatory evidence와 counter-confirmatory evidence를 모두 찾게 된다. 

This source of validity evidence embodies all the richness and complexity of the contemporary theory of validity in that the relationship to other variables aspect seeks both confirmatory and counter-confirmatory evidence. For example, it may be important to collect correlational validity evidence which shows a strong positive correlation with some other measure of the same achievement or ability and evidence indicating no correlation (or a strong negative correlation) with some other assessment that is hypothesized to be a measure of some completely different achievement or ability.


Campbell and Fiske가 제안한 multitrait multimethod 디자인과 관련되어 있다.

The concept of convergence and divergence of validity evidence is best exemplified in the classic research design first described by Campbell and Fiske.27 In this ‘multitrait multimethod’ design, differ- ent measures of the same trait (achievement, ability, performance) are correlated with different measures of the same trait. The resulting pattern of correlation coefficients may show the convergence and divergence of the different assessment methods on measures of the same and different abilities or proficiencies.


지필평가에서는 전체 점수와 subscale 점수의 상관관계를 볼 수 있다.

In the written comprehensive examination example, it may be important to document the correlation of total and subscale scores with achievement examina- tions administered during the basic science courses.



후속 결과 근거 

Consequences


• Impact of test scores/results on students/society

• Consequences on learners/future learning

• Positive consequences outweigh unintended negative consequences?

• Reasonableness of method of establishing pass-fail (cut) score

• Pass-fail consequences:

1. P/F Decision reliability- Classification accuracy

2. Conditional standard error of measurement at pass score (CSEM)

• False positives/negatives

• Instructional/learner consequences


비록 현재 Standards에 포함되어 있으나 가장 논쟁이 많이 되는 것이다. 시험 점수, 결정, 결과가 피시험자에게 미치는 영향, 그리고 교수-학습에 미치는 영향 등이다. 평가 결과가 피시험자, 교수, 환자, 사회에 미치는 영향은 엄청나게 클 수 있으며, 의도했든 그렇지 않았든 긍정적이거나 부정적일 수 있다.

This aspect of validity evidence may be the most controversial, although it is solidly embodied in the current Standards.1 The consequential aspect of validity refers to the impact on examinees from the assessment scores, decisions and outcomes, and the impact of assessments on teaching and learning. The conse- quences of assessments on examinees, faculty, patients and society can be great and these consequences can be positive or negative, intended or unintended.


북미에는 고부담 시험이 많다. 이런 경우 이 시험에 탈락할 때 따르는 결과는 심대하다. 의과대학에 입학할 것인지, 의사 자격을 부여할 것인지 등에 대한 결정에 따르는 비용이 크다.

High-stakes examinations abound in North Amer- ica, especially in medicine and medical education. Extremely high-stakes assessments are often mandated as the final, summative hurdle in professional educa- tion. The consequences of failing any of these examinations is enormous, in that medical education is interrupted in a costly manner or the examinee is not permitted to enter graduate medical education or practice medicine.


마찬가지로 전문의, 세부전문의 자격 시험도 그러하다. 위양성은 환자에게, 위음성은 시험을 본 당사자에게 요구하는 비용이 크다.

Likewise, most medical specialty boards in the USA mandate passing a high-stakes certification examination in the specialty or subspec- ialty, after meeting all eligibility requirements of training. postgraduate The consequences of passing or failing these types of examinations are great, as false positives (passing candidates who should fail) may do harm to patients through the lack of a physician’s skill and specialized knowledge or false negatives unjustly (failing candidates who should pass) may harm individual candidates who have invested a great deal of time and resources in graduate medical education.


시험 결과에서 오는 harm이 없거나, 아니면 최소한 good > harm임을 보여야 한다.

Evidence related to consequences of testing and its outcomes is presented to suggest that no harm comes directly from the assessment or, at the very least, more good than harm arises from the assessment.



합격률, 합격률(합격선)의 적정함에 대한 판단, 다른 시험결과과의 상관관계

In both example assessments, sources of consequen- tial validity may relate to issues such as 

  • passing rates (the proportion who pass), 
  • the subjectively judged appropriateness of these passing rates, 
  • data comparing the passing rates of each of these examinations to other comprehensive examinations such as the USMLE Step 1 and so on.


합격 점수, 합격 점수 결정 절차, 합격점수의 통계적 특성 등이 모두 validity의 일부이다. 어떻게 합-불합 기준점수를 설정했는지, 그 방법에 대한 근거도 중요함.

The passing score (or grade levels) and the process used to determine the cut scores, the statistical prop- erties of the passing scores, and so on all relate to the consequential aspects of validity.28 Documentation of the method used to establish a pass–fail score is key consequential evidence, as is the rationale for the selection of a particular passing score method.


다른 psychometric quality indicator 들

Other psychometric quality indicators concerning the passing score and its consequences (for both example assessments) include a 

  • 결정의 신뢰도 formal, statistical estimation of the pass–fail decision reliability or 
  • 분류 정확도 classification accu- racy29 and 
  • SEM추정 some estimation of the standard error of measurement at the cut score.30


Equally important consequences of assessment meth- ods on instruction and learning have been discussed by Jaeger.31 The methods and strategies Newble and profound selected to evaluate students can have a impact on and what is taught, how exactly what students learn, how this learning is used and retained (or not) and how students view and value the educa- tional process.









 2003 Sep;37(9):830-7.

Validity: on meaningful interpretation of assessment data.

Author information

  • 1Department of Medical Education, College of Medicine, University of Illinois at Chicago, 60612-7309, USA. sdowning@uic.edu

Abstract

CONTEXT:

All assessments in medical education require evidence of validity to be interpreted meaningfully. In contemporary usage, all validity is construct validity, which requires multiple sources of evidence; construct validity is the whole of validity, but has multiple facets. Five sources--content, response process, internal structure, relationship to other variables and consequences--are noted by the Standards for Educational and Psychological Testing as fruitful areas to seek validity evidence.

PURPOSE:

The purpose of this article is to discuss construct validity in the context of medical education and to summarize, through example, some typical sources of validity evidence for a written and a performance examination.

SUMMARY:

Assessments are not valid or invalid; rather, the scores or outcomes of assessments have more or less evidence to support (or refute) a specific interpretation (such as passing or failing a course). Validity is approached as hypothesis and uses theory, logic and the scientific method to collect and assemble data to support or fail to support the proposed score interpretations, at a given point in time. Data and logic are assembled into arguments--pro and con--for some specific interpretation of assessment data. Examples of types of validity evidence, data and information from each source are discussed in the context of a high-stakes written and performance examination in medical education.

CONCLUSION:

All assessments require evidence of the reasonableness of the proposed interpretation, as test data in education have little or no intrinsic meaning. The constructs purported to be measured by our assessments are important to students, faculty, administrators, patients and society and require solid scientific evidence of their meaning.

PMID:
 
14506816
 
[PubMed - indexed for MEDLINE]


글로벌 의사 만들기: 무언가 해야 할 시간?(Med Teach, 2011)

Developing a global health practitioner: Time to act?

JUDY MCKIMM1 & MICHELLE MCLEAN2

1Swansea University, UK, 2United Emirates University, UAE





“여전히 대다수의 의과대학 교육과정에 인간의 건강을 전 지구적 차원에서 바라볼 수 있게끔 하는 교육은 거의 부재하다. 그러나 전 세계가 가까워지는 미래 진료 환경에서 집단간의 갈등, 빈곤, 환경 파괴 등이 건강에 미치는 영향을 이해하는 것은 의사들에게 필수적이다.”

To consider the health of humanity on a global scale is rarely part of the medical curriculum, yet understanding the health effects of conflict, damage essen- poverty and environmental is tial for doctors practising in our shrinking world (Anon).



국제화와 세계화는 의학교육과 고등교육에서 흔히 사용되는 용어이다.

Globalisation and internationalisation, words commonly used in medical and higher education,


적절한 의료의 제공을 위해서 의학교육기관은 글로벌하게 사고하면서 로컬하게 행동할 수 있는 의사를 양성해야 한다. 또한 이들은 어디서 의료를 하든 지역사회와 인구의 변화하는 요구에 따를 수 있어야 한다.

Institutions should thus be producing medical graduates who can think globally but act locally to deliver appropriate healthcare and adapt to the changing needs of communities and populations, irrespective of where they practice medicine – a global health practitioner.


글로벌 의료인력은 무엇을 생각할 수 있어야 하는가?

What should a global health practi- tioner need to be aware of?


작아진 지구: 상호연결된 글로벌 커뮤니키

A shrinking world: An interconnected global community


우리가 사는 지구는 점점 더 좁아지고 있다.

The world in which we live is shrinking,


지금의 학생은 Net Generation의 '디지털 네이티브'라고 할 수 있으며, 멀티미디어와 함께 자라나고 정보에 즉각적 접근이 가능하다. 

Our students are the‘digital natives’ of the Net Generation, having grown up with multimedia and instant access to information (Morris & Kanter (2008) McKimm 2009). suggests that for today’s student:


컴퓨터라는 것을 통해서 보자면, 디지털 네이티브는 어떤 장소에서든 볼 수 있고, 어떤 사람과도 연결될 수 있으며, 어떤 개념에 대해서도 정보에 접근할 수 있다. 그들은 마치 이곳이 자신들의 방인 양 살고 있으며, 다니는 곳은 세상의 모든 곳이고 모든 기록된 역사이다. 

By looking through a computer window, they are able, instantaneously, to see almost any place, to connect to almost any person, and to access infor- mation about almost any concept... The space in which they move around, as if it were their own room, is the entire world and all recorded history (p. 115).


Kanter는 '상호연결성'이란 느낌이 강화되면서 학생들과 졸업생이 다른 문화권에서의 경험을 더 의도적으로 찾아나서고 있다.

Kanter (2008) believes that it is this feeling of enhanced connectedness on a global scale – the sense of global community that leads students and graduates to deliberately seek educational experiences to enrich their understanding of the practice of medicine in other cultures.


글로벌 헬스: 국제적 이슈

Global health: An international issue


이 '상호연결성'이라는 개념으로부터 국제보건이라는 개념이 등장했고, 협력적 행동으로 최선의 대응이 가능한, 국가 경계를 넘어는 보건 이슈가 등장했다. 국제보건 문제를 예상하고, 예방하고, 개선하기 위한 싸움에 참여하지 못하는 것은 건강 영역에서 미국의 지위는 물론 자기 자신의 보건, 경제, 안보까지 위험에 빠뜨릴 것이다.

Emerging from this ‘connectedness’ is the notion of global health – health issues and concerns that transcend national boundaries which are best be addressed by co-operative actions (United States Institute of Medicine 1997) – The failure to engage in the fight to anticipate, prevent, and ameliorate global health problems would diminish America’s stature in the realm of health and jeopardise our own health, economy,and national security (p. 4). 


기후변화, 갈등, 건강불평등

Climate change, conflict and healthcare disparities


환경 문제, 특히 기후 변화는 건강 불평등을 더 악화시킬 것이다. 

Environmental issues, climate change in particular, will further widen healthcare disparities. The health consequences of climate change include 

    • compromised food security through flooding and droughts in an already sensitive agricultural sector, 
    • increased mortality from extreme weather events, 
    • water scarcity during droughts, 
    • diarrhoeal diseases during flooding and 
    • the spread of infectious diseases due to changing patterns of insect vectors (World Health Organization 2008).


Costello 등은 기후변화가 21세기 국제보건의 가장 큰 위협이 될 것이라 했다.

Costello et al. (2010) believe that climate change has been the greatest global health threat of the twenty-first century,


개발기구의 원조 대부분은 빈곤문제를 완화시키고, 강건한 보건 인프라 구축을 통해서 핵심 건강 이슈 - 만성질병, 감염병, 모자보건 - 를 해결하는 데 있다. 이러한 행동의 중심에는 적절하게 수련받은 보건의료인력이 있다. 태평양 국가와 같은 여러 나라에서 '효과적인' 보건의료인력은 그 지역에서 수련받은 의료전문직과, 지역사회/토착 보건인력, 보다 일시적이긴 하나 해외에서 온 의료전문인력의 팀으로 구성된다.

Much of the work of aid and development agencies focuses on alleviating poverty and establishing a robust health infrastruc- ture and adequate resources to address key health issues such as chronic and communicable diseases and maternal and child health (WHO 2007). Central to such activities is an appropri- ately trained health and community workforce. In many countries (for example in the Pacific islands), an ‘effective’ health workforce comprises a team of locally trained health professionals, community and indigenous health workers as well as a more transitory group of overseas-trained health professionals (Bedford & Hugo 2008).



글로벌 질병부담

Global burden of disease


Murray와 Lopez의 '글로벌 질병부담'을 업데이트 하면서 Mathers와 Loncar는 2020년에는 감염질환으로 인한 사망과 5세미만 사망자는 더 줄겠지만, 예방가능한 질병(주로 흡연과 관련한)들이 HIV/AIDS, 우울, 협심증, 차 사고 등보다 더 많은 사망자를 낼 것으로 예측했다. 교육과 의료는 이러한 사망을 예방해야 한다.

In their update of Murray and Lopez’ (1996) Global Burden of Disease study, Mathers and Loncar (2006) predict that in 2020 while fewer children under 5 years will die and deaths from communicable diseases will decrease, preventable diseases (many tobacco-related) will claim more lives than HIV/AIDS, depression, ischaemic heart disease or road traffic accidents. Education and access to healthcare are therefore vital for preventing such deaths (Mathers & Loncar 2006).




힘을 얻는 글로벌 사회적 책무성

Global social responsibility and accountability gaining momentum


1990년대 이후, 의학교육자들은 의학교육의 사회적 책무성을 더 강조했다.

Since the 1990s, medical educationalists have been promoting socially accountable medical education (Woollard 2006; Boelen & Woollard 2009). Social accountability has been described as:



의과대학은 교육/연구/진료활동을 통해서 지역사회, 지역, 국가의 건강과 관련된 우선순위 문제를 먼저 해결해야 한다. 우선적 건강문제는 정부/보건기관/보건전문가/대중 등이 협력해서 밝혀야 한다.

the obligation of medical schools to direct education, research and service activities towards addressing the priority health concerns of the community, region or nation that they are mandated to serve. The priority health concerns are to be identified jointly by governments, healthcare organizations, health pro- fessionals and the public (Boelen & Heck 1995, p. 3).


WFME의 기본기준에도 의학교육-진료-보건시스템의 연관성을 강조한다.

A WFME (2003) basic standard reflects a linkage between medical education, medical practice and healthcare systems.


WFME는 특히 local, national, regional and global contexts에 관심을 둘 것을 강조한다.

The WFME (2003) specifically states that attention should be paid to local, national, regional and global contexts.



글로벌 의사 양성의 장애물

Challenges of producing a global health practitioner


의학교육의 글로벌 불평등

Global disparities in medical education


26개의 SSA 국가는 의과대학이 없거나 1개 있고, 24%의 질병부담은 아프리카에 있으나, 3%의 인력만이 이 곳에 있다. 부유한 국가에는 의학교육에 대한 접근권에 격차가 있는데, 20%의 미국 의과대학생만이 하위 60% 출신이다.

That 26 sub-Saharan African countries have none or one medical school only and that Africa carries 24% of the world’s disease burden but only 3% of the global work healthcare force (Mullen et al. 2010),highlights the disparities in health education and healthcare.Disparities in access to medical education also exist in more affluent nations – only 20% of US medical students originate from families in the lowest three quintiles (AmericanAssociation of Medical Colleges 2005). 



교육과정에 대한 지역사회 참여 부족

Lack of community engagement in curricula


의과대학의 미션선언문, 비전선언문, 진급과 테뉴어 가이드라인, 행정구조에 지역사회-참여 학술활동를 얼마나 포함하고 있는가 대한 최근 북미 의과대학 설문 결과를 보면, 그 격차가 크게 들어난다. 많은 의과대학이 여전히 자신이 속한 지역사회를 포용하겠다는 개념을 포함하고 있지만, 글로벌 사회에 대한 책임에 대해서는 얼마나 기대할 수 있는가? ICRAM은 질병부담이 높아지고, 빈곤, 글로벌화, 혁신 등이 늘어나는 이 시기에, 대학의학이 글로벌 사화에 대한 책임을 인식하지 못하고 있다고 지적했다.

A recent survey of North American and Canadian medical schools, however, highlights significant gaps in the integration of community-engaged scholarship into medical school mission and vision statements,promotion and tenure guidelines and administration structures(Goldstein & Bearman 2011). With many medical schools still to embrace the notion of serving their own local communities,how feasible is it to expect a global social responsibility? The International Campaign to Revitalise Academic Medicine notes that at a time of increasing health burden, poverty, globali-sation, and innovation, academic medicine seems to be failing to realize its potential and global social responsibility(Clark 2005, p. 101). 



다국가 사업이 된 의료전문직 교육

Health professions education: An international business


세계적으로 의료전문직에 대한 요구를 충족시키기 위해서 해외 국가에서 간호사나 의사를 수입하는 경우가 늘고 있다. 또는 교육과정이나 전체 의과대학이 해외로 나가기도 하는데 Weill Cornell medical school in Qatar 가 그 사례이다. 그런데, 여기서 교육받은 학생은 어느 사회를 위한 의사인가?

The worldwide demand for healthcare professionals has culminated in many medical and nursing colleges producing graduates for other countries (e.g. India and the Philippines), whereas in other contexts, curricula have been bought or entire medical schools have been off-shored, the Weill Cornell medical school in Qatar being an example (Hodges et al. 2009). It is, however, pertinent to ask for which communities are these students being trained?




미래의 글로벌 의료인력 양성

Developing tomorrow’s global health practitioner


문화적 역량

Cultural competence


너무 오랫동안 'medical culture'는 서구의 문화를 의미해왔다. 그러나 이러한 ethos는 졸업생이 국가의 경계를 넘나들어 'think globally but act locally'해야 하는 상황에서 바뀌고 있다. WHO와 UN은 건강권에 대해서 문화적으로 적합한 의료시스템에 대한 접근권이라고 했다. Stout과 Downey는 여기에 여러 형태의 치료 (전통치료, 치료행위) 가 포함된다고 했다. McKimm은 문화적 요인에 기인하는 의료 불평등은 여러 수준 -사회/기관/전문직/개인 간 - 에서 나타난다고 주장했다.

For too long, ‘medical culture’ has meant Western culture. This ethos is, however, changing with the increasing recognition that graduates need to cross cultural boundaries and to ‘think globally but act locally’ (Taylor 2003). The WHO and United Nations declarations on the right to health encompass access to a culturally appropriate healthcare system, which, for Stout and Downey (2006) includes access to different forms of treatment (such as traditional medicine or healing practices). McKimm (2011) asserts that inequalities in healthcare result- ing from cultural factors may need to be addressed at many levels: societal, organization, professional and interpersonal (p. 56).


건강은 여전히 '사회적'인 것이다

Health remains ‘social’: Advocacy


건강상태는 의료/정치/경제/교육/환경 등 여러 요인의 상호작용에 따라 결정된다.

Health status is determined by the interrelationship of many factors: medical, political, economic, educational and environ-mental, the bases of the current global health inequalities(Evert et al. 2008; Boelen & Woollard 2009). 


Woollard는 21세기에 '역량을 갖춘 의사'란 상당한 다른 사람에게 ethos of service를 전달할 수 있는 사람, 즉 사회적, 환경적 정의의 지지자가 되어야 한다고 했다. Woollard는 Boyer의 네 개의 scholarship과 함께(teaching, discovery, integration, application) 학문 참여의 중요성을 강조했는데, 이를 통해서 사회적/시민적/환경적/윤리적 문제를 해결해야 한다고 했다.

For Woollard (2006), the twenty-first century brings the challenge of not only creating skilled and competent graduates but practitioners who are capable of transmitting a profound ethos of service to the welfare of others (p. 302) – advocates of social and environ-mental justice. Woollard (2006) also emphasises the impor-tance of promoting the scholarship of engagement, alignedwith Boyer’s other four scholarships (teaching, discovery,integration, application) in order to understand and address pressing social, civic, environmental and ethical problems facing communities across the world. 



글로벌 핵심 교육과정

A global core curriculum




글로벌 의료인력 양성: 핵심 이슈

Developing a global health practi-tioner: Key issues 



의학교육의 변혁

Transformation of medical education


GIC는 제3세대 교육 변화를 이야기했다.

A Global Independent Commission proposes a third generation of reform:



모든 나라의 건강전문직은 지식을 동원할 수 있어야 하고, 비판적 추론을 할 수 있어야 하고, 윤리적 행동을 할 수 있어야 한다. 이를 통해서 locally responsive, globally connected team으로서 환자와 인구집단 중심의 건강시스템에 참여할 수 있어야 한다.

All health professionals in all countries should be educated to mobilize knowledge and to engage in critical reasoning and ethical conduct so that they are competent to participate in patient and population-centred health systems as members of locally responsive and globally connected teams (p. 33). 


GCSAMS는 의과대학이 건강한 의료시스템은 튼튼한 일차의료적 접근 위에 세워져야 하며, 1차의료가 2차, 3차 의료와 적ㅈ러한 균형을 이뤄야 한다고 했다.

The Global Consensus for Social Accountability of Medical Schools (GCSAMS 2010) recently advocated that medical schools recognise that a sound health system is founded ona solid primary healthcare approach, with proper integration of the first level of care with secondary and tertiary levels and an appropriate balance of professional disciplines to serve people’s needs


개개 의사들은 유능할 수 있으나, 이 전문성이 다-전문가 팀에서 통합되어 효과적인 환자-중심, 인구-기반 의료를 제공할 수 있어야 한다.

While individual professions have distinctive and perhaps complementary skills, it is imperative that this expertise coalesces such that multiprofessional teams are effective inpatient-centred and population-based health care (Frenk et al.2010). 


이는 WHO의 FCIE와도 일치하는 것이다.

This view is echoed Practice in and the WHO’s Framework for Collaborative Interprofessional Education(WHO 2010) which emphasises the need for health profes-sions’ education to produce a practice-ready workforce able to work flexibly and collaboratively in a range of contexts, cultures and countries to improve health outcomes.


지역의 요구에 부응하며 동시에 국제보건 강조하기

Promoting global health while meeting local needs




협력과 네트워크

Collaboration and networks


(a) Academic collaborations: faculty and student exchange;student electives or in-service learning (often in devel-oping countries); research; complementary degree and graduate programmes and education and health networks (e.g. Towards Unity for Health (TUFH)http://www.the-networktufh.org/home/index.asp; FAIMER (the Foundation for the Advancement of Research,International Medical Education and www.faimer.org). 


(b) Philanthropic organisations: e. g. the Gates and Kellogg Foundations which provide funding for health education and training initiatives (Philibert 2009). 


(c) Partnerships with communities, governments, development agencies: establishing new medical schools in partner countries; working with local communities on health projects and consultancy on aid-funded education and training projects. 




Conclusions and next steps



사회적책무성은 소수의 주변부 관심에서 의과대학의 당연한 핵심 이슈로서 그 자리를 옮겨왔다고 Woollard가 믿었으나, 만약 우리가 글로벌 의사를 양성하려면, 이러한 과정을 더 가속화하여 각 교육기관이 더 책임감을 갖게 해야 한다. Boelen and Woolllard은 의학교육기관이 질/평등/관련성/효율성의 기본 원칙에 충실함으로써 사회에 미치는 영향을 보다 뚜렷하게 하고, 건강시스템 발전에 적극적으로 참여하는 근거를 제공해야 한다고 했다. 건강전문직에 있어서 사회적 책무성은 세 가지 상호의존적 영역으로 측정해야 한다.

While Woollard (2006) believes that social accountability is moving from the peripheral concern of a few to its rightful place as a central issue of medical schools, if we are serious about producing global health practitioners, we need to accelerate the process and make institutions more account-able. Boelen and Woolllard (2009) propose that educational institutions should be required to verify their impact on society by following basic principles of quality, equity, relevance and effectiveness and by providing evidence of active participation in health system development. Social accountability should then be measured in three interdependent domains concern-ing health professionals: 


  • 개념화: 교육기관의 역할 conceptualisation (role of the institu-tion), 
  • 생산: 바람직한 전문직 production (in terms of the desired professional) and 
  • 활용성: 사회적 요구 충족 utilisability (needs of society addressed). 

Woollards의 책임감 있는 아카데믹 파트너십의 위계에도 global을 넣어야 한다. (Municipal, local, national)

Woollard’s (2006) hierarchy of responsible academic part-nerships (e.g. municipal, local, national) should also include‘global’. 







Boelen C, Woollard B. 2009. Social accountability and accreditation: A new frontier for educational institutions. Med Educ 43:887–894.









 2011;33(8):626-31. doi: 10.3109/0142159X.2011.590245.

Developing a global health practitionertime to act?

Author information

  • 1College of Medicine, Swansea University, Grove Building, Singleton Park, Swansea SA2 8PP, Wales, UK. j.mckimm@swansea.ac.uk

Abstract

Although many health issues transcend national boundaries and require international co-operation, global health is rarely an integral part of the medical curriculum. While medical schools have a social responsibility to train healthcare professionals to serve local communities, the internationalisation of medical education (e.g. international medical students, export of medical curricula or medical schools) makes it increasingly difficult to define it as 'local'. It is therefore necessary to produce practitioners who can practice medicine in an ever-changing and unpredictable world. These practitioners must be clinically and culturally competent as well as able to use their global knowledge and experience to improve healthand well-being, irrespective of where they eventually practice medicine. Global health practitioners are tomorrow's leaders, change agents and members of effective multiprofessional teams and so need to be aware of the environmental, cultural, social and political factors that impact onhealth, serving as advocates of people's rights to access resources, education and healthcare. This article addresses some of the difficulties ofdeveloping global health practitioners, offering suggestions for a global health curriculum. It also acknowledges that creating a global healthpractitioner requires international collaboration and shared resources and practices and places the onus of social accountability on academic leaders.

PMID:
 
21774648
 
[PubMed - indexed for MEDLINE]





교육과정-평가 학생위원회: 학생중심 접근법으로 교육과정 변화를 촉진할 수 있을까? (Med Teach, 2015)

The Student Curriculum Review Team: How we catalyze curricular changes through a student-centered approach

KATIE W. HSIH, MARK S. ISCOE, JOSHUA R. LUPTON, TYLER E. MAINS, SURESH K. NAYAR, MEGAN S. ORLANDO, AARON S. PARZUCHOWSKI, MARK F. SABBAGH, JOHN C. SCHULZ, KEVIN SHENDEROV, DAREN J. SIMKIN, SHARIF VAKILI, JUDITH B. VICK, TIM XU, OPHELIA YIN & HARRY R. GOLDBERG*

The Johns Hopkins University School of Medicine, USA




의학교육을 개선하라는 요구는 새로운 것이 아니다. 사실, 의학교육의 자극제도 아니었다. 1910년부터 1993년까지의 교육과정에 대한 보고서를 보면, 늘어나는 의학지식을 습득하고 공공의 이익을 위해 헌신하며, 평생학습 기술을 익히고...이와 같이 학습목표는 놀라울 정도로 달라진 것이 없다. 

Calls for reform in medical education are nothing new. Nor, in fact, is the impetus behind them: a review of such reports from 1910 to 1993 revealed a strikingly consistent set of stated objectives, including tackling the growing body of medical knowledge, better serving the public interest and fostering lifelong learning skills, all of which seem familiar today (Christakis 1995).


LCME는 모든 의과대학은 학생의 피드백을 포함하여 교육과정 검토 기구를 보유할 것을 의무화하고 있다. 과목 평가를 위한 설문이 종종 이 요건을 충족시키기는 하나, 그것 만으로는 구체적인 피드백을 실제 적용가능한 행동계획으로 옮기게 하는 데에는 실패하고 있다.

Accordingly, the Liaison Committee on Medical Education mandates that all medical schools have an internal review process that involves student feedback. While course evaluation questionnaires often fulfill this requirement (Abrahams & Friedman 1996), they alone may fail to gather specific feedback that can be translated effectively into actionable plans (Amrein-Beardsley & Haladyna 2011).


의과대학은 이러한 전통적 방식의 단점을 인식하고, 학생의 인풋을 확대하는 접근법을 보충하였다. 

Medical schools have noticed the shortcomings of these traditional methods and modified them with supplemental approaches to solicit greater student input. 

  • 300명 중 50명 대상 포커스그룹
    For example, at Wayne State University, focus groups have been added to the school’s curricular evaluation process. The school invites 50 randomly selected students from a class of approximately 300 to participate in a focus group moderated by student representatives, who then produce a formal report that is presented to course directors (Wilson et al. 2013). 
  • PBL 그룹에서 학생대표로 교수가 중재하는 포커스그룹에 참가할 학생을 선정함
    At the University of Sydney, problem-based learning groups appoint student representatives to attend faculty-moderated focus groups; faculty a report summarizing issues raised and responses is then published internally and included in the final course review (Hendry et al. 2001). 
  • 각 쿼터마다 학생 포커스그룹을 시행하고, 교육과정 평가 담당교수가 그 결과를 받아서 반영함.
    Finally, during their period of curriculum reform, the Stanford School of Medicine convened student focus groups midway through each quarter, and curriculum evaluation staff presented focus group findings to course directors while the courses being reviewed were still in progress (Fetterman et al. 2010).

이들 포커스그룹이 의미있고, 건설적인 인풋을 넣긴 하지만, 학생의 일부만 참여하고, 교수의 영향에서 자유로울 수 없다는 한계가 있다.

While these focus groups have been found to elicit meaningful and constructive input from medical students,they invite participation from only a fraction of the student body and may be susceptible to influence by faculty and staff.



우리가 한 것

What we did


SCRT (Student Curriculum Review Team)

The SCRT at the Johns Hopkins University School of Medicine aims to foster a learner-centered model of curriculum review.


SCRT의 목적

Goals of SCRT


(1) 모든 학생을 능동적이고 쌍방향의 과목 개선 프로세스에 참여시킨다. 과목 평가 설문지는 수동적이고 일방적이었다.

(2) 능동적 피드백을 수합함에 있어서 교수의 영향이 없도록 하며, 학생 간 생산적 토론이 이뤄지게 놔둔다.

(3) 학생의 피드백이 공개됨은 물론, 교육과정 관리자 교수들과 토론을 하게 하며, 교수와 학생이 모두 각자의 관점을 이야기할 수 있게 한다.

(4) 향후 교육과정의 개선과 학생 및 교수에게 투명성을 담보하기 위하여 모든 SCRT절차를 문서화한다.

(1) Engage the entire student body in an active, two-way course improvement process, as opposed to course questionnaires, which are more passive and one-sided. 

(2) Gather this active feedback in a manner that is not influenced by faculty and allows for productive discussion among students. 

(3) Ensure that student body feedback is not only presented but also discussed with course directors, with students and faculty both having the chance to express their viewpoints.

(4) Maintain documentation of the entire SCRT process for future curricular improvement and transparency of the process for students and faculty.



스텝0: SCRT 참여 자격을 개방한다 (목표 1). SCRT는 1학년 혹은 2학년의 어떤 시점에서든 자발적으로 팀에 합류한 학생들로 구성되어 있다. 이는 모든 학생들로 하여금 다양한 기간에 걸쳐서 - 한 시간부터 수십 시간까지 - 자신의 가능한 시간에 따라 참여할 수 있게 한다. 이렇게 투자하는 시간에 대한 별도의 보상은 없다. 각 해의 시작시에 SCRT 멤버는 co-chairs를 지정한다. 


Step zero: establish open SCRT membership (goal 1). SCRT consists of medical students who voluntarily join the team at any point during their first or second years. This allows for the entire student body to participate for varying lengths of time, fromone hour to dozens of hours over the year, depending on their availability. Members are not compensated for their time. At the beginning of each academic year, the members of SCRT designate co-chairs who delegate responsibilities and ensure completion of all necessary tasks for each course review, a process outlined below.



스텝1: 과목 평가 자료를 모으고 검토한다. 각각의 전-임상 과목에서 학생들은 익명으로 과목평가를 마친다. 교육과정관리실은 이 평가 자료를 SCRT 멤버에게 전달한다. 그 과목을 검토하도록 임명된 멤버는 평가결과를 읽고 지속적으로 나타나는 긍정적인 주제 혹은 향상될 수 있는 부분을 찾아낸다. 이 주제는 Town Hall Meeting에서 학생들에게 모두 공개된다.

Step one: gather and review course evaluation data. After each pre-clinical course, students anonymously complete the School of Medicine’s course evaluation. The Curriculum Office then sends these completed evaluations to the SCRT members. Team members who have elected to review that course read the evaluations and identify recurrent positive themes and opportunities for improvement. These themes are presented to the student body at a Town Hall Meeting.


스텝2: THM을 통해서 해결책을 만들고, 온라인으로 설문을 한다(목표 1, 목표 2). SCRT의 기본은 광범위한 학생들을 참여시켜서 건설적 아이디어를 이끌어내고 그것을 지지해줄 양적 자료를 만드는 것이다. 이를 위해 THM은 점심시간에 이뤄지며, 매 6~8주마다 시행되고, 2~3개 과목에 대해 논의한다. 과목 검토는 회상비뚤림(recall bias)를 줄이기 위해서 가급적 과목 종료와 가까운 시기에 한다. 동시에 학생들의 부담을 줄이기 위해서 몇 개 과목을 모아서 한꺼번에 한다. 각 학생은 THM에 올 것이 적극 권장되고, 진솔한 토론을 위해서 교수는 참여하지 않는다. 모든 학생이 참여하는 것을 가능하게 함으로써, 학생들은 그들이 받을 교육의 형태를 만들어나갈 하나의 책임감을 부여받게 된다. 일반적으로 2/3의 학생이 참여한다.


THM의 초반에 SCRT 멤버들은 과목 평가결과자료를 발표하고, 그로부터 도출된 주제를 열거한다. 또한 이전의 SCRT 검토 결과 달라진 점을 언급해줌으로써, SCRT에 기여하는 시간이 교육과정과 학생에게 긍정적 영향을 미치고 있음을 보여준다. 이후 SCRT 멤버들의 지도에 따라서 학생들은 소그룹 토론을 하는데, 이 때 SCRT 멤버는 토론을 이끌고, 과목의 특정한 개선을 위한 가능한 방법을 제시한다. 이 SCRT멤버들은 THM동안 소그룹을 순환하면서 모든 학생들이 각 과목에 대한 토론에 참여하도록 한다. 이 시간이 마칠 때 SCRT멤버가 도출된 결론을 요약발표한다.


학생들의 대표성을 확보하고, 많은 학생이 참여하게 하기 위해서 여기서 제안된 해결책들이 전 학생을 대상으로 한 간단한 온라인 투표에 붙여진다. 이 설문은 대부분 '예' '아니오'질문으로 이뤄진 객관식 문항이며 보통 1/2~2/3의 학생이 참여한다. 이 단계에서 수집된 자료들은 SCRT가 교수들에게 제출하는 예비보고서에 담길 제안사항을 결정하고, 그 근거를 입증하는데 쓰인다. 이 각 단계들은 엄청나게 빈틈없이 진행되고, 이것을 거치며 학생들의 의견이 정제되고 실제 행동으로 옮겨질 수 있는 형태로 명확해진다.


Step two: generate solutions through Town Hall Meeting and online survey (goals 1 and 2). A fundamental element of the SCRT process is its engagement with the broader student body in a manner that elicits constructive ideas and generates quantitative data to support them. To accomplish this, a Town Hall Meeting is held during the lunch hour every 6–8 weeks during the academic year to discuss 2–3 courses. Every effort is made to review courses close to their completion to reduce recall bias. This is balanced with the effort to group courses together to decrease student burden. Every student is welcome to attend the Town Hall Meetings, and faculty are not present, allowing for candid discussion among students. By allowing everyone to attend, students are empowered with the one- responsibility for shaping their education. Typically, third to two-thirds of the class attends each Town Hall Meeting.


At the beginning of the Town Hall Meeting, SCRT members present a summary of course evaluation data and list the evaluation-derived themes. They also note the changes that resulted from the previous year’s SCRT review (Table 1), assuring students that their time and contribution to the SCRT process are having a positive impact on the curriculum and allowing students to put the current status of the course in context. The student body is then organized into smaller discussion groups led by SCRT members who are tasked with leading discussions on potential solutions to address a specific course’s opportunities for improvement. These designated SCRT members rotate from group to group throughout the Town Hall Meeting so that all students in attendance discuss each course. At the conclusion of the hour, the designated SCRT members summarize the solutions generated.



In order to increase student representation and quantify support, the proposed solutions are presented to the entire class in a short online survey. The survey is composed of a series of multiple choice, mostly simple ‘‘yes’’ or ‘‘no’’ questions, and generally one-half to two-thirds of the class participates in the online survey. The data collected from this stage of the evaluation process helps determine and substan- tiate the suggestions SCRT will include in its preliminary report to faculty. With each step of the process, an increasingly thorough, refined and actionable understanding of student opinion becomes evident.



스텝3: 과목 책임교수 미팅과 Student Assessment and Program Evaluation (SAPE) 보고서를 통해서 변경사항들이 도입된다. (목표 3, 4). 앞의 두 단계에서 수집된 양적, 질적 데이터를 활용하여 SCRT 멤버들은 예비보고서를 통해 학생들의 피드백에서 나온 긍정적 주제들과 가능한 해결책을 제시한다. 이 예비보고서는 토론의 근간이 되는 자료로 사용되고, SCRT 멤버들이 과목 책임교수, 교육과정학장, SCRT 담당교수를 만난다. 이 미팅은 SCRT의 독특한 점인데, 교수들이 학생과 함께 학생의 아이디어에 대해서 토론하고, 기존의 전형적인 과목평가와 포커스 그룹에서는 불가능했던 생산적인 대화를 가능하게 한다. 추가적으로, SCRT 멤버들은 개인의 의견이 아니라 전체 학생들의 관점을 공유하기 때문에, 교수에게 반발당할 우려에서 보다 자유로울 수 있다. SCRT 담당교수는 학생 대표측과 과목 교수간의 다리 역할을 하며, 긴장이 생겼을 때 중제할 수 있다. 


토론이 끝나면, SCRT는 미팅에서 나온 결과를 포함하여 예비 보고서를 업데이트하고, 해결되지 않은 문제들이나 과목 책임교수가 고려하겠다고 동의한 주요 아이템도 추가한다. 보고서와 미팅 노트는 미팅 참석자들 사이에서 회람한 후 전체 학생들에게 내부 웹사이트를 통해 공개된다. 최종 SCRT 보고서는 SAPE 위원회에 전달되며, 이 위원회는 각 과목을 1~2년마다 평가하는 기존의 공식 위원회이다. SAPE위원회는 교육부학장이 chair로 있는 교육정책과 교육과정위원회에 대한 권고사항 준비때 SCRT보고서 내용을 포함한다. 


이 과정은 SCRT를 두 가지 측면에서 효과적으로 만든다. (1)다음 해 해당 과목이 시작되기 전에 바꿀 수 있는 부분이 적용되도록 과목 책임교수와의 직접 미팅 한다 (2)SAPE의 최종 권고안을 통해서 간접적으로, 더 넓은 범위의 행정적 수단에 동원된다.


Step three: implement changes via course director meeting and Student Assessment and Program Evaluation report (goals 3 and 4). Using quantitative and qualitative data collected from the first two steps, SCRT members create a preliminary report that outlines positive themes from student feedback and potential solutions that target the identified opportunities for improvement. Using this preliminary report as a launchpad for discussion, SCRT members meet with the course directors, the Dean of the Curriculum and the SCRT faculty advisor. This meeting is a unique aspect of SCRT that allows faculty to discuss ideas with students, enabling a productive dialogue that cannot be accomplished through standard course evaluations and focus groups. In addition, because SCRT members are sharing the viewpoints of their class rather than individual opinions, they are able to speak honestly without fearing repercussion from faculty. The SCRT faculty advisor, who serves as a crucial bridge between student representatives and course faculty, can also moderate any tension that may arise. 


Following the discussion, SCRT updates the preliminary report with notes from the meeting, including unresolved issues and action items that course directors agreed to consider. The report and notes are circulated among  meeting attendees and then are made available to all students on an internal website. The final SCRT report is sent to the Student Assessment and Program Evaluation (SAPE) Committee, a pre-existing faculty-led committee that formally evaluates each course every 1–2 years. The SAPE committee includes the SCRT report in its preparation for recommendations to the Educational Policy and Curriculum Committee, chaired by the Vice Dean of Education. This process is outlined in Figure 1. In addition, an example showing the sequence of implementing a change to a course is provided in Figure 2.


This process allows SCRT to effect change in two ways: (1) directly through a meeting with course directors who can implement changes before the next iteration of the course and (2) indirectly through SAPE’s final recommendations, which may include broader administrative measures.



교과목 개선: SCRT의 최종 목표는 교육과정 전체에 걸쳐 학생 기반의 변화를 이끌어내는 것이다. Table 1은 실제로 이 과정을 통해서 바뀐 것들의 리스트다.


Course improvements. SCRT’s ultimate goal is to advocate for student-supported changes throughout the curriculum to improve student learning and satisfaction. Table 1 includes a representative list of changes made by course directors following the 2012–2013 SCRT reviews.



이후 방향 

Next steps


SCRT의 결과를 보면, 이 과정이 매우 잘 받아들여지고 있음이 확인된다. 도입된 이후 - 비록 선택사항이었음에도 - 모든 전임상 과목의 책임교수가 SCRT대표단을 만났다.

A review of SCRT suggests that the program is well-received: since its inception, directors from 100% of preclinical courses have met with SCRT representatives, even though it is optional to do so.


과목 책임교수에게 간단한 설문을 해보았다. 75%가 매우 도움이 되었다, 25%는 어느 정도 도움이 되었다. 라고 하였음. '중립' '별 도움이 안 됨' '매우 도움이 안 됨'에 응답한 사람은 없었다. 모든 과목 책임교수가 SCRT 가 지속되어야 한다고 응답했다.

We administered a brief survey to course directors inthe Fall of 2013 in an initial attempt to quantify the usefulness of SCRT. We asked ‘‘How helpful did you find the SCRT process as an addition to the unedited student course evaluations?’’ and 75% (n¼15) indicated that it was ‘‘very helpful’’, while the remaining 25% (n¼5) responded ‘‘some- what helpful’’; no course directors answered ‘‘neutral’’, ‘‘somewhat unhelpful’’ or ‘‘very unhelpful’’. All of the course directors responded that the SCRT process should be continued.


학생 역시 SCRT의 효과에 대해서 44%가 매우 도움이 된다, 31%가 어느 정도 도움이 된다, 21%가 중립, 4%가 별 도움이 안 됨에 응답했고, 매우 도움이 안 됨에 응답한 학생은 없었다.

Students were also surveyed in November 2013 on the impact of SCRT. When asked ‘‘How helpful have youfound the SCRT process?’’ 23 of 52 students (44%) responded‘‘very helpful’’, 16 (31%) said ‘‘somewhat helpful’’, 11 (21%)said ‘‘neutral’’, 2 (4%) said ‘‘somewhat unhelpful’’ and nostudents selected ‘‘very unhelpful’’. 




Christakis NA. 1995. The similarity and frequency of proposals to reformUSmedical education. Constant concerns. JAMA 274(9):706–711. 


Schumacher DJ, Englander R, Carraccio C. 2013. Developing the masterlearner: Applying learning theory to the learner, the teacher, and thelearning environment. Acad Med 88(11):1635–1645. Wilson MW, Morreale MK, Wainea E, Balon R. 2013. The focus group:A method for curricular review. Acad Psychiatry 37(4):281–282. 
















 2015 Nov;37(11):1008-12. doi: 10.3109/0142159X.2014.990877. Epub 2014 Dec 23.

The Student Curriculum Review Team: How we catalyze curricular changes through a student-centeredapproach.

Author information

  • 1a The Johns Hopkins University School of Medicine , USA.

Abstract

Student feedback is a valuable asset in curriculum evaluation and improvement, but many institutions have faced challenges implementing it in a meaningful way. In this article, we report the rationale, process and impact of the Student Curriculum Review Team (SCRT), a student-led and faculty-supported organization at the Johns Hopkins University School of Medicine. SCRT's evaluation of each pre-clinical course is composed of a comprehensive three-step process: a review of course evaluation data, a Town Hall Meeting and online survey to generate and assess potential solutions, and a thoughtful discussion with course directors. Over the past two years, SCRT has demonstrated the strength of its approach by playing a substantial role in improving medical education, as reported by students and faculty. Furthermore, SCRT's uniquely student-centered, collaborative model has strengthened relationships between students and faculty and is one that could be readily adapted to other medical schools or academic institutions.

PMID:
 
25532595
 
[PubMed - in process]


"내가 의대에 적합한걸까?" 1학년 학생들의 확신결여 현상에 대한 이해(Med Teach, 2015)

‘‘Am I cut out for this?’’ Understanding the experience of doubt among first-year medical students

RHIANON LIU1, JOSEPH CARRESE1,2,3, JORIE COLBERT-GETZ4, GAIL GELLER1 & ROBERT SHOCHET1 

1Johns Hopkins University School of Medicine, USA, 2Johns Hopkins Bayview Medical Center, USA, 3Johns Hopkins Berman Institute of Bioethics, USA, 4University of Utah School of Medicine, USA




의과대학생들이 느끼는 정신적 고통의 수준은 높은 편인데, 그로 인해 우울, 탈진, 의욕상실 등을 경험한다. 의과대학생의 distress는 유급부터 자살충동까지 여러 부정적 결과를 초래한다.

Medical students experience high rates of distress, often taking the form of depression, burnout, and loss of empathy over the course of medical training (Compton et al. 2008; Hojat et al. 2009; Dyrbye et al. 2011a). Medical student distress is further associated with negative personal consequences, ranging from thoughts of dropping out to suicidal ideation (Dyrbye et al. 2010c).


의과대학생들이 distress를 받는 한 가지 이유는 샘솟는 '의구심'이다. 의과대학생들이 노출된 여러 스트레스 요인들에는 사회적 지지가 결여된 환경에서의 생활, 재정 문제, 수면 부족, 과도학 학습시간, 인간이 겪는 고통과 죽음에 대한 대면 등이 있다.

One source of medical student distress that has not been adequately studied is emergence of doubt. Medical students are exposed to a range of stressors that include living away from social supports, financial debt, lack of sleep, long hours of study, and encountering human suffering and death (Compton et al. 2008).


연구 설계

Study design


mixed-methods study를 수행하였다. 

We conducted a mixed-methods study involving a survey and focus groups examining the phenomenon of doubt among first-year medical students at the Johns Hopkins University School of Medicine (JHUSOM) in June, 2012. 


We asked students to answer 13 questions embedded in an annual, online advising program survey: 

  • nine questions about doubt were developed based on literature review of medical student well-being, and 
  • four questions reflecting other measures of distress from a validated well-being index (Table 1) (Dyrbye et al. 2010b, 2011b). 


For the focus groups, we created a semi- structured interview guide based on literature review and expert opinion. We tested the guide in a pilot focus group, then revised it prior to use. One study team member (R.L.) served as the focus group facilitator. Questions in the survey and interview guide addressed types of doubt, coping with doubt, and impact of doubt. The distress questions on the survey addressed burnout, depression, stress, and loss of empathy.



학업생활을 하면서..
1. 나는 의과대학이 나에게 옳은 선택이었는지에 대한 의심을 한 적이 있다.
2. 나는 JHUSOM이 나에게 맞는 의과대학이었는지에 대한 의심을 한 적이 있다.
3. 나는 내가 의과대학 학업환경에서 성공할 수 있을지에 대한 스스로의 능력에 대한 의심을 한 적이 있다.
4. 나는 의과대학에서의 학업 외 다른 생활에 대해서 의심을 한 적이 있다.
5. 의과대학생활에 대한 의심으로 인해서 내 스스로의 목적이 뭔지 의문을 가지게 되었다.
6. 의과대학생활에 대한 의심으로 인해서 내가 누구인지 의문을 가지게 되었다.

의과대학생활에 대해 의심이 생겼을 때...
7. 건강한 형태로 대응하고자 노력한다.
8. JHUSOM의 문화는 내가 그러한 의심을 표현하는 것을 주저하게 한다.
9. 그런 경우에 어떤 도움을 받을 수 있는지 잘 모르겠다.

스트레스 문항
10. 의과대학이 정서적으로 냉담한 사람이 되도록 만든다는 걱정을 합니까?
11. 의과대학에서 탈진을 경험합니까?
12. 지난 몇 달 간, 기분이 저하되거나, 우울하거나, 희망이 없다는 기분을 느낀 적이 있습니까?
13. 지난 몇 달 간, 모든 것이 감당하지 못할 정도로 쌓여있다는 기분을 느낀 적이 있습니까?




자료 분석

Data analysis

처음 세 문항을 가지고 두 그룹으로 분류, 이후 logistic regression 시행

For the survey items, we dichotomized students based on their responses to the first three items (Table 1, questions 1–3). Students who responded ‘‘agree’’ or ‘‘strongly agree’’ to at least two of these items were classified into the moderate/high doubt group, and the remaining students into the low/no doubt group. We then used logistic regression to compare the likelihood of these groups ‘‘agreeing’’ or ‘‘strongly agreeing’’ with statements about coping with doubt and impact of doubt (Table 1, questions 5–9), and to compare the likelihood of these groups experiencing the four types of distress (Table 1, questions 10–13).


포커스그룹 자료 분석

For the focus groups, one author (R.L.) transcribed the audio-recordings and then four members of the study team (R.L., R.S., J.C., G.G.) independently coded the transcripts. Each transcript was read by at least two readers, and coded using an editing style of analysis (Miller 1999). We iteratively reviewed our codes to identify major themes.


질적 결과

Quantitative Results


1번 문항에 대해서는 46%가, 2번 문항에 대해서는 39%가, 3번 문항에 대해서는 51%가 스스로 의심을 표했다.

Forty-six percent (51/112) of students doubted (agreed or strongly agreed) whether medical school was the right choice for them, 39% (44) doubted whether JHUSOM was the right choice, and 51% (57) doubted their ability to succeed in the academic environment of medical school.


이 세가지에 기초하여 20%는 고의심, 29%는 중증도 의심, 22%는 저의심, 29%는 무의심 집단으로 구분.49%-51% 정도의 비율

Based on response patterns for these three items, 20% (23) experienced high doubt, 29% (32) moderate doubt, 22% (25) low doubt, and 29% (32) no doubt. In sum, 49% (55) experienced moderate/ high doubt, while 51% (57) experienced low/no doubt.


저의심/무의심 집단에 비해서, 중등도의심/고의심 집단은 우울, 무기력 등을 겪을 가능성이 두 배 이상 높음

Compared to those withlow or no doubt, students with moderate or high doubt weretwice as likely to experience being down, depressed, orhopeless and to experience emotional hardening. 






Qualitative Results


Student responses werecategorized into three broad themes: types of doubt, ways ofcoping with doubt, and impact of doubt (Table 5)


확신결여의 유형 Types of doubt


내가 정말 의사가 되고 싶은가? Do I want to become a doctor?


내가 의사가 될 만큼 유능한가? Am I capable of becoming a doctor?


의심에 대처하는 자세 Coping with doubt


의심의 영향 Impact of doubt




고찰


'의심'의 경험과 그것을 관리하는 것이 의과대학생의 웰-빙에 중요한 요인이다. 1학년 학생 중 doubt의 유병률이 매우 높다는 것이 놀라운 결과로서, 거의 절반이 중등도- 고- 의심 상태임을 확인할 수 있었고, 이는 주로 의과대학 진로에 대한 자신의 열망이나 능력에 대한 것이었다. 더 나아가서 고의심 집단은 스스로의 정체성이나 목적에 대해서도 의심하는 경향이 높았다.

Our results suggest that the existence of doubt and its management are indeed important components of medical student well-being. A striking finding from this study is the high prevalence of doubt among first-year medical students. Nearly half of these students reported moderate to high levels of doubt, largely related to uncertainty about their desire or ability to pursue a career in medicine. Furthermore, students who experienced higher levels of doubt were more likely to question their sense of identity and purpose.


질적, 양적 결과를 조합하면 의과대학생이 받는 스트레스를 '의심'의 한 현상으로 이해하는 것도 가능한데, 이 때 의심은 catalyst나 mediator, 혹은 독자적인 형태로 존재할 수 있다

The combination of quantitative and qualitative results linking doubt to other forms of distress enriches our understanding of medical student distress by including the phenomenon of doubt, either as a catalyst or mediator of known forms of distress, or as a distinct form of distress.


연구결과는 개인의 목적이나 자기자신이 누군가에 대한 의심이 만연해 있음을 지적하는데, 의과대학이 개인적 성장과 전문직정체성형성(PIF)에 결정적 시기임이 보여진 바 있다. Cohen 등은 PIF에 부정적 영향을 줄 수 있는 훈련과정으로서 높은 기대, 지식/술기 부족에 대한 내적 공포, 배제적이고 위계적 문화, 고통과 죽음을 직면하는 정서적 무게감 등을 꼽았다. 이번 포커스그룹에서 학생들은 이러한 이슈에 대해서 비슷한 이야기를 많이 했다. 의학교육계가 정체성형성에 초점을 둔다면, 학생이 의심과 관련해서 이를 인식하고 극복하고자 하는 노력에 관심을 기울이는 것이 유용할 수 있다.

Study participants indicated that pervasive doubt led them to question both their personal purpose and their sense of who they were. Previous work has shown that medical school is a critical time for personal development and professional identity formation (PIF) (Cohen et al. 2009; Holden et al. 2012). Cohen et al. identified several aspects of the training process that may negatively influence PIF, including pressures like high expectations and internal fear of inadequate know- ledge or skills, the exclusive and hierarchical culture of medicine, and the emotional weight of facing suffering and death (Cohen et al. 2009). Our focus group participants raised many of these same issues, supporting the idea that doubt is an important factor in medical students’ PIF. As the medical education community increases its training focus on identity formation, adding students’ perceptions and struggles with doubt to the discourse may be useful (Jarvis-Selinger et al. 2012).


포커스그룹에서는 의심의 긍정적 측면과 파괴적 결과가 모두 나타났다. 긍정적 측면은 의심을 어떻게 다루어야 하는지 의과대학 초기에 경험할 수 있어서 미래의 진료 상황에서도 비슷한 불확실성에 대응할 수 있다는 것이다.

Focus group participants described both helpful and destructive consequences to their experiences with doubt. On the positive side, learning how to manage a sense of doubt early in medical school could prepare students to deal with future uncertainty in the context of patient care


이번 연구에서 우려되는 점은 학생들이 의심을 품고 있을 때 탈진이나 우울에 빠질 수 있다는 것이다.

A concerning finding of our work, however, is that students felt harboring doubtcould lead to burnout and depression. 


그렇다면, 어떻게 학생들이 의심을 건설적으로 다루고, 성장할 수 있도록 도울 수 있을 것인가? 이다.

A critical question, then, is how to help students manage their doubt constructively, enhancing their growth in the face of inevitable uncertainty, rather than letting it overwhelm and discourage them.


Dunn 등이 연구한 의과대학생의 모델에서는 '대응력 원천(coping reservoir)'를 제안했다. 각 학생은 각각의 개인적 원천이 있는데 성격/기질/대응방식에 따라 달라진다. 이 원천은 건강하거나 불건강한 대응방법으로 인해 채워지거나 비워질 수 있다. 더 나아가서 스트레스와 같은 요인은 원천을 고갈되게 할 수 있고, 사회적 지지는 이것을 채워준다. 이 모델은 본 연구와 일치하는데, 학생들이 말한 '고갈 요인'은 많은 경우 의심의 형태로 존재했으며, '충전 요인'은 의심을 대처하는 효과적인 방법이었다. 의과대학생과 의사의 회복탄력성이 강조되고 있다. 의심을 의과대학에서 받는 스트레스의 중요한 요인으로 바라볼 때 학생들의 탈진을 예방하고 회복탄력성을 길러줄 수 있을 것이다.

One model of medical student well-being by Dunn et al. (2008) introduces the idea of a coping reservoir. Each student has an individual reservoir, determined by personal traits, temperament, and coping style. The reservoir can then be filled or drained by healthy or unhealthy coping methods. Furthermore, other factors like stress can drain the reservoir, while social support can fill it. This model corresponds to the processes described by students in our study. Students described many of the ‘‘depleting factors’’ as types of doubt, and many of the ‘‘replenishing factors’’ as helpful ways of coping with doubt. There is increasing recognition of the importance of physician resilience and of training medical students to be resilient (Epstein & Krasner 2013; Nedrow et al. 2013; Zwack & Schweitzer 2013). Addressing doubt as an important component of medical student distress may help educators guide students towards resilience rather than burnout during a grueling training process and challenging career.





Glossary 


Coping reservoir: 

A term used to describe the positive and negative strategies used to cope with stress and how they interact. Dunn LB, Iglewicz A, Moutier C. 2008. 

A conceptual model of medical student well-being: Promoting resilience and preventing burnout. Acad Psychiatr 32(1):44–53. 


Identity formation: 

The development of the distinct personality of an individual regarded as a persisting entity in a particular stage of life; a person’s mental representation of who he or she is in which individual characteristics are possessed and by which a person is recognised or known. Erikson EH. 1950. Childhood and society. New York: W. W. Norton. Josselson R. 1987. Finding herself: Pathways to identity development in women. San Francisco: Jossey-Bass.





Jarvis-Selinger S, Pratt DD, Regehr G. 2012. Competency is not enough: integrating identity formation into the medical education discourse. Acad Med 87(9):1185–1190.











 2015 Dec;37(12):1083-9. doi: 10.3109/0142159X.2014.970987. Epub 2014 Oct 16.

"Am I cut out for this?" Understanding the experience of doubt among first-year medical students.

Abstract

PURPOSE:

Existing research shows that medical students experience high levels of distress. The purpose of this study was to understand howmedical students experience doubt, and how doubt relates to distress.

METHODS:

A mixed-methods study was conducted among first-year students at the Johns Hopkins University School of Medicine in June 2012.Students answered survey questions and participated in focus groups about doubt and other forms of distress.

RESULTS:

Ninety-four percent (112) of students responded to the survey, with 49% reporting a moderate or high degree of doubt. Compared to those reporting no or low doubtstudents with moderate/high doubt were significantly more likely to question their purpose and identity, struggle to cope with doubt, and experience depression and emotional hardening. Twenty-eight percent of students (34/112) participated in focus groups to explore their doubt, and three themes emerged: types of doubt, ways of coping with doubt, and impact of doubt.

CONCLUSIONS:

Doubt is highly prevalent among first-year medical students, affects their identity and purpose, and has positive and negative consequences. Doubt among medical students merits awareness and further study, as it may be an important mediator of students' emerging identity and sense of well-being.

PMID:
 
25319402
 
[PubMed - in process]


신뢰도: 평가 데이터의 재생산가능성(Med Educ, 2004)

Reliability: on the reproducibility of assessment data
Steven M Downing

 

 

 

 

 

신뢰도란 무엇인가? 가장 간편한 정의는 신뢰도란 평가자료, 평가점수 등이 시간이나 상황이 달라져도 재생산가능한 정도를 의미하는 것이다. 이 정의는 data의 재생산에 관한 것이므로 validity와 마찬가지로 reliability란 평가의 결과의 특성이지 평가도구 그 자치의 특성이 아니다Feldt and Brennan는 이렇게 말했다. "평가대상자의 수행능력의 일관성 혹은 비일관성을 정량화하는 것이 신뢰도 분석의 핵심이다"

What is reliability? In its most straightforward defini-tion, reliability refers to the reproducibility of assess-ment data or scores, over time or occasions. Notice that this definition refers to reproducing scores or data, so that, just like validity, reliability is a charac- teristic of the result or outcome of the assessment, not the measuring instrument itself. Feldt and Brennan5 suggest that: Quantification of the consis- tency and inconsistency in examinee performance constitutes the essence of reliability analysis. (p 105)

 

 

평가자료의 일관성

THE CONSISTENCY OF ASSESSMENT DATA

 

따라서 신뢰도란 타당도의 필요조건이나 충분조건은 아니며, 모든 평가에서 신뢰도는 타당도 근거에 대한 주요 원천이다. 신뢰도가 충분치 않다면 그 자료는 uninterruptible하며, 왜냐하면 신뢰도가 낮게 나온 평가로부터 얻은 자료는 random error의 가능성이 높기 때문이다.

Thus, reliability is a necessary but not sufficient conditionfor validity6 and reliability is a major source of validity evidence for all assessments.7,8 In the absence of sufficient reliability, assessment data are uninter- ruptible, as the data resulting from low reliability assessments have a large component of randomerror.


이론적으로, 신뢰도는 classical meas- urement theory (CMT) 로 정의할 때, 총 variance 중 true variance의 비율로 나타내어진다. (신뢰도계수는 상관관계 계수처럼 해석되기 때문에, 신뢰도를 true score와 observed score의 상관계수의 제곱이라고 생각해도 정확하다.) [관찰점수는 진점수 +/- 측정의 무작위에러]라는 기본적 정의로부터 시작하여, 약간의 통계적 가정을 더하면 신뢰도나 평가의 재생산가능성에 대해 흔히 사용되는 신뢰도 측정 공식을 유도할 수 있다. 이상적인 세계에서는 error term이 없을 것이고 모든 관찰점수가 언제나 진점수와 정확히 일치할 것이다.

Theoretically, reliability is defined in classical meas- urement theory (CMT) as the ratio of true score variance to total score variance. (As reliability coeffi- cients are interpreted like correlation coefficients, it is also accurate to think of reliability as the squared correlation of the true scores with the observed scores.5) Starting fromthe basic definitional formula, X ¼ T + e (the observed score is equal to the true score plus random errors of measurement), and making some statistical assumptions along the way, one can derive all the formulae commonly used to estimate reliability or reproducibility of assessments. In the ideal world there would be no error term in the formula and all observed scores would always be exactly equal to the true score (defined as the long- run mean score, much like l, the population mean score).

 

 

성취평가의 신뢰도(검사지의 신뢰도)

RELIABILITY OF ACHIEVEMENT EXAMINATIONS

 

검사점수의 재생산성에 대해서 흔히 사용되는 것은 Cronbach alpha계수나  Kuder-Richardson formula 20 (KR 20)로 흔히 추정할 수 있는 내적일관성이라는 개념이다. 이 내적일관성 신뢰도의 논리는 직관적이고 간단하다. 이 공식들의 통계적인 유도방식은 시험-재시험 개념으로부터 시작한다.

The approach typically utilised to estimate the reproducibility of test scores in written examinations employs the concept of internal consistency, usually estimated by the Cronbach alpha9 coefficient or Kuder-Richardson formula 20 (KR 20).10 The logic of internal test consistency reliability is straightforward and intuitive. The statistical derivation of these formulae starts with the test-retest concept,

 

이 시험-재시험 개념이 대부분의 신뢰도 추정의 토대이긴 하나, 시험-재시험 방식의 연구설계는 거의 없으며, 있더라도 실제 상황에서 시행한다는 것은 어렵다.

While this test-retest concept is the foundation of most of the reliability estimates used in medical education, the test-retest design is rarely, if ever, used in actual practice, as it is logistically so difficult to carry out.

 

다행히, 측정통계학자들이 한 차례의 시험으로도 시험-재시험 조건에서의 신뢰도 추정방법을 만들었는데, 그 논리는 검사결과를 반으로 나누는 것이다. 검사 결과를 무작위로 둘로 나눠서 시험-재시험 재생산가능성을 추정하는 것이다. 그러나 이러한 신뢰도는 오직 검사결과의 절반에 대한 것이며, 전체 검사의 신뢰도 추정을 위해서는 Spearman-Brown prophecy formula를 이용해서 추가적인 계산을 해야 한다.

Happily, measurement statisticians sorted out ways to estimate the test-retest condition many years ago, from a single testing.11 The logic is: the test-retest design divides a test into 2 random halves, Further, the correlation of the scores fromthe 2 randomhalf tests approximates the test-retest reproducibility of the examination scores. (Note that this is the reliability of only half of the test and a further calculation must be applied, using the Spearman-Brown prophecy formula, in order to determine the reliability of the complete examina- tion.12)

 

또 다른 통계적 도출방법은 모든 가능한 방식으로 검사결과를 두 개로 나누는 것이다. 이는 Cronbach’s alpha coefficient 에서 사용하는 것인데, Cronbach alpha는 polytomous data에서 사용하는 것으로 dichotomous data에서 사용하는 KR20에 비해서 보다 일반화된 형태라고 할 수 있다.

A further statistical derivation (making a few assumptions about the test and the statistical char- acteristics of the items) allows one to estimate internal consistency reliability from all possible ways to split the test into 2 halves: this is Cronbach’s alpha coefficient, which can be used with polytomous data (0, 1, 2, 3, 4, …n) and is the more general formof the KR 20 coefficient, which can be used only with dichotomously scored items (0, 1), such as typically found on selected-response tests.

 

지필검사에 대한 높은 내적일관성 신뢰도 추정은 이 검사가 나중에 다시 시행되어도 같은 결과가 반복될 것임을 시사한다.

A high internal consistency reliability estimate for a written test indicates that the test scores would be about the same, if the test were to be repeated at a later time.

 

 

평가자 자료의 신뢰도 (평가자의 신뢰도)

RELIABILITY OF RATER DATA

 

 

사람이 평가자로 들어가는 평가나, 사람의 평가가 자료의 일차 원천인 경우, 신뢰도 혹은 일관성에 대한 관심은 그 평가자에게 쏠린다. 임상상황에서의 평가나 구두평가의 재생산가능성의 가장 큰 위협은 개개의 평가자가 일관되지 못한 것, 혹은 다수 평가자간 재생산가능성이다. (그러나 대부분의 설계에서, 평가자는 item/case에 nested되어있거나 confounded되어 있기 때문에, item이나 case라는 컨텍스트를 배제하고 순수하게 평가자한테서만 기인한 에러를 추정하기는 불가능하다)

For all assessments that depend on human raters or judges for their primary source of data, the reliability or consistency of greatest interest is that of the rater or judge. The largest threat to the reproducibility of such clinical or oral ratings is rater inconsistency or low interrater reproducibility. (Technically, in most designs, raters or judges are nested or confounded with the items they rate or the cases, or both, so that it is often impossible to directly estimate the error associated with raters except in the context of items and cases.)

 

이러한 경우 평가척도를 활용한 평가에서 내적일관성(alpha)에 대한 관심보다는 평가자간 신뢰도가 신뢰도 추정에 있어서 더욱 중요하다.

The internal consistency (alpha) reliability of the rating scale (all items rated for each student) may be of some marginal interest to establish some commu- nality for the construct assessed by the rating scale, but interrater reliability is surely the most important type of reliability to estimate for rater-type assess- ments.

 

평가자간 신뢰도를 추정하기 위한 여러 방법이 있다. 

There are many ways to estimate interrater reliability, depending on the statistical elegance desired by the investigator. 

  • 가장 단순한 방법은 일치도를 %로 나타내는 것인데, 간편히 쓰기는 좋지만 논문에 쓰기는 부적절하다.
    The simplest type of interrater reliability is percent agreement , such that for each item rated, the agreement of the 2 (or more) independent raters is calculated. Percent-agreement statistics may be acceptable for in-house or everyday use, but would likely not be acceptable to manuscript reviewers and editors of high quality publications, as these statistics do not account for the chance occurrence of agree- ment. 
  • Kappa는 우연에 의해서 일치할 가능성을 고려한 것이며, 2명의 독립적 평가자에 대해서 종종 사용되는 방법이다. phi coefficient도 유사한 상관계수이지만, 우연히 일치할 가능성을 보정하지 않아서 과대추정하는 경향이 있다.
    The kappa13 statistic (a type of correlation coefficient) does account for the random-chance occurrence of rater agreement and is therefore sometimes used as an interrater reliability estimate, particularly for individual questions, rated by 2 independent raters. (The phi14 coefficient is the same general type of correlation coefficient, but does not correct for chance occurrence of agreement and therefore tends to overestimate true rater agree- ment.)
  • 가장 우아한 방법은 일반화가능도이론을 활용한 분석이다. 잘 설계되기만 하면 GT는 모든 관심 변인에 대한 variance component 를 알 수 있다.
    The most elegant estimates of interrater agreement use generalisability theory (GT) analysis.2–4 From a properly designed GT study, one can estimate variance components for all the variables of interest in the design: the persons, the raters and the items.
  • GT만큼 우아하지는 않지만, 가장 활용하기 좋은 방법은 ICC이다. ICC는 (GT와 마찬가지로) ANOVA를 활용한 것이며, 특정 요인과 관련된 variance를 추정해준다. 평가자간 신뢰도 분석에 ICC를 사용하는 것의 강점은 흔히 사용가능한 통계소프트웨어로 계산이 된다는 것이며, n명의 평가자에 대한 실제 평가자간 신뢰도를 계산해주며, 종종 큰 관심의 대상이 되는 한 평가자의 신뢰도도 계산할 수 있다는 것이다. 또한 결측치도 manage할 수 있다.
    A slightly less elegant, but perhaps more accessible method of estimating interrater reliability is by use of the intraclass correlation
    coefficient.15 Intraclass correlation uses analysis of variance (ANOVA), as does generalisability theory analysis, to estimate the vari- ance associated with factors in the reliability design. The strength of intraclass correlation used for inter- rater reliability is that it is easily computed in commonly available statistical software and it permits the estimation of both the actual interrater reliability of the n-raters used in the study as well as the reliability of a single rater, which is often of greater interest. Additionally, missing ratings, which are common in these datasets, can be managed by the intraclass correlation.

 

 

수행능력 평가의 신뢰도 (OSCE와 SP)

RELIABILITY OF PERFORMANCE EXAMINATIONS: OSCES AND SPS

 

실제 상황에서 일어나는 SP나 OSCE 검사에 대해서는 더 표준화된, 통제된 형태의 신뢰도 추정이 필요하다.

While ward-type evaluations attempt to assess some of these skills in the real setting (which tends to lower reliability due to the interference of many uncontrolled variables and a lack of standardisation), simulated patient (SP) examinations and objective structured clinical exam- inations (OSCEs) can be used to assess such skills in a more standardised, controlled fashion.


신뢰도 분석시에 수행능력 검사에서 특히 고려해야 할 점이 있다. 이 경우 평가문항(item)이 평가사례(case)에 nested되어 있기 때문에, 신뢰도분석의 단위는 case가 되어야 하며 item이 되어서는 안된다. 모든 신뢰도 분석에 있어서 공통된 한 가지 가정은, 각각의 item이 'locally independent'하다는 것이며, 이것이 의미하는 바는 모든 item의 점수가 다른 item에 대해서 논리적으로 독립적이어야 한다는 것이다. 한 세트 내에 nested된 item의 경우(예컨대 OSCE, SP, Key feature, MCQ의 testlet)는 모두 이 local independence 가정에 위배되는 것이다. 따라서 case set이 신뢰도 분석의 단위가 되어야 한다. 실제 예를 들어보면 20개 스테이션으로 된 OSCE를 시행하며, 각 스테이션에 5개 item이 있다면, 신뢰도 분석은 20개의 OSCE점수를 대상으로 해야지, 100개의 item을 대상으로 하면 안된다. 20개 OSCE점수로부터 나온 결과는 100개 item으로부터 나온 것보다 분명 낮을 것이다.

Performance examinations pose a special challenge for reliability analysis. Because the items rated in a performance examination are typically nested in a case, such as an OSCE, the unit of reliability analysis must necessarily be the case, not the item. One statistical assumption of all reliability analyses is that the items are locally independent, which means that all items must be reasonably independent of one another. Items nested in sets, such as an OSCE, an SP examination, a key features itemset16 or a testlet17 of multiple choice questions (MCQs), generally violate this assumption of local independence. Thus, the case set must be used as the unit of reliability analysis. Practically, this means that if one administers a 20- station OSCE, with each station having 5 items, the reliability analysis must use the 20 OSCE scores, not the 100 individual item scores. The reliability esti- mate for 20 observations will almost certainly be lower than that for 100 observations.

 

 

평가에서 신뢰도 계수를 어떻게 활용할 수 있을까?

HOW ARE RELIABILITY COEFFICIENTS USED IN ASSESSMENT?

 

한 가지 실제 활용 방식은 SEM을 계산하는데 사용하는 것이다.

One practical use of the reliability coefficient is in the calculation of the standard error of measurement (SEM). The SEM for the entire distribution of scores on an assessment is given by the formula:12

 

 

이 SEM은 신뢰구간을 계산하는데 사용할 수 있다.

This SEM can be used to form confidence bands around the observed assessment score, indicating the precision of measurement, given the reliability of the assessment, for each score level.

 

 

 

신뢰도는 어느 정도나 되어야 하는가? 

HOW MUCH RELIABILITY IS ENOUGH?

 

매우 high stake인 경우 0.9는 되어야 한다고 하며(예컨대 면허나 자격증 시험과 같이 평가대상자와 사회에 미치는 영향이 지대한 경우), moderate stake (학기말 고사, 연말고사)의 경우 0.8~0.89, low stake (수업시간의 평가)에서는 0.7~0.79 등과 같다.

If the stakes are extremely high, the reliability must be high in order to defensibly support the validity evidence for the measure. Various authors, textbook writers and researchers offer a variety of opinions on this issue, but most educational measurement professionals suggest a reliability of at least 0.90 for very high stakesassessments, such as licensure or certification exam- inations in medicine, which have major conse- quences for examinees and society. For more moderate stakes assessments, such as major end-of- course or end-of-year summative examinations in medical school, one would expect reliability to be in the range of 0.80–0.89, at minimum. For assessments with lower consequences, such as formative or summative classroom-type assessments, created and administered by local faculty, one might expect reliability to be in the range of 0.70–0.79 or so.

 

 

신뢰도 계수의 절대값보다는 평가대상자에 대한 위양성 혹은 위음성 판정에 따른 결과가 훨씬 중요하다.

The consequences on examinees of false positive or false negative outcomes of the assessment are far more important than the absolute value of the reliability coefficient.

 

pass/fail 결정의 재생산가능성을 추정하는 한 방법은 pass/fail reproducibility index를 계산하는 것인데, 이는 어느 정도나 confidence할 수 있는가에 대한 지수이다. 0에서 1 사이의 값으로 나타나며, 이것을 해석할 때는 동일한 pass / fail결정이 재시험에서도 이루어질 것인가에 대한 가능성이다. 일반화가능도이론으로도 커트라인 점수에 대한 측정 정밀도를 계산할 수 있다.

One method of estimating this pass ⁄ fail decision reproducibility was presented by Subkoviak20 and permits a calculation of a pass ⁄ fail reproducibility index, indicating the degree of confidence one can place on the pass ⁄ fail outcomes of the assessment. Pass ⁄ fail decision reli- ability, ranging from 0.0 to 1.0, is interpreted as the probability of an identical pass or fail decision being made upon retesting. Generalisability theory also permits a calculation of the precision of measure- ment at the cut score (a standard error of measure- ment at the passing score), which can be helpful in evaluating this all-important accuracy of classifica- tion.

 

평가자료의 해석에 있어서 신뢰도가 낮은 경우에 일어날 결과는 무엇일까? Wainer and Thissen는 table 1과 같은 결과를 제시했따.

What are some of the practical consequences of low reliability of the interpretation of assessment data? Wainer and Thissen21 discuss the expected change in test scores, upon retesting, for various levels of score reliability (Table 1).

 

신뢰도가 낮을 경우 재시험 상황에서 예상할 수 있는 점수의 변화폭이 상당히 크다. 예컨대 신뢰도가 0.5라면..

Expected changes in test scores upon retesting can be quite large, especially for lower levels of reliability. Consider this example: a test score distribution has a mean of 500 and a standard deviation of 100. If the score reliability is 0.50, the standard error of meas- urement equals 71.

 

575점을 받은 학생의 95% 신뢰구간은 575 ± 139로, 재시험에서 가능한 점수는 436–714에 이른다. 이는 상당히 넓은 범위이며, 이 정도의 신뢰도 수준이 그다지 드물지 않다. (특히 평가자-기반 혹은 수행능력 시험에서). 0.75의 신뢰도에서도 98점까지 달라질 수 있다. 

Thus, a 95% confidence interval for a student scoring of 575 on this test is 575 ± 139. Upon retesting this student, we could reasonably expect 95⁄ 100 retest scores to fall somewhere in the range of 436–714. This is a very wide score interval, at a reliability level that is not uncommon, especially for rater-based oral or performance examinations in medical education. Even at a more respectable reliability level of 0.75, using the same data example above, we would reasonably expect this student’s scores to vary by up to 98 score points upon repeated retesting. The effect of reliability on reasonable and meaningful interpretation of assess- ment scores is indeed real.

 



 

신뢰도 높이기

IMPROVING RELIABILITY OF ASSESSMENTS

 

신뢰도를 높일 수 있는 방법이 있다. 가장 중요한 것은 충분히 많은 숫자의 검사문항, 평가자, 케이스를 사용하는 것이다. 신뢰도가 낮은 흔한 원인 중 하나는 지나치게 작은 수의 평가문항, 케이스, 평가자 등이다. 문항이나 지시문에 혼동이 없도록 명확하게 기술되어야 한다. 내용전문가가 충분히 검토해야 한다. 중간정도의 난이도를 가진 케이스나 문항을 사용한다. 검사문항이 너무 쉽거나 어려우면, 거의 대부분 맞거나 틀리게 되고, 학생의 성취 혹은 신뢰도에 대해서 얻을 정보가 매우 적다.

There are several ways to improve the reliability of assessments. Most important is the use of sufficiently large numbers of test questions, raters or perform- ance cases. One frequent cause of low reliability is the use of far too few test items, performance cases or raters to adequately sample the domain of interest. Make sure the questions or performance prompts are clearly and unambiguously written and that they have been thoroughly reviewed by content experts. Use test questions or performance cases that are of medium difficulty for the students being assessed. If test questions or performance prompts are very easy or very hard, such that nearly all students get most questions correct or incorrect, very little information is gained about student achievement and the reliability of these assessments will be low. (In mastery-type testing, this will present different issues.)

 

가능하다면 예비시험 등을 통해서 결과를 얻어보라.

If possible, obtain pretest or tryout data from assess- ments before they are used as live or scored questions. However, it is possible to bank effective test questions or performance cases in secure itempools for reuse later.

 

 

 

 

 

  

 

 

 


 

 2004 Sep;38(9):1006-12.

Reliability: on the reproducibility of assessment data.

Author information

  • 1Department of Medical Education, College of Medicine, University of Illinois at Chicago, 808 South Wood Street, Chicago, IL 60612-7309, USA. sdowning@uic.edu

Abstract

CONTEXT:

All assessment data, like other scientific experimental data, must be reproducible in order to be meaningfully interpreted.

PURPOSE:

The purpose of this paper is to discuss applications of reliability to the most common assessment methods in medical education. Typical methods of estimating reliability are discussed intuitively and non-mathematically.

SUMMARY:

Reliability refers to the consistency of assessment outcomes. The exact type of consistency of greatest interest depends on the type of assessment, its purpose and the consequential use of the data. Written tests of cognitive achievement look to internal test consistency, using estimation methods derived from the test-retest design. Rater-based assessment data, such as ratings of clinical performance on the wards, require interrater consistency or agreement. Objective structured clinical examinations, simulated patient examinations and other performance-type assessments generally require generalisability theory analysis to account for various sources of measurement error in complex designs and to estimate the consistency of the generalisations to a universe or domain of skills.

CONCLUSIONS:

Reliability is a major source of validity evidence for assessments. Low reliability indicates that large variations in scores can be expected upon retesting. Inconsistent assessment scores are difficult or impossible to interpret meaningfully and thus reduce validity evidence.Reliability coefficients allow the quantification and estimation of the random errors of measurement in assessments, such that overall assessmentcan be improved.

PMID:
 
15327684
 
[PubMed - indexed for MEDLINE]


고위직을 위한 상황면접질문과 행동묘사면접질문 비교(PERSONNEL PSYCHOLOGY, 2001)

COMPARISON OF SITUATIONAL AND BEHAVIOR DESCRIPTION INTERVIEW QUESTIONS FOR HIGHER-LEVEL POSITIONS


ALLEN I. HUFFCUlT Department of Psychology Bradley University
JEFF A. WEEKLEY Kenexa

WILL1 H. WIESNER, TIMOTHY G. DEGROOT Department of Psychology McMaster University
CASEY JONES Kenexa

 

 

 

 

 

Pulakos and Schmitt 는 고위직에 있어 SI가 BDI보다 덜 효과적이라는 가설을 내세웠다. 그들의 가설을 평가하기 위해서 우리는 2개의 새로운 구조화된 면접 연구를 수행하였다. 두 연구는 모두 고위직 선발에 대한 것이었고, 동일한 직무특성 평가를 위하여 SI와 BDI 문항을 매칭시켰다. 그 결과는 SI가 이러한 직위에 있어서는 수행능력 예측에 더 떨어진다는 것이다. 더 나아가서 SI와 BDI가 동일한 직무 특성을 평가하고자 매칭되었지만, 상관관계가 매우 낮았고 BDI는 외향성과 관련되어 있었다. 낮은 SI의 효과성을 논의하고자 한다.

Based on a study of federal investigative agents, Pulakos and Schmitt (1995) hypothesized that situational interviews are less effective for higher-level positions than behavior description interviews. To evalu- ate their hypothesis we analyzed data from 2 new structured interview studies. Both of these studies involved higher-level positions, a mili- tary officer and a district manager respectively, and had matching SI and BDI questions written to assess the same job characteristics. Re- sults confirmed that situational interviews are much less predictive of performance in these types of positions. Moreover, results indicated very little correspondence between situational and behavior descrip- tion questions written to assess the same job characteristic, and a link between BDI ratings and the personality trait Extroversion. Possible reasons for the lower situational interview effectiveness are discussed.

 

 


 


근대 구조화면접을 이루는 두 가지 가장 유명한 것이 SI와 BDI이다. SI에서 지원자는 가상의 직무상황에 대해서 어떻게 대응할지를 대답해야 한다. SI는 goal-setting theory에 근간을 두고 있어서, 의도(goal)이 행동(action)의 즉각적 전구체(precursor)라고 가정한다.

Situational and behavior description interviews have emerged as the two most popular formats for constructing modern structured interviews (Campion, Palmer, & Campion, 1997; Harris, 1989). In a situational in- terview (SI) applicants are given hypothetical job situations and asked to indicate how they would respond (Latham, Saari, Pursell, & Campion, 1980). Situational interviews are grounded in goal- setting theory, par- ticularly in that intentions (i.e., goals) are the immediate precursor of a person’s actions (Latham, 1989). In a behavior description interview

 

BDI에서 지원자는 과거의 관련된 경험과 관련한 질문을 받는데, BDI는 과거의 행동이 미래의 최고의 예측인자라는 전제에 기반한다.

(BDI) applicants are asked to relate actual incidents from their past rel- evant to the target job (Janz, 1982). Behavior description interviews are grounded in the premise that the past is the best predictor of the future (Janz, 1989).

 

그러나 Pulakos and Schmitt 의 연구는 위의 validity scenario에 잠재 위협을 말한다. 그들은 상당히 복잡한 직무에 대해서 SI와 BDI를 개발하였고, 216개의 샘플에서 수행능력 평가와의 상관관계가 SI에서 -0.02, BDI에서 0.32임을 보여주었다. 이 연구에서 특히 중요한 점은, 그들의 가설, 즉 SI가 고위직에 대해서 효과적이지 않을 것이라는 것, 이며, 만약에 이것이 사실이라면 구조화 면접의 과학과 실행에 대한 상당한 함의가 있다.

However, a study by Pulakos and Schmitt (1995) suggests a possible caveat to the above validity scenario. They developed both situational and behavior description' interviews for a fairly complex position, a In a sample of 216 incumbents (108 for federal investigative agent. each format), the correlations with performance evaluations were -0.02 for the SI and 0.32 for the BDI. What is particularly important about this study is their hypothesis that situational interviews may not be as effective for higher-level positions as they are for lower-level positions. If true, this has very important implications for the science and practice of structured interviewing.

 





그렇다면 고위직 면접에서 왜 SI가 BDI보다 덜 효과적일까? 첫 번째 설명은 SI질문이 이들에게 너무 단순하다는 것이다. 그러나 평균과 표준편차를 분석해보면, 이 세 가지 연구에서 이것은 사실이 아니었다. 오히려 문항으로서 SI가 BDI보다 더 나은 편이었다. 두 번째 가능한 설명은 고위직에 있어서는 SI에 대한 대답을 평가하는 것 자체가 어렵기 때문이라는 것이다. 평가자간 신뢰도를 보면, 이 역시 가능성이 낮다. 세 번째 설명은 BDI가 현재, 혹은 최근의 직위와 관련한 직무 수행능력을 타당하게 보여준다는 것이다. Pulakos와 Schmitt의 연구와 우리의 두 번째 실험은이 가설을 뒷받침해주지 않는다. 또 다른 가능성은 SI와 BDI가 서로 다른 구인을 평가한다는 것이다. 우리의 연구 결과를 보면 SI와 BDI가 같은 직무 특성을 평가하고자 하더라도, 적어도 고위직에 대해서는, 그 결과는 잘 일치하지 않는다. 이러한 낮은 일치도는 중요한 결과인데, 다른 면접 관련 문헌에서 다뤄진 바가 없는 것이다. 이 결과의 함의는 SI와 BDI가 서로 대체가능한 측정방법으로 고려되어서는 안된다는 점이다. 그보다는 각각 별개의 검사도구라고 보는 것이 나으며, 서로 다른 구인을 평가하는 것으로 봐야 한다. 두 번째 함의는 BDI로어떤 구인을 보고자 하든, 고위직에 대해서는 SI보다는 우월하다는 것이다.

So why would situational interviews be less effective than behavior description interviews for higher-level ppsitions? The first possible ex- planation is that SI questions are just to8 simple for higher-level posi- tions. Analysis of the means and standard deviations suggest that this was not the case in any of the three studies. Rather, it was not uncom- mon for the SI questions to have slightly better properties than the BDI questions. The second possible explanation is that responses to SI ques- tions are more difficult to rate with higher-level positions. Analysis of interrater reliability data in all three studies again suggests that this was not the case. The third possible explanation is that BDI questions are valid because they capture job performance in either the current or a re- cent position, an explanation which is particularly viable in concurrent designs. Data available in Pulakos and Schmitt (1995) and in our second study (both of which were concurrent) does not support this idea either. Another possible explanation is that SI and BDI ratings tend to cap- ture different constructs. Our results strongly suggest that SI and BDI questions written to assess the same job characteristics do not tend to correspond, at least not for higher-level positions. This lack of corre- spondence is an important finding, one we are unaware of anywhere else in the interview literature. One implication of this finding is that situa- tional and behavior description formats probably should not be consid- ered as alternate methods of measurement. Rather, it might be more appropriate to view them as separate testing devices, ones which for the most part capture different constructs. A second implication is that whatever constructs BDI questions tend to capture for higher-level po- sitions are more predictive of performance than whatever constructs SI questions tend to capture.

 

마지막으로, SI의 타당도에 관해서 언급되어야 할 방법론적 이슈가 있다. Pulakos와 Schmitt의 연구에서, 일부 지원자는 모든 가능한 가능성을 고려하고자 했고, 다른 지원자는 표면적인 응답만을 했다. 후자와 같은 답도 여전히 옳은 답이기에, 더 복잡한 사고를 통해서 답을 한 전자와 같은 지원자가 - 비록 그들이 더 적합한 지원자임을 보였더라도 - 반드시 더 높은 점수를 받은 것은 아니다. Pulakos와 Schmitt 연구의 함의는 SI의 점수체계가 낮은 복잡도의 직무에 더 잘 맞는다 것이다. SI연구의 표본 답안에 대한 연구를 보면, SI 점수체계가 그 지원자가 어떤 행동을 할 것인가에만 엄격하게 초점이 맞춰져 있고, 왜 그러한 행동을 할 것인지, 어떻게 그 행동을 할 것인지에 대해서 맞춰져 있지 않다. 표면적으로 드러난 행동에 초점을 두는 것은 낮은 직위에 대해서는 완벽하게 적합할 수 있다. 그러나 고위직에 대해서는 어떻게 지원자가 특정 행동에 이르렀고, 왜 그 행동을 하기로 했는가가 행동 그 자체보다 중요할 수 있다.

Last, there is a methodological issue related to SI validity in higher- level positions that warrants mention. During the Pulakos and Schmitt (1995) study it was observed that some candidates thought through every possible contingency when answering the SI questions and other appli- cants gave more superficial responses. Because the latter answcrs were still essentially correct, candidates engaging in more complex thought did not necessarily receive higher ratings even though a case could be made that they represented better job candidates. The implication of Pulakos and Schmitt’s (1995) observation is that the standard SI scoring system may be better suited for jobs of lower com- plexity. An examination of the benchmark answers provided as examples in several SI studies illustrates the tendency for SI scoring to be based strictly on what overt action candidates would take (e.g., Campion et al., 1994, Latham & Saari, 1984), not on why they would take that action or how they arrived that action. A focus upon overt actions may be perfectly adequate (and even preferred) for lower-level positions. But for higher- level positions knowing how candidates arrived at a particular action and why they chose that action is often just as important as the action itself. 


또한 SI 질문에 대한 probing이 왜 그 행동을 하게 되었는가와 특히 관련이 있음에도, 면접관이 probe 하지 못하게 되어있다는 점도 중요하다. 아직 SI의 점수체계가 문제라는 것에 대한 직접적 증거는 없다. 연구가 필요하다.

It is also important to point out that interviewers typically are not allowed to probe responses to SI questions, and probing is where information related to why they choose a particular action would be most likely to emerge. Admittedly we do not have direct evidence at the present time that the standard SI scoring system is the culprit. Nonetheless, this is- sue and its implications are important enough to warrant investigation of modifications to the SI scoring system in future research.

 


요약하면, 본 연구의 결과는 다음과 같은 기여가 있다. 가장 중요한 것은 SI가 고위직에 맞지 않는다는 Pulakos and Schmitt의 가설을 지지하는 결과이다. 그들의 원래 연구와 이번 두 개의 새로운 연구를 합해서 보면, 면접 개발자들은 고위직에 SI를 사용할 때 조심해야 한다는 제언을 할 수 있다. 이 제언이 특히 중요한 이유는 모든 세 연구에서 SI와 BDI를 직접적으로 비교했기 때문이다. 또한 SI와 BDI 질문에 대한 평가 결과는 일치도가 매우 낮았다. 또한 흥미로운 것은 SI와 BDI 점수의 일치도가 낮은 직위의 면접에서는 높은 일치도를 보였다는 점이다. 마지막으로, BDI가 외향성 점수와 상관관계가 높은 점은 verbal presentation skill의 영향이 컸을 수 있음을 의미한다.

In summary, results of this investigation contribute to the interview literature in several ways. Probably the most important contribution is they support Pulakos and Schmitt’s (1995) hypothesis that situational in- terviews do not tend to work as well for higher-level positions. Based on the combined results of their original Ftudy and our two new studies, the formal recommendation can now be dade that interview developers should exercise considerable caution when using the standard SI format for higher-level positions. What makes this recommendation particu- larly viable is that all three of these studies involved direct comparison of situational and behavior description interviews for the same position. In addition, out results suggest a strong lack of correspondence between SI and BDI questions written to assess the same job characteristics in higher-level positions. What is interesting is that the one published study which involved a direct comparison of SI and BDI validity for the same lower-level position found a much higher correspondence (Campion et al., 1994). Finally, our results suggest an association between BDI rat- ings and Extroversion scores, which may point to a larger influence from verbal presentation skdls.

 

 

 

 

 

 


COMPARISON OF SITUATIONAL AND BEHAVIOR DESCRIPTION INTERVIEW QUESTIONS FOR HIGHER-LEVEL POSITIONS

  1. ALLEN I. HUFFCUTT1,*, 
  2. JEFF A. WEEKLEY2, 
  3. WILLI H. WIESNER3,
  4. TIMOTHY G. DEGROOT3 and
  5. CASEY JONES2

Article first published online: 7 DEC 2006

DOI: 10.1111/j.1744-6570.2001.tb00225.x

Personnel Psychology

Personnel Psychology

Volume 54, Issue 3, pages 619–644, September 2001


+ Recent posts