자기평가와 자기모니터링의 차이 분석 (Adv in Health Sci Educ, 2011)

Exploring the divergence between self-assessment and self-monitoring

Kevin W. Eva • Glenn Regehr






전문직의 자기조절에 관한 많은 모델이 자신의 기술의 부족함을 알고, 그것을 CPD 활동을 통해서 채워나갈 책임을 요구한다. 그러나 이러한 기대와 달리 자기평가에 대한 많은 문헌들은 자기조절적 전문직이 이 과저을 효과적으로 할 수 있는가에 대한 의구심을 표한다. 이에 대해서 저자들은 자기조절에 관한 문헌에서 표현되는 자기평가에 대한 구조가 문제가 있음을 지적한 바 있다. 본 논문에서 2 개의 연구를 통해서 자기평가와 자기조절의 관계를 살펴보고자 한다.

  • 자기평가a global judgment of one’s ability in a particular domain
  • 자기조절a moment-by-moment awareness of the likelihood that one maintains the skill/knowledge to act in a particular situation

Many models of professional self-regulation call upon individual practitioners to take responsibility both for identifying the limits of their own skills and for redressing their identified limits through continuing professional development activities. Despite these expectations, a considerable literature in the domain of self-assessment has questioned the ability of the self-regulating professional to enact this process effectively. In response, authors have recently suggested that the construction of self-assessment as represented in the self-regulation literature is, itself, problematic. In this paper we report a pair of studies that examine the relationship between self-assessment (a global judgment of one’s ability in a particular domain) and self-monitoring (a moment-by-moment awareness of the likelihood that one maintains the skill/knowledge to act in a particular situation). 


본 연구들은 수행능력과 자기평가와의 상관관계는 낮지만, 자기모니터링과 관련한 척도와는 상관관계가 높음을 보여준다.

These studies reveal that, despite poor correlations between performance and self-assessments (consistent with what is typically seen in the self-assessment literature), participant performance was strongly related to several measures of self-monitoring including: 

  • 문제에 대해서 답을 할 것인가 미룰 것인가. the decision to answer or defer responding to a question, 
  • 그 결정(답을 할 것인가 미룰 것인가)을 내리는데 걸린 시간 the amount of time required to make that decision to answer or defer, and 
  • 정답이 제시되었을 때의 답에 대한 자신감 the confidence expressed in an answer when provided. 


이러한 차이는 자기모니터링과 자기평가에 대한 인지적 기전을 이해하는데 도움이 될 것이며, 교육과 학습에 대한 노력을 어떻게 더 잘 제시해줄 수 있을 것인가에 대한 이해에 도움이 될 것이다.

This apparent divergence between poor overall self-assessment and effective self-monitoring is considered in terms of how the findings might inform our understanding of the cognitive mechanisms yielding both self-monitoring judgments and self-assessments and how that understanding might be used to better direct education and learning efforts.







개개인이 효과적인 자기평가를 할 수 있는가에 대한 의문이 제기되고 있다.

there is now a well established literature that raises doubts about the capacity of individuals to effectively self-assess either personal (Dunning et al. 2004) or professional (Gordon 1991; Boud 1995; Davis et al. 2006) areas of relative strength and weakness. Increasingly it is being recognized that self-assessment as ‘‘a process of personal reflection based on an unguided review of practice and experience for the purposes of making judgments regarding one’s own current level of knowledge, skills, and understanding as a prequel to self-directed learning activities that will improve overall performance and thereby maintain competence’’ (Eva and Regehr 2007, p. 81) is inherently flawed."





















 2011 Aug;16(3):311-29. doi: 10.1007/s10459-010-9263-2. Epub 2010 Nov 30.

Exploring the divergence between self-assessment and self-monitoring.

Author information

  • 1University of British Columbia, Vancouver, BC, Canada. kevin.eva@ubc.ca

Abstract

Many models of professional self-regulation call upon individual practitioners to take responsibility both for identifying the limits of their own skills and for redressing their identified limits through continuing professional development activities. Despite these expectations, a considerable literature in the domain of self-assessment has questioned the ability of the self-regulating professional to enact this process effectively. In response, authors have recently suggested that the construction of self-assessment as represented in the self-regulation literature is, itself, problematic. In this paper we report a pair of studies that examine the relationship between self-assessment (a global judgment of one's ability in a particular domain) and self-monitoring (a moment-by-moment awareness of the likelihood that one maintains the skill/knowledge to act in a particular situation). These studies reveal that, despite poor correlations between performance and self-assessments (consistent with what is typically seen in the self-assessmentliterature), participant performance was strongly related to several measures of self-monitoring including: the decision to answer or defer responding to a question, the amount of time required to make that decision to answer or defer, and the confidence expressed in an answer when provided. This apparent divergence between poor overall self-assessment and effective self-monitoring is considered in terms of how the findings might inform our understanding of the cognitive mechanisms yielding both self-monitoring judgments and self-assessments and how that understanding might be used to better direct education and learning efforts.

PMID:
 
21113820
 
[PubMed - indexed for MEDLINE] 
PMCID:
 
PMC3139875
 
Free PMC Article


PBL의 세계화에 대해 다시 생각하다: 어떻게 문화가 자기주도학습에 영햐을 주는가 (Med Educ, 2012)

Rethinking the globalisation of problem-based learning: how culture challenges self-directed learning

Janneke M Frambach,1 Erik W Driessen,1 Li-Chong Chan2 & Cees P M van der Vleuten1







교육법은 문화와 사상을 반영한다. 끊임없이 교육법이 전 세계적으로 퍼져나가고 공유되는 현 시점에서 이러한 인식에 대한 여러 문화간 함의를 살펴보는 것이 중요하다.

It is generally acknowledged that education methods reflect cultural and ideological values.1–3 Addressing the cross-cultural implications of this notion is increasingly urgent in view of the continuing dis- semination of education methods around the globe.


문화간 공유하는 가치가 존재한다는 가정이 틀릴지도 모른다는 것에 대한 반대주장은 상당히 무시되어왔다. 세계화는 교육법의 표준화를 가져왔고, 문화적 차이에 대한 고려 없이 여러 문화권에서 활용되고 있다. 의학교육 외 분야에서의 연구를 통해 학생의 학습과 교육적 접근에 대한 선호에 문화간 차이가 있다는 것이 밝혀졌다. 그 결과 '국제적' 교육법에 대한 문화적 근원이 존재할 것이라는 가정은 다른 문화권에서는 적절하지 않을 수도 있다.

Counterarguments that the assumption of shared values across cultures may be false seem to be largely ignored.8 Driven by ideological or other motives, the globalisation movement promotes the standardisation of education methods and practices across cultures, apparently with little regard for cultural differences.9,10 Research outside medical education has revealed differences between cultures in students’ learning and preferences for educational approaches.11–13 Consequently, the cultural originof a supposedly ‘international’ educational approach may compromise its suitability for other cultural contexts.3


서양 문화에 기반을 둔, 학생 중심의 문제 중심의 방법은 진정으로 국제적이지 않을 수 있고, 여러 비-서구권 문화에서 적합성에 문제가 제기된 바 있다. Gwee와 Khoo는 아시아 문화권에서의 태도는 PBL의 교육 원칙과 잘 맞지 않을 수 있으나, 이 차이를 좁힐 수 있는 태도에 대해서도 언급하기도 했다. PBL이 여러 문화권에 걸쳐서 적용가능한가에 대한 실제 연구는 학생과 교수 등을 통한 연구에서 긍정적으로 보여진 적도 있지만, 서양에서 진행되는 PBL과의 차이 혹은 거기서 발생하는 문제도 지적된 적 있다. 대부분의 이러한 연구는 PBL의 도입시기 혹은 도입 직후에 국한되었거나, 단일기관, 단일지역에서 수행되었다.

Rooted in Western culture, student-centred, problem- based methods may not be of a truly international nature3,14 and their compatibility with non-Western cultures has been questioned.15 Gwee5 and Khoo16 pointed to Asian cultural attitudes that might be difficult to reconcile with the educational principles of PBL, but also noted attitudes that might mitigate this discrepancy. The few empirical studies into the cross-cultural applicability of PBL reported positive views among students and staff,6,7,17,18 but also noted problems and assumed differences with Western practice.17–19 Most of these studies were limited to the implementation phase of PBL or shortly thereafter and to single institutions, countries or regions, mainly in Asia.


본 연구에서는 어떻게 문화적 요인이 SDL이라는 PBL의 주 교육원칙에 영향을 주는가에 대해 알아보고자 한다. 이러한 원칙은 '민주주의, 개인주의, 평등주의라는 서양의 사상'에 크게 기반하고 있다.

The present study investigates whether and how cultural factors affect one of PBL’s main educational principles: self-directed learning (SDL).20 It has been argued that this principle relies strongly on ‘Western ideals of democracy, individualism and egalitarian- ism’.21 It is defined here as: 


학생이 교수에 의해 정의된 것이 아니라 스스로 학습활동을 정의하고 참여하는 것에 대한 준비도. '준비도'라는 것은 동기부여 뿐 아니라 적절한 행동기술을 포함한다. 따라서 SDL을 하는 학습자는 지식 습득에 대한 내적 요구가 있으며, 이 요구는 선생님에 의해서 정해진 것이 아니다. 추가적으로 이 학생은 적절한 정보탐색기술을 익혀야 하며, 이는 스스로의 요구를 충족시키기 위한 정보원이 어딘지를 알고 찾는 것이다.

‘…the preparedness of a student to engage in learning activities defined by himself rather than by a teacher. ‘‘Preparedness’’ must be understood as having both a motivational aspect and involving skilled behaviour. Thus, an accomplished self-directed learner experi- ences an intrinsic need to acquire knowledge, not dominated by requirements set by his teachers. In addition, he has mastered the appropriate information seeking skills, that is: he knows where and how to find information resources that would fulfil his need.’22


문화라는 것은 공유된 동기, 가치, 신념을 의미하며 그 문화의 구성원을 집단으로 묶어준다.

Culture is defined as the shared motives, values, beliefs and identities of members of collectives.13 


사회문화이론가들은 인간은 환경의 규범과 특징을 내면화하면서 지속적으로 환경에 의해 형성되어가며, 반대로 스스로 가진 생각과 가치를 환경에 외면화하면서 그 환경에 영향을 주고 바꾼다.

Socio-cultural theorists state that humans are continuously influenced and shaped by their environment as they ‘internalise’ its norms and characteristics.26,27 Conversely, humans influence and transform their environment by ‘externalising’ their inner ideas and values.27




자료수집

The semi- structured interviews lasted 1 hour on average and were audio-recorded and transcribed verbatim. Oral and written informed consent was obtained. The participants received a symbolic gift. Purposive sampling ensured the inclusion of male and female students, students from different PBL groups and from the first and third years of training. 


The researchers were briefly introduced at the start of the tutorials and did not participate in sessions. Documents about the implementation and application of PBL were obtained from the key persons. The researchers kept journals in which they recorded additional contextual information. They also reported personal perspectives to create awareness of potential researcher bias. To enhance the trustworthiness of the data, a member check was conducted by asking a sample of the participants to indicate agreement with and comment on a report of preliminary results. The comments were integrated with the data.


Using the thematic approach of template analysis,28 a succession of coding templates, consisting of hierar- chically structured themes, were applied to the data (Fig. 1).


28 King N. Using templates in the thematic analysis of text. In: Cassel C, Symon G, eds. Essential Guide to Qualitative Methods in Organizational Research. London: Sage Publications 2004;256–70.










중동국가에서의 불확실성과 전통

Uncertainty and tradition in the Middle East 


Middle Eastern students expressed more feelings of uncertainty as a cultural factor compared with Dutch and Hong Kong students. Their uncertainty and difficulties in adapting to SDL were related to sharp contrasts between PBL and their prior educational experiences. Rather than feeling motivated, many students felt lost and unable to find appropriate information to address their learning objectives. Uncertainty was related to experiences of traditional, teacher-centred secondary education, but also to a culturally determined focus on tradition. Middle Eastern respondents referred to their society’s respect for the ‘old ways’ and wariness regarding innovations. As they became used to PBL, however, their attitudes changed significantly. Students came to support the principle of SDL and information seeking became less problematic, although students still felt PBL was not easy and wanted more guidance:



Having experienced information searching and self- study in secondary school, Dutch students had less difficulty in adapting to SDL. In addition, Dutch culture places less value on tradition. Although the Dutch students were less uncertain, they required time to develop information-seeking skills and they generally preferred tutors who provided clear guidance. A particular problem for them concerned determination of the depth and breadth of the knowledge to be attained. Dutch, Hong Kong and, particularly, Middle Eastern students tried to cope with uncertainty and independence by asking senior students for advice and materials. Although it reduced insecurity, this strategy discouraged them from depending upon themselves for their learning:


혼합과 위계의 홍콩

Hybridism and hierarchy in Hong Kong 


From the outset, finding information was less difficult for Hong Kong students. As topics of tutorials were also covered in lectures in the hybrid curriculum, identifying learning needs and developing information-seeking skills were less relevant to Hong Kong students. They showed little awareness that PBL was intended to foster SDL. Whereas the lectures covered the basic sciences, the Hong Kong tutorials focused more on clinical reasoning skills. By contrast, the Dutch and Middle Eastern students had to rely on tutorials for most of their knowledge. The Hong Kong students often felt the tutorials repeated the content of lectures, which some appreciated as providing a useful opportunity for revision and a chance to apply their knowledge to a clinical case, but others considered a waste of valuable study time:



Some Hong Kong students were anxious about multiple interpretations that might come up during tutorial discussions because these created uncertainty about the ‘truth’ and they were hesitant about trusting their peers’ statements. This reflected their experience of a teacher-centred secondary education, as well as the culture of a hierarchical society in which knowledge and authoritative statements about the ‘truth’ are expected to come from professors or experts who represent persons of higher status. Students were not used to having to rely entirely on themselves for their learning. They also attached greater value to tutorials that were facilitated by expert clinicians rather than by non-experts. Although Dutch students also preferred facilitation by expert tutors, they were comfortable relying on their peers. The impact of hierarchy was also evident in the Middle Eastern school and manifested in students’ experiencing of anxiety about the requirement to search independently for the ‘truth’. However, by Year 3, student anxiety in the Middle East school had abated, whereas student anxiety in Hong Kong showed little difference between the years:



성취와 평가에 대한 문화간 차이

Achievement and assessment across cultures 


Middle Eastern and Hong Kong students characterised themselves and their respective societies as competitive and described themselves as striving for success and to be the best. They felt pressured to pass examinations and rank among the top students:


Dutch students were also examination-focused, although their responses during interviews suggested a lower level of culture-related focus on achievement and success compared with the other two cohorts. The general feeling among the three groups of students was that they valued PBL only for its contribution to their examination preparation. This depended on examination content. In Hong Kong, examination content was mainly determined by lectures. In the Middle Eastern and Dutch schools, it depended more on PBL tutorials. However, particularly in the Middle Eastern school, the inclusion of additional topics caused students to concentrate on these predetermined additional topics and their lecture notes more than on identifying and addressing their individual learning needs. Even if they supported and understood the principle of SDL, achievement and assessment took priority, directing their attention and efforts away from SDL to exam- ination content:




Discussion


보통 불확실성, 전통, 위계, 성취에 대한 강조는 비-서구권에서 서구권보다 더 두드러지는 특징이다. PBL과 비-서구 문화권 사이에서 발생하는 불일치는 여기서 발생하는 것일 수 있으며, PBL을 곧바로 이들 문화권에 적용하는 것을 어렵게 한다.

Uncertainty, tradition, hierarchy and achievement have often been identified as more prominent in non-Western than in Western cultures.29–31 This suggests a certain incongruity between PBL and non-Western cultures, which complicates the straightforward transfer of PBL to such cultural contexts.


그러나 문화적 요인이 모든 것을 설명해주지는 않았고 다른 것도 있다.

However, cultural factors clearly do not explain all of the discrepancies in findings between the respective contexts. Several contextual factors, such as a 

    • traditional, teacher-centred secondary education, 
    • a hybrid curriculum and 
    • examination content not covered during PBL sessions 

further complicated students’ development of SDL skills. 


예를 들면...홍콩 고등학교 교육은 SDL기회가 거의 없음. 학생들은 선생님-의존적이 됨.

For example, the secondary school education system in Hong Kong is very much based on knowledge acquisition and rote learning to pass examinations. Because teachers and recommended textbooks serve as the main sources of information, there is little opportunity for SDL. Therefore, it is not surprising that current Hong Kong medical students remain dependent on teachers and lectures for their learning. However, this may change in the future in response to education reform taking place in Hong Kong high schools, which emphasises SDL by students as a major educational goal.


SDL과 PBL기술이 그것이 적용되는 context에 크게 영향을 받는다는 이전 연구를 뒷받침한다. PBL이 도입되었다고 SDL이 자연적으로 발생하지는 않는다. 촉진적 환경을 만들어주기 위해서는 정교하게 계획되고 집중적 노력이 필요하다. 실제로 1학년 학생을 적절한 가이드 없이 PBL의 독립적 학습환경에 넣어두면 살아남기 위해서 오히려 튜터, 사전에 정해진 학습목표, 단순 암기에 지나치게 의존하게 된다. 우리 연구에서도 세 문화권 학생은 서로 정도의 차이는 없었지만 비슷한 행동패턴을 보였는데, 불확실성을 줄이려고 한다거나, 선배에게 상담을 한다거나, 튜터에게 가이드를 요청한다거나, 시험 내용에 초점을 맞추는 것 등이 공통적이었다.

Our findings support earlier comments that the development of SDL and other PBL skills depends heavily on the context in which PBL is applied.21 Research suggests that SDL does not occur automatically when PBL is implemented. Carefully considered and focused efforts are needed to shape a propitious context.32,33 In fact, exposing Year 1 students to the independent learning environment of PBL without providing them with adequate guidance may, rather than promoting the development of SDL skills, cause them to become severely dependent on tutors, predetermined learning objectives and on rote learning in order to ‘survive’.32,33 This is supported by our findings that students across three different cultures, albeit to different degrees, mentioned sim- ilar behaviours, needs and preferences with regard to alleviating uncertainty, consulting senior students, asking for tutor guidance and focusing on examination content.


가능한 대안 중 하나는 SDL을 PBL을 위한 도구로 활용하고, PBL의 목표로 인식하게 하는 것이다. 점진적으로 SDL에 노출시켜 주는 것이, 즉 1학년에는 가이드와 지원 체계를 강력하게 하는 것이, 궁극적으로 SDL 기술을 더 발전시킬 것이다. 특히 중고등학교 교육이 교사 중심이고 hybrid 접근법이 없는 경우에, 그리고 불확실성, 전통, 위계, 성취가 강조되는 문화권에서 중요하다.

A possible solution might be to strike a balance between using SDL as a means to PBL and perceiving it as an end of PBL.33 Gradual exposure to SDL, with relatively strong guidance and support in the first year, might ultimately yield the development of more SDL skills.33 Our findings suggest that this is particularly relevant in contexts in which secondary education is teacher-centred, no hybrid approach is followed, and the cultural factors of uncertainty, tradition, hierarchy and achievement are valued highly.


그럼에도 불구하고 세 문화권 학생들 모두 1학년에서 3학년으로 올라감에 따라 SDL의 원칙을 내면화하는 것으로 보였다. 

Despite the challenges, however, students across the three cultures increasingly internalised the principle of SDL as they moved from Year 1 to Year 3. 

  • The Middle Eastern students made substantial progress from initial uncertainty to preparedness to determine their own learning activities and find relevant infor- mation. 
  • Because of their pre-university experiences in SDL, the progress of the Dutch students was less marked, but still noticeable. 
  • Hong Kong students seemed to quickly adapt to the PBL learning environment and to develop clinical reasoning skills, but were less stimulated to develop SDL skills in terms of determining learning objectives and con- sulting different information sources. 

이러한 결과는 학생들이 PBL 과정에 익숙해짐에 따라서 SDL 기술이 발전한다는 기존의 연구와도 일관된 것이다. 따라서 PBL이 비록 여러 문화권에서 곧바로 적용하기는 어려울지라도, 적용될 수 없다는 결론을 내리는 것 또한 옳지 않다.

These findings would appear to be consistent with those of studies reporting that SDL skills develop naturally as stu- dents become used to the PBL process and curricu- lum.33,34 Thus, although PBL may not be cross- culturally applicable in a straightforward way, it would be wrong to conclude that it cannot be applied across cultural contexts as practice continues to prove.


근본적인 질문은 PBL이 애초에 국제적으로 적용되어야 했었냐는 것이다. 의학교육이 전 세계적으로 개혁이 필요한 것은 사실이지만, 한 해법이 모든 곳에 적용되어야 하는냐는 논란의 여지가 있다. 현재는 서구에서 유래한 학생중심 교육법이 '국제적' 기준으로 대표되고 있다. 그러나 이 연구에서 이러한 방법들의 문화권간 적용 가능성에 의문을 가질 수 있음을 보여준다. 학생 중심의 문제 중심의 방법을 적용하면서 문화적 도전에 맞닥뜨리느니, 오히려 그 맥락에 가장 잘 맞는 대안을 찾거나 만드는 것이 나을 수도 있다. 현재의 '국제 기준'에 대한 움직임을 감안하면 이는 쉽지 않다. 그러나 아시아 지역의 영향력이 높아지고, 다른 지역도 발달함에 따라 미래의 의학교육의 지형도 바뀔 것이다.

A more fundamental question is whether PBL should be globalised in the first place. It is true that medical education worldwide is in need of reform, but whether one solution should be applied to all contexts is debatable. Currently, student-centred methods originating in Western culture seem to represent an ‘international’ standard. Yet, as this study confirms, the cross-cultural applicability of these methods may be questionable. Rather than taking on the cultural and contextual challenge of adopting student-centred, problem-based methods, it might be wiser for medical educationalists to rise to the challenge of exploring or creating alternatives that best fit their particular context. Given the current movement towards the development of ‘international standards’, this is a major challenge indeed. However, the rising influence of the Asian region and rapid developments in other parts of the world may imply changes in the future landscape of medical education.


11 Tweed RG, Lehman DR. Learning considered within a cultural context: Confucian and Socratic approaches. Am Psychol 2002;57 (2):89–99. 


12 Li J. Mind or virtue: Western and Chinese beliefs about learning. Curr Dir Psychol Sci 2005;14 (4):190–4. 


13 Joy S, Kolb DA. Are there cultural differences in learning style? Int J Intercult Relat 2009;33 (1):69–85.


8 Hodges BD, Segouin C. Medical education: it’s time for a transatlantic dialogue. Med Educ 2008;42 (1):2–3. 


9 Karle H, Christensen L, Gordon D, Nystrup J. Neo- colonialism versus sound globalisation policy in medi- cal education. Med Educ 2008;42 (10):956–8. 


10 Hodges BD, Maniate JM, Martimianakis MA, Alsuwai- dan M, Segouin C. Cracks and crevices: globalisation discourse and medical education. Med Teach 2009;31 (10):910–7.


31 Al Kadri HM, Al-Moamary MS, Magzoub ME, Roberts C, van der Vleuten CPM. Students’ perceptions of the impact of assessment on approaches to learning: a comparison between two medical schools with similar curricula. Int J Med Educ 2011;2:22–52.




 2012 Aug;46(8):738-47. doi: 10.1111/j.1365-2923.2012.04290.x.

Rethinking the globalisation of problem-based learning: how culture challenges self-directed learning.

Author information

  • 1Department of Educational Development and Research, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, the Netherlands. j.frambach@maastrichtuniversity.nl

Abstract

CONTEXT:

Medical schools worldwide are increasingly switching to student-centred methods such as problem-based learning (PBL) to foster lifelongself-directed learning (SDL). The cross-cultural applicability of these methods has been questioned because of their Western origins and because education contexts and learning approaches differ across cultures.

OBJECTIVES:

This study evaluated PBL's cross-cultural applicability by investigating how it is applied in three medical schools in regions with different cultures in, respectively, East Asia, the Middle East and Western Europe. Specifically, it investigated how students' cultural backgrounds impact on SDL in PBL and how this impact affects students.

METHODS:

A qualitative, cross-cultural, comparative case study was conducted in three medical schools. Data were collected through 88 semi-structured, in-depth interviews with Year 1 and 3 students, tutors and key persons involved in PBL, 32 observations of Year 1 and 3 PBL tutorials, document analysis, and contextual information. The data were thematically analysed using the template analysis method. Comparisons were made among the three medical schools and between Year 1 and 3 students across and within the schools.

RESULTS:

The cultural factors of uncertainty and tradition posed a challenge to Middle Eastern students' SDL. Hierarchy posed a challenge to Asian students and achievement impacted on both sets of non-Western students. These factors were less applicable to European students, although the latter did experience some challenges. Several contextual factors inhibited or enhanced SDL across the cases. As students grew used to PBL, SDL skills increased across the cases, albeit to different degrees.

CONCLUSIONS:

Although cultural factors can pose a challenge to the application of PBL in non-Western settings, it appears that PBL can be applied in different cultural contexts. However, its globalisation does not postulate uniform processes and outcomes, and culturally sensitive alternatives might be developed.

© Blackwell Publishing Ltd 2012.

PMID:
 
22803751
 
[PubMed - indexed for MEDLINE]



SDLRS: 요인분석 (Med Educ, 2005)

The Self-Directed Learning Readiness Scale: a factor analysis study

J Dennis Hoban, Sonya R Lawson, Paul E Mazmanian, Al M Best & Hugo R Seibel





의학에서 SDL은 중요하다. ABMS, RCPSC, WFME 등은 평생학습을 수련기간동안 반드시 평가해야 하는 전문성 중 하나로 보았다.

Self-directed learning is important in medicine where knowledge is continuously changing and a wide- range of patient problems is a constant.1 The American Board of Medical Specialties, The Royal College of Physicians and Surgeons of Canada, and the World Federation for Medical Education describe life-long and self-directed learning as professional characteristics that should be evaluated in the train- ing of physicians.2–4


1970년대 후반, Guglielmino가 SDL에 대한 준비도를 측정하는 도구를 개발하였다. Delphi 연구를 통해서 타당도를 마련했으며, 이러한 귀납적 접근법을 통해서 SDL에 대한 준비도를 개념화하고 다음과 같은 정의를 내렸다.

In the late 1970s, Guglielmino developed an instru- ment to measure self-directed learning readiness.5 A Delphi study was conducted to gain expert consensus on the characteristics of the self-directed learner. In the case of the SDLRS a foundational effort to ensure content validity involved basing the items on a Delphi panel s perception of the charac- teristics of an individual with a high level of readiness for self direction in learning’6 (p. 213). Using this inductive approach to conceptualise self-directed learning readiness, she offered a definition of the self-directed learner: 


    • one who exhibits initiative, independence, and persistence in learning; 
    • one who accepts responsi- bility for his or her own learning and views problems as challenges, obstacles
    • one who is capable of self-discipline and has a high degree of curiosity
    • one who has a strong desire to learn or change and is self-confident
    • one who is able to use basic study skills, organize his or her time and set an appropriate pace for learning, and to develop a plan for completing work; 
    • one who enjoys learning and has a tendency to be goal-oriented.5 (p. 73). 


Guglielmino는 이러한 특질과 관련된 문항을 개발해서 41문항으로 된 원본을 만들었고 PCA w/ varimax rotation으로 8개 요인 구조를 도출했다.

Guglielmino developed items that related to these qualities and produced her original version of the SDLRS consisting of 41 items. She tested it on a group of 307 US high school juniors and seniors, college undergraduates, and adults in continuing education courses. She reported that principal com- ponent analysis (PCA) with varimax rotation yielded an 8-factor structure. She labeled these factors: 

(1) openness to learning opportunities; 

(2) self- concept as an effective learner; 

(3) initiative and independence in learning; 

(4) informed acceptance or responsibility for one’s own learning; 

(5) love of learning; 

(6) creativity; 

(7) future orientation; and 

(8) ability to use basic study skills and problem solving skills.5 Later she modified the SDLRS to include 58 items.6 


SDLRS는 널리 사용되긴 했지만 비판도 많았다. 우선 타당도에 대한 Field의 지적이 있었다. Bonham도 구인타당도에 대한 의문을 삼았다. SDLRS점수가 낮다는 것은 두 가지를 의미하는데, (1) 공부를 싫어하거나 (2) 다른 사람에 의해 지도받는 것을 좋아하거나. 

The SDLRS has been widely used but also criticised. Field questioned the validity of the scale as a measure of readiness for self-directed learning.7 Responding to Field’s criticism, Guglielmino wrote that readiness in the scale title is a measure of an individual s current level of readiness to engage in self-directed learning ‘‘with the implication that this level may change’’’8 (p. 236). Bonham also questioned the construct validity of the SDLRS, suggesting that low scores on the SDLRS could mean two things: a dislike for learning or one’s preference for his or her learning to be directed by another.9 Brockett and Hiemstra stated, …the evidence is rather convincing that early concerns raised about certain items of the scale are warranted 10 (p. 73). 


이후 연구에서 SDLRS가 실제로 측정하는 것에 무엇인가에 대한 결과가 엇갈렸다. 

Subsequent empirical work yielded mixed results regarding what the SDLRS actually measured. 

  • Mourad and Torrance administered the 58 itemSDLRS to a random sample of 684 K-12 students enrolled in a programme for gifted children at the University of Georgia.11 Their PCA suggested an 8-factor model; they concluded, more studies are needed to validate the scale using different samples (p. 102). 
  • Field’s SDLRS study involved 244 adult students in Sydney, Australia.7 Using common factor analysis he identified 4 factors but then concluded that the scale measures a construct that is homogeneous 7 (p. 138). The single construct was love and enthusiasm for learning
  • Bligh12 conducted an SDLRS study with medical trainees (n ¼ 216) during their medical education preparation in the UK. His PCA yielded 3 major factors: enthusiasm for learning, positive self- concept as a learner, and orientation to learning. 


Field의 연구와 다르게 여기서 언급한 모든 연구는 PCA를 사용했다. PCA와 Factor analysis는 공통점이 있지만, 다른 결과를 내놓기도 한다. 따라서 Field의 분석을 다른 연구자들의 분석과 비교하기는 어렵다. 실제로 McCune은 Field의 연구에 대해서 'PCA와 CFA의 결과 차이를 알아야 한다'라고 지적했다. 연구자들은 EFA를 위해서 종종 PCA를 활용한다. 그러나 우리는 Field의 연구가 Exploratory 하다고 생각한다. 여기서 제시한 연구들은 SDLRS에 깔린 구조를 알아내고자 하는 것이지만 PCA는 그러한 목적의 분석방법이 아니다.

With the exception of Field’s study, all the investiga-tions reported herein employed PCA. While PCA andfactor analysis methods may bear some similarities, they tend to produce different results. Thus it is challenging to compare Field’s factor structure with the others’ component structures. Indeed McCune criticised Field along these lines when she wrote, 'Also, Field should realise that if Guglielmino used principal component analysis while he used commonfactor analysis, their results should differ 13' (p. 245).Researchers often use PCA when conducting explor-atory factor analysis (EFA). We believe even Field’s study was exploratory in nature though McCune offered a competing view when she wrote, He statesthat ‘‘eight factors are sought’’ which I assume is his attempt to portray his analysis as a confirmatory factor analysis 13 (p. 245). PCA reduces a large set of items into smaller components and accounts for all ofthe variance among the items, but the studies we reported using PCA were trying to identify underlying structures of the SDLRS. PCA is not designed for that purpose. Preacher and MacCallum explain the distinction between PCA and exploratory factor analysis.14 


PCA yields observable composite variables (com- ponents), which account for a mixture of common and unique variance (including random error). The distinction between common and unique sources of variance is not recognised in PCA, and no attempt is made to separate unique variance from the factors being extracted. Thus, components in PCA are conceptually and mathematically quite different from factors in EFA.14 (p. 20). 


West와 Bently는 CFA를 이용해서 SDLRS척도를 분석하였다. orthogonal solution은 부적절하다는 결론을 내렸다. 6개 요인을 밝혔다. 이 중 3번 요인이 reverse scored item으로만 거의 이뤄져있었다. 또한 1st-order factor가 포함될 가능성을 추측해냈다.

West and Bentley examined the underlying SDLRS measurement model using confirmatory factor ana- lysis (CFA).15 Their study was conducted with 439 K-12 Tennessee teachers and administrators. The analysis concluded that an orthogonal solution to the SDLRS measurement model was not adequate. More importantly, they reported that a highly correlated 6-factor model best described the underlying theory. The 6 factors were: (1) love of learning; (2) self- confidence as a learner; (3) openness to challenge; (4) inquisitive nature; (5) self-understanding; and (6) acceptance of responsibility for learning. Inter- estingly, factor 3 contained mostly reverse scored items; yet, the possibility of reverse scored methods variance was not reported. Finally, they conjectured that the first order factors could be subsumed under a single factor characterising a higher order structure.


이들 연구에서 무엇을 배울 수 있는가? 첫째, 거의 모든 탐색연구가 reverse score item으로만 이루어진 요인이 있을 가능성을 제시한다. 그러나 이 것에 대해서 충분히 설명한 연구는 없다. 둘째, 거의 모든 연구자들은 EFA를 활용했다. CFA는 1st order factor들 간의 관계를 설명하는 2nd order factor를 찾게 해준다. 

What did we learn from the literature? First, nearly every exploratory study concluded that there was a component ⁄ factor that consisted of reverse scored items including Guglielmino’s5 original analysis. These items included negative statements about a high self-directed learner or a positive statement about a low self-directed learner. Not 1 author fully explained this phenomenon. Second, all researchers used EFA techniques except West and Bentley who used CFA methods. CFA allows the investigator to specify a second order factor to account for relationships among first order factors. In addition, it provides the researcher with information for disen- tangling random and systematic (method) error variance.16 For example, reserves scored items may be investigated as a source of method variance.


SDLRS의 psychometric nature를 연구하기 위해서 다음과 같이 했다.

To study the psychometric nature of the SDLRS for entering medical students, we collected SDLRS data from 972 students and conducted an EFA study with half the data and confirmed the model it produced with a CFA study using the other half of the data. Preacher and MacCallum emphasised the need for making good decisions in the process of conducting exploratory factor analysis.14 They were particularly concerned about using PCA, retaining components with eigenvalues greater than one, and using varimax rotation; a bundle of procedures affectionately termed Little Jiffy .14 This potentially limited approach was used in most of the previous SDLRS factor analysis studies we reviewed. We followed Preacher and MacCallum’s guidelines14 in our factor analysis.


17개 문항이 reverse scored되었다. 

Seventeen of the items are reverse scored. According to Guglielmino, wordingwas reversed in some of the items to prevent the response set of acquiescence (agreeing with all theitems).5




EFA 방법

Exploratory factor analysis – method 


'만약 요인들의 관계를 모른다면, 서로 완전히 독립이라고 가정할 이유가 없다. 따라서 oblique rotation을 하는 것이 안전하다.

We used Principal Axis Factor Analysis SPSS 11.0 for Windows to extract factors. To decide what factors to retain we used the scree plot, results from previous studies, and our comfort with the extracted factors. We decided to use an oblique rotation (Promax) based upon the West and Bentley study and Preacher and MacCallum’s argument, …if the researcher does not know how the factors are related to each other, there is no reason to assume that they are completely independent. It is almost always safer to assume that there is not perfect independence, and to use oblique rotation instead of orthogonal rotation 14 (p. 26). Individual loadings of 0.30 or greater were used in the factor designation. Extracted factors were examined and named based on an analysis of the items loading on each factor. Cronbach a was used to estimate the internal consistency of the items consti- tuting a factor.







EFA 결과

Exploratory factor analysis – results


5번째 요인이 모두 reverse scored item이었다.

Interestingly, the fifth factor measured all reverse scored items. This phenomenon was also reported by Guglielmino5, Field7, and Mourad and Torrance.11 Guglielmino noted her PCA showed that all items in factor 1 of her original 41 item version were negative statements.5 She speculated that it is possible that the factor also includes an avoidance of agreement with negative statements 5 (p. 61). Field7 and Mourad and Torrance11 also reported a factor in which items were phrased so that they had to be reverse scored. Mourad and Torrance offered 2 explanations for this factor (1) an attitude toward negative state- ments and (2) preference for complex and ambi- tious situations 11 (p. 99). Field labelled the factor containing all negatively worded items as facility with negatively phrased items 7 and argued it was not related to readiness for self-directed learning.


반복적으로 나타난 reverse scored phenomenon

Because the reverse scored phenomenon has been repeatedly reported in the literature we decided it was worth analysing in the confirmatory phase of our study.




CFA 방법

Confirmatory factory analysis – method


Using LISREL 8.5417 a series of confirmatory factor analyses was performed to further examine the measurement model underlying the SDLRS. First, the 58 items were trimmed to 41 in order to obtain items that loaded on only one factor in the specified model. Items were retained if: (a) they had factor loadings ¼ 0.30; and (b) their secondary loadings were < 0.30. This resulted in the deletion of 17 items.



We hypothesised a series of three models (Models ¼ A, B, C) based on the logic provided by Anderson and Gerbing.21


Model A is a 4-factor confirmatory model that represents the substantive factors derived from the previous EFA.


Model B is a 5-factor confirmatory model that represents 4 correlated substantive factors and an orthogonal reverse coding method factor that loads on reverse scored items from all substantive factors.



Next, Model C, a confirmatory higher order model was developed and evaluated. This model includes a higher order factor that is presumed to account for the 4 first order factors examined in Model A, and like Model B, includes a reverse scoring factor.




CFA 결과

Confirmatory factor analysis – results








Guglielmino는 귀납적 접근법으로 SDLRS를 만들었다. 그녀는 Delphi를 활용했으며, SDL에 바람직하거나, 필요하거나, 필수적인 것을 고려해서 만들었다고 확실히 언급했다. 또한 상황적, 태도적 문항이 필요해서 Delphi와 문항의 1:1 대응은 어려웠다고 했다. 본 연구에서 SDLRS는 Guglielmino가 말한 상황적, 태도적 특징을 충분히 측정하지 못하고 있는 것으로 보인다.

Guglielmino used an inductive approach to develop the SDLRS. She made clear, Results of the Delphi survey were used as a guideline in the construction of items for the scale. Characteristics which emerged from the survey with a rating of desirable, necessary,or essential were considered for inclusion. 5 (p. 37). She explained A one-to-one correspon- dence between SDLRS items and characteristics selected by the Delphi survey was not possible, since situational and attitudinal items were desired. 5 (p. 38). In our study, the SDLRS instru- ment did not fully measure these characteristics orsituational and attitudinal constructs that Gugliel- mino specified. 


본 연구의 EFA에서 4개의 요인을 밝혔다. McCune은 요인분석이 샘플에 따라서 다른 결과를 보일 수 있다고 했다. 따라서 (특히 다른 연구에서 PCA를 활용했다는 점에서) 다른 집단이나 다른 의대생의 결과와 다른 것에 놀라지는 않았다.

Our EFA study produced 4 acceptable factors. SinceMcCune cautioned that factor analysis studies of thesame instrument could yield different results depending on the samples used,13 we were not surprised that our EFA factors varied somewhat from the structures reported for other medical students12 and other populations,5,7,11,15 especially since many of our reviewed studies used PCA.


여러 연구들이 reverse scored item으로만 구성된 요인이 있음을 밝힌 바 있는데, 그 이상에 대한 연구는 없었다. SDLRS는 17개의 그러한 문항이 있다. 점점 더 많은 연구들이 reverse coded item이 내적일관성신뢰도를 떨어뜨리며 factor structure의 해석을 어렵게 함을 보여준다.

While several studies5,7,11 noted the presence of a factor comprising all or mostly reverse scored items, we learned that the reverse coding method variance had not been considered or further explored. The SDLRS contains 17 items that are reverse scored. A growing body of evidence concludes that reverse- coded items may weaken the internal consistency reliability of test score22–24 and impair interpretation of the factor structure.25–27


Marsh는 부정형으로 서술된 문항의 효과를 없애는 가장 쉬운 방법으로 긍정형 문장만 사용하는 것을 제시했다. 이것이 측정 전문가의 권고사항에 반하는 것이긴 하나, reverse coded item과 관련된 문제에 따른 제약이 있는 것은 분명하다.

Marsh suggests that the easiest way to eliminate the effects of negatively worded items is to use only positively worded items… 29 (p. 817). Although this approach runs counter to the recommendations of measurement experts, the problem of method vari- ance associated with reverse coded items suggests a limitation of including reverse scored items in the SDLRS.


여기서 얻을 수 있는 결론은 Guglielmino가 척도를 개발하기 위해 노력한 것은 맞지만 그 SDL척도는 부족해보인다. 우리 연구에 따르면 SDLRS는 다음의 네 가지를 측정한다. 

What can we conclude from our study of the SDLRS? We acknowledge Guglielmino’s efforts to develop a practical instrument for measuring self-directed learning readiness. Her carefully constructed approach to generating the instrument appears appropriate; yet, the SDLRS apparently falls short of measuring characteristics that Guglielmino deter- mined were associated with self-directed learning. With our two samples, the SDLRS measured the respondents’ perceptions of how often they felt positively about: 

  • (1) learning being a tool for life; 
  • (2) their self-confidence in their abilities and skills for learning; 
  • (3) taking responsibility for their own learning; and 
  • (4) their curiosity. 


SDL을 잘 하는 학생이 이러한 특징을 많이 가진 것은 직관적으로 말이 되는 것으로 보이지만, 이러한 인식이 SDL행동을 유도한다고 믿을 근거 또한 부족하다. Guglielmino는 원래 Delphi를 통해서 SDL학습자의 특징을 찾아냈을 뿐이고, 이에 대한 어떤 이론적 기반도 제시하진 않았다.

While it makes intuitive sense that self-directed learners would perceive themselves as holding these characteristics in abundance, there is no reason to believe that these perceptions would predict self-directed learning behaviour. Guglielmino identified only characteris- tics of self-directed learners from her original Delphistudy; she offered no theory of self-directed learningor of readiness.


지난 25년간 SDL에 대한 연구와 컨퍼런스가 많았지만 SDLRS가 바뀌지는 않았다. 의학에서 SDL에 대해서 SDL에 영향을 주는 조건에 대한 것도 설명해야 한다. 신경생물학의 최근 연구를 보면 다음의 것들을 중요시하고 있다. 학생과 의사들은 단순히 학습에 대해서 새로운 기술을 습득하는 것으로 충분하지 않다. 많은 경우 자신과 자신의 일과 지속적 전문성 개발에 대한 생각을 다시 할 것을 요구받는다. 우리는 Baveye가 말한 것처럼 SDL를 단순히 SDL에 대한 인식을 측정하는 것에서 관찰가능한 SDL노력이 어느정도인가를 측정하는 것으로 바뀌어야 한다고 생각한다.

Even as conferences and books exploring self-directed learning proliferated during the past 25 years, there appears to be no change in the SDLRS. In considering self-directed learning in medicine, one must account for the current research as well as conditions that influence the performance of self-directed learning. Newer studies extend fromthe neurobiology of aging and the role of cognition in making practice chan- ges30 to the social psychology of the physician’s changing work environment, the importance of phy- sicians’ peers, and the accountability schemes and financial incentives built into medical practice.31 Students and practitioners are not being asked merely to take on new skills or to adjust their attitudes toward learning. Many are being asked to rethink the way they see themselves, their work, and their ongoing professional development. We agree with Baveye32,33 who suggests that the study of self-directed learning should be reoriented to an entirely new direction, away from simple measures of perceptions of self- directed learning to observed self-directed learning endeavours, apropos of the 21st century.










 2005 Apr;39(4):370-9.

The Self-Directed Learning Readiness Scale: a factor analysis study.

Author information

  • 1Virginia Commonwealth University School of Medicine, PO Box 980565, Richmond, VA 232-0565, USA. jdhoban@vcu.edu

Abstract

BACKGROUND:

The practice of medicine demands that its physician practitioners are self-directed, life-long learners. The Self-Directed LearningReadiness Scale (SDLRS) intends to measure adults' readiness to engage in self-directed learning.

PURPOSE:

The present study assesses the underlying factor structure of the SDLRS for a sample of entering medical students.

METHODS:

Over a period of 6 years, 972 first year medical students at the Virginia Commonwealth University School of Medicine completed the SDLRS. To summarise the inter-relationships among variables, a principal axis factor analysis with oblique rotation was used on the 58 SDLRS items. A series of confirmatory factor analyses using LISREL 8.54 was performed to further examine the measurement model underlying the SDLRS.

RESULTS:

A 4-factor confirmatory model representing 4 correlated substantive factors and a reverse coding method factor fits these data well.

CONCLUSIONS:

Medical educators should hold limited expectations of the SDLRS to measure medical students' readiness to engage in self-directed learning. The definitions and theoretical assumptions that inform readiness for self-directed learning should be reconsidered. Alternative approaches to studying self-directed learning should be explored.

PMID:
 
15813759
 
[PubMed - indexed for MEDLINE]


자기조절학습과 의과대학 학업성적의 관계(Med Teach, 2015)

Self-regulated learning and academic performance in medical education

SUSANNA M. LUCIEER1, LAURA JONKER2,3, CHRIS VISSCHER2, REMY M. J. P. RIKERS4,5 & AXEL P. N. THEMMEN1,6





의료전문직은 변화하는 사회에 맞춰서 높은 수준을 유지해야 한다. CME에서 배우기 위해서는 아래와 같은 것을 해야 하며, 요약하면 의사들은 자기조절적 학습자가 되어야 한다는 것이다. 이것은 학습과정에 있어 행동, 메타인지, 학습동기 등이 주도적이어야 함을 뜻한다.

The medical profession has to ensure that high standards in providing patient care are repeatedly being met in the context of a rapidly and constantly changing medical world (Brydges &Butler 2012; Bjork et al. 2013). This means that medical doctors have to stay updated with the developments in their field of expertise and have to maintain their competencies (Greveson& Spencer 2005; Artino et al. 2012; Brydges & Butler 2012;Premkumar et al. 2013). To be able to benefit and choose from the many opportunities of continuous medical education,medical doctors have to define their own learning needs, set personal goals and engage in the most appropriate learning activities (Lycke et al. 2006; Brydges et al. 2012; Premkumaret al. 2013). In short, medical doctors have to be self-regulated learners, which means that they have to be behaviorally, meta-cognitively and motivationally proactive in their learning process (Zimmerman 1986; Wolters 1998; Jonker et al. 2010).


Ertmer와 Newby에 따르면, 자기조절적 학습자는 다음의 것을 할 수 있다. 다른 연구자는 자기조절학습의 동기와 관련된 요인을 언급했는데 아래와 같다. 그러나 이러한 요인들은 학습자가 스스로 사용할 동기가 없으면 별로 가치가 없고, 그래서 두 가지를 추가했다. (노력과 자기효능감)

According to Ertmer and Newby (1996), self-regulated learners are individuals who are able to 

      • plan their study behavior, 
      • monitor their progress, 
      • reflect upon, and 
      • evaluate the entire learning process. 


Other researchers also highlighted the importance of motivational components in self-regulated learning (Hong & O’Neil 2001; Sitzmann & Ely 2011). They argued that one may be 

      • able to plan, 
      • monitor, 
      • reflect upon, and 
      • evaluate his or her learning behavior, 

but that these competencies are of little value when one is not motivated to employ them. Therefore, they added two subcomponents of motivation to the concept of self-regulated learning, i.e., effort and self-efficacy. 

      • Effort is crucial to reach the goals self- regulated learners have set, and 
      • self-efficacy is important since one needs to have trust in his or her own potential in order to complete a task (Hong & O’Neil 2001; Sitzmann & Ely 2011).


불행하게도, 자기조절적 학습 기술은 의과대학기간에 늘 강조되는 것은 아니다. 어떤 연구에서는 의과대학동안 학생들의 자기조절학습능력이 향상된다고 나오나, 일부 졸업생들은 그렇게 잘 준비되지 않았다고 느낀다. 

Unfortunately, self-regulated learning skills are not always emphasized during medical school (Artino et al. 2012). While studies showed that students do develop self-regulated learning skills during medical school (Loyens et al. 2008), some graduates feel uncertain and unprepared to do so (Artino et al. 2012). Therefore, it is important to investigate to what extent medical students’ self-regulated learning skills change during their education.



자기조절학습은 학업능력을 가장 잘 예측해주는 요인으로 나타나기도 한다. 자기조절학습은 능동적인 학습과정으로서 목표를 설정하고 효과적인 학습전략을 개발하는 것이다. 이 단계는 학습능력의 정신능력을 변화시키는데, 예컨대 목표설정, 학습전략 개발, 향상과 효과성의 모니터링 등이다. 어떻게 학습발달과정을 모니터링하고, 학습 행동을 조절하고 적응시킬 것인지가 진정으로 효과적인 학습자의 요건으로 여겨진다. 비록 지난 연구들이 자기조절 학습이 반드시 높은 성취를 위해 필요한 기술은 아니라고 하기도 하나, 자기조절적 학습자가 더 효과적인 학습자라는 것이 보여진 바도 있다.

It has also been shown that self-regulated learning is one of the best predictors of academic performance (Pintrich & Degroot 1990). Self-regulated learning is viewed as a proactive learning process that is used to set learning goals and develop effective strategies for learning (Zimmerman 2008). This process helps people to transform mental abilities in academic skills, such as setting goals, developing learning strategies, and monitoring the progress and effectiveness of their learning (Zimmerman 2002, 2008). Knowing how to monitor the progress of your learning and how to control and adapt your learning behavior, is seen as a requirement for being a truly effective learner (Ertmer & Newby 1996; Bjork et al. 2013). Although research suggests that it is not necessary to use self- regulated learning skills for high achievement (Ablard & Lipschultz 1998), it has been shown that self-regulated learners are more effective learners (Nota et al. 2004; Toering et al. 2009) who get more out of their potential (Zimmerman 1986) and attain higher grades during high school (Nota et al. 2004) and in college (Ablard & Lipschultz 1998).



In this study, the Self-Regulation of Learning Self-Report Scale (SRL-SRS) is used. This questionnaire contains six subscales: 

    • planning, 
    • monitoring, 
    • evaluation, 
    • reflection, 
    • effort, and 
    • self-efficacy

following the theories of Ertmer and Newby (1996) and Hong and O’Neill (2001).





Setting



Participants



Instruments

The Self-Regulation of Learning Self-Report Scale (SRL-SRS) was used to investigate the students’ level of self-regulated learning. The SRL-SRS contains 50 items on a 4- or 5-point Likert scale, depending on the subsection of the questionnaire. Following the theory described by Ertmer and Newby (1996) and Hong and O’Neill (2001), the questionnaire comprises six subscales of original English-language questionnaires: planning, monitoring, evaluation, reflection, effort, and self-efficacy. An example of a question in the subscale monitoring is: ‘‘While making an assignment, I check my progress,’’ and an example from the subscale effort is: ‘‘I keep trying to finish my assignment, even when I find the assignment extremely difficult’’. The questionnaire has been compiled and validated in a Dutch study (Toering et al. 2012). The questionnaire was originally created for high school students. Therefore, in this study, minor changes were made in a few questions, e.g., the term homework was replaced by study assignments.



Measurements of academic performance



Data analysis 


Data were analyzed with the use of IBM SPSS AMOS version 18.0 (SPSS, Inc., Chicago, IL) and IBM SPSS Statistics version 21.0 (SPSS, Inc., Chicago, IL). Confirmatory factor analysis and Cronbach’s alpha were used to investigate whether the constructs to of the questionnaire fitted the model and measure the internal consistency of the factors. A one-way ANOVA was performed to compare the level of self-regulated learning skills of the first and third-year medical students, a p value of 50.05 was considered significant. For the subscale reflection, Welch F was calculated since equal variances could not be assumed. Effect sizes, eta squared, were converted where 0.01, 0.06 and 0.14 indicate a small, medium, and large effect, respectively (Cohen 1988; Lakens 2013). The correlation between the self-regulated learning skills and the measures of academic performance were calculated with Pearson correlations and multinomial logistic regression analysis. Here, given the multiple comparisons, a more conservative p value of 50.01 was considered significant.




결과


Validation of the questionnaire





Change of self-regulated learning skills






Correlation with academic performance









자기조절학습은 GPA에서 작은 부분만 설명한다.

Multinomial logistic regression analyses showed that self- regulated learning skills explained a small proportion of the variance in GPA among first-year medical students: R2 ¼0.086, Model 2 (18) ¼1592.612, p50.001 as well as some of the R2 ¼0.105, Model 2 variance of the third-year students: (18) ¼38.735, p ¼0.003


1학년과 3학년을 비교하면 의과대학기간동안 자기조절 학습 기술은 별로 달라지지 않았다.

Concerning the first question, we hypothesized thatstudents’ self-regulated learning skills would change duringmedical school. However, we found that the levels of mostself-regulated learning skills did not differ between the firstand third year at medical school, except reflection, which washigher in the third year.


이는 어쩌면 최고의 학생들만 의과대학에 입학하기 때문이 ceiling effect 때문일 수도 있다.

It is however possible that, since only the best students are accepted for medical school (Razack et al. 2012) these students already score relatively high at entrance, and therefore show little development of self-regulated learning during medical school itself (i.e. ceiling effect)


사람들은 흔히 스스로의 학습행동을 어떻게 관리해야하며, 어떻게 학습하는 것인가에 대해서 배우지 않아도 된다는 가정을 한다.

In addition, most people have a strong assumption thatchildren and adults do not need to be taught how to learn andhow to manage their learning behavior (Bjork et al. 2013). 


연구에 따르면 자기조절학습 기술은 학습될 수 있으나, 중요한 것은 그것이 특별히 강조되어야 한다는 점이다.
Research showed that self-regulated learning skillscan be taught, but they have to be specifically emphasized(Zimmerman, 1989; Hong & O’Neil 2001

더 나아가, 사람들은 종종 학습과 기억에 대한 잘못된 정신모형을 가지고 있고, 특히 '좋은 수행능력'에 대한 기준이나 준거에 대한 지식이 없을 경우에는 자신의 자기조절학습 기술에 대해서 과대평가하는 경향이 있다. 마찬가지로 1학년에서 아마도 자신을 과대평가 했을 수 있다.
Further, people often have a flawed mental model of how they learn and remember (Bjork et al. 2013) and tend to overestimate their self-regulated learning skills (Zimmerman2008), especially when they do not have knowledge of the criteria and standards of good performance (Kostons et al.2012). It is possible that first-year students overestimated their use of self-regulated learning skills more than third-year students, and thus, reported a higher use of self-regulated learning skills. 

본 연구는 수행능력의 일부가 자기조절학습능력으로 설명 가능하나, 여전히 많은 부분은 설명되지 않음을 보여준다.

This study confirmed that some variation in performance could be explained by the students’ self-regulated learning skills, both in the first-year and in the third-year, but a large part of the variation remained unexplained. 


'노력'이 관련되어 있었음. 

Effort was also related to first-year academic performance.According to Hong and O’Neill (2001), effort is necessary to actually use the other self-regulated learning skills one possesses. Effort is crucial to reach the goals a learner has set (Hong & O’Neil 2001) and is required to persist on difficult tasks (Pintrich & Degroot 1990; Hong & O’Neil 2001)


모니터링과 관련해서, 가장 낮은 분위의 GPA를 가진 학생보다 그 윗 분위의 학생들이 더 낮은 monitoring level을 보였다 이는 Kruger-Dunning effect라고 설명될 수 있다(수행능력이 매우 떨어지는 학습자는 스스로의 학습에 대한 모니터링을 거의 안하며, 그래서 그들이 그것을 하지 않는다는 것 조차 알지 못한다.)

Regards to monitoring,not the students with the lowest GPA reported the lowest level,but those with the second lowest GPA. This could be the result of the so called Kruger–Dunning effect; poorly performing learners rarely monitor their learning and consequently are unlikely to notice that they are not doing so (Ertmer &Newby 1996; Kruger & Dunning 1999; Langendyk 2006;Kostons et al. 2012). 


3학년에는 effort만 관련되어 있었음.

In the third year of medical school, only effort was to some extent related to performance differences.



cross-sectional design이라는 한계가 있으나, 그룹간 age, gender가 비슷하고 sample size가 크고, response rate가 비슷하고, 모든 학생이 같은 학교에 다니므로 수용가능하다. 모든 학생이 비슷한 패턴으로 변할 것이라는 가정이 가능하다.

One notable limitation of this study is the use of a cross-sectional design,while a longitudinal design would have been more appropriate. Still, a cross-sectional design is deemed acceptable since the groups are comparable in age and gender, the sample size is large, the response rate is comparable, and all students attended the same medical school and in the same curriculum. It is therefore appropriate to assume that all students will change in a similar way (William & Darity2008). 





 2015 Aug 27:1-9. [Epub ahead of print]

Self-regulated learning and academic performance in medical education.

Author information

  • 1a Institute of Medical Education Research Rotterdam , Erasmus MC, The Netherlands .

Abstract

CONTENT:

Medical schools aim to graduate medical doctors who are able to self-regulate their learning. It is therefore important to investigate whether medical students' self-regulated learning skills change during medical school. In addition, since these skills are expected to be helpful to learn more effectively, it is of interest to investigate whether these skills are related to academic performance.

METHODS:

In a cross-sectional design, the Self-Regulation of Learning Self-Report Scale (SRL-SRS) was used to investigate the change in students' self-regulated learning skills. First and third-year students (N = 949, 81.7%) SRL-SRS scores were compared with ANOVA. The relation with academic performance was investigated with multinomial regression analysis.

RESULTS:

Only one of the six skills, reflection, significantly, but positively, changed during medical school. In addition, a small, but positive relation of monitoring, reflection, and effort with first-year GPA was found, while only effort was related to third-year GPA.

CONCLUSIONS:

The change in self-regulated learning skills is minor as only the level of reflection differs between the first and third year. In addition, the relation between self-regulated learning skills and academic performance is limited. Medical schools are therefore encouraged to re-examine the curriculum and methods they use to enhance their students' self-regulated learning skills. Future research is required to understand the limited impact on performance.

PMID:
 
26313552
 
[PubMed - as supplied by publisher]



프로페셔널리즘의 변화: 깨진 거울에 비친 모습 (Med Teach, 2015)

The changing face of professionalism: Reflections in a cracked mirror

TREVOR GIBBS

AMEE, UK




가변성은 삶의 법칙이다. 어떤 두 얼굴도 서로 같지 않고, 어떤 두 신체도 같지 않다. 또한 어떤 두 사람도 우리가 질병이라 부르는 비정상적인 조건에서 동일하게 반응하거나 행동하지 않는다. William Osler

Variability is the law of life, and as no two faces are the same, so no two bodies are alike, and no two individuals react alike and behave alike under the abnormal conditions which we know as disease. William Osler (1949–1919)


William Osler가 한 이 말은 의학교육과 학생들에게도 동등하게 적용될 수 있다. 두 명의 환자가 절대로 동일하지 않은 것처럼, 두 명의 학생, 두 가지의 교육 및 직업환경도 절대로 동일하지 않다.

Much of what the renowned clinician, William Osler, was quoted to have said regarding clinical medicine can be equally applied to medical education and its students; just as two patients are not likened the same, neither are two students nor two educational or vocational environments.


Koehn과 Swick은 모든 의과대학생들이 문화적역량을 갖춰야 할 뿐만 아니라 초국가적인 역량 역시 갖춰야 한다고 주장했다. 즉, 국가간 관계에서 유도된 기술, 여러 문화를 아우르는 심리학과 여러 문화권 사이의 의사소통능력 등을 갖추어서 변화하는 사회에 적용시킬 수 있어야 한다는 것이다. 그들은 우리가 가르치는 것(문화적 역량)과 문화에 따라서 역량이 얼마나 다양하게 바뀔 수 있는가(문화 이외의 다른 요인들에 의해서 영향을 받은 역량)의 차이를 transnational competency라 명명하였다.

Koehn and Swick (2006) described the need for all medical students to not only become culturally competent but also to become transnation- ally competent as well; the need to produce students having a comprehensive set of skills, derived from international relations, cross-cultural psychology and intercultural commu- nication, equipping themto work in and adapt to our changing society. They describe the difference between what we teach – cultural competence and how various competencies change as a result of culture, with what happens when other factors other than culture affect those competencies – transnational competency.


프로페서녈리즘은 새로운 것이 아니며 히포크라테스 시절부터 존재해왔다. 이 당시에는 판단, 윤리, 특히 지혜가 핵심적 특징이었고, 매우 강한 실증주의적 관점과 함께 발전해왔다.

Professionalism is not new; with its origins firmly based on the early Hippocratic era, when judgement, ethics and has particularly wisdom were its key features, which it developed with a strongly positivist stance.


전문직업성에 관한 더 포괄적인 정의는 Epstein과 Hundert가 한 것으로 아래와 같다.

A more expansive definition of professionalism is Hundert provided by Epstein and (2002): ‘‘Professional competence is the habitual and judicious use of communication, knowledge, technical skills, clinical reasoning, emotions, values, and reflection in daily practice for the benefit of the individual and community being served’’. 



Hilton과 Southgate는 현대사회에서의 의료 전문직업성은 환자를 돌보는 것에 있어서 필요한 광범위한 자질을 포괄하고 있으며, 전통적인 의미의 전문성을 함양하고, 자율성을 가지고, 자기조절을 하는 것 이상이다"라고 했으며, 전문직업성이 역동성을 가지면서, 사회의 변화에 따라 적용해나갈 수 있는 것임을 암시했다.

Hilton and Southgate (2007, p. 265) recognized that ‘‘Medical the Professionalism in today’s society requires exhibition of a range of qualities deployed in the service of patients, rather than more traditionally defined aspects such as mastery, autonomy and self-regulation’’, alluding to the need for professionalism to be a dynamic entity, capable of adaptation to a changing world.


조금 더 일찍이, Schon은 진료에 대한 이론과 현실 사이의 이분성을 이야기하면서, "의사들이 연구에서 도출한 이론과 기술을 효과적으로 활용할 수 있는 높고 단단한 기반과, 온갖 상황이 혼란스럽고 난장판인 질퍽질퍽한 저지대"라고 묘사했다. 또한 우리가 가르치는 것과 우리가 노출되는 현실의 차이를 언급하였다. 우리가 명료하다고 가르치는 'learning organization'은 현실이라는 탁한 물 속에서 언제나 적용가능한 것은 아니다. 우리가 전문직업성에 대해서 가지고 있는 관점을 지지해주는 성찰적 거울은 우리가 실제로 발을 딛고 있는 현실에 의해서 깨어지고, 그 금(crack)으로 인해 상이 왜곡된다.

Somewhat earlier, it was Scho¨n who spoke of a dichotomy between our use of theory to inform practice and the reality of the real world, when he described the evocative image of: ‘‘...the high, hard ground where practitioners can make effective use of research-based theory and technique, and the swampy lowland where situations are confusing ‘‘messes’’ incapable of technical solution’’(Scho¨n 1983, p. 42). He too was describing the difference between what we teach and what we are exposed to in reality. What we see and teach in the clarity pervading the learning organization is not always applicable in the murky waters of the real world; what we see in the reflective mirror that support our views of profession- alism is frequently distorted by the cracks caused by the situation where we find ourselves in.







 2015 Sep;37(9):797-8. doi: 10.3109/0142159X.2015.1054797. Epub 2015 Jun 17.

The changing face of professionalismReflections in a cracked mirror.

Author information

  • 1a AMEE , UK.


시뮬레이션 교육에서 방향이 제시된 자기조절학습 vs 교수자 지도의 학습(Med Educ, 2012)

Directed self-regulated learning versus instructor-regulated learning in simulation training

Ryan Brydges,1 Parvathy Nair,2 Irene Ma,3 David Shanks2 & Rose Hatala2




자기조절학습은 다음과 같이 정의된다. 많은 의료전문직 관련 기관들은 SRL을 비전과 미션에 포함시키고 있으며 이 때 SDL이나 평생학습과 같은 용어를 사용한다. SRL과 SDL이 여러 공통점이 있지만, 의학교육연구에서 이 둘은 구분되어 사용된다.

Self-regulated learning has been defined as a process involving ‘self-generated thoughts, feelings and actions that are planned and cyclically adapted to the attainment of personal goals’.15 Many medical professional agencies have included SRL in their visions and mission statements, using terms like ‘self- directed learning’ (SDL) and ‘lifelong learning’.16 Although SRL and SDL have many similarities,17 medical education researchers have used the terms differently: 

    • SDL is often invoked when a learning environment is designed to promote autonomous learning (e.g. problem-based learning), whereas (학습환경이 자율적 학습을 촉진하도록 설계된 것)
    • SRL is mentioned when the focus is on understanding the mechanisms of autonomous learning in order to identify how best to support learners when they engage in SRL.18 (자율적 학습의 메커니즘 이해에 초점을 두고 있으며, 이를 통해서 어떻게 SRL을 하는 학습자를 가장 잘 지원할 수 있는가를 알기 위한 것)


IRL과 DSRL을 비교하였다.

We compared this traditional approach of instructor-regu- lated learning (IRL) with what we call ‘directed self- regulated learning’ (DSRL).




절차 Procedure


참가자들은 baseline 설문지를 작성하였다.

Participants completed a baseline questionnaire on their previous experience with LP training (with both simulation and real patients) and reported their baseline confidence in performing LP on an 11-point Likert scale.


DSRL의 프로토콜 Protocol for directed self-regulated learning


IRL의 프로토콜 Protocol for instructor-regulated learning


네 명의 평가자가 global rating scale (GRS)와 checklist (CL)을 사용하여 pre, post, retention test를 평가하였다. 각 비디오에 대해서 두 명이 평가하였고, 공평하게 분배했다. GRS의 validity, reliability는 psychometric study로 supported 된다. 우리는 기존에 validity와 reliability가 확보된 CL을 가지고 그것의 변형된 버전을 만들었다. 일부 상관없는 문항을 삭제하였다. 비록 이러한 일부 문항의 삭제가 전문가 합의과정을 통해서 결정되었지만 validity를 다시 평가하지는 않았다. 대신 ICC를 계산하였다.

Four trained, blinded expert raters (authors RH, IM, PN, DS) used a global rating scale (GRS) (Appen- dix S1, online) and a procedural checklist (CL) to independently evaluate participants’ videotaped per- formances on the pre-test, post-test and retention test. Two raters evaluated each video; the rating load was distributed equally amongst the four raters. The concurrent validity, construct validity and reliability of the GRS are supported by a psychometric study on procedural skills.24 We created a modified version of a CL with demonstrated content validity and reli- ability,11 which consisted of 26 major and 44 minor actions. We removed some items fromthe original CL that we deemed as non-applicable to our simulation scenario, which shortened the overall CL, but main- tained an emphasis on major actions; specifically, we included 21 major actions and 14 minor actions (Appendix S2). Although the removal of items was based on expert consensus, we did not re-evaluate the validity of our modified CL; however, we did assess inter-rater reliability using the intra-class correlation coefficient (ICC).



평가자 훈련 Rater training


Training protocol

Prior to rating the videotaped performances, the four raters engaged in training. The training protocol involved the rating of three randomly selected videos, a meeting to discuss disagreements, the rating of another three randomly selected videos and a meet- ing to discuss these, followed by a final rating and discussion of four randomly chosen videos.


여기서 사용한 10개의 비디오는 ICC에서 계산시 포함하지 않았음.

We did not use the ratings of those 10 videos in the ICC reliability calculation, but did include them in the remaining analyses. During the training, the four raters developed a shared understanding of how to use the GRS and CL. One significant training outcome was the raters’ mutual definition of resident competence as ‘a resident being capable of performing LP in the clinical context with direct supervision’.


통계 분석

For these two performance variables, we assessed group differences between the DSRL and IRL groups on the pre-test and post-test, and in the pre-test, post- test and retention test scores using two separate repeated-measures analyses of variance (ANOVAs) with test as the within-subjects factor and group as the between-subjects factor. We also computed Pearson correlation coefficients to assess the relationship between participants’ self-reported confidence and the performance variables at the pre- and post-tests. Finally, we used separate independent samples t-tests to examine group differences in participants’ self- reported LP experience, total number of LPs per- formed or observed, total time with an instructor, and total practice time. We used Tukey’s honestly signif- icant differences (HSD) test and calculated the bias- corrected Hedges’ g effect size for the appropriate post hoc comparisons. All data are reported as mean ± standard error (SE) and the alpha level was set at p < 0.05.








본 연구에 대한 설명 중 하나는, 교수자가 연습 중에 제공한 피드백이 학습에 큰 도움이 된다고 여기는 것과 관련되어있다. 그러나 안타깝게도 motor learning에 관한 수십년의 연구에서 동시적 피드백(concurrent feedback)이 오히려 장기 학습에 해롭다는 결과가 나왔다. 전형적인 형태는, 연습 중에는 그것이 도움이 되는 것으로 보이지만, 피드백이 불가능한 상황인 retention 평가에서는 급격한 하락이다.

One explanation for our findings is associated with the provision by instructors of feedback during practice, which is commonplace in many domains and is considered a potent aid to learning. Unfortu- nately, decades of motor learning research have shown that concurrent feedback can be detrimental to long-term learning.26 Typically, findings show a benefit during practice, resulting in rapid skill acquisition, followed by poor performance on reten- tion tests when the feedback is unavailable.27


연습하는 중간에 제공되는 동시적 피드백의 촉진적 특성은 무시하기 어려우며 IRL이 의학에서 왜 그렇게 널리 쓰이는지를 설명해준다. 그러나 이러한 직관에도 불구하고 우리의 결과는 performance-learning paradox를 지지하는 근거를 하나 더 추가했을 뿐이다. (즉각적인 이득이 나타나나, 이것이 장기적인 학습을 반영하지는 못한다는 것) 이러한 결과가 직관에는 반하는 것일 지 모르나, 수많은 연구들이 장기적 학습에 있어서 '바람직한 어려움'의 활용을 지지한다.

The facilitating nature of concurrent feedback during practice is difficult to ignore and probably explains how pervasive IRL is in medicine. Despite these intuitions, our results add to evidence supporting the performance–learning paradox, which refers to the common finding that immediate performance bene- fits (as demonstrated in post-test scores) do not always reflect long-term learning (as demonstrated in retention test scores).28 Counterintuitive as this paradox may be, a wealth of evidence supports the use of ‘desirable difficulties’ to depress immediate performance and improve long-term learning.28,29


두 번째 개념틀은 자기모니터링이다. Eva와 Regehr는 자기모니터링을 '자신이 특정 상황에서 필요한 기술과 지식을 유지할 수 있는 가능성에 대한 순간순간의 판단'이라고 정의한다.

A second conceptual framework for interpreting the findings comes fromresearch on self-monitoring. Eva and Regehr31 defined self-monitoring as ‘a moment- by-moment awareness of the likelihood that one maintains the skill or knowledge to act in a particular situation’.


본 연구에서 DSRL그룹은 여러 시뮬레이터 모델 사이에서 점차 발전해나가며 자신의 능력과 준비도를 자기모니터링할 수 있다.

In the present study, the DSRL group’s opportunity to progress between simulator models may have led participants to self-monitor their ability and ‘readiness’ frequently.


교수들이 사용할 수 있는 시간은 제한적이고 이것이 흔히 simulation-based training 도입의 장애로 여겨진다. 연구자들은 교육효율성을 표준 성과 변인으로 고려해볼 필요가 있다.

Given that limitedfaculty staff time is a frequently cited barrier to the implementation of simulation-based training,33 researchers may benefit fromconsidering training efficiency (e.g. use of faculty staff time) as a standard outcome variable.


그러나 총 hands-on simulator time으로 계산하면 이 주장은 전혀 반대가 된다. (DSRL에서 hands-on time이 훨씬 많았음)

However, when we consider the total hands-on simu- lator time, the argument is reversed.


특히 DSRL을 위해서는 더 많은 자원이 필요하고, 어떻게 이러한 자원 필요량의 균형을 맞출 것인지가 중요한 문제일 것이다.

Further, more resources (e.g. LP trays and equipment) were needed for the DSRL than the IRL group. Determining how to balance these resource concerns represents a major challenge for simulation researchers.








 2012 Jul;46(7):648-56. doi: 10.1111/j.1365-2923.2012.04268.x.

Directed self-regulated learning versus instructor-regulated learning in simulation training.

Author information

  • 1Department of Medicine, University of Toronto, Toronto and The Wilson Centre, University Health Network, Toronto, Ontario, Canada. ryan.brydges@utoronto.ca

Abstract

OBJECTIVES:

Simulation training offers opportunities for unsupervised, self-regulated learning, yet little evidence is available to indicate the efficacy of this approach in the learning of procedural skills. We evaluated the effectiveness of directed self-regulated learning (DSRL) and instructor-regulated learning (IRL), respectively, for teaching lumbar puncture (LP) using simulation.

METHODS:

We randomly assigned internal medicine residents in postgraduate year 1 to either DSRL ('directed' to progress from easy to difficult LP simulators during self-regulated learning) or IRL (in groups of four led by an instructor). All participants practised for up to 50 minutes and completed a pre-test, post-test and delayed (by 3 months) retention test on the simulator. Pairs of blinded trained experts independently rated all videotaped performances using a validated global rating scale and a modified version of a validated checklist. Participants provided measures of LP experience and self-reported confidence. We analysed the pre-post (n = 42) and pre-post-retention performance scores (n = 23) using two separate repeated-measures analyses of variance (anovas) and computed Pearson correlation coefficients between participants' confidence and performance scores.

RESULTS:

Inter-rater agreement was strong for both performance measures (intra-class correlation coefficient > 0.81). The groups achieved similar pre-test and post-test scores (p > 0.05) and scores in both groups improved significantly from the pre- to the post-test (p < 0.05). On retention, a significant interaction (F(2,42) = 3.92, p = 0.03) suggests the DSRL group maintained its post-test performance, whereas that in the IRL group dropped significantly (p < 0.05). Correlations between self-reported confidence and post-test performance were positive and significant for the DSRL group, and negative and non-significant for the IRL group.

CONCLUSIONS:

Both IRL and DSRL led to improved LP performance immediately after practice. Whereas the IRL group's skills declined after 3 months, the DSRL group's performance was maintained, suggesting a potential long-term benefit of this training. Participants in the DSRL group also developed a more accurate relationship between confidence and competence following practice. Further research is needed to clarify the mechanisms of self-regulated learning and its role in simulation contexts.

© Blackwell Publishing Ltd 2012.

PMID:
 
22691145
 
[PubMed - indexed for MEDLINE]


층층이 쌓인... 복잡한 학습환경에서의 자기조절 (Med Educ, 2012)

Layers within layers … self-regulation in a complex learning environment

Rukhsana W. Zuberi






학습자가 동등한 지위의, 협력적 목표를 지닌, 그룹 내 접촉에 호의적인 사회적 규범을 지닌 환경에 지속적으로 노출되는 것이 중요함을 강조한다.

The articles1–3 correctly emphasise the value of repeated contact among individuals in a context in which equal status, cooperative goals and social norms favour inter-group contact."


또한 facilitation, guidance, support and role-modelling 가 중요하다.

The papers also repeat- edly define the importance of facilitation, guidance, support and role-modelling by faculty members."


롤모델의 중요성은 아무리 강조해도 지나치지 않으며, 롤모델은 스스로 무의식적, 의식적 편견을 극복하기 위해 노력해야 한다.

The importance of role models cannot be overstated and, there- fore, the role models themselves must explore and overcome their own unconscious and conscious biases."


세 개의 논문이 행동을 바꾸기 위한 교육전략을 제시했지만 거기에 깔린 이론을 확인해 볼 필요가 있다. 예컨대, 교육방법은 자기효능감이 발달함에 따라서 조절해나가야 하며, 학습자의 노력에 의해 관리할 수 있는 과제를 주어야 한다. 만약에 너무 위압적이면 학생은 자신감을 상실하고 비난하게 될 것이며, 너무 사소하면 학생은 성취감을 느끼지 못할 것이다. social cognitive theory에서 제시한 세 가지의 자기-반응적 요인은 어떻게 개개인이 자신에게 요구되는 수준과 현재 수행능력의 차이를 다루는지 예측하게 해준다.

(i) perceived self-efficacy to achieve the standards; 

(ii) behavioural self-reaction to sub- standard performance, and 

(iii) readjustment of personal stan- dards." 


For this reason, although the three articles1–3 published in this issue did survey educational strategies to change behaviour, the theories underpinning these remain to be explored. For example, it is broadly recognised that pedagogy must be moderated by the development of self-efficacy4,5 and by giving tasks that can, with effort, be managed by the learner. If the opportunities for practice are overwhelming, stu- dents lose confidence and embark on the blame game, but if the tasks are trivial, they gain no sense of achievement. Three ‘self-reactive factors’ in the social cognitive theory of Bandura and Cervone6 predict how an individual will handle the discrepancy detected between his or her performance and expected standards. These factors are: (i) perceived self-effi- cacy to achieve the standards; (ii) behavioural self-reaction to sub- standard performance, and (iii) readjustment of personal stan- dards."



Grain-size 분석과 관련해서, Bandura는 자기조절은 그것의 하위 기능은 자기모니터링으로부터 시작된다고 했다. 자기모니터링은 자기진단, 자기동기부여, 변화를 필요로 하는 행동과의 시간적 근접성, 변화에 대한 동기, 어떤 행동이 자신이 속한 문화나 기관으로부터 어느 정도 가치를 인정받는지 등을 포함한다.

In support of the grain-size analysis mentioned in the paper on self-regulation,1 Bandura7 has stated that self-regulation is initi- ated by the self-monitoring sub- function. Self-monitoring includes self-diagnosis, self-motivation, tem- poral proximity to the behaviour that requires change (the impor- tance of recency), motivation to change, and an understanding of whether the behaviour to be acquired is valued or not by the institution or the culture (in terms of the hidden or informal curricu- lum)."





 2012 Jan;46(1):7-8. doi: 10.1111/j.1365-2923.2011.04169.x.

Layers within layers … self-regulation in a complex learning environment.

PMID:
 
22150189
 
[PubMed - indexed for MEDLINE]


해부실습: 가이드가 적고 자기주도적인 학습 vs 엄격한 가이드와 스테이션 기반 학습의 비교 (Anat Sci Educ, 2012)

Loosely-guided, self-directed learning versus strictly-guided, station-based learning in gross anatomy laboratory sessions.

Kooloos JG , de Waal Malefijt MC, Ruiter DJ, Vorstenbosch MA.







연구질문

(1) do strictly-­guided gross anatomy laboratory sessions lead to higher learning gains than loosely­-guided experiences? and 

(2) are there differences in the recall of anatomical knowledge between students who undergo the two types of laboratory sessions after weeks and months? "


연구방법

The design was a randomized controlled trial.


해부학 지식은 12개 구조물의 이름을 실습 직후, 1주 후, 5주 후, 8달 후에 걸쳐 측정함으로써 진행되었다.

The recall of anatomical knowledge was measured by written reproduction of 12 anatomical names at four points in time: immediately after the laboratory experience, then one week, five weeks, and eight months later. "


SGG가 LGG보다 네 차례 모두에서 더 높은 점수를 받았다.

The strictly-­guided group scored higher than the loosely-­guided group at all time ­points. Repeated ANOVA showed no interaction between the results of the two types of laboratory sessions (P = 0.121) and a significant between-­subject effect (P ≤ 0.001). Therefore, levels of anatomical knowledge retrieved were significantly higher for the strictly-­guided group than for the loosely-­guided group at all times"



 2012 Nov-Dec;5(6):340-6. doi: 10.1002/ase.1293. Epub 2012 May 31.

Loosely-guided, self-directed learning versus strictly-guided, station-based learning in gross anatomy laboratorysessions.

Author information

  • 1Department of Anatomy, Radboud University Nijmegen Medical Centre, The Netherlands. j.kooloos@anat.umcn.nl

Abstract

Anatomy students studying dissected anatomical specimens were subjected to either a loosely-guided, self-directed learning environment or a strictly-guided, preformatted gross anatomy laboratory session. The current study's guiding questions were: (1) do strictly-guided gross anatomylaboratory sessions lead to higher learning gains than loosely-guided experiences? and (2) are there differences in the recall of anatomical knowledge between students who undergo the two types of laboratory sessions after weeks and months? The design was a randomized controlled trial. The participants were 360 second-year medical students attending a gross anatomy laboratory course on the anatomy of the hand. Half of the students, the experimental group, were subjected without prior warning to station-based laboratory sessions; the other half, the control group, to loosely-guidedlaboratory sessions, which was the course's prevailing educational method at the time. The recall of anatomical knowledge was measured by written reproduction of 12 anatomical names at four points in time: immediately after the laboratory experience, then one week, five weeks, and eight months later. The strictly-guided group scored higher than the loosely-guided group at all time-points. Repeated ANOVA showed no interaction between the results of the two types of laboratory sessions (P = 0.121) and a significant between-subject effect (P ≤ 0.001). Therefore, levels of anatomical knowledge retrieved were significantly higher for the strictly-guided group than for the loosely-guided group at all times. It was concluded that gross anatomy laboratory sessions with strict instructions resulted in the recall of a larger amount of anatomical knowledge, even after eight months.

Copyright © 2012 American Association of Anatomists.

PMID:
 
22653816
 
[PubMed - indexed for MEDLINE]


문화에 따라 달라지는 SDL (Med Educ, 2012)

The culturally sculpted self in self-directed learning

Gary Poole






SDL에서 'self'라는 단어는 학습에 대한 책임과 그 성과에 대한 책임의 위치(locus)를 의미한다.

The word ‘self’ in self-directed learning (SDL) indicates a locus for both the responsibilities related to learning and the outcomes of that learning."


문화권에 따라서 학습에 대한 책임은 개인에게 있을 수도 있고 집단에 존재할 수도 있다.

responsibility may be located in the individual or in the collective."


학생들은 아래와 같은 질문들을 던져봐야 한다.

Students must think about themselves and ask some important questions. Can I effectively evaluate my own performance? Do I believe I can learn independently? Am I independently motivated to learn? Can I stay organised?


이들 질문에 대답을 하기 위해서는 일종의 'constructive narcissism'이 필요하다 (건설적 자아도취). 학생은 자신에 대해서 생각하는 것이 '이기적'인 것이 아니라, 필요한 것이라고 생각해야 한다. 실제로 자신을 성찰의 대상으로 포함하는 자기성찰 과정은 성공의 필수적 요소로 여겨진다. 이러한 종류의 자기성찰의 특성은 문화와 개인에 따라 달라진다.

The effort required to answer these questions calls for a kind of ‘constructive narcissism’. One must believe that thinking about the self is not ‘self-ish’, but necessary. Indeed, the process of reflection that includes the self as an object of that reflection is considered central to successful SDL.3 The nature of this sort of reflection will vary across cultures and individuals.4,5"



더 나아가 우리가 스스로에 대해서 생각하는 방식은 자기 자신 내에서의 비교와 다른 사람과의 비교를 모두 필요로 한다. 아시아와 중동의 학생들은 서양 학생들보다 더 경쟁적이라는 것을 보여주는데, 여기서 '경쟁적'이라는 단어는 자기자신을 상대방과 비교함으로써 정의내리는 경향을 말한다. 경쟁적인 학생들은 외부의, 상대적인 데이터 (시험성적과 같은) 정보를 필요로 한다. PBL에서 그러한 학생들은 지속적으로 자신의 지식을 다른 사람들의 지식과 비교한다.

Furthermore, the way we think about ourselves may well involve the making of comparisons, both within one’s self and between self and others. The authors’ data indicate that students in Asian and Middle Eastern contexts are more competitive in nature than their Western counterparts.1 I would argue that ‘competitive’ refers to a tendency to define one’s self in comparative terms. Competitive students require external, comparative data from things like examinations. In problem-based learning (PBL) settings, such students may continually compare their knowl- edge with that of other group members."


학생들에게 SDL을 원하느냐는 질문에, SDL이 PBL부터 자기주도적 연구 프로젝트까지 다양하고 일반적인데 서양의 의과대학에서 답은 당연히 '그렇다'일 것이다. 그러나 Frambach는 서양의 이러한 가치가 모든 곳에서 유효하다는 가정은 조심해야 한다고 주장한다. 그들은 이렇게 말한다. '학생 중심, 문제 중심의 방식을 적용하는데 맞닥뜨리는 문화적, 맥락적 어려움과 싸우느니, 차라리 그 맥락에 맞는 대안을 새롭게 만들거나 찾아내는데 노력을 들이는 편이 현명하다.'

Western medical schools, in which SDL opportunities that range from PBL to self-directed scholarly projects are common, the answer would be yes. In their concluding remarks, however, Frambach et al.1 caution against the assumption that Western values can be uni- versally applied. They state: ‘Rather than taking on the cul- tural and contextual challenge of adopting student-centred, prob- lem-based methods, it might be wiser for medical educationalists to rise to the challenge of exploring or creating alternatives that best fit their particular context.’1"



Frambach J, Driessen E, Chan L-C, van der Vleuten CMP. Rethinking the globalisation of problem-based learning: how culture challenges self-directed learning. Med Educ 2012;46:738–47."







 2012 Aug;46(8):735-7. doi: 10.1111/j.1365-2923.2012.04312.x.

The culturally sculpted self in self-directed learning.

Author information

  • 1Centre for Health Education Scholarship, Faculty of Medicine, University of British Columbia, Jim Pattison Pavilion North, VGH 910 West 10th Avenue, Suite 3300, Vancouver, British Columbia V5Z 1M9, Canada. gary.poole@ubc.ca


자기조절학습을 다룬 의학교육연구에 대한 자성적 분석 (Med Educ, 2012)

A reflective analysis of medical education research on self-regulation in learning and practice

Ryan Brydges & Deborah Butler





의학에서 자기조절에 관한 연구는 그 역사가 길다.

The study of self-regulation in medicine has a long and rich history.


자기조절에 대한 연구는 의료전문직으로서 필요한 요건의 한 가지로서 관심이 집중되어왔고, 의료전문직이란 자기조절을 하는 전문가로서 지속적인 전문성 개발을 적극적으로 이뤄내서 최소한의 역량을 유지할 수 있어야 하기 때문이다. 유사하게 의학을 배워나가는 학습자들도 의학교육에 유능한, 자기조절적 학습자로서 참여할 것이 기대된다.

Correspondingly, attention has also focused on self-regulation as a requirement of medical professionals, who as part of a self- regulating profession, are expected to identify and willingly engage in ongoing professional develop- ment activities that serve to maintain a minimum level of learning and competence.2,3 Similarly, in preparation for joining medical practice, medical trainees are also expected to engage in medical education as capable, self-regulating learners.4


SRL은 '스스로 생각, 감정, 행동을 만들어서 자신의 목표를 달성할 수 있도록 계획하고 주기적으로 적용해나가는 것'으로 정의되어왔다.

Self-regulated learning (SRL) has been defined classically as: ‘self-generated thoughts, feelings and actions that are planned and cyclically adapted to the attainment of personal goals.’6


임상실습을 도는 학생을 상상해보자.

Firstly, let us consider a medical student engaged in her first clinical clerkship.


학생은 이 기간동안 주로 그전 학습환경에서 사용해온 학습전략을 지속적으로 활용하나, 여러 근거들을 종합하면 임상환경에서 그러한 전략들은 대체로 효과적이지 못하다.

Evidence suggests that during this transition she will persist in using learning strategies (e.g. memorisation) that may have worked in previous settings, yet are not as effective in the clinical environment.8,9


이 학생은 도움을 요청해야 할 때가 언제인지 적절한 판단을 내림으로써 유능하게 보일 수도 있지만, 동시에 환자의 안전도 담보해야 한다. 동시에 임상경험을 통해서 학습 진행을 관리해야 한다. 학생은 학습, 봉사, 의료의 우선순위가 서로 상충한다는 것을 곧 깨닫게되며, 자신의 시간관리를 스스로 해야한다.

She must make careful judgements about when to ask for assistance so that she can appear compe- tent, but preserve patient safety. Simultaneously, she must manage her learning about and through her practice experiences. She quickly realises that learn- ing, service and patient care can be conflicting priorities, and that she must manage her time largely on her own.9


로테이션 시스템에서 학생은 길을 잃은 느낌을 받거나 그 구조의 불합리성을 비난하고, 임상에서의 위계질서는 병원에서 있느니 책 한글자라도 더 보는 것이 배우는 것이 많겠다는 생각을 하게 한다.

She may start to feel lost and blame the work structure in the form of the rotational system and clinical hierarchy for making her feel as if she learns more from reading a textbook than she does from her time at the hospital.9


둘째로, 임상의사는 지속적으로 전문성을 함양하여 역량을 유지하고 최신 지식을 유지하면서 자기조절을 하게 된다. 그러나 자기조절에 대한 주위의 기대치는 임상 교사와 같은 그들이 맡은 또 다른 역할에서도 드러난다.

Secondly, practising clinicians self-regulate when they engage in ongoing professional learning to maintain competencies and stay up to date. However, expecta- tions for self-regulation also emerge within other of their roles, such as that of the clinical teacher.11,12


학계든 지역사회든 모든 환경에서 의사는 교수자로서 자기조절(피교육자에 대한 교육을 어떻게 할 것인가)와 평생학습에 대한 자기조절(어떤 참고자료를 찾아봐야 하며 언제 조언을 구해야 하는가)의 균형을 유지해야 한다.

academic or community care setting. In all settings, he must balance self-regulating his practice as teacher (e.g. in how he educates his trainees and patients) and self-regulating his lifelong learning (e.g. in how he makes decisions about what to look up and when to consult with colleagues).


이 두 가지 사례는 우리가 의학교육에서 자기조절에 대해 생각하는 방식을 묘사한다.

These two examples illustrate current ways of think- ing about self-regulation in medical education.


어떤 영향력은 학습자 외부에 있고, 어떤 영향력은 학습자에 내재되어 있다.

Some influences are external to learners, Other influences can be associated with a learner,


이러한 사례를 기반으로 보면, 의학교육에서 자기조절이란 다양한 내적, 외적 영향력에 주의를 기울일 것이 요구된다.

Given these examples, it would seem that understanding self-regulation in medical educa- tion requires attending to a range of internal and external influences.5



Figure 1 A model of self-regulation (adapted from Butler et al.5; see also Cartier and Butler16)




개개인이 학습환경에 가져오는 것

What individuals bring to the learning context

그러나 개개인이 자기조절에 대해 접근하는 방식은 학습환경을 곧바로 반영하는 것은 아니다. 개개인은 다양한 오랜 기간에 걸쳐서 발전되어온 지식, 신념, 행동, 역사, 경험 등을 특정 환경에 가져온다. 성공적인 학부 의과대학생은 의과대학에 처음 입학했을 때는 자신감에 넘쳤을 수 있지만 머지 않아 자기주도적 교육과정의 낯선 요구에 겁을 먹을 것이다.

However, an individual’s approach to self-regulationis not a direct reflection of context. Individuals bring to contexts a variety of knowledge, beliefs andemotions that have developed over time through their history and experiences and that emerge in particular settings. Successful undergraduate learn- ers, for example, might be confident when enteringmedical school, but then may be rattled by the unfamiliar demands of a self-directed curriculum.19


자기조절 관련 행동의 사이클

A cycle of self-regulation in action



역사적, 사회적, 문화적 세팅에서의 자기조절

Self-regulation in historically, socially and culturally situated settings

문화적 규약과 교육적 행동, 개개인의 해석과 열망으로부터 자기조절이 어떻게 드러나는지를 보여준다.

This research underlines the importance of considering how observed patterns in 

      • self-regulation (e.g. the pattern demonstrated by a medical trainee who fails to ask for help when he needs it to ensure patient safety) 
    • emerge from 
      • complex interactions between cultural norms (e.g. expectations to act independently), 
      • pedagogical practices (e.g. feedback) and 
      • individuals’ interpretations and aspirations (e.g. the desire to assume the identity of a doctor).



SRL을 요구하거나 촉진하는 학습환경

Learning environments that demand or foster SRL


또 다른 연구에서 어떻게 교육환경이 자기조절을 촉진하는지 알아보았다. 

In another group of studies, medical education researchers have focused on how pedagogical envi- ronments might foster self-regulation.26 An impor- tant example can be found in the rich history of articles that invoke the term ‘self-directed learning’ (SDL). Mapped on to our theoretical framework, one contribution of this line of research is that it attends specifically to how environments can be designed to provide opportunities for and expect self-regula- tion,13 such as by using tools like PBL and life-long learning modules.27 Evidence to support such SDL initiatives is encouraging, if still in its early stages.


White의 연구를 보면, 학생들에게 SRL을 요구하는 것은 궁극적으로는 도움이 된다. SRL은 SR을 기대하거나 가능하게 하는 환경에서부터 촉진될 뿐만 아니라, 그러한 환경에서 자기조절을 지원하고, 개개인에게 어떻게 SRL을 할 수 있는지 알려주는 것으로부터도 촉진된다.

As we considered the study by White19 (and other similar studies) in relation to emerging strands of research, our analysis suggested that, although it is ultimately beneficial for students to experience demands for self-regulation, either in an early PBL experience or when transitioning to clerkship, students in both groups may have been similarly disoriented and challenged at the moment when learning expectations changed (i.e. in PBL in Year 1 or in the clerkship in Year 3), and may have benefited from support to navigate those changes.13 This obser- vation is grounded in evidence which suggests that self-regulation is fostered not only by establishing environments that afford or expect self-regulation, but also by supporting self-regulation in those contexts and by assisting individuals to learn how to self-regulate their learning. Medical educators are starting to study self-regulation from this perspective by drawing on the seminal work of Irby12,28 to consider how to manage trainees’ progression toward independent learning and practice.10,11


self-directed learning에서 중요한 것은 그것을 너무 문자 그대로 해석하지 않는 것이다.

A key implication is that the term ‘self-directed learning’ should not be interpreted too literally.31


어떻게 자기조절이 co-regulated practice의 형태로 guide될 수 있으며, support될 수 있는지를 강조하고자 한다.

Thus, we join others in calling for greater emphasis on how self-regulation can be guided and supported as a form of co-regulated practice (i.e. practice that is shaped by context and by others).13,29–32


개개인이 학습환경에 가져오는 것

What individuals bring to the learning context


의학교육연구는 어떻게 이전 지식, 신념, 감정이 수행능력을 좌우하며, 왜 SRL이 항상 이상적으로 작동하지 않는지에 대해 설명해주었다.

Medical education research has examined how prior knowledge, beliefs and emotions mediate perfor- mance and can account in part for why self-regulation may not always unfold in an ideal manner.


종합하면, 이전 경험에서의 지식과 헌신이 새로운 환경과 상호작용하여 self-regulation에 영향을 준다.

Collectively, these studies show that knowledge and commitments from prior experience interact with context (e.g. clarity of expectations) to influence self-regulation.8,32


자기조절행동의 사이클

A cycle of self-regulation in action


최근까지 자기평가에 대한 연구는 학습자들이 얼마나 자기 지식의 한계를 잘 판단하며 그 정보를 가지고 전문성개발의 가이드로 삼는지에 대한 것이었다.

Until recently, most self-assessment research has focused on measuring how well trainees or practitioners can judge the extent or limits of their knowledge and use that information to guide their professional development.37


그러나 최근의 연구는 분석의 입자 크기에 보다 집중하고 있다. 

However, recent research has paid closer attention to the grain size of analysis,39 which has important implications for understanding the potential for trainees and doctors to adequately self- assess.


자기평가의 이전 연구는 개개인이 자신들의 경험을 축적하여 총괄적 자기평가를 할 수 있다는 가정에 이뤄졌지만, 보다 최근의 연구는 어떻게 주어진 SRL 사이클에서 실제 수행 중에 self-monitor 혹은 self-assess를 할 것인가에 집중되어 있다. self-monitor에 대한 최근의 두 가지 연구는 특정 순간에 self-monitor를 하는 것이 global self-assess보다 더 정확하고 민감하다는 것을 보여준다.

That is, previous study of self-assessment assumed that individuals are capable of aggregating across experiences to generate global self-assessments which will spur professional learning. By contrast, more recent work has focused on how an individual self-monitors or self-assesses within a situated SRL cycle during practice (Fig. 1). Two recent studies of the self-monitoring process suggest that individuals are more accurate and sensitive when self-monitoring in the moment than they are in making global self- assessments.35,37



의학교육의 복잡성

Complexity of medical education


첫째로, 우리는 자기조절에 대한 분석이 학습과 진료의 복잡성을 고려해야 한다고 본다.

Firstly, we recommend that any analysis of self- regulation in medical education recognise and take into account the complexity of learning and practice.



자기조절에 대한 개념

Conceptualising self-regulation


앞에서 제시한 프레임워크르 사용하는 것이 다양한 SRL에 대한 연구를 조화롭게 엮어주는 것을 확인했다.

We found that a benefit of using our integrative framework in our analysis was that it enabled us to connect different lines of research on self-regulation in a coordinated and coherent manner.


입자 크기에 대한 관점의 유용성

Grain size: a useful perspective


자기조절에 대한 연구로부터 우리는 연구자들이 grain size에 관심을 두어야 함을 주장하고자 한다. grain size란 연구자가 연구를 위해 선택하는 분석 혹은 디테일의 수준이다.

Building from research in self-assessment, we recom- mend that researchers attend to the grain size of analyses of self-regulating processes. Grain size refers to the level of detail or analysis one selects for study. 

      • For example, a fine-grained analysis might focus on instances of self-regulating processes as they play out in context (such as when a medical trainee interacting with a patient with diabetes decides that his expertise is exceeded and that he needs help from a supervisor), 
      • whereas a more global analysis might ask for a generalised assessment of knowledge relevant to practice (such as when the same trainee rates how much he knows about diabetes to guide his learning).



자기조절에 대한 지지

Supporting self-regulation


두 가지 주요한 가정에서 탈피해야 한다. 한 가지는 독립적으로 완수할 수 있는 학습활동을 설계하는 것이 내용 영역의 학습과 자기조절의 개발 모두를 달성할 수 있다는 것이며, 둘째는 자기조절이라는 것이 온전히 학습자 내부에서 진행되는 것이고 따라서 교수는 거의 할 일이 없다는 생각이다.

Our analysis of research suggests two major assump- tions to be avoided in the study of self-regulation in medical education. The first is the assumption that designing an activity so that it can be completed independently is sufficient to promote both learning of a content domain and the development of self-regulation. The second is the assumption that self- regulation is an activity conducted entirely within the learner and, consequently, that faculty members play little or no role in supporting self-regulation.13


의학교육자들은 SRL을 할 수 있는 환경을 설계하는 것 뿐만 아니라 자기조절 process에 대한 지원을 제공해야 한다.

Crucially, then, medical educators must assume responsibility, not just for designing environments that afford the opportunity for self-regulation, but also for providing support for the self-regulating processes.



또한 그러한 support가 얼마나 다양한 형태로, 그리고 다양한 자원으로부터 올 수 있는가를 말하고자 한다.

We also recommend learning from education research (in medicine and elsewhere) that has shown how support for professional learning can come in many forms, such as by facilitating, prompting, modelling or explaining, and from many sources, such as text, video, online modules, peers and instructors.


자기조절능력의 향상은 내용 전문가로서 발전하는 것과 함께, 그리고 그 한 부분으로서 도달할 수 있다. 

Indeed, improvements in self-regulation can emerge alongside and as part of the development of content expertise. Contexts and forms of support that foster explicit attention to learning expectations and pro- cesses as part of content area instruction have the potential to support both concept mastery and self-regulation.42 Ideally, one learns about content by self-regulating learning, whereas one builds knowl- edge about self-regulation via the experience of learning. More subtly, descriptions of the development of ‘discernment’ suggest that self-regulation as applied in a new area improves as a trainee acquires expertise.22


한 분야에 대해서 내용에 대한 지식과 효과적인 자기조절은 서로 협력적으로 발전해나갈 수 있다.

individual and that content knowledge and effective self-regulation in a given area can develop progressively and in tandem (i.e. bootstrapping can occur).




45 Eva KW, Regehr G. Exploring the divergence between self-assessment and self-monitoring. Adv Health Sci Educ Theory Pract. 2011;16(3):311–29.





 2012 Jan;46(1):71-9. doi: 10.1111/j.1365-2923.2011.04100.x.

reflective analysis of medical education research on self-regulation in learning and practice.

Author information

  • 1Centre for Health Education Scholarship, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada. ryan.brydges@utoronto.ca

Abstract

OBJECTIVES:

In the health professions we expect practitioners and trainees to engage in self-regulation of their learning and practice. For example, doctors are responsible for diagnosing their own learning needs and pursuing professional development opportunities; medical residents are expected to identify what they do not know when caring for patients and to seek help from supervisors when they need it, and medical school curricula are increasingly called upon to support self-regulation as a central learning outcome. Given the importance of self-regulation in both health professionseducation and ongoing professional practice, our aim was to generate a snapshot of the state of the science in medical education research in this area.

METHODS:

To achieve this goal, we gathered literature focused on self-regulation or self-directed learning undertaken from multiple perspectives. Then, with support from a multi-component theoretical framework, we created an overarching map of the themes addressed thus far and emerging findings. We built from that integrative overview to consider contributions, connections and gaps in research on self-regulation to date.

RESULTS AND CONCLUSIONS:

Based on this reflective analysis, we conclude that the medical education community's understanding about self-regulation will continue to advance as we: (i) consider how learning is undertaken within the complex social contexts of clinical training and practice; (ii) think of self-regulation within an integrative perspective that allows us to combine disparate strands of research and to consider self-regulationacross the training continuum in medicine, from learning to practice; (iii) attend to the grain size of analysis both thoughtfully and intentionally, and (iv) most essentially, extend our efforts to understand the need for and best practices in support of self-regulation.

© Blackwell Publishing Ltd 2012.

PMID:
 
22150198
 
[PubMed - indexed for MEDLINE]


의대 과정동안 자기주도적학습 능력이 향상될까 줄어들까? (Acad Med, 2013)

Does Medical Training Promote or Deter Self-Directed Learning? A Longitudinal Mixed-Methods Study

Kalyani Premkumar, MBBS, MD, MSc (Med Ed), PhD, Punam Pahwa, PhD,

Ankona Banerjee, MSc, Kellen Baptiste, MD, Hitesh Bhatt, MSc, and Hyun J. Lim, PhD




Campbell 등은 SDL을 Maslow가 '자기실현적 개인'이라고 칭한 고도로 주도적인 자기 학습자와 같이 학습 프로그램에 참여하는 것에 있어서 스스로 주도하고, 스스로 기획하는 행동이라 정의했다. Hammond와 Collins는 SDL을 학습자가 주도권을 쥐고 다른 사람의 지원과 협력하에 진행되는 프로세스라고 묘사했다. 그러나 가장 잘 정의된 것은 Knowles가 개개인이 이니셔티브를 쥐고, 다른 사람의 도움이 있거나 없는 상황에서 스스로의 학습요구를 진단하고, 학습목표를 설정하고, 인적 물적 학습자원을 찾아서 적절한 학습전략을 선택 및 도입하고 스스로의 학습결과를 평가하는 것이다.

Campbell et al1 define SDL as behaviors that range from participation in programmed learning to the self- initiated, self-planned activities of such highly directed self-learners as Maslow’s self-actualizing individuals.2,3 Hammond and Collins4 describe SDL as a process in which learners take the initiative, with the support and collaboration of others. But we believe the concept of SDL is best captured by Knowles5 as a process in which individuals take the initiative, with or without the help of others, in diagnosing their learning needs, formulating learning goals, identifying human and material resources for learning, choosing and implementing appropriate learning strategies, and evaluating learning outcomes."



왜 의학에서 SDL이 중요한가? Why is SDL important in medicine?"

SDL은 의학 분야에서 매우 강조되어왔는데, 예컨대 CanMEDS에서 scholar role을 보면, 의사들이 일생에 걸친 자기성찰적 학습을 해야 한다. 또한 ABMS와 WFME는 SDL을 의학교육에서 반드시 평가해야 하는 특징으로 보았다.

SDL has been increasingly emphasized in the medical field. For instance, the scholar role of the CanMEDS 2005 Physician Competency Framework emphasizes SDL by requiring physicians to demonstrate a lifelong commitment to reflective learning,9 and both the American Board of Medical Specialties and the World Federation for Medical Education include SDL as a characteristic that should be evaluated during medical education.10,11"


ACGME 역시 practice-based learning and improvement를 여섯 개의 핵심 역량 중 하나로 보았으며, 추가적으로 CME 역시 의사들이 자기주도적으로 스스로의 학습요구, 목표, 학습활동 선택 등을 잘 할 수 있다는 가정하에 이뤄지는 것이다.

The Accreditation Council for Graduate Medical Education, too, identifies practice-based learning (a form of SDL) and improvement as one of its six core competencies.12 In addition, continuing medical education for physicians is based on the assumption that physicians are self-directed leaders who can accurately predict their own learning needs, set goals, engage in appropriate learning activities, and regularly and accurately assess the outcomes.13,14"




SDL의 역량 

Competencies of SDL"


SDL의 역량은 여러 사람이 묘사했는데, 학습 격차에 대한 이해, 자신과 타인에 대한 평가, 성찰, 정보 관리, 비판적 사고, 비판적 평가 등이 있다.

The competencies of SDL have been described by many.5,15,16 They include proficiency in assessment of learning gaps, evaluation of self and others, reflection, information management, critical thinking, and critical appraisal.15"


이런 것들이 '자율성'을 의미하는 것 같지만, SDL은 동료 및 선생님들과 서로 정보를 교환하는 것을 포함한다.

Although all these characteristics seem to indicate autonomy, SDL involves interaction with peers and teachers to exchange information.17"


SDL이 교육될 수 있는가? Can SDL be taught?"


SDL은 종종 연속체로 묘사되곤 하는데, 한 극단에는 완전히 교사에게 의존적인 학생이, 다른 극단에는 완전히 자기주도적인 학생이 있다. 후자는 자신이 학습할 내용을 독립적으로 결정하고, 자원을 찾고, 문제를 해결하고, 평가할 수 있다. 그러나 한 상황에서 고도로 자기주도적인 사람도 새롭고 익숙하지 못한 환경에서는 매우 비-자기주도적이 될 수 잇다.

SDL is often described as a continuum, present in all individuals to some extent, with those who are least self-directed being totally dependent on the teacher for learning.5,16 The other end of the spectrum is the totally self-directed learner, who independently determines what is to be learned, identifies the resources, solves problems, and evaluates. It should be noted that a person who is highly self- directed in a particular situation may be very much less self-directed in a new and unfamiliar context.18"


Grow는 SDL의 단계 모델을 만들면서, 학습자를 자기주도적 학습의 네 단계 중 하나에 있다고 보았다.

Grow,16 in his staged SDL model, describes the learner, at any given time or learning situation/ context, to be in one of four stages: 

      • dependent (stage 1), 
      • interested (stage 2), 
      • involved (stage 3), and 
      • self-directed (stage 4)."

교육자들은 학습자가 어느 단계에 있는가를 진단하고 더 높은 단계로 준비히켜주어야 한다.

Educators need to diagnose the learner’s stage of self-direction and prepare the learner to advance to higher stages."


Grow에 따르면, 의존적으로 학습하는 것은 교수-학습의 학생중심이라는 원칙에 위배되는 것이지만, 일시적으로 의존적 관계를 촉진하는 것이 학습자를 더 높은 단계로 이끄는 것에 장애가 되지는 않는다.

According to Grow, although learning in a dependent mode goes against the principle of student-centered styles of teaching and adult learning, there is nothing demeaning or destructive in promoting temporarily dependent relationships as long as the purpose is to advance learners to higher stages."



SDL 준비도 측정 

Measuring SDL readiness"


SDL readiness를 측정할 수 있는 몇 가지 도구가 있다. SDLRS, OCLI, Ryan's, Fisher 등의 SDL readiness scale 등이다.

There are several instruments that have been developed to measure SDL readiness.15,19–23 The more widely used instruments are Guglielmino’s15,24 Self-Directed Learning Readiness Scale (SDLRS), Oddi’s19,20 Continuing Learning Inventory (OCLI), Ryan’s23 ability and importance scores, and Fisher and colleagues’18 SDL readiness scale."



SDLRS는 자기기입식 설문지로 다음의 구인을 평가한다.

The SDLRS consists of a self-report questionnaire of 58 questions and is one of the most common instruments used to assess SDL readiness.24 It measures eight constructs: 

• Openness to learning opportunities 

 Self-concept as an effective learner 

• Initiative and independence in learning 

• Informed acceptance of responsibility to one’s own learning 

• Love of learning 

• Creativity 

• Positive orientation to the future 

• Ability to use basic study and problem- solving skills"




질적분석 

Qualitative


연구 후반부에 인터뷰를 실시하였고, 여섯 명으로부터 반구조화 면접을 수행했다.

At the end of the study, all instructors in the medical school were invited to participate in interviews. Six instructors volunteered, and we conducted semistructured interviews with them."


녹취된 인터뷰와 포커스그룹은 세 명의 coder에 의해서 independently 분석되었고, 반복적인 토론을 통해서 합의달성

The recorded interviews and focus group discussions were transcribed. The transcriptions were independently examined for common themes by three coders (K.P., A.B., and another qualitative research expert). Repeated discussions were held until agreement among coders was attained."




입학 당시 SDL 준비도

SDL readiness and age at admission 


연령이 높아질수록 SDLRS 점수에 긍정적 효과가 있었다. 종합적으로, 더 나이를 먹은 학생이 더 점수가 높았다.

Age was considered to be a continuous variable in this study, and an increase in one year of age at admission had an overall positive impact on SDLRS scores. Overall, students who were older had significantly higher scores (P = .002) than did younger students."



SDL과 premedical university education 기간

SDL and years of premedical university education"



SDL 준비도의 변화

Changes in SDL readiness of medical students from admission to graduation."


입학 후 1년이 지났을 때 SDLRS점수에 유의미한 하락이 있었다. 입학시 점수보다 지속적으로 낮았다.

There was a significant drop (P < .001) in SDLRS scores in all cohorts one year after admission. The scores of all cohorts continued to be significantly lower than that at admission throughouttraining and at graduation."



SDL을 촉진하는 현 교육과정 내의 활동들

Activities in the current curriculum that promote SDL "

SDL에 배정된 시간을 중요시하여, 교육과정의 한 부분으로서 1학년 때 매일 오후는 SDL로 할당되었다. 그러나 지침이나 관리 없이 시간만 주는 것은 SDL을 오히려 악화시켰다.

The instructors and students identified some of the activities within the current curriculum that promote SDL: The time allocated for SDL—especially in the first year—was considered valuable. As part of the curriculum, a few hours of every afternoon in year 1 is earmarked for SDL. However, it was felt that being giventime, without direction and monitoring, deterred SDL."



SDL 촉진/악화 요인

Factors that facilitate/deter SDL 

두 가지 주제가 드러났다. 하나는 학습 환경과 관련된 것이고 다른 하나는 평가에 관련된 것이다.

Two themes clearly emerged: one related to the learning culture and environment and the other related to assessment."


SDL이 교육과정 전체에 걸쳐서 주요한 주제가 되어야 한다. 한 두 개의 과목만으로는 안된다.

Instructors felt that SDL has to be a theme throughout the curriculum, not introduced in just one or two courses."


SDL은 guide process로 여겨졌으며 SDL의 기술들은 교육될 수 있다

SDL was thought to be a guided process, and skills in SDL had to be taught."



SDL이 학년이 올라가면서 낮아지는 것에 대해서 교사나 학생 모두 전혀 놀랍지 않다는 분위기였다.

Towards the end of the focus group/ interviews, students and instructors were told that preliminary findings of this study seemed to indicate that SDL readiness was decreasing with increasing years. Both groups were not surprised at the findings:"










SDLRS scores, adult population, other health professions, other medical schools."


SDL readiness and gender."


SDL readiness and age. 


나이를 더 먹은 학생이 어린 학생보다 유의미하게 높았으며 이는 Reio와 Davis, Kell과 Van Deursen의 결과와 일치한다. (50세까지는 SDL이 향상됨) 학습에 대한 선호는 과거의 학습경험과 학습과정을 통제할 수 있다는 자신감에서 나온다.

Older students had significantly higher scores than younger students. This is consistent with the findings of Reio and Davis38 and Kell and Van Deursen41 and lends further support that SDL has a positive developmental trajectory until the 50s, consistent with SDL theory.42,43 The learning preference has been attributed to previous learning experience and confidence in controlling the learning process.44 "


SDL readiness and premedical training"


premedical training과의 Harvey 등의 연구와 대조되는데, 여기서는 SDL점수가 premedical education 수준이 높을수록 높았다.

This finding is in contrast to that of Harvey et al,39 who found a significant positive trend in SDL scores (using SDLRS, the Oddi Continuing Learning Inventory [OCLI], and Ryan’s [ability] scores) associated with the highest level of premedical education achieved (undergraduate only, master’s, or doctoral)."



SDLRS scores with increased medical training."

SDL점수가 1학년 말에 유의미하게 하락했다.

Our findings indicate that SDL readiness scores decreased significantly at the end of one year."


이러한 연구 결과는 U of Toronto의 결과와도 비슷하다.

our findings are similar to the findings of researchers at the University of Toronto Faculty of Medicine, who did a cross- sectional study on first- and second-year medical students (N = 280). Of the three instruments that they used (SDLRS, OCLI, and Ryan), the scores obtained with Ryan’s instrument showed a decrease with more training.39"






이런 차이는 어떻게 설명가능할까?

How can these differences be explained?"


이러한 차이는 어떻게 설명할 수 있을까? Knowles는 학습을 교수주도과 자기주도를 양측에 둔 연속체로 묘사하였다(pedagogical - androgogical). 이 연속체는 학습자가 학습에 대하여 얼마나 통제권을 가지고 있으며, 학습목표를 달성하기 위하여 필요한 평가와 전략에 대해 얼마나 자유도를 가지는지에 의해 달라진다. 예컨대 특정한 학습 영역에서 자기주도적이 되기 위해서는 학습자는 특정 수준 이상의 지식을 가지고 있어야 한다. SDL에 대해 얼마나 준비가 되어있느냐는 학생마다 매우 다르고, 이는 학생마다 경험이 다르기 때문이다. 학생이 지식이 적을 때는 높은 수준의 구조화가 필요하다. SDL에 대한 준비도와 틀에 짜여진 교육시간에 대한 선호도와는 분명한 역의 상관관계가 있다. 학부 교육과정동안 학생들은 과도한 정보를 주입당하는데, 학생들은 그 시간동안 자기들의 스스로의 흥미에 대해 공부할 시간이 거의 없다. 따라서 의대 기간동안 SDLRS가 떨어지는 것이 놀라운 것은 아니다.

How can the differences in SDLRS scores with training be explained? Knowles5 describes learning as a continuum with teacher-directed (pedagogical) learning at one end and self-directed (androgogical) at the other. This continuum can be explained in terms of how much control the learner has over learning and the amount of freedom given to evaluate and implement strategies to achieve learning goals. In medicine, the learning environment tends to keep students in the pedagogical end of the spectrum. For instance, to be self-directed in a specific content area, a person must possess a certain level of knowledge. The readiness for SDL is variable in any given student population, as each student enters medicine with a different academic background. When students have a low level of knowledge, they prefer a high degree of structure. It has been shown that there is a definite inverse correlation between SDL readiness and student preference for structured teaching sessions.47,48 Throughout undergraduate training, students are overloaded with information. Moreover, the competencies required are well defined by regulatory bodies, and there is very little time for students to pursue their own interests. As our faculty note, it is therefore not surprising that SDLRS scores decrease with training."


Knowles에 따르면 학생과 교사 모두 SDL기반 교육과정 도입에 필요한 스킬을 갖출 필요가 있다.

According to Knowles,5 both teachers and students have to possess the skills necessary for the implementation of an SDL-based curriculum."


    • role of facilitators"
    • learning environment that is collaborative rather than competitive."
    • diagnose learning needs"
    • help learners diagnose their own needs.5"
    • Training plays a key role"


학생과 교사의 준비가 부족한 것 외에 다른 SDL의 장애물로는 Shokar는 "전문성, 교육과정, 법, 조직 내규, 외부 제약, 시간 제약, 학습할 내용 등으로부터 오는 제한"이라고 했다.

Apart from lack of teacher and student preparation, other factors may serve as barriers to SDL. Shokar et al50 list some of these barriers as “restrictions imposed by professional, curricular, legal, and institutional requirements, statutory educational regulations, time constraints, and the need to ensure that specific content is covered.”"


모든 상황에 SDL이 적용가능한 것이 아님을 염두에 둘 필요가 있다. 학생이 과거 경험이 매우 적거나 학습의 초점이 과목 그 자체보다는 내용(구체적 학습목표)에 있을 때 SDL은 적절하지 않다. 학부의학교육은 - 특히 1학년은 - 거의 지식에 대한 기반이 약하고 따라서 학생들이 교사-의존적이 되는 것은 이해할 만 하다.

One must also remember that not all situations are applicable for SDL. SDL may not be appropriate in situations where the student is new or has very little previous experience of the subject and when the focus of learning is on the content (e.g., specific learning objectives) rather than the subject itself.47 In undergraduate medical training— especially in the first year—most students have very little foundational knowledge. It is therefore understandable that students are more teacher-dependent and require that the education program be more structured."


SDL에 영향을 주는 요인들이 복잡함을 고려할 때 많은 모델이 SDL을 대표한다. Garrison의 모델, Brocket과 Hiemstra의 모델, Candy의 모델.

Given the complexity of factors that influence SDL, a number of models have been proposed to represent SDL. 


    • Garrison’s51 model focuses on three psychological constructs: self- monitoring (cognitive responsibility), self-management, and motivation (see Figure 2). 
      • Self-monitoring refers to the ability of learners to monitor both their cognitive and metacognitive processes. 
      • Self-management focuses on goal setting, use of resources, and external support for learning. 
      • Motivation has two dimensions: entering and task motivation. 
        • Entering motivation is what compels the learner to participate in the learning process, whereas 
        • task motivation is what keeps the learner on task and persisting in the learning process. To promote SDL, each of these constructs needs to be addressed."
    • Brocket and Hiemstra’s52 model of SDL (see Figure 2) focuses on learner control of responses to a situation even if there is no control over the circumstances; the model considers SDL and learner self-direction as two dimensions, with personal responsibility connecting the two. To facilitate SDL, focus on promoting personal responsibility is required."
    • Candy’s53 SDL model (see Figure 2) illustrates two interacting laminated (layered) domains. One dimension relates to the amount of control within an institutional setting, with one end of the continuum showcasing teacher control and the other learner control. The second dimension relates to amount of control over informal learning: autodidaxy. In this model, one needs to help the organization and its teachers choose appropriate strategies based on the content and level of knowledge of students, and to facilitate movement of students along the continuum."


다갈래의 전략이 필요하다. 학생이 SDL궤도를 따르도록 도와줄 수 있는 네 가지 원칙을 요약하였음.

From the different SDL models it can be deduced that a multipronged approach has to be taken to promote SDL in students. Francom54 summarizes four principles instructors can use to help students move along the SDL trajectory: 

• Match the level of SDL required in learning activities to student readiness 

• Progress from teacher to student direction over time 

• Support the acquisition of subject matter knowledge and SDL skills together 

• Have adults practice SDL in the context of learning tasks"



학생이 학습에 대한 책임을 지게 하는 교육자의 여섯 가지 역할을 제시하였음

Hiemstra55 describes six foundational roles that instructors need to take on to enable students to adopt personal responsibility for their learning: 

• Content resource (sharing expertise and experiences using various forums) 

• Resource locator (locating and sharing various resources to meet student needs)"

• Interest stimulator (arranging for resources that maintain student interest in the subject, e.g., games, discussions; guest presentations) 

• Positive attitude generator (through positive reinforcement; prompt, useful feedback) 

• Creative and critical thinking stimulator (through study groups; journal writing; logs; simulation; role-play) 

• Evaluation stimulator (learner evaluation and promotion of self-evaluation)"




Figure 2 Three models of self-directed learning (SDL).51–53 See the text for a discussion of these models. Used with permission."











 2013 Nov;88(11):1754-64. doi: 10.1097/ACM.0b013e3182a9262d.

Does medical training promote or deter self-directed learning? A longitudinal mixed-methods study.

Author information

  • 1Dr. Premkumar is curriculum consultant and faculty development specialist, and associate professor, Department of Community Health and Epidemiology, College of Medicine, University of Saskatchewan, Saskatoon, Saskatchewan, Canada. Dr. Pahwa is professor, Department of Community Health and Epidemiology, University of Saskatchewan, Saskatoon, Saskatchewan, Canada. Ms. Banerjee was a third-year master's student, Department of Community Health and Epidemiology, University of Saskatchewan, Saskatoon, Saskatchewan, Canada, at the time this article was written. Dr. Baptiste was a fourth-year medical student, College of Medicine, University of Saskatchewan, Saskatoon, Saskatchewan, Canada, at the time this article was written. Mr. Bhatt is biostatistician, University of Alberta, Edmonton, Alberta, Canada. When this article was written, he was biostatistician, Department of Community Health and Epidemiology, University of Saskatchewan, Saskatoon, Saskatchewan, Canada. Dr. Lim is professor, Department of Community Health and Epidemiology, University of Saskatchewan, Saskatoon, Saskatchewan, Canada.

Abstract

PURPOSE:

The School of Medicine, University of Saskatchewan curriculum promotes self-direction as one of its learning philosophies. The authors sought to identify changes in self-directed learning (SDL) readiness during training.

METHOD:

Guglielmino's SDL Readiness Scale (SDLRS) was administered to five student cohorts (N = 375) at admission and the end of every year of training, 2006 to 2010. Scores were analyzed using repeated-measurement analysis. A focus group and interviews captured students' and instructors' perceptions of self-direction.

RESULTS:

Overall, the mean SDLRS score was 230.6; men (n = 168) 229.5; women (n = 197) 232.3, higher than in the average adult population. However, the authors were able to follow only 275 students through later years of medical education. There were no significant effects of gender, years of premedical training, and Medical College Admission Test scores on SDLRS scores. Older students were more self-directed. There was a significant drop in scores at the end of year one for each of the cohorts (P < .001), and no significant change to these SDLRS scores as students progressed through medical school. Students and faculty defined SDL narrowly and had similar perceptions of curricular factors affecting SDL.

CONCLUSIONS:

The initial scores indicate high self-direction. The drop in scores one year after admission, and the lack of change with increased training, show that the current educational interventions may require reexamination and alteration to ones that promote SDL. Comparison with schools using a different curricular approach may bring to light the impact of curriculum on SDL.

PMID:

 

24072133

 

[PubMed - indexed for MEDLINE]


자기주도학습 - 개념과 맥락의 중요성 (Med Educ, 2005)

Self-directed learning – the importance of concepts and contexts

G C Greveson & J A Spencer




두 가지 논문 모두 교육자에게 교육접 접근법과 이니셔티브의 효과를 정확히 평가하려면 거기에 깔린 개념을 정확히 이해할 것을 강조한다. Candy는 SDL을 바라보는 여러 상이한 개념이 있다고 말한다. 다양한 교육철학에 따라서 이념적인 것부터 도구적인 것까지 다양한 세팅에서 다양한 함의를 갖는다. Miflin등에 따르면, SDL에 기반한 graduate medical course를 도입할 때 교사와 학생들이 여기에 대해서 가지는 서로 다른 해석때문에 어려움이 있었다고 말한다.

Both papers highlight that it is crucial that educators understand concepts underpinning educa- tional approaches and initiatives, in order to evaluate whether they are likely to be effective. Candy sug- gested there are several conceptu- ally distinct ways of viewing SDL, based on varying educational phi- losophies, from the ideological to the instrumental, which may have different implications for practice in different settings.3 For example, Miflin et al. reported the difficulties in implementing a graduate med- ical course based on the idea of SDL when there were many differ- ent interpretations of the concept amongst the teachers and students involved.4"


의학교육자들은 학습자들이 평생의 업에 걸쳐서 자신의 학습을 관리할 수 있도록 SDL을 도입할 것을 요구받는다.

Medical educators are exhorted to adopt SDL with the principal aim of producing learners who can manage their own learning throughout their careers."


그러나 Coffield는 너무 오랫동안 평생학습이 근거도 없이, 연구도 부족하고, 이론도 없는 채로 비판받지도 않아왔다고 지적한다. SDL에 대해서도 마찬가지다.

Yet Coffield caustically claimed that for too long life-long learning has remained an evidence-free zone, under-researched, under-theorised, unencumbered by doubt and unmoved by criticism .5 The same could probably be said of SDL."


Canday와 Schimidt는 한 상황에서 SDL을 강조하는 것이 다른 상황으로도 전이될 수 있는가에 대한 의문을 가져왔다.

Both Candy3 and Schmidt6 ques- tioned whether encouraging SDL processes in one context would enable self-management or learner control to be transferred to other learning contexts."


Norman은 자기주도적 학습자들이 자기평가에도 효과적이어야 하지만, 연구 근거는 많은 보건전문직이 이것을 잘 하지 못한다는 것으로 나타난다.

Norman draws attention to the fact that self-directed learners need to be effective at self-assessment, yet research evidence suggests that many health professionals (pre- or post-qualification) have difficulties with this.7"


Schmidt는 전문가 양성에 있어서 SDL 기술이 지나치게 강조되어왔음을 지적한다.

Schmidt put forward similar arguments for his claimthat the importance of SDL skills in professional practice had been overemphasised.6"


인지주의자들은 학습에 있어서 개인적 특성을 강조하지만 Candy는 자기주도적 학습이란 집단성보다 개인성을 강조하는 형태로 잘못 인도되어 왔는데, 사실 지식과 학습은 본질적으로 학습자를 다른 학습자와의 관계 속에 두는 것이다 라고 주장했다.

Cognitivists stress the private and individual nature of learning. However, Candy claims: The termself-direction has misled many into elevating the individual above the collective – but the nature of knowledge and learning inherently puts learners in relationship with others .3"











 2005 Apr;39(4):348-9.

Self-directed learning--the importance of concepts and contexts.

PMID:

 

15813753

 

[PubMed - indexed for MEDLINE]


의과대학생들은 임상실습에서 어떻게 자기주도적인 방법으로 학습할까? Design-based research (Med Educ, 2005)

How can medical students learn in a self-directed way in the clinical environment? Design-based research

Tim Dornan,1 Judy Hadfield,1 Martin Brown,2 Henny Boshuizen3 & Albert Scherpbier4






영국의 GMC는 의과대학생에게 자신의 학습 방향을 스스로 결정하라고 하면서 의학교육을 새로운 트랙에 올려놓았다. 성인학습자 원리에 따르면 - 이 권고안의 근간이 되는 - 학습자와 그의 열망은 교수-학습 공식에서 중요한 위치를 갖는다. 그러나 '자기주도학습'이란 말은 단일한, 일반적으로 동의되는 의미를 가지고 있지 않는다. GMC는 스스로의 학습요구를 조직하고 관리하는 것을 의미하였으며, 성인학습자 원리를 처음 주장한 Knowles는 SDL을 스스로의 동기를 찾고 교사와 보다 동등한 관계를 갖는 것을 의도하였다.

The UK General Medical Council (GMC) set medical education off on a new track when it called for medical students to direct their own learning.1 Adult learning principles ,2 which underpin that recom- mendation, have served a useful role in putting the learner and his or her individual aspirations back in the teaching)learning equation .3 However, the term self-direction does not have a single, generally agreed meaning. The GMC meant organising and managing one s own learning needs.’1 Knowles, who first articulated adult learning principles, meant finding one’s own motivation and having a more equal relationship with teachers.2"


PBL에서의 자기주도적 행동 성향이 임상교육현장으로 잘 전이되지 않는 것은 (1)낯섦, 복잡성, 권력관계, 정서적 요인 등이 젊은 학습자들로 하여금 임상환경에서 자발적인 행동을 어렵게 하며, (2)외부의 방향제시가 필요하다는 결론에 이르게 한다.

The limited transfer of self-directed behaviour from PBL to placement learning6 led us to hypothesise that: unfamiliarity, complexity, power differentials and 1 emotional factors make it hard for young learners to act autonomously, and more external direction is needed in the clinical 2 environment."


iSUS는 학습 관리 시스템으로서, 교육과정 목표를 알고 있고, 각 학생에 대하여 그 목표에 대한 자기보고를 지속적으로 추적하여 전체 그룹과의 향상정도를 비교하고 관련된 교육을 받을 수 있게 도와준다. 학생은 예컨대 신장요로계 질환을 더 학습해야 한다고 할 때, 신장투석 클리닉을 관련 학습기회의 장소로 찾아주고, 그 장소에 참석할 수 있도록 예약을 해준다. 여기까지 몇 번의 클릭과 7초 정도만 소요된다. iSUS는 1명의 학생에게만 공간이  가능하다는 것을 알고, 다른 학생의 예약을 차단해준다. 이전 학생으로부터의 피드백은 유용한 선택의 길라잡이가 된다. 학생이 다음 번 로그인하면 iSUS는 그 클리닉에서의 경험에 대해 코멘트하도록 권한다.

Intelligent signup system ) iSUS The learning management system, described in detail elsewhere,12 is intelligent in that it knows the curriculum objectives, keeps track of each student’s self-reported progress towards them, compares pro- gress against the whole peer group, suggests signups or other relevant experiences, and helps access them. A student might, for example, see they need to learn more about renal ⁄ urological disease, particularly chronic renal failure, identify the predialysis clinic as a relevant learning opportunity, and reserve a place to attend it. That would take 7 mouse-clicks and a few seconds. iSUS would know there was space for only 1 student to attend, and restrict booking to someone who had no other timetabled activity. Feedback from previous students would be available online to guide the choice. The next time the student logged on, the system would ask for a free text comment about their experience at the clinic."


질적연구방법

Textual materials were coded, back-referenced to the complete transcripts, assembled into an evolving interpretation, and reduced to a single narrative using NVivo software (QSR, Doncaster, Victoria, Australia). Three types of statement were identified: 

      • 1 Statements describing the behaviour of a student on a specific occasion in a practice setting, coded as performing , observing , discussing or being taught and subcoded as active ( I did ), passive ( I was directed ) or mutual ( We did ). 
      • 2 General descriptions of learning were coded for the degree of autonomy expressed by the student, and sources of guidance or support. 
      • 3 Conceptualisations of self-direction. "


한 명의 연구자가 일차 코딩을 수행하고, 다른 연구자들은 자료를 읽고 그 코딩의 bias, 생략된 부분, 의견의 차이 등을 찾아내였다. 최종 결과는 3 가지의 statement의 카테고리를 삼각층량함으로서 도출하였다.

One researcher carried out the primary coding. Three others read the material and sought bias in his coding, omissions, and disconfirmatory statements. The final conclusions were arrived at by triangulation between the 3 categories of statement, representative examples of which are cited."


1. 임상환경에서 학생의 행동유형

1. Behaviour of students in the clinical environment 

1.a. Performing 

1.b. Observing"

1.c. Discussing 

1.d. Being taught 


2. 학습에 대한 일반적 묘사

2. General descriptions of learning 

2.a. Learning autonomously 

2.b. Learning with support 

2.b.1a. Staff behaviours"

2.b.1b. Guidance"

2.b.1c. Facilitation"

2.b.1d. Feedback"

2.c. Being told what to learn"


3. 자기주도학습의 개념

3. Conceptualisations of self-direction 

3.a. Definition of self-direction"

3.b. The value of self-direction"

3.c. Prerequisite conditions for self-direction"

3.c.1a. Objectives"

3.c.1b. Student attributes"

3.c.1c. The transition into self-direction"

3.c.1d. Support by teachers"



자기주도학습은 운전석에 앉는 것과 같아 자신의 학습에 책임을 지는 것이다. 종종 그 과목에서 부족했던 부분을 보충하는 것일 수도 있지만, 비효율적인 학습방법으로 인식되기도 한다. 한 학생은 direction (명확한 목표가 있는 것)을 motivation (direction이 선행될 때 가능한 것)과 구분하기도 했다. 자율이란 '어떻게'를 선택하는 것이었으며 '무엇을 학습할 것인가'를 결정하는 것을 의미하지 않았다. 학습의 다양성을 인정하면서도 응답자들은 핵심 목표를 다루고 싶어했다.

Self-direction involved being in the driving seat, or being responsible for one’s own learning. At times, it could be a compensation for deficiencies in the course, but that was seen as an inefficient way of learning (3.a.1). One respondent distinguished direction – having clear objectives – from motiva- tion – for which direction was a prerequisite (3.a.2). Autonomy meant being able to choose how, rather than what, to learn (3.a.3). Whilst valuing diversity of learning, respondents were concerned to cover coreobjectives. "



교육과정 평가에서 처음에는 학생의 태도에 실망했지만, 점차 '자기주도적 태도'에 대한 질문을 갖게 했다. 우리 학생들은 학습의 개념화와 행동적 차원에서 능동적이었지만, 교사가 그들을 지지해줄 때 더 능동적이고 동기부여를 받았다. 다양한 형태의 지원이 가능하다.

Curriculum evaluation has moved us from disap- pointment in students’ behaviour to a more ques- tioning attitude towards self-direction. Our students were active in their actions and conceptualisations of learning. However, they were most active and motivated when teachers supported them. Support took several forms:"


    • organisational: opening up learning opportunities, particularly those that involved students in patient care; 
    • pedagogic: suggesting objectives or methods, training skills, giving feedback, explaining con- cepts, and 
    • affective: giving permission, helping students through the transition to a more independent learning style, nurturing, and placing demands."


본 연구는 문제바탕학습 방법이 왜 임상교육으로 자동적으로 전이되지 않는가를 설명해준다. 방향제시와 동기부여의 원천, 그리고 그들의 관계가 PBL과 임상교육에서 차이가 있었다. 튜터가 관리하는 PBL 그룹에서 불확실성은 그들로 하여금 지적인 호기심을 유발하고 학습목표로 이끈다.

This study confirms and helps explain our previous observation that problem-based methods do not transfer automatically to the clinical environment.6 The sources of direction and motivation, and their interrelationship, are different in PBL and place- ment learning. Uncertainty in a tutored PBL group in a seminar room motivates students by generating epistemic curiosity 14 and leads them to learning objectives."



임상환경은 세미나실보다 더 위협적이다. 여기서 불확실성은 동기를 부여해주기보다는 동기를 깎아먹으며, 지지적인 의사와의 사회적 관계가 학생들에게 더 동기부여가 되었다. 불행하게도, PBL과 임상교육 간의 교차(crossover)가 없는 것은 임상환경에서 학생들에게 주어진 학습목표가 무엇인지를 애매하게 만든다. iSUS는 동기부여의 잃어버린 한 조작을 찾게 해준다.

The clinical environment is much more threatening than the seminar room. Thus, uncer- tainty is more of a demotivator than a motivator. Social interaction with a supportive practitioner gave our students the direction they needed to become motivated. Unfortunately, the lack of crossover between problem-based and placement learning leaves our teachers and learners very unclear about the intended learning outcomes of their clinical placements.6 iSUS, it seems, can provide the missing piece of the motivational jigsaw."



본 연구와 다른 연구들은 자기주도적 학습이 임상환경에서의 기본적 전문성 교육에는 적용하기 어려움을 보여준다. 또한 Knowles 자신조차도 나중에는 정서적 지지가 중요함을 강조한 바 있다. 다른 연구자들은 자기주도적 행동과 주변 환경과의 상호작용을 강조한다. 이는 교육에 대한 인지적 이해(cognitive perception)에서는 잘 드러나지 않았던 것이다. Miflin 등은 교사가 임상교육의 필수적 조건임을 찾아낸 바 있다. PBL에 cognitive foundation을 제시한 Schmidt조차도 나중에는 임상환경에서 진정으로 자기주도적이 되기 얼마나 어려운지, PBL에서 임상환경으로 전이되는 것이 얼마나 어려운지 강조하기도 했다.

Our own and others’ research leads us to suggest that self-direction , as literally applied by many of our teachers, is inapplicable to basic professional education in the clinical environment. It is note- worthy that Knowles himself re-emphasised the place of affective support in late publications.15 Other writers have emphasised the interaction between self-directed behaviour and contextual factors,16 which are less emphasised in cognitive conceptions of education17 than new social theories.18 Miflin and colleagues found, like us, that the teacher is a vital condition for clinical learning.5 Their analysis anti- cipated, and has strong similarities with, our own critique of self-direction. Schmidt, who gave the problem-based method its cognitive foundation, later reaffirmed its benefits but recognised how difficult it is to develop truly self-directed behaviours and transfer them from PBL to the clinical envi- ronment.19"



우리는 임상교육을 PBL과 상보적일 수는 있어도 서로 별개의 것으로 바라보아야 한다는 결론에 이르렀다. PBL에서는 동료그룹 내에서라도 자율이 핵심이지만, 임상교육에서는 지지적 참여가 핵심 조건이 된다.

We conclude that placement learning should be seen as separate from, although complementary to, PBL. Whereas autonomy (albeit within a peer group) is key to PBL, supported participation is a core condition for placement learning."


적절한 지원이 있으면 의과대학생들은 매우 능동적인 학습자가 될 수 있으며, Harden 등은 이를 직무기반학습 이라는 새로운 관점에서 보았다.

Harden et al. approached the same problem from a different angle when they devised task-based learning .20"











 2005 Apr;39(4):356-64.

How can medical students learn in a self-directed way in the clinical environmentDesign-based research.

Author information

  • 1Hope Hospital, University of Manchester School of Medicine, Stott Lane, Salford, Manchester M6 8HD, UK. Tim.Dornan@Manchester.ac.uk

Abstract

AIM:

This study aimed to establish whether and under what conditions medical students can learn in a self-directed manner in the clinicalenvironment.

METHOD:

A web-based learning management system brought 66 placement students, in a problem-based learning (PBL) medical curriculum, into closer touch with their clinical learning objectives and ways of achieving them. Free response comments from 16 of them during the 7 weeks they used it, transcripts of group discussions before and after the period of use, and responses from all 66 students to a questionnaire were analysed qualitatively.

RESULTS:

Students were rarely fully autonomous or subservient. They valued affective and pedagogic support, and relied on teachers to manage their learning environment. With support, they were motivated and able to choose how and when to meet their learning needs. The new system was a useful adjunct.

CONCLUSIONS:

Self-direction, interpreted literally, was a method of learning that students defaulted to when support and guidance were lacking. They found "supported participation" more valuable. Learning in the clinical environment was a social process with as many differences from, as similarities to, PBL.


학부 교육과정이 학생의 SDL에 미치는 영향 (Acad Med, 2003)

Effect of an Undergraduate Medical Curriculum on Students’ Self-Directed Learning 

Bart J. Harvey MD, PhD, Arthur I. Rothman, EdD, and Richard C. Frecker, MD, PhD





1992년, U of Toronto는 전통적인 강의중심의 교육과정을 소그룹의, 문제중심의, SDL을 장려하는 교육과정으로 개편하였다. 개편한 교육과정은 Barrows가 묘사한 "학생이 경험을 통해 자극받고, 그들이 배우는 내용이 미래에 그들이 감당할 책임과 어떻게 연결되는지 깨닫고, 높은 수준의 학습동기를 유지하고, 전문가적 태도의 중요성을 깨닫는" 것을 목표하였다. 우리는 교육과정 변화가 학생의 SDL에 어떻게 영향을 주는지 보고자 했다.

In 1992, at least in part to address this curricular goal and to enhance students’ self-directed learning (SDL), the Uni- versity of Toronto Faculty of Medicine revised its conventional, lecture-based medical curriculum into a “hybrid,” replacing much of the curriculum time devoted to large-group didactic lectures with small-group, problem-based, and SDL opportunities. The resulting curric- ular changes were designed and implemented to achieve the ideals Barrows describes: “That students would be stimulated by the experience, would see the relevance of what they were learning to their future responsibilities, would maintain a high degree of motivation for learning, and would begin to under- stand the importance of responsible pro- fessional attitudes.”4 We undertook this study to begin to learn whether the curricular revision enhanced students’ SDL (and, ultimately, their abilities as effective lifelong learners)."



총 280명의 학생이 참여하였으며, 각각 학년에서 70명 무작위 선택. 이 숫자를 선택한 이유는 60%응답자가 0.80의 study power와 5% difference를 제공하기 때문이다.

Participants A total of 280 students, 70 from each of the four years of the undergraduate med- ical curriculum, were randomly selected fromthe school’s population of 700. We chose this number because calculations showed that 60 respondents per class (85% response rate) would provide a study power of .80 to detect ( .05) a 5% difference among the four years.23"


네 가지 SDL 요소에 따라서 Rayn은 간략한 두 파트로 된 설문지를 개발하여 SDL의 중요도와 자기주도적 학습자로서의 능력을 평가하게끔 했다.

Guided by these four SDL components, Ryan6 developed and administered a brief, two-part questionnaire to assess students’ perceptions concerning the importance of SDL and their abilities as self-directed learners."


두 개의 가장 널리 사용되는 SDL 척도는 SDLRS와 OCLI이다. 타당도와 신뢰도에 대한 여러 연구가 되어있다.

The two most widely recognized, extensively used, and validated instruments for measuring SDL capability and readiness13–15 are Guglielmino’s Self- Directed Learning Readiness Scale (SDLRS)16 and Oddi’s Continuing Learning Inventory (OCLI).17,18 Several assessments of the reliability and valid- ity of the OCLI and SDLRS have been conducted,14 including dissertations re- porting positive associations between instrument scores and SDL activity."



Ryan의 설문지는 SDL의 네 가지 요소에 대한 것이며 그 요소들에 대해서 중요도와 능력을 0부터 6까지 응답하게 되어있다. SDLR은 58개의 명제로 되어 있으며 5점 척도로 응답하게 되어있다. OCLI는 24개의 명ㅈ제로 되어있으며 7점 척도로 응답하게 되어있다.

Ryan’s questionnaire asks respondents to consider the four identified components of SDL and rate each, from low (0) to high (6), on its importance and their ability with the component.6 The SDLRS contains 58 statements (e.g., “I learn several new things on my own each year”) with five-point responses, ranging from “Almost never true of me; I hardly ever feel this way” to “Almost always true of me; there are very few times when I don’t feel this way.” Total scores range from 58 (least ready for SDL) to 290 (most ready).16 The OCLI contains 24 statements (e.g., “I work more effectively if I have freedom to regulate myself”) with seven- point responses, ranging from “Strongly Disagree” to “Strongly Agree.” Total scores range from 24 (least characteris- tic of self-directed learners) to 168 (most characteristic).17,18 Brockett and Hiemstra,14 in their review of the use and validity of the SDLRS and OCLI, concluded that both are well-accepted measures of SDL."









여학생과 남학생은 SDL 점수가 비슷했다. SDLRS와 OCLI Ryan ability 점수는 나이가 높아짐에 따라 높아졌으며, 의예과 이전에 교육 수준이 높을수록 높았다. 그러나 여러 요인을 포함한 MLR 분석에서는 premedical education만이 유의한 요인으로 남았다.

Women and men had similar SDL scores. SDLRS, OCLI, and Ryanability scores increased significantly by age (data not shown) and highest level of premedical education achieved (under- graduate only, masters, or doctoral). However, only premedical education re- mained significant in a multivariate lin- ear regression including both factors."


SDLRS, OCLI, Ryan importance 점수가 학년간 유의하게 달랐지만, 지속적인 경향은 Ryan importance에서만 확인되었다. 

Although a significant between-yeardifference was found for the SDLRS(p .028), OCLI (p .011), andRyanimportance (p .021), a significanttrend by year was only evident forRyanimportance scores (p .007). Thistrend, however, indicated a decrease inperceived SDL importance by curricularyear."



SDL이 학년 간 차이가 컸지만, 교육과정을 진행함에 따라서 지속되는 경향은 없었고 1학년이 가장 높고, 2학년에서 가장 낮은 경향이 있었다.

Although significant interyear SDL differences were found, SDL scores did not follow a trend consistent with progression through the curriculum, with the first and second years having consis- tently the highest and lowest scores, respectively."


우리는 네 개의 SDLRS 문항이 학년에 따라 경향성을 갖는 것을 확인하였다. 그러나 이 모든 문항이 학년이 올라가면서 SDL이 감소하는 것으로 나타났다.

we found four, albeit different, SDLRS items suggesting a trend by curricular year. These, however, all indicated decreasing SDL into senior years."


본 연구 결과가 교육과정이 SDL을 촉진하지 않는다는 것을 보여주나 다른 해석도 봐야 한다.

Although the results of our study suggest that the curriculum does not foster students’ SDL, alternative explanations should be considered."


평가도구들은 SDL 변화를 보여주기에 충분히 민감한가? 두 가지를 보면 그러하다. (1) 모든 세 가지 평가도구가 유사한 결과를 보여준다. 즉, SDL이 premedical education 수준이 높을수록 높아지는 것이다. (2) 세 척도의 유사성이 우연히 나타났을 가능성은 낮다. 또한 응답률이 거의 일정했으므로 응답 편향의 가능성도 낮다.

the instruments sufficiently sensitive to detect SDL progress? Two factors sug- gest that they are: (1) all three instru- ments provide similar findings, with each able to detect the significant in- creasing trend in SDL associated with higher levels of premedical education; and (2) the similarity of results for each of the three measures of SDL ability (i.e., SDLRS, OCLI, and Ryanability) suggests that the significant results ob- served are unlikely to have occurred by chance (i.e., as a result of the multiple comparisons conducted). Further, re- sponse bias is not a likely explanation for the study’s inability to detect a year- by-year SDL trend because the response rates across the four years were uni- formly high—in excess of 85%."


추가적으로 고려할 점은, 이것이 단면연구라는 점이다. 

An additional consideration, how- ever, is the study’s cross-sectional design. Although this design is more efficient than a longitudinal approach, actual changes in SDL are not measured in the same groups of students over time. Instead, the cross-sectional design assumes the comparability of the four classes. Although the admission proce- dures and curriculum were similar for each of the four years, the failure to detect SDL progress over the curriculum could be the result of unmeasured differences between two or more of the classes. 







 2003 Dec;78(12):1259-65.

Effect of an undergraduate medical curriculum on students' self-directed learning.

Author information

  • 1Department of Public Health Sciences, Institute of Medical Science, Ontario Institute for Studies in Education, Toronto, Canada.

Abstract

PURPOSE:

Lifelong, self-directed learning (SDL) has been identified as an important ability for medical graduates. To evaluate the effect of the University of Toronto Faculty of Medicine's revised undergraduate medical curriculum on students' SDL, a cross-sectional study was conducted.

METHOD:

A questionnaire package was mailed to 280 randomly selected students, 70 from each of the four years of the curriculum. The package contained the two most widely recognized, extensively used, and validated instruments of SDL (Guglielmino's 58-item Self-Directed LearningReadiness Scale and Oddi's 24-item Continuous Learning Inventory) and Ryan's two-part Self-Assessment Questionnaire. An identification number and sociodemographic questions were included with the questionnaires. Data analysis was completed using chi-square for differences of proportions, analysis of variance for differences between means, and linear regression for trends.

RESULTS:

A total of 250 (89.3%) complete questionnaire packages were returned. No significant trend in SDL was evident by curriculum year, and similar SDL levels were observed for women and men. However, a significant positive trend in SDL was found with the highest level of premedicaleducation achieved (undergraduate only, masters, or doctoral). Further, students' perceptions concerning the importance of SDL decreased according to year in the curriculum.

CONCLUSION:

This study found no evidence that students' self-reported SDL is positively influenced by the current undergraduate medicalcurriculum at the University of Toronto Faculty of Medicine.

PMID:
 
14660430
 
[PubMed - indexed for MEDLINE]


보건의료인 교육에서 자기주도학습(SDL)의 효과: systematic review (Med Educ, 2010)

The effectiveness of self-directed learning in health professions education: a systematic review

Mohammad H Murad,1,2 Fernando Coto-Yglesias,3 Prathibha Varkey,1 Larry J Prokop4 &

Angela L Murad2






과연 SDL이 효과가 있는가에 대해 답하고자 할 때 어려운 점은 SDL을 정의하는 것의 어려움과 SDL 기반의 교육과정이 매우 이질적이라는 것이다. 1975년 Malcolm Knowles는 가장 흔하게 인용되면서 가장 포괄적인 SDL의 정의를 다음과 같이 내렸다. 

"SDL은 개개인이 이니셔티브를 쥐고, 다른 사람의 도움이 있거나 없는 환경에서 스스로의 학습 요구를 파악하고 목표를 설정하고 학습에 필요한 인적 자원과 물적 자원을 찾아서 적절한 학습 전략에 도입하고, 그 결과를 평가하는 것을 말한다.'

The main challenges to answering this question involve the difficulty of defining SDL and the heterogeneity of SDL-based curricula. In 1975, Malcolm Knowles provided one of the most commonly cited and comprehensive definitions of SDL: ‘SDL is a process in which individuals take the initiative, with or without the help of others, in diagnosing their learning needs, formulating goals, identifying human and material resources for learn- ing, choosing and implementing appropriate learn- ing strategies, and evaluating learning outcomes.’4"


Knowles는 SDL의 여러 주요 요소에 대해서 말했다.

    • the educator should be a facilitator of learning and not a content source; 
    • learners should be involved in identifying their learning needs, objectives and resources, and 
    • learners should be involved in implementing the learning process, should commit to a learning contract and should evaluate the learning process.4


Knowles described several essential components of SDL: the educator should be a facilitator of learning and not a content source; learners should be involved in identifying their learning needs, objectives and resources, and learners should be involved in imple- menting the learning process, should commit to a learning contract and should evaluate the learning process.4"


다양한 교육적 인터벤션은 SDL의 일부 요소들을 공유하며, 아주 제한된 요소만 가지고 그렇게 불리기도 한다.

Numerous educational interventions share some elements of SDL and are often labelled as such,"


그러나 학습자들이 진정으로 자기주도성을 갖추기 위해서는 Knowles가 내린 정의의 다른 요소들도 포함되어야 한다.

Yet, for learners to be truly self-directed, some of the other components contained in Knowles’ definition4 should be incorporated in the learning process"


학습자들은 42%에서는 무작위 배정되었지만, 무작위 배정의 세부 사항이나 질적 평가를 위한 연구의 다른 특징들은 제대로 보고되지 않은 경우가 많다.

Learners were randomly allocated to SDL in 25 studies (42%); however, details about randomisation and other study characteristics needed for quality assessment were poorly reported."


분석한 연구 논문들의 목록

A detailed description of included studies is presented in Table S1."


40개의 연구가 지식 영역, 9개 연구가 술기 영역, 5개 연구가 태도 영역

Forty studies reported outcomes in the knowledge domain, nine studies reported outcomes inthe skills domain, and five reported outcomes intheattitudes domain."


상호작용화된 컴퓨터 모듈 사용이 13개 연구, 나머지는 비-상호작용 (책, 시청각자료) 자료 활용

Learning resources included interactive computer- ised modules in 13 studies (22%) and non-interactive (reading materials, audiovisual resources) in the remaining studies."


지식 영역에서는 중등도의 향상이, 태도와 술기 영역에서는 통계적으로 유의미하지 않은 매우 작은 향상이 있었음.

When data were pooled in meta-analysis and out- comes compared with those of traditional teaching methods, SDL was associated with a moderate increase in the knowledge domain (SMD 0.45, 95% CI 0.23–0.67; I2 = 92%) (Fig. 2), a trivial and non- statistically significant increase in the skills domain (SMD 0.05, 95% CI ) 0.05 to 0.22; I2 = 2%) (Fig. 3), and a non-significant increase in the attitudes domain (SMD 0.39, 95%CI ) 0.03 to 0.81; I2 = 91%) (Fig. 4)."



학습자들의 learning resource를 선택할 경우에 다른 경우에 비해서 지식 영역에 통계적으로 유의미한 향상이 있었다.

There was a statistically significant interaction suggesting that when learners were involved in choosing learning resources, they made larger improvements in the knowledge domain."


SDL 인터벤션의 기간과 effect size 사이에는 상관관계가 없었다. 

In meta-regression, there was no correlation betweenthe observed effect size and the length of SDL intervention (p = 0.64) or the time interval between the completion of the intervention and outcome assessment (p = 0.14)."


SDL은 학습자들이 학습자원 탐색에 참여했을 때 더 효과적이었다.

In addition, SDL seemed to be more effective when learners were involved in identifying their learning resources."


Knowles는 자기주도적 학습자는 교육자와의 상담을 통해서 그들의 스타일과 교육과정의 목표에 가장 잘 맞는 학습 방법과 학습 자원을 찾는다고 주장하였다. 예컨대 인지적 목표는 written resource나 panel discussion을 통해서, 행동적 목표는 role-play난 case-based learning을 통해서, 정의운동 목표는 role-play와 simulation을 통해서 가장 잘 달성될 수 있다. 유사하게 자기주도적 학습자는 자신의 학습 유형에 가장 잘 부합하는 학습 방법을 선택할 수 있는 능력이 있다.

Knowles suggested that learners who are self-directed should consult with educators and determine the methods and resources that best fit their learning style and the curriculum objectives.4 For example, cognitive objectives can be achieved using written resources or panel discussions; behavioural objectives can be attained using role-play and case-based learning, and psychomotor objectives are best fulfilled by role-play and simulation. Simi- larly, self-directed learners should have the ability to choose the learning method that suits their individual learning styles (e.g. a visual learner may choose a video-based method, etc.)."



이렇게 이질적인 분석결과를 다음과 같이 설명할 수 있다. (1)SDL의 잇점은 학습자가 학습 방법, 전략, 자원을 선택할 때 가장 잘 나타난다. (2)초보 학습자보다 advanced 학습자가 SDL로부터 더 많이 배운다 (3)전공(학과)에 따라서 SDL 의 기대 효과가 다르다.

We were able to partially explain the heterogeneity by finding that: (i) the benefit of SDL increases when learners are involved in choosing their learning methods, strategies and resources, a key component that defines SDL according to Know- les;4 (ii) advanced learners may benefit more from SDL compared with less advanced learners, and (iii) learner type (discipline) may also affect the anticipated benefits of SDL (nurses had a larger SMD compared with other health professionals)."


이 리뷰에서 추론할 수 있는 것은 SDL이 기존의 교육방법과 거의 비슷한 수준으로 효과적이라는 것이다. SDL은 특정 상황에서 추천되어왔으며(성인학습자, advanced 학습자, 학교 시설과 교사에 대한 접근이 어려운 상황), 학습할 내용이 많을 때 보조적인 방법으로 사용되어왔다. SDL은 비용-효과적이기도 하다.

Inference from this review implies that SDL is likely to be as effective as traditional learning methods. Self-directed learning has been suggested in certain settings (e.g. for adult learners and advanced learn- ers, and in contexts in which access to academic institutions or teachers is limited) and as a supple- mental method of learning when learning content is large.6 It is also plausible that SDL is cost-effective."




SDL에 대한 다양한 묘사가 존재하지만 Knowles가 아마 가장 포괄적이고 가장 빈번하게 인용될 것이다. 그 정의는 하지만 타당화되지 않았으며, 교육자들은 교육과정을 개발하면서 Knowles가 말한 요소 일부 중 그들의 환경에 활용가능하고 관련성이 있어 보이는 부분만 포함시킨다.

Although several descriptions of SDL exist, that byKnowles4 is perhaps the most comprehensive and most frequently cited. This definition has not beennecessarily validated; thus, educators developing SDL curricula have incorporated some of the elements described by Knowles as they have deemed relevant and feasible in their learning environments"


현재까지는, 우리의 판단은 SDL은 기존의 교육방법에 비해서 세 가지 영역 모두에서 비슷하게 효과적이라는 것이다.

At present and according to our findings, we believe that SDL in health professions education is at least as effective as traditional learning in all three domains."


SDL은 지식 영역에서 더 효과적인 것으로 보인다. 우리는 교육자들이 SDL 교육과정을 도입할 때 다음을 고려해야 한다.

Self-directed learning may preferentially be more effective in the knowledge domain. We recommend that educators embarking on developing SDL curricula for learners in health professions should: 

    • involve learners in choosing learning resources and strategies to enable them to find the most appropriate resources to fit their individual learning styles as well as the overall learning objective; 
    • consider SDL as an effective strategy for more advanced learners (e.g. those in the later years of medical school or residency and doctors in practice), and 
    • consider SDL particularly when the learning outcome falls in the knowledge domain."




 2010 Nov;44(11):1057-68. doi: 10.1111/j.1365-2923.2010.03750.x.

The effectiveness of self-directed learning in health professions education: a systematic review.

Author information

  • 1Division of Preventive, Occupational and Aerospace Medicine, Mayo Clinic, Rochester, Minnesota 55905, USA. Murad.mohammad@mayo.edu

Abstract

OBJECTIVES:

Given the continuous advances in the biomedical sciences, health care professionals need to develop the skills necessary for life-long learningSelf-directed learning (SDL) is suggested as the methodology of choice in this context. The purpose of this systematic review is to determine the effectiveness of SDL in improving learning outcomes in health professionals.

METHODS:

We searched MEDLINE, EMBASE, ERIC and PsycINFO through to August 2009. Eligible studies were comparative and evaluated the effect of SDL interventions on learning outcomes in the domains of knowledge, skills and attitudes. Two reviewers working independently selected studies and extracted data. Standardised mean difference (SMD) and 95% confidence intervals (95% CIs) were estimated from each study and pooled using random-effects meta-analysis.

RESULTS:

The final analysis included 59 studies that enrolled 8011 learners. Twenty-five studies (42%) were randomised. The overall methodological quality of the studies was moderate. Compared with traditional teaching methods, SDL was associated with a moderate increase inthe knowledge domain (SMD 0.45, 95% CI 0.23-0.67), a trivial and non-statistically significant increase in the skills domain (SMD 0.05, 95% CI-0.05 to 0.22), and a non-significant increase in the attitudes domain (SMD 0.39, 95% CI-0.03 to 0.81). Heterogeneity was significant in all analyses. When learners were involved in choosing learning resources, SDL was more effective. Advanced learners seemed to benefit more from SDL.

CONCLUSIONS:

Moderate quality evidence suggests that SDL in health professions education is associated with moderate improvement in the knowledge domain compared with traditional teaching methods and may be as effective in the skills and attitudes domains.

© Blackwell Publishing Ltd 2010.










"나는 절대 프로 축구선수는 못 될거야" - 자기평가의 오류들 (J Contin Educ Health Prof. 2008)

“I’ll Never Play Professional Football” and Other Fallacies of Self-Assessment

KEVIN W. EVA, PHD; GLENN REGEHR, PHD





자기평가라는 용어가 사용되는 다양한 맥락 각각은 '자기평가'라는 용어의 사용을 정당화해줄지도 모르지만, 종합적으로 보면 그렇게 다양한 개념들을 하나의 이름 아래 두는 것은 교육자와 이론가 모두에게 혼란과 갈등의 원인이 될 뿐이다.

While each of these contexts may, individually, be a justifiable use of the term self-assessment, collectively, couching such very different concepts under a single label can be a significant source of confusion and conflict for educators and theoreticians alike."


Self-assessment라는 용어가 여러 커뮤니티에 수출되어 사용됨에 따라서 모든 것을 포괄하는 용어가 되었고, 결국 "어떻게도 정의되지 않는" 것이 되어버렸다.

It is important, however, to recognize that when such terms are exported to the larger community, they run the risk of becoming so all- encompassing as to include everything and, therefore, ul- timately define nothing."


Lingard와 Haber는 "우리가 쓰는 용어가 우리가 가질 수 있는 사고의 폭을 열어주기도 하고 제한하기도 한다"

Lingard and Haber have stated that “the language we use both makes possible and constrains the thoughts we can have.”4"




Self-Assessment Versus Self-Directed Assessment Seeking"


보건의료인에게 있어 자기평가가 CPD cycle에서 매우 중요하다는 것은 일반적으로 잘 알려진 사실이다. 이는, '자기조절에 능한 전문직'의 원형이 계속적 교육 활동에 대한 길잡이로서 정기적으로 자신의 약점을 찾아내는 것이기 때문이며, 이를 통해서 현실에서의 격차를 좁혀나가는 것이다. 이러한 점에서 '자기평가'란 종종 은연중에 개인적이고, 누군가 지도해주지 않는 와중에 이뤄지는 성찰과정으로 여겨진다. 예컨대 이러한 개념은 Colliver가 말한 "니 점수를 맞춰봐" 형식의 자기평가 연구 모델에 부합하는 것이다. 이러한 연구 결과의 결과는 '자기평가 점수는 대체로 정확하지 못하다'라는 결과를 반복해서 생산해냈다.

It seems generally well accepted in the health professions that self-assessment is a key step in the continuing pro- fessional development cycle. That is, the archetype of the self-regulating professional is seen as one who regularly self- identifies areas of professional weakness for the purposes of guiding continuing education activities that will overcome these gaps in practice.6 In this construction, self-assessment is often ~implicitly or otherwise! conceptualized as a per- sonal, unguided reflection on performance for the purposes of generating an individually derived summary of one’s own level of knowledge, skill, and understanding in a particular area. For example, this conceptualization would appear to be the only reasonable basis for studies that fit into what Colliver has described as the “guess your grade” model of self-assessment research,7 the results of which formthe core foundation for the recurring conclusion that self-assessment is generally poor.8"


이러한 "지도받지 않는, 내적으로 생성되는" 자기평가에 대한 구조는 Boud가 말한 "자기평가란 고립되고 개인적인 활동이 아니라 동료와, 교사와 다른 정보원을 동반하게 된다"와 대비된다.

This “unguided, internally generated” construction of self- assessment stands in stark contrast to the model put forward by Boud, who argued that “the phrase self-assessment should not imply an isolated or individualistic activity; it should com- monly involve peers, teachers, and other sources of in- formation.”9"



Boud가 묘사한 자기평가는 '바깥을 바라보고' '외부적으로 피드백을 찾으며' '외부로부터 정보를 찾고' '이러한 외부 정보원 평가에 활용하여' '수행능력의 향상을 이뤄내는 것'이다. 이러한 측면에서 자기평가는 자신을 평가하는 교육학적 전략 그 이상이며, 한 사람이 마스터해야하는 능력이 아니라, 길러야하는 습관에 가깝다.

The conceptualization of self-assessment as enunciated in Boud’s description would appear to involve a process by which one takes personal responsibility for look- ing outward, explicitly seeking feedback and information from external sources, then using these externally generated sources of assessment data to direct performance improvements. In this construction, self-assessment is more of a pedagogical strategy than an ability to judge for oneself; it is a habit that one needs to acquire and enact rather than an ability that one needs to master."


자기평가의 정확성에 대한 근거에서 나타나는 것은 분명하다. '우리는 그것을 잘 못한다' 그러나 '자기주도적 평가 탐색'의 습관이 가르쳐질 수 있는 것인지, 그리고 그것이 다양한 맥락에 걸쳐서 적용가능한지, 아니면 이러한 활동을 의도적으로 교육학적으로 포함시키는 것이 과연 바람직한지는 알지 못한다.

While the evidence pertaining to the accuracy of self- assessment as an ability is robust and clear—we do not do it well—there appears to be little research that directly tests whether or not the habit of self-directed assessment seeking can be taught in a manner that leads the learner to apply the habit cross-contextually, or whether intentionally engaging in this sort of activity is pedagogically advantageous. It is"


재미있는 사실은 자기평가라는 단어가 원래는 자기 기입형 다지선다형 시험과 같은 자기주도적 평가를 촉진하기 위한 문헌들로부터 의학교육계에 들어오게 되었다는 사실이다.

interesting to note that the phrase self-assessment originally made its way into the medical education lexicon by virtue of papers that were promoting self-directed assessments such as self-administered multiple-choice question exams;10"



Self-Assessment Versus Reflection"


인간은 스스로에 대한 총괄평가 결과를 내리는 것을 잘 못한다는 것은 여러 근거로부터 드러난 것 뿐만 아니라, 사람은 원래 이러한 형태의 자기평가를 잘 못하도록 태어난 것이기도 하다. 그 이유로는 여러 인지적 이유(정보 무시, 기억 편향), 사회생물학적 이유(긍정적 전망을 하도록 적응됨), 사회적 이유(동료와 상관으로부터 언제나 적절한 피드백을 받는 것은 아님) 등이 있다.

We, along with many others, have argued ~and continue to believe! that the evidence reveals not merely that humans are poor at producing self-generated summative assessments of their own performance or ability, but that humans are actually predisposed to being poor at this form of self-assessment. There are cognitive reasons ~eg, information neglect and memory biases!,11 sociobiological reasons ~it being adap- tive to maintain an optimistic outlook!,12 and social rea- sons ~eg, not always receiving adequate feedback from peers and supervisors!13"



그러나 사람이 자기평가를 잘 못한다는 결과가 수행능력에 대한 성찰이 무의미한 활동이라는 것을 의미하는 것은 아니다.

The conclusion that humans do not self-assess well, however, should in no way imply that reflection on performance is a useless activity."



즉, 자기성찰은 '왜 환자의 건강상태가 이러한 방식으로 악화되는가'를 이해하는 것, 혹은 '왜 어떤 사회적 관계가 특별히 성공적이었는가'를 이해하는 것 등의 활동이다. 이러한 "왜" 질문에 대한 답을 찾는 것은 교육학적 전략으로 매우 효과적임이 밝혀졌고, 세상에 대한 이해와 스스로 그것을 구성하는 방식에 도움이 된다. 이러한 방식으로 정신적 에너지를 재투자하는 경향의 차이가 진정한 전문가와 경험만 많은 비전문가를 결정짓는 요인이 된다. 그러나 다시 한번 강조하는데, 이러한 형태의 자기성찰적 행동 - 세상을 더 잘이해하기 위한 - 은 자기평가에 대한 능력과 동일하게 평가될 수 없다. '왜' 질문을 효과적인 방식으로 하는 것이 반드시 스스로의 지식과 능력 수준을 아는 것을 필요로 하지는 않기 때문이다.

Thus, reflection involves activities such as try- ing to understand why a patient’s health state deteriorated in the way it did or why a social interaction went partic- ularly well. Exploring these sorts of “why” questions may very well prove to be an effective pedagogical strategy that can lead to better understanding of both the world and the adequacy of one’s own personal constructions of it. Certainly the expertise literature would suggest that the tendency to reinvest mental energy in this way is a defin- ing determinant of who achieves true expert status in any given field and who evolves into an “experienced non- expert.”15 Again, however, promoting those sorts of re- flective behaviors—aimed at understanding the world better—should not be considered the same as promot- ing self-assessment as a mechanism for judging personal competence. Asking “why” in an effective manner does not require insight into one’s own level of knowledge or abilities,"




Self-Assessment Versus Self-Monitoring"



누군가는 자신의 약점을 인지할 수도 있다. 그러나 여전히 그러한 인식이 더 포괄적인 자기개념에 영향을 주지 않을 수도 있다. 부족한 수행능력에 대한 원인을 "~만 아니었으면 되었을텐데"와 같은 방식으로 깎아내리는 것은 매우 자연스러운 현상이다. 그러나 그러한 식으로 우리의 즉각적 수행능력의 효과성에 대해 인식하는 것으로는 그것들이 모여서 정확한 자기평가를 이루는 것을 보장해주지 않는다.

One may recognize weak- nesses in performance, in the moment, but still not have those observations impact upon one’s broader self-concept. It is easy ~and typically automatic! to find reasons to dis- count negative performances by saying things like “If only @insert favourite excuse here# hadn’t occurred.”18 The end result is that a series of moments in which we are aware of the effectiveness of our immediate performance does not guarantee that those moments will be aggregated to generate an accurate self-assessment overall."


한 의사가 자기의 역량을 넘어서는 특정한 상황을 인지하는 능력은 환자 안전의 중요한 결정요인이 된다. 그리고 이것은 그 의사가 지식과 술기의 격차를 효과적으로 극복해나가기 위해서 더 포괄적인 차원에서의 지속적 교육을 받을 수 있느냐보다 중요하다.

A physician’s ability to recognize when a particular situation is beyond his boundaries of compe- tence is likely to be a greater determinant of patient safety than whether or not that physician is able to determine which broad-based continuing education activities would most ef- fectively fill his gaps in knowledge0skill."



자기평가와 자기주도적 평가 탐색을 비교하였다. 또한 자기평가와 자기성찰을 비교하였고, 마지막으로 자기평가와 자기모니터링을 비교하였다.

In this section, we have contrasted self-assessment ~an abil- ity! with self-directed assessment seeking ~a pedagogical strat- egy! as potential mechanisms for determining one’s areas of strength and weakness, we have contrasted self-assessment ~an ability! with reflection ~a pedagogical strategy! as po- tential mechanisms for improving one’s understanding of the world, and we have contrasted self-assessment ~an ability! with self-monitoring ~an immediate contextually relevant re- sponse to environmental stimuli! as potential mechanisms for determining the need to recruit additional resources to facilitate performance in particular situations."


The Illusion of Personal Accuracy in Self-Assessment


"자기평가를 잘 하지 못한다"라는 말을 들으면 가장 먼저 보이는 반응은 "'걔네'는 왜 그렇데?"이다. 한발 더 나아가서 만약에 "우리"가 "그들"로 하여금 자기평가를 잘 하게 만들 수 있을 것이다라는 믿음을 갖기에 이른다. 이러한 반응은 무척 팽배해서 역설적으로 대부분의 사람들이 자신의 자기평가능력이 평균 이상이라고 생각하고 있다.

The most common response to the findings that self-assessment is poor appears to be bewilderment at how“they” can be so bad, with a concomitant belief that if “we” can just get “them” to self-assess as well as we do, then everything will be okay. This reaction is suf- ficiently pervasive that, ironically, the majority of people think they are above average in self-assessment ability.19"


자기평가에 대한 최근에 경험이 있는데, 한 사람이 자기는 자기평가에 아주 능하다면서, 절대로 자신은 전문 축구선수는 되지 못할 것이라고 말했고, 이것이 바로 자신의 한계를 잘 안다는 증거라고 말했다. 그러나 이것은 사람들이 자기평가에 대해서 생각하는 세 가지 흔한 실수를 보여준다.

At a recent reception following a talk on self- assessment, a colleague suggested that his self-assessment was fine—after all, he knewhe was never going to be a pro- fessional football player, so he clearly knew his limitations. This sort of claim raises three issues that highlight some of the pitfalls regarding thinking about self-assessment."


첫 번째로, 운동능력의 수준에 대한 평가와 같이 객관적으로 관찰가능한 성과에 대한 평가는 인지적 능력 혹은 덜 객관적인 신체적 능력에 대한 평가와 다르다. 이런 운동능력은 인지적 태도와 달리 외부 정보에 의해서 감지 가능하며 특정 수행을 하기 위한 내적 정신적 과정과 다르다.

First, judging the quality of physical skills for which there is an objectively observable outcome is probably impor- tantly different from judging cognitive aptitudes or less ob- jective physical skills in that, unlike cognitive aptitudes, the mental processes required to judge the quality of physical performances are different ~often derived from external in- formation conveyed via the senses! fromthe internal mental processes required to enact the performance.20"


두 번째로 여기에는 논리적 헛점이 있다. 자세히 말하자면, 세상을 2x2 테이블로 본다고 하면, 스스로 잘 한다고 여기는 - 스스로 못 한다고 여기는, 그리고 다른 사람이 잘 한다고 인정하는 - 다른 사람이 못 한다고 인정하는 네 가지 조합이 있다.

To elaborate, we might think of the world as a 2 2 table in which there are some activities at which we thrive and others at which we perform poorly, crossed with some activities we think we do well and others we think we do poorly."



마지막으로, 이 특별한 상황은 극단적 상황에서의 추론이며, 흔히 오류를 범하기 쉬운 전략이다.

Finally, this particular example involves a process of rea- soning fromextreme examples, another erroneous rhetorical strategy."



안타깝게도, 이러한 개인 수준의, 오류 투성이의 자기 과신은 교육자로서의 우리가 스스로 하여금 자신의 강점과 약점을 효과적으로 찾아내는 것이 가능하다고 믿게 만든다. 그 결과 우리 교육자들은 해답의 일부가 되기는 커녕 문제의 일부가 되고 있다.

Un- fortunately, it is this personal, flawed self-confidence in our own self-assessment ability that has led us as educators to perpetuate the myth that the effective self-identification of strengths and weaknesses is even possible. As a result, we educators have not only failed to be part of the solution, we have actually been part of the problem."


Truths About Self-Assessment


우리의 뇌는 근본적으로 자신의 능력에 대해서 과도하게 긍정적인 태도를 갖도록 설계되어 있다.

Rather, the tendency to be overly opti- mistic about one’s abilities is a fundamental property of the way our brains are wired.25"


우리의 수행능력이 나쁠 때는 외부 환경을 비난하기가 쉽다.

When we have a poor performance, usually it is easy to find a way to blame external circumstances.18"


그러나 우리는 과도하게 긍정적이라는 일반적 원칙은 확고하다. Gilbert가 쓴 것과 같이 "진화는 이러한 정신적 과정을 우리 동의 없이 뇌에 설치했다는 점에서 MS window award를 받을 만하다."

we are betteroff being overly optimistic is a robust one. As Gilbert haswritten in another context, “Evolution deserves the Micro-soft Windows Award for installing these mental processes inevery one of us without asking permission.”28 "

문제는 단순이 이러한 정신 차원의 문제가 존재한다는 것 뿐만 아니라, 사람들은 그것이 작동한다는 것조차 모른다는 점이다.

The problem is not that these mental phenomena exist. It is that people do not appreciate that they are active."


온타리오에서의 PREP는 가정의학 의사들의 역량을 지속적으로 측정했고, '불충분한 역량'의 두 번째 뛰어난 예측인자는 '고립된 직무환경' 임을 밝혔다.

The Physician Review andEnhancement Program in Ontario, charged with evaluatingand assessing family physicians’ ongoing competence withinthe province, have reported that the second best predictor ofincompetence is working in isolation.29"


답은 분명하다. 자기평가라는 것은 절대로 스스로 개발할 수 있는 일반적 기술이 아니다.

The evidence is clear and overwhelming: self-assessment is not and will never be a generic skill that one can develop."


우리는 이러한 부정확한 자기평가가 '우리'의 문제이며 '그들'의 문제가 아님을 다시 강조하고 싶다.

We wish to reemphasize that the inadequacy of self-assessment must be viewed as a “we” problem rather than a “they” problem."


자기성찰 연습는 세상을 더 잘 이해하는 것에 목적이 있다. 

The focus of these exercises is not to determine that one is greator at least good enough, but rather to determine how oneunderstands the world and how one might increase this un-derstanding to the benefit of future performance."


"의사들은 자기평가를 얼마나 잘 하는가?" "어떻게 자기평가능력을 향상시킬 수 있는가?" "어떻게 자기평가 능력을 측정할 수 있는가?"와 같은 연구질문은 폐기되어야 한다. 대체로 연구자들이 자기평가 능력을 향상시켰다고 할 때를 보면 Ward가 묘사한 오류에 흔히 빠져있다. 집단 수준의 상관관계에 지나치게 빠져있거나, 실제 점수와 한 사람이 평가한 자기평가 점수 사이의 차이를 정확도의 척도로 보는 것이다. 대신 우리는 세 가지 분야의 연구를 해야 한다.

We believe that research questions that take the form of “How well do various practitioners self-assess?” “How can we improve self-assessment?” or “How can we measure self-assessment skill?” should be considered de- funct and removed from the research agenda. Usu-ally when researchers claimto have improved self-assessmentthey have fallen prey to one of the fallacies described byWard et al. ~eg, placing undue faith in group-based corre-lations or mistaking distance of individual guesses fromtruescores as a sensible measure of self-assessment accuracy!.30 Instead, we see three potential programs of research that parallel the three concepts we have distinguished from self- assessment in the first section of this article."


이런 것을 염두에 둔다면, 주요 주제는 역량을 어떻게 유지할 것인지, 그리고 어떻게 CPD를 유지할 것인지이다. Schon의 용어를 따르자면 자신의 행동에 대한 성찰은 '자기주도적 평가 탐구'의 습관을 기르고 외부의 피드백 소스를 흡수하여 자신의 장점과 단점을 인지하는 능력을 개발하는 것이다.

Withthis inmind, we wouldargue that the predominant concern regarding the professional issues of maintenance of competence and continuing professional development ~in Schön’s terms, issues of reflection on practice!31 should fo- cus upon developing habits of self-directed assessment seek- ing and upon understanding factors that influence our ability to absorb these external sources of feedback in developing a coherent self-awareness of our strengths and weaknesses."


중요한 것은 한 사람이 그 자신의 점수를 정확히 예상할 수 있느냐가 아니라, 자신의 점수를 알았을 때 무슨 행동을 하느냐이다.

The relevant issue is not whether a person can predict his score on a test, but what he does with the information when he finds out his score."



Self-directed assessment seeking에 대해서 우리가 해야 할 질문은 이런 것이다. 

Thus, we should be asking questions like “What forms of external data would helpindividuals recognize areas that require updating?” “How can we collect and deliver these data in a meaningful form?” “How can we convince people to believe this feedback and incorporate it into their self-concept?” and, more generally, “How can we get people to act on externally derived infor- mation?”"


    • “What forms of external data would help individuals recognize areas that require updating?” 
    • “How can we collect and deliver these data in a meaningful form?” 
    • “How can we convince people to believe this feedback and incorporate it into their self-concept?” and, more generally, 
    • How can we get people to act on externally derived information?”


Self-reflective exercises에 관해서 우리가 물어야 하는 질문은 이런 것이다.


    • “Does engaging in self-reflection result in improved performance” could parallel the emerging literature that reveals the pedagogical benefits of externally derived assessment strategies (eg, multiple-choice tests of knowledge).32,33 
    • More sophisticated questions could address 
      • (a) whether or not sharing one’s self-reflections with peers, a tutor, or a mentor is necessary to elicit full advantage of the activity and 
      • (b) whether or not developing the habit of self-reflection in one context tends to transfer readily to maintaining that habit in novel contexts or at variable stages of one’s career


Self-Monitoring에 관해서 우리가 물어야 하는 질문은 이런 것이다. 

Thus, we should be asking questions like “Do individuals show behavioral indications of slowing down0help seeking when they reach the bound- aries of their knowledge0abilities in their moment-to-moment interactions with patients?” “What cues ~external or internal! initiate such slowing down processes?” “Does the initiation of these processes impact upon the appropriateness of the care provided?” and “Howbest can the skills associated with slowing down and help seeking be taught?”"


    • “Do individuals show behavioral indications of slowing down/help seeking when they reach the boundaries of their knowledge/abilities in their moment-to-moment interactions with patients?” 
    • “What cues (external or internal) initiate such slowing down processes?” 
    • “Does the initiation of these processes impact upon the appropriateness of the care provided?” and 
    • “How best can the skills associated with slowing down and help seeking be taught?”






 2008 Winter;28(1):14-9. doi: 10.1002/chp.150.

"I'll never play professional football" and other fallacies of self-assessment.

Author information

  • 1Department of Clinical Epidemiology and Biostatistics, Program for Educational Research and Development, McMaster University, Hamilton, Ontario, Canada. evakw@mcmaster.ca

Abstract

It is generally well accepted in health professional education that self-assessment is a key step in the continuing professional development cycle. While there has been increasing discussion in the community pertaining to whether or not professionals can indeed self-assess accurately, much of this discussion has been clouded by the fact that the term self-assessment has been used in an unfortunate and confusing variety of ways. In this article we will draw distinctions between self-assessment (an ability), self-directed assessment seeking and reflection (pedagogical strategies), and self-monitoring (immediate contextually relevant responses to environmental stimuli) in an attempt to clarify the rhetoric pertaining to each activity and provide some guidance regarding the implications that can be drawn from making these distinctions. We will further explore a source of persistence in the community's efforts to improve self-assessment despite clear findings from a large body of research that we as humans do not (and, in fact, perhaps cannot) self-assess well by describing what we call a "they not we" phenomenon. Finally, we will use this phenomenon and the distinctions previously described to advocate for a variety of research projects aimed at shedding further light on the complicated relationship between self-assessment and other forms of self-regulating professional development activities.

PMID:
 
18366120
 
[PubMed - indexed for MEDLINE]


OSCE 스타일의 시험에서 평가자의 판단이 대조효과에 영향을 받을까? (Acad Med, 2015)

Are Examiners’ Judgments in OSCE-Style Assessments Influenced by Contrast Effects?

Peter Yeates, MClinEd, PhD, Marc Moreau, MD, and Kevin Eva, PhD





불행하게도, 판단-기반 평가는 psychometric weakness를 안고 있다. 그리고 이러한 문제는 reformulation이나 training으로도 충분히 극복하지 못해온 것이 사실이다.

judgment-based assessments are susceptible to a raft of psychometric weaknesses1–3 that have not been satisfactorily resolved through either reformulation4,5 or training.6–8"


평가자의 오류들: "평가자의 인식에 대한 연구에서, 평가자들은 상대적으로 독특하고 개인별로 특유한 수행능력 이론을 가지고 있으며('우수한 수행능력을 구성하는 것이 무엇인가에 대한 개인적 믿음') (공인된 평가기준이 아닌) 자기 자신의 임상능력을 평가틀(frame of reference)로 활용한다. 평가자들은 평가가 자신의 감정이나 (인색하게 보이기 싫음), 특정 피교육자와의 과거 경험에 영향을 받으며, 조직의 문화에 의해서도 영향을 받는다는 것을 인식하고 있다. 더 나아가서 평가자의 판단은 관찰한 것을 넘어선 추론에 의해 영향을 받기도 하는데, 여기에는 피교육자의 문화, 교육, 동기에 대한 지레짐작까지 포함된다. 이런 연구 결과 외에도 평가자들은 행동 준거에 따라서 평가하라는 지침에도 불구하고 피교육자의 수행수준을 다른 피교육자와 비교하는 경향을 보인다.

“assessor cognition” assessors appear to possess relatively unique, idiosyncratic performance theories (personal beliefs about what constitutes good performance)9,10 and may use their own clinical abilities as their frame of reference (rather than recognized assessment standards) when judging the performance of trainees.11,12 Assessors perceive that their judgments are influenced by their own emotions (e.g., not wanting to feel mean), their prior experiences of particular trainees’ performance, and their institutional culture.12 Further, assessors’ judgments appear to be frequently guided by inferences that go beyond their observations, including presumptions about a trainee’s culture, education, or motivation.12,13 Additional to these findings, an exploratory investigation10 suggested that assessors, despite instructions to judge against a behavioral standard, showed a tendency to make judgments by comparing trainees’ performance against other trainees’.


요약하면, 역량에 대한 이해라는 것은 본질적으로 상대적인 것이라 할 수 있다.

in essence, it indicates that their understanding of competence may be inherently comparative (i.e., norm referenced).17


어떤 순서든 학생이 OSCE 서킷에 들어가면 그 학생과 그 전 학생간의 관계는 무작위가 된다. 그 결과 아주 큰 데이터셋을 분석한다면 특정 학생과 그 전학생간의 상관관계는 전혀 없어야 한다. 정적 상관은 assimilation effect를 의미하며, 부적 상관은 contrast effect를 의미한다. 

When students are not entered into an OSCE circuit in any particular order, the relationship between a performance and its predecessor should be random. Consequently, when a large dataset is examined, no relationships should exist between the scores of performances and their predecessors. A positive relationship between successive performances would indicate an assimilation effect, whereas a negative relationship would indicate a contrast effect. The first dataset was drawn from the 2011 United Kingdom Foundation Programme Office (UKFPO) Clinical Assessment. The second dataset was drawn from the 2008 Multiple Mini Interview (MMI) that was used for selection into the University of Alberta Medical School.


'스테이션 서킷'이란 것은 각 특정 스테이션에 들어가는 지원자의 점수들로 구성된다. 스테이션의 난이도나 평가자의 엄격한 정도가 분석에 영향을 주는 것을 방지하기 위해서 모든 점수를 z score로 변환하였다. 수행능력 기반 평가 점수는 총 5288명을 대상으로 수집하였다. 모든 점수의 평균과 중간값은 비슷했다. skewness와 kurtosis는 -1과 1 사이이다.

A “station-circuit,” therefore, comprised the scores for the candidates at an individual station for an individual circuit of the exam. To prevent station difficulty and/or rater stringency from influencing the analyses, we transformed every candidate score into a z score centered around the mean of its station-circuit. Performance-based assessment data were available for 5,288 candidate observations (see Table 1). All scores’ mean and median values were similar, with skewness and kurtosis values between −1 and 1, indicating that data were adequately normal for parametric analysis.





어떤 관계가 나타나더라도 분석 중 생긴 artifact가 아니라는 것을 확실히 하기 위해서 Excel을 활용하여 Monte Carlo Simulation을 수행하였다. 20개의 무작위 숫자를 생성한 뒤 동일한 분석을 하였다.

To ensure that any relationship observed was not an analytic artifact generated from the way in which scores were compared with preceding scores, we used Microsoft Excel (Microsoft Corporation, Redmond, Washington) to run a Monte Carlo simulation. Twenty random numbers were produced and then used to calculate the same “preceding candidate” metrics as described above (n-1, n-2, etc.).


첫 번째 분석은 contrast effect가 나타나는가였다. 분명하게 현재 점수가 그 앞 세 명의 점수의 평균과 관계가 있음이 나타났으며, 어떤 이전 점수와도 관계가 나타났다.

We initially queried whether the previously demonstrated contrast effects15,16 examiners may be susceptible to contrast effects in real-world situations despite the formality of high-stakes exams and despite explicit behavioral guidance. Notably, the observed relationships were stronger when current scores were related to the average of three preceding scores relative to when they were related to any individual preceding performance.





이론적으로, 이는 우리가 기존에 주장한 바를 지지하는데, 보건의료인력 교육에서 준거지향평가에 의지함에도, 고도로 훈련받고 충분한 자원이 제공되는 평가 참여하는 평가자도 역량에 대한 '고정된(fixed)' 감각이 없다

Theoretically, this sustains our prior assertions15,16 that, despite our espoused reliance on criterion-based assessments within health professional education,19 highly trained and well-resourced examiners may still lack any truly fixed sense of competence against which to judge when making assessment decisions.


평가자들은 특정 수험생을 평가할 때 단순히 가장 최근 수험생과 대조하는 것이 아니라 그 전에 접한 수험생들의 수행능력을 통합하여 기준을 정하는 것으로 보인다. 이는 행동을 설명한 문구나, 훈련에도 불구하고 평가자들은 과거의 예제들을 바탕으로 비교 판단을 하기 위한 mental database를 축적한다는 생각을 지지한다. 이런 효과가 시험의 맨 끝까지 나타난다는 점에서 이것은 워밍업 단계의 현상이 아니며, 따라서 한 두개의 초기 스테이션을 제외한다고 사라질 수 있는 것이 아니다. 가장 강한 부적 상관은 특정 학생을 접하기 4, 5, 6번째에 접한 학생의 평균이였고, 이는 평가자에게 초반부에 접한 케이스가 의미를 갖는다는 초기효과(primacy effect)를 보여준다.

This suggests that assessors mentally amalgamate previous performances to produce a performance standard to judge against rather than simply contrasting with the most recent performance. This is consistent with the idea that, despite the availability of behavioral descriptors and training, examiners accumulate a mental database of past exemplars against which they make comparative judgments. The persistence of the effect near the end of the exam indicates that it is not a “warm-up” phenomenon, and so cannot be counteracted by discarding one or two initial stations. That the strongest negative relationships were observed between later candidates and the average of students that preceded them by four, five, and six places suggests a primacy effect in these exemplar comparisons, in that the early cases that examiners see may be particularly meaningful in setting expectations.


과거 연구들은 대조효과가 여기서 나타난 것보다 더 컸는데, 몇 가지 해석이 가능하다. 첫 번째로 실험실 상황(조작된 상황)에서는 참가자들인 지속적으로 강한, 양방향의 조작(manipulation)에 노출되게 된다. AVN4-6이 가장 큰 영향을 줬다는 것이 특히 시사하는 바는 비디오 기반의 예제를 활용하여 모든 평가자에게 초반의 표준화된 비교측정기(comparator)를 만들도록 할 수 있다는 장점이 있다. 

The preceding laboratory studies showed contrast effects that were larger than those observed in this study, explaining up to 24% of observed score variance in mini-CEX scores. A number of explanations are possible. First, in the laboratory context, participants were consistently exposed to a strong, bidirectional manipulation (either seeing very good or very poor performances prior to intermediate performances). That AvN4-6 showed the largest influence has particularly important implications for practical strategies to overcome the biases observed in this line of research. If confirmed, it would suggest potential benefits in using video-based exemplars to create a standardized set of initial exemplar comparators for all examiners.











 2015 Jan 27. [Epub ahead of print]

Are Examiners' Judgments in OSCE-Style Assessments Influenced by Contrast Effects?

Author information

  • 1P. Yeates is clinical lecturer in medical education, Centre for Respiratory Medicine and Allergy, Institute of Inflammation and Repair, University of Manchester, and specialist registrar, Respiratory and General Internal Medicine, Health Education North West, Manchester, United Kingdom. M. Moreau is assistant dean for admissions, Faculty of Medicine and Dentistry, and professor, Division of Orthopaedic Surgery, University of Alberta, Edmonton, Alberta, Canada. K. Eva is senior scientist, Centre for Health Education Scholarship, and professor and director of educational research and scholarship, Department of Medicine, University of British Columbia, Vancouver, British Columbia, Canada.

Abstract

PURPOSE:

Laboratory studies have shown that performance assessment judgments can be biased by "contrast effects." Assessors' scores become more positive, for example, when the assessed performance is preceded by relatively weak candidates. The authors queried whether this effect occurs in real, high-stakes performance assessments despite increased formality and behavioral descriptors.

METHOD:

Data were obtained for the 2011 United Kingdom Foundational Programme clinical assessment and the 2008 University of Alberta Multiple Mini Interview. Candidate scores were compared with scores for immediately preceding candidates and progressively distant candidates. In addition, average scores for the preceding three candidates were calculated. Relationships between these variables were examined using linear regression.

RESULTS:

Negative relationships were observed between index scores and both immediately preceding and recent scores for all exam formats. Relationships were greater between index scores and the average of the three preceding scores. These effects persisted even when examiners had judged several performances, explaining up to 11% of observed variance on some occasions.

CONCLUSIONS:

These findings suggest that contrast effects do influence examiner judgments in high-stakes performance-based assessments. Although the observed effect was smaller than observed in experimentally controlled laboratory studies, this is to be expected given that real-world data lessen the strength of the intervention by virtue of less distinct differences between candidates. Although it is possible that the format of circuital exams reduces examiners' susceptibility to these influences, the finding of a persistent effect after examiners had judged several candidates suggests that the potential influence on candidate scores should not be ignored.

PMID:
 
25629945
 
[PubMed - as supplied by publisher]


커크패트릭 평가의 네 단계를 열어줄 일곱 개의 열쇠(Performance Improvement, 2006)

Seven Keys to Unlock the Four Levels of Evaluation

by Donald L. Kirkpatrick






First Key: Analyze Your Resources

To do this, you must answer the following questions:

Does your job consist of only one function—evaluating training programs—or does it include other and perhaps more important duties and responsibilities of planning the curriculum and teaching?

• How large a staff do you have for evaluation?

• How much of your budget can you spend on evaluating programs?

• How much help and cooperation can you get from other departments such as Human Resources or sales if you are evaluating sales training programs?

• How much support and help can you get from line managers if you are training their subordinates in programs such as Leadership Development for Supervisors?




Second Key: Involve Your Managers

If you are going to be effective in evaluating programs, you need to have your managers’ encouragement and support


1. Ask for their input in deciding on subject content. George Odiorne, in one of his books, made the following statement: “If you want people to support your decisions, give them a feeling of ownership.”


2. Get your managers to establish an encouraging climate regarding the program.


3. Ask for their helping evaluating the program. Levels 3 (Behavior) and 4 (Results) require this help. You can evaluate Levels 1 (Reaction) and 2 (Learning) without involving managers because you have control over these two levels. But Levels 3 and 4 are typically influenced by factors beyond those within your control




Third Key: Start at Level 1 (Reaction) and Continue Through Levels 2, 3, and 4 as Resources Permit


Some trainers or HPT professionals refer to Level 1 as “happiness ratings” or “smile sheets,” and I agree! That’s exactly what they are. They measure the reaction of the participants to the program. But those trainers also claim that these evaluations are not of much value. I disagree.


In business, industry, and government, there is a slight difference. First, they may not pay for the program, and the existence of the program doesn’t depend on their attendance. But you can be sure that they will be telling somebody—perhaps even their boss—whether they thought the program was worthwhile.


And when the reactions are positive, the chances of learning are improved.



Fourth Key: Evaluate Reaction

Here are the guidelines for evaluating Reaction:

1. Decide what you want to find out—make a list of items to which you want the reaction of the participants (i.e., subject content, leader’s effectiveness, schedule, audiovisual aids, handouts, case studies, facilities, meals, etc.).

2. Design a form that will quantify reaction. The most common form consists of a five point scale: either Excellent, Very Good, Good, Fair, and Poor; or Strongly Agree, Agree, Neutral, Disagree, and Strongly Disagree. The objective is to get as much information as possible in the shortest period of time. Participants are not eager to spend time writing the answers to questions.

3. Provide the opportunity for written comments. End your reaction sheet with the question, “What would have improved the program?”

4. Get 100% immediate response. When the participants leave the program, have them put their completed reaction sheet on the back table. Do not tell them to fill out the form and send it back, or do not ask them to email their reactions. If you do either of these, you will not get enough responses to represent the entire group.

5. Be sure you get “honest” answers. Tell participants you want their honest answers and do not ask them to sign the form.

6. Establish an acceptable standard for their combined reaction and tabulate the forms to see if you achieved or exceeded that standard.



Fifth Key: Evaluate Learning

Here are the guidelines for evaluating Learning:

1. Measure before and after knowledge, skills, and attitudes.

2. Use a form the participants can complete for evaluating knowledge and attitude change.

3. Use a performance test for evaluating skills.

4. Get 100% response.

5. For knowledge and attitudes, design a test that measures what you want them to know and the attitudes you want them to have at the end of the program. 

6. A question that usually arises about the pretest and posttest is whether the same form can be used or if “Form A” and “Form B” should be developed. There are too many problems when you try to develop a “Form A” and “Form B” that will cover the same knowledge and attitudes. So use the same form.



Sixth Key: Evaluate Behavior

Here are the guidelines for evaluating Behavior:

1. Measure on a before-and-after basis if practical. If this is not practical, the alternative is to measure after the program and ask, “What changes in behavior have occurred since you attended the program?”

2. Allow time for behavior change to take place. 3 months or 6 months after the program, or maybe never. The best compromise seems to be 3 months after the program.

3. Use a patterned interview or written survey asking the same questions of all respondents. One important question to include is, “Do you plan to change your behavior in the future?”

4. Decide who will be polled. For example, the following options are possible:

• The participants

• The bosses of the participant

• The subordinates of participants

5. Based on the fact that some participants have not changed their behavior but did answer positively the question, “Do you plan to change your behavior in the future?” repeat the research after 3 more months.



Seventh Key: Evaluate Results

Here are the guidelines for evaluating Results:

1. Measure on a before-and-after basis.

2. Allow time for results to develop—perhaps 6 months or a year.

3. Repeat at appropriate times.

4. Use a control group if practical. A “control” group is individuals who did not attend the program. An “experimental” group is the participants. U








Seven keys to unlock the four levels of evaluation

  1. Donald L. Kirkpatrick

Article first published online: 10 AUG 2006

DOI: 10.1002/pfi.2006.4930450702

Copyright © 2006 International Society for Performance Improvement

평가에 관한 교수개발: 역량중심교육과정의 잃어버린 고리(Acad Med, 2011)

Faculty Development in Assessment: The Missing Link in Competency-Based Medical Education

Eric S. Holmboe, MD, Denham S. Ward, MD, PhD, Richard K. Reznick, MD, Peter J. Katsufrakis, MD, MBA, Karen M. Leslie, MD, Vimla L. Patel, PhD, Donna D. Ray, MD, and Elizabeth A. Nelson, MD





가장 처음 도입된 역량바탕 프레임워크는 CanMEDS로 1990년대 중반 도입되었다. 이후 ACGME는 레지던트와 펠로우에 대한 일반역량을 개발 및 도입하였다.

One of the first competency-based frameworks to be introduced was CanMEDS in the mid-1990s.14 The Accreditation Council for Graduate Medical Education followed with the development and introduction of the general competencies framework for residency and fellowship in 2001.15



CBME는 복잡한 상황과 맥락에 따른 성과에 좌우되기 때문에 피교육자가 다음 단계로 진행하기에 정말 준비가 되었는가를 평가하기 위한 강건한 평가법과 평가절차가 반드시 필요하다. 그 결과, CBME의 도입시기부터 의학교육자들은 평가도구의 성배를 찾아다녔다.

CBME, because it is driven by complex situational and context-dependent outcomes, requires robust assessment and evaluation processes to determine whether a trainee is truly prepared to enter the next stage of his or her career. As a result, since the inception of CBME, medical educators have been seeking the holy grail of evaluation tools.



그러나 다양한 평가법과 도구들도 비판적으로 학생을 관찰하고, 질문하고, 그들의 수행능력을 실제 환자가 있는 현장에서 판단하는 교수를 대체하지는 못했다. 시험이나 표준화환자로 측정된 피교육자의 능력 또는 역량이 실제 근무지 기반 수행능력으로 '전이'됨을 확인하는 것은 반드시 교수의 책임이 된다. 왜냐하면 발달의 궤적에 있어서 CBME는 더 빈번한, 더 적시의, 건설적이고 실제 상황에서의 평가에 의존해야 하며 '대리지표'로서의 총괄평가에 의존해서는 안되기 때문이다.

However, these methods and tools cannot replace the importance of faculty who are enabled to critically observe, question, and judge trainee performance in actual patient care situations.24 Ensuring that a trainee’s capability or competence, as measured by exams and standardized patients, translates, or “transfers,” into actual work-based performance with patients and families is an essential faculty responsibility.25 Because of its emphasis on developmental trajectories, CBME requires more frequent, timely, formative, and authentic assessment and less dependence on “proxy,” summative assessments.10



전문성의 개발, 그리고 고립된 자기평가의 오류에 대한 근거들은 이러한 관점을 지지한다. 예컨대 SP만을 이용하여 피교육자를 평가하는 것은 비용이 비쌀 뿐만 아니라, 더 중요하게는 정기적으로 지속적 피드백을 주지 못한다.

This perspective is supported by evidence from work in the development of expertise and the perils of isolated selfassessment. For example, exclusively using standardized patients to judge whether a trainee was acquiring competence in clinical skills would not only be expensive but, more important, would not provide the learner with regular and ongoing feedback



또한 많은 문헌이 대부분의 의사들이 자신의 강점과 약점을 외부 자료나 피드백 없이는 발견하지 못한다는 것을 보여준다.

Furthermore, a substantial body of literature clearly demonstrates that most physicians cannot determine their own strengths and weaknesses without external data and feedback.28 ***




평가자로서의 교수: 도전과 기회

Faculty as Evaluators: Challenges and Opportunities


분절된 학습환경

The fractured learning environment

현재 의과대학 교수는 주로 임상현장에서 피교육자와 근무하게 되고, 이 말은 외래, 클리닉, 병동, 수술장, ICU 등의 microsystem을 의미한다. 이런 임상현장은 근무지기반 훈련과 평가가 이루어지는 맥락이라 할 수 있다.

At present, medical faculty work with trainees primarily in clinical units, referred to by some as microsystems, such as an ambulatory clinic or officebased setting, a hospital ward, a surgical suite, an intensive care unit, or other such sites.29 These clinical units are the context for work-based training and assessment.


피교육자가 열악한 임상 microsystem에서 교육을 받는다면 임상현장에서의 역량, 질적 향상, 시스템 기반 진료를 충분히 배우고 있는지를 알기가 어렵다.

It is hard to conceive that trainees can effectively acquire competency in clinical care, quality improvement, or systems-based practice if they practice in poorly functioning clinical microsystems.



입원환자에 대해 보면, 너무 많은 교수들이 그들의 교육하고 평가하는 장소에 아주 잠시만 머무를 뿐이다. 예를 들면 내과나 소아과 교수는 입원 환자 회진을 2주~4주 정도만 돌게 된다. 이러한 로테이션 구조는 전문과 수련 문화에 너무 깊이 뿌리내리고 있지만, 이러한 microsystem에서의 로테이션이 학습자의 역량에 대한 평가에 대한 교수의 능력에 어떤 영향을 주는지 잘 알지 못한다.

In the inpatient setting, too many faculty are transients in the very clinical units where they teach and assess. For example, faculty in internal medicine and pediatrics often rotate on inpatient clinical services for just two to four weeks. This rotational structure is deeply ingrained within these specialty training cultures, yet we know little about how rotating through these microsystems affects the faculty’s ability to accurately assess competence of their learners.32



현재 의학교육 시스템에서 환자 경험과 시간의 불연속성은 지속적 평가와 피드백을 어렵게 한다. Hirsh 등은 이러한 '연속성'을 의학교육의 "조직 원리"로 강조해야 한다고 주장한다.

This lack of continuity in both patient experience and time with faculty for trainees in the current medical education system makes longitudinal assessment and feedback very difficult. Hirsh and colleagues34 argued for the importance of continuity as an “organizing principle” for medical education



분절된 학습환경과 연속성의 부족은 교수들로 하여금 자신들이 받은 느낌을 동료교수들에게 전달하는 "feed forward"를 주저하게 만든다. 그들은 이러한 feed forward가 교수들에게 편견을 심어줄까 걱정한다. 그러나 그 결과는 정보의 공유를 통한 피교육자의 발달 및 의미있는 전략 수립이 아니라 '매 번 새로 시작하는' 평가일 뿐이다.

Compounding the fractured learning environment and lack of continuity is the substantial reluctance on the part of faculty to “feed forward” information to their colleagues about trainees over fear of “biasing” the receiving faculty.36,37 However, the end result is a perpetual cycle of “starting over” with assessment instead of using the shared information for the trainee’s development and creation of meaningful action plans.



교수들은 시스템의 과학을 이해해야 하며, 어떻게 학제간 팀에서 효과적으로 일하는가를 알아야 하며, 자율성에 대해서 전통적 과점이 아니라 보다 '관계중심적'관점으로 초점을 옮겨야 한다. '관계적 자율성'에서는 인간은 서로 연결되어있으며 상호의존적이고, 자율성이라는 것은 사회적으로 구성되며 다른 사람에 의해서 주어지는 것임을 의미한다.

To do this, faculty must understand the science of systems and how to work effectively in interdisciplinary teams, and they must move away from traditional views to a more relational view of autonomy. Relational autonomy recognizes that human agents are interconnected and interdependent, meaning that autonomy is socially constructed and must be granted by others.42,43



미래의 교수개발은 어떻게 시스템 요인이 교육과 진료의 질에 영향을 주며, 시스템 기반 진료환경에서 교수들이 피교육자 역량 평가을 위해 어떤 준비가 되어야하는가를 다뤄야 한다.

Future faculty development will need to incorporate training about how system factors affect the quality of both teaching and patient care, and also how faculty must be prepared to assess their trainees’ competencies in systems-based practice



몇 가지 이유로 외래환경이 입원환자보다 지속적 평가와 피드백에 더 적합하다.

For several reasons, the outpatient setting holds potentially more promise than inpatient settings for longitudinal assessment and feedback for most specialties.44 

First, many trainees in specialties such as internal medicine, family medicine, and pediatrics work with a stable group of faculty preceptors who can observe these trainees over time.24 

Second, because trainees often have their own panel of patients, assessment methods such as a medical record audit can be combined with reflection guided by faculty.45 

Finally, as so much of medicine has moved into the outpatient setting, it follows logically that more training and assessment should occur here as well.





전통적 평가 역할 

Traditional assessment roles

예측가능한 미래에 두 가지의 역할이 계속 중요할 것이다. (1)지식과 임상추론의 프로빙을 위한 질문, (2)면담, 신체진찰, 상담, 기타 의사소통 기술에 판단을 위한 직접 관찰

For the foreseeable future, two traditional faculty roles in assessment will continue to be essential: (1) questioning to probe knowledge and clinical reasoning and (2) direct observation to judge the clinical skills of medical interviewing, physical examinations, counseling, and other communication skills as well as procedural skills.



교수들은 추론 과정을 강조하는 질문 기술을 개발해야 한다.

Faculty need to develop the skills to ask questions that emphasize the reasoning process and incorporate key findings and lessons from a growing body of evidence from research on cognition.46,47 Practical approaches exist to help faculty acquire these skills.46,48 These questioning skills apply equally well to the evaluation of procedural skills. ***



비록 교수들이 피교육자의 수행능력을 비판적으로 정확하게 관찰할 수 있어야 하지만, 일부 연구결과는 교수들이 피교육자의 부족한 점을 정확히 찾아내지 못한다는 것을 보여준다.

Although faculty need to be critical and accurate observers of trainee performance, limited published research demonstrates that faculty frequently fail to identify deficiencies in trainees’ clinical skills.24,49–51



확실하게 하기 위해서 교수들은 기본적 psychometric과 quality properties에 대한 평가가 이루어진 도구를 활용해야 하며, 최근의 연구는 최소한의 질적 기준을 만족하는 관찰도구들을 정리했다. 그러나 평가서식을 재설계하는 것이 전체 평가의 10%만을 설명한다는 점에서 의학교육자들은 교수들을 더 효과적인 관찰과 평가로 눈을 돌려야 한다.

To be sure, faculty should only use tools that have been evaluated for basic psychometric and quality properties, and a recent systematic review identified a small group of observation tools that meet minimal quality criteria for use.55 However, given that the redesign of evaluation forms only explains up to 10% of the variance in ratings,56 medical educators must now shift their attention to developing more effective methods to train faculty in observation and assessment.



추가적으로 교수들을 숫자에만 기반한 스케일 평가에만 유지하지 않도록 해야 한다. CBME는 질적 평가에 상당히 의존하게 될 것이다.

In addition, we must help faculty and programs move away from rating scales based on just numbers, as CBME will require a greater reliance on descriptive or “qualitative” assessment.57



교수들은 숫자로 하는 평가는 피교육자에 대한 판단을 종합하여 대표할 수 있게 나타내는 것에 그치지 않는 다는 것을 인식해야 한다. 궁극적으로 평가도구는 그 도구를 사용하는 사람의 수준 만큼만 좋은 것이다.

Faculty need to recognize that numeric ratings are nothing more than a process to synthesize and then represent a composite judgment about a trainee. Ultimately, evaluation tools are only as good as the individuals using them;



Albanese 등은 교육 커뮤니티와 기관이 어떻게 교수개발을 구성해야 하는가에 대한 유용한 프레임워크를 제시한다.

Along those lines, recent work by Albanese and colleagues60 provides a useful framework about how the educational community and institutions might structure faculty development activities using an integrated systems model (ISM).


• Changes in assessment and supervision that are also mission critical for the institution and help to build system

“reserve” will be more likely implemented.

• The further a faculty member moves along the stages of change, the higher the likelihood of adoption that can also produce individuals more likely to become champions for the change. 

• Enlisting the assistance of respected educational faculty to help implement the change helps to promote broader

and more rapid uptake by other faculty.

• Helping faculty mentally picture how the change in the educational program will affect and improve their own educational practices will also assist in the adoption of new knowledge and skills.



Assessment by faculty must be grounded in the principles of CBME


CBME에서는 준거와 발달을 기반으로 한 평가가 필요하다. 발달적 용어로 준거를 정의해야 하는데 이는 흔히 milestone이나 benchmark라고 명명되고, 교수나 프로그램 관리자들로 하여금 피교육자들이 적절한 '궤적'상에 있는가를 판단할 수 있게 한다.

CBME requires assessment be criterion based and developmental. Defining the criteria in developmental terms, commonly called milestones or benchmarks, allows faculty and program directors to determine whether the trainee is on an appropriate “trajectory.”62



Milestone은 교육과정과 평가의 청사진이 될 수 있다.

Milestones, in effect, can become the blueprint for curriculum and assessment.62



다양한 연구로부터 평가의 가장 큰 문제 중 하나는 교수들이 적절한 수행능력이 어느 정도인지에 대한 동의가 부족하다는 것이다. 이와 같은 교수진 내에서의 합의의 부족은 신뢰도와 타당도의 가장 큰 적이다. 또한 피교육자들에게 교수로부터 받는 이질적인 평가와 피드백에 대한 부당한 짐을 지우게 된다.

Multiple studies highlight that one of our biggest and most refractory problems in assessment is the lack of agreement among faculty about what constitutes satisfactory performance across competencies regardless of the competency framework.20,54 This lack of agreement among faculty is a major threat to the reliability and validity of decisions about trainee competence.54,56 In addition, it places an unfair burden on trainees to make sense of the disparate ratings and feedback they receive from faculty.



궁극적으로 교수들은 피교육자를 다른 교수에게 넘길 때 의미있는 수행능력 자료를 제공하는 것에 대한 두려움을 벗어나야 한다. 이는 특히 현재 우리의 로테이션 모델에서 중요하다. forward feeding 없이는 피교육자들은 피상적, 비구체적 평가와 피드백에 머물고 말 것이다.

Ultimately, faculty must become less fearful of providing meaningful performance data—including strengths and developmental needs— about the trainee during educational handoffs.36,37 This is especially important in our current rotational model of training—without “forward feeding” of information, trainees may end up in a perpetual cycle of superficial, nonspecific assessment and feedback..




Assessment requires competent faculty 

교수의 임상역량은 효과적 평가를 위해서 중요한 요소이지만, 아직까지 이 부분은 충분한 관심을 받지 못하고 있다. 여러 프로그램들은 교수들이 - 비록 아주 높은 수준은 아니더라도 - 교육과 평가에 충분한 지식/술기/태도 역량을 갖추었음을 전제로 하고 있다. 그러나 우리는 여러 학생들과 전공의가 임상 술기에 상당히 부족한 부분이 있음을 알고 있고, 따라서 나중에 이들이 교수가 되었을 때 중요한 부분이 부족할 수 있다는 것이 그다지 놀라운 사실만은 아니다. 여러 문헌들이 이를 뒷받침하는데, 심장 청진기술에 대한 연구에서 교수들은 3학년 학생들보다 딱히 더 나은 기술을 가지고 있지 않았다. 다른 연구에서는 가정의학의사, 인턴, 외과의사들이 informed decision making skill이 부족함을 보여줬다.

Clinical competence of faculty is a crucial component of effective assessment, yet this issue has received little attention to date. Programs operate on the assumption that faculty possess sufficient, if not high, levels of knowledge, skills, and attitudes in the competencies they are responsible for teaching and assessing. We have known for some time that numbers of students and residents graduate with significant deficiencies in clinical skills,24 so it might not be surprising that those who later become faculty may possess important deficiencies in clinical skills. A growing body of literature supports this concern. For example, a study of cardiac auscultation skills found that faculty were no more skilled than thirdyear medical students.68*** Another study highlighted substantial deficiencies in informed decision-making skills among family medicine physicians, internists, and surgeons,69***



이러한 결과가 시사하는 바는 CBME를 위한 교수개발은 평가능력에 대해서 임상술기 수련만 필요한 것은 아니라는 점이다. 새로운 21세기에 필요한 역량 개발이 필요하다. 대부분의 교수들은 이러한 역량에 대한 어떠한 공식적 교육도 받은 적이 없다. 그 결과 피교육자가 그러한 역량을 학습할 때 교수도 같이 학습하게 되며, collaborative model이 더욱 필요한 것이다.

The implication of these findings is that CBME-focused faculty development will need to incorporate clinical skills training with training in assessment. In addition to improving the clinical skills of faculty, faculty development will also need to incorporate training in the “new” competencies crucial to 21st century practice: evidence-based practice using point-of-care clinical decision support and information; health information technology; teamwork; care coordination; systems functionalities; advocacy; and contextaware professionalism, to name a few. The majority of faculty working today never received formal training in any of these competencies.29 In effect, there are a number of new competencies that faculty will need to learn as their trainees learn them, necessitating more collaborative models of faculty training.



한 명의 교수가 모든 부분에 전문가가 되어야 한다는 것도 안디ㅏ.

This is not to say that a single faculty member need be an expert in all competencies; rather, trainees should be taught and evaluated by those individuals that truly possess the highest level of knowledge and skill in the domain of interest, and those individuals may not be physicians.




Faculty as coach and mentor in assessment

대부분의 피교육자는 감독자 없이 진료를 하게 된다.

Ultimately, the majority of trainees will graduate from their programs and enter unsupervised practice.


포트폴리오는 스스로에 대한 평가를 할 수 있게 도와주는 강력한 도구이다.

Portfolios are a potentially powerful tool for engaging trainees in their own assessment.71



Next Steps: Preparing Faculty for the CBME Era

교수개발이 CBME의 속도결정단계라는 합의가 늘어나고 있다.

There is a growing consensus that the rate-limiting step in the evolution to CBME is faculty development.72


교수가 교육, 평가, 피드백을 포함하는 코치로서 전문가 역할을 할 수 있어야 한다.

The role of faculty as expert “coaches” must encompass teaching, assessment, and feedback.



아직 우리는 효과적인 교수개발 모델이 없다.

We have yet to develop the most effective faculty development models. The good news from a recent systematic review is that the faculty who participate in educational training activities report 

(1) high levels of satisfaction, 

(2) positive changes in their attitudes, 

(3) increased understanding of educational principles and teaching skills, 

(4) changes in behavior as noted by their students, and 

(5) greater involvement in teaching.75



교수개발에서 행동 변화까지.

However, few studies have investigated whether faculty training translates into actual behavior changes among trainees. In addition, most faculty development is designed as a one-time “bolus” activity and less often as a longitudinal designed program.














 2011 Apr;86(4):460-7. doi: 10.1097/ACM.0b013e31820cb2a7.

Faculty development in assessment: the missing link in competency-based medical education.

Author information

  • 1American Board of Internal Medicine, Philadelphia, Pennsylvania 19106, USA. eholmboe@abim.org

Abstract

As the medical education community celebrates the 100th anniversary of the seminal Flexner Report, medical education is once again experiencing significant pressure to transform. Multiple reports from many of medicine's specialties and external stakeholders highlight the inadequacies of current training models to prepare a physician workforce to meet the needs of an increasingly diverse and aging population. This transformation, driven bycompetency-based medical education (CBME) principles that emphasize the outcomes, will require more effective evaluation and feedback byfaculty.Substantial evidence suggests, however, that current faculty are insufficiently prepared for this task across both the traditional competencies of medical knowledge, clinical skills, and professionalism and the newer competencies of evidence-based practice, quality improvement, interdisciplinary teamwork, and systems. The implication of these observations is that the medical education enterprise urgently needs an international initiative of faculty development around CBME and assessment. In this article, the authors outline the current challenges and provide suggestions on where faculty development efforts should be focused and how such an initiative might be accomplished. The public, patients, and trainees need the medical education enterprise to improve training and outcomes now.

© by the Association of American Medical Colleges.

PMID:
 
21346509
 
[PubMed - indexed for MEDLINE]


바람직한 변화에 도달하는 길: 성과바탕의학교육의 다음 단계에 관한 생각 (Acad Med, 2015)

Achieving the Desired Transformation: Thoughts on Next Steps for Outcomes-Based Medical Education

Eric S. Holmboe, MD, and Paul Batalden, MD






Change and OBME

성과바탕(역량중심, 역량바탕, 등등)의학교육의 개념이 새로운 것은 아님. McGaghie 등은 WHO의 1978년 보고서에서 OBME가 지역인구의 요구에 더 잘 맞는다고 주장함. 약 40년이 지난 지금 OBME는 전세계적으로 확실히 변혁을 이끌고 있음. 변화는 불가피하지만 (헤라클리투스는 '변하지 않는 유일한 것은 변화이다' 라고 했지만) 동시에 매우 어렵다. 왜 변화가 어려운지를 이해해야 한다.

Outcomes-, or competency-based, medical education represents a fundamental shift in educational philosophy and perspective,9–11 yet the idea is not new. McGaghie and colleagues,9 writing on behalf of the World Health Organization in 1978, argued that OBME could better meet the needs of populations in local contexts. Now, nearly 40 years later, OBME is clearly driving substantial change and disruption across the globe. Although change is inevitable— the Greek philosopher Heraclitus (402 BCE) observed that the only constant is change (“you could not step twice into the same river”)12—it is often extremely difficult. Understanding why change can be so hard may help elucidate some of the barriers and challenges to implementing OBME.



첫 번째로 Heifetz와 Linsky의 말처럼 변화는 새로운 대안에 대한 것이 아니라, 기존에 존재했던 것에 대한 익숙함을 상실하는 것에 더 가깝다. 의사들은 이미 플렉스너의 GME의 시간-기반 모델을 위해 많은 희생을 해왔다.

First, we echo Heifetz and Linsky,13 who have argued that change is often not about the new alternative but, rather, about the loss of its familiar antecedent. Many physicians and others have made substantial personal investments and sacrifices to lead and serve in educator roles in Flexnerian and time-based models of GME.



두 번째로 우리의 경험을 떠올려보면 '사실'이라고 여겨왔던 많은 것들이 있다. 현재 전공의 교육의 구조는 이미 그것이 효과적이지 않다는 많은 증거에도 불구하고 '사실'이라고 여겨진다. 결국 실험과 혁신은 매우 최소한으로만 시도된다. 과거의 모든 경험이 나쁘다고 말하는 것이 아니라, 일부 요소들은 분명 역사와 경험이 있지만, 우리가 마주한 도전은 기존의 가정, 현재의 상황을 다시 검토하고 그것들을 모두 통합적, 비판적으로 평가하는 것이다. 이 노력의 일환으로 우리는 "서로 상반되는 두 가지 생각이 긍정적 긴장상태를 유지하도록 하는 능력"을 향상시킬 필요가 있다.

Second, our past experience leads us to accept many assumptions as “truths” when we should be questioning them for what they are: assumptions.15 The current structure and architecture of residency have become “truths” despite the disconfirming evidence that they are not always effective; that change is needed; that, minimally, experimentation and innovation should be allowed and welcomed. We do not mean to imply that all aspects of our past are “bad.” To the contrary, some elements of our educational history and experience, such as bedside rounding, remain instructive and valuable.16–18 Our challenge is to embrace the integrative thinking that allows us to consider assumptions, current reality, and critically assessed practices all together. As part of that effort, we need to increase our ability “to hold two conflicting ideas in constructive tension.”19



마지막으로 OBME를 도입하는 것은 Theory U framework을 따른다. 이것은 왜 새로운 것의 도입이 그렇게 어려운지를 보여주는 것이며, 여기서 U는 그 도입과정을 보여준다. 왼쪽(하향)에서는 새로운 것이 등장할 수 있도록 원래 것이 자리를 열어주는 과정이며, 더 나은 것이 우측에서 나타난다.

Finally, the implementation of OBME and other recent changes to the U.S. accreditation system invite attention to Scharmer’s21 Theory U framework, which illuminates why implementation can be so difficult. The “U” describes the journey to implementation: The left-hand side, or down stroke, represents letting go so that something better can emerge; the something better is represented by the upstroke or right-hand side of the U.



의료시스템에는 단순한 문제, 복합적 문제, 복잡한 문제가 있다.

These health care systems involve simple, complicated, and complex problems.23 

    • Simple problems can be solved by following an established recipe. 
    • A stable, reliable solution or outcome to complicated problems involves multiple steps and requires critical formulas and expertise from multiple fields. 
    • For complex problems, however, formulae and algorithms have limited value; each problem usually requires attention to shared purpose and relationships, often involving unique and iterative approaches.23



Richard Normann은 서비스 로직(service logic)의 중요성을 강조했다. 그는 "생산에서 활용으로, 결과에서 과정으로, 교환에서 관계로 초점을 옮길 것"을 주장했다. "service"라는 단어를 통해서 우리는 환자 돌봄에 거의 가치를 더하지 않는 그런 활동을 의미하거나 수련생에 의해 행해지지 말아야 할 그런 활동을 의미하는 것이 아니다. 슬프게도 GME에서 '서비스'라는 단어는 많은 경우 불필요한 '인턴잡'을 의미하는 용어처럼 되어왔다. 우리는 그것이 아니라 '서비스'를 서로 공유하고 있는 의미있는 직무로 정의내리고자 한다. 과정과 관계에 초점을 두는 Normann에 의해 정의된 서비스로직의 의미를 되새기면서 서비스로직을 새롭게 바라보는 것이 필요하다.

Richard Normann27 highlights the importance of a service logic—that is, “forcing a shift of attention from production to utilization, from product to process, from transaction to relationship.” By “service,” we do not mean clinical care activities that add little to patient care or those activities that should not be performed by trainees. Sadly, in the context of GME, the term “service” has too often become conflated with unnecessary “scut work” performed by trainees. Instead, we define “service” as shared and meaningful work. Recapturing the service logic as described by Normann—focusing on process and relationship—is urgently needed in medical education.



'시간'은 소중한 자원이라기보다는 일종의 '개입'의 한 수단으로 여겨지는 경우가 많다. Ten Cate는 쓸모없는 '잘못된 이분법'을 지적했다. 역량을 키우려면 경험이 필요하고, 경험을 쌓으려면 시간이 필요하나, 시간 그 자체만으로는 역량이 되지 않는다. 시간은 현명하고 생산적으로 쓰여져야 한다.

Time is viewed as an intervention (e.g., a “three-month rotation”) rather than a precious resource. Ten Cate recently highlighted the unhelpful “false dichotomy” surrounding the presumed positive association of greater time and better learning or increased skills.33 Competency requires experience, experience requires time, but time alone does not produce competence. Time should be used wisely and constructively.



우리가 사실로 받아들이는 또 다른 가정은 로테이션, 블록 모델인데 이것을 지지하는 근거는 별로 없다.

Another assumption-as-truth is that the rotational or block model of medical education, used in many medical school clerkships and in disciplines such as internal medicine and pediatrics, is necessary to ensure that learners’ experiences are sufficiently broad and deep. Little evidence exists to support this assumption.




의학교육계는 OBME를 "결과물"을 만들어내는 것으로부터 "서비스"를 강조하는 것으로 새롭게 프레이밍해야한다. "서비스"는 언제나 그것의 수혜자와 함께 '공동생산'되고 '공동창조'된다. 이러한 서비스 수혜자의 적극적 관여를 인정하는 것이 '학습자와 교수자의 공동 작업'이라는 '의료전문성 개발'의 새로운 정의를 만들어내는 길이다. OBME에서의 학습자는 그들의 교육경험과 평가활동을 공동으로 이끄는 자발적 에이전트여야 한다.

The medical education community has to reframe the OBME model from one that produces a “product” or “material good” to one that emphasizes “service.” A “service” is always cocreated and coproduced to some degree by the recipient of the service.24–27,48,49 The recognition of this need for active engagement seems to invite a new definition of “health professional development” as the shared work of teacher and learner. Learners in an OBME system must be active agents coguiding both their curricular experiences and assessment activities.



Wagner 등은 의료에서 '활성화된 환자'의 중요성을 역설했다. 서비스로서의 의학교육도 다를 바가 없다. 의학교육자들은 새롭게 설계된 의료시스템에서 일하고 배울 수 있도록 피교육자를 활성화시켜야 한다. 학습자들은 평가와 피드백을 구하기 위해서 자발적이 되어야 한다.

Wagner and colleagues50 have described the importance of “activated patients” for the development of good care. Medical education-as-service should be no different. Medical education needs activated trainees who are able to work in redesigned care delivery systems and learn through a redesigned curriculum. Learners must also be self-directed in seeking assessment and feedback.51



학습자들에게 그들이 은퇴하기 전까지 사용하기에 충분한 지식을 주입하여 전문직에 안착시키는 것은 불가능하다. 새로운 의사들은 단순한 의학 지식과 술기를 아는 것 이상이 필요하며, '자기평가', '성찰', '진료향상을 위한 지속적 자료 활용' 기술을 가지고 있어야 한다. 이미 지식이 풍요로워졌고, 팀 기반이며, 공동창조하고 공동생산하는 의료계에서 자율적, 독립적 의료진은 더 이상 설 곳이 없으며, 비효율만을 높일 뿐이다.

Attempting to launch learners with enough knowledge into a professional orbit that is set up so that they retire before burning up on reentry is illogical. New physicians need more than just competence in medical knowledge and technical skills; they also require the skills of self-directed assessment, reflection, and using continuous data for improving their own clinical practice.57,58 In a knowledge-rich, team-based, cocreating, coproducing world of health care service, the model of the autonomous, independent practitioner is largely irrelevant and increasingly ineffective.






비록 그것이 전부는 아닐지라도 의학교육자들은 학습과 평가에 새로운 기술을 활용해야 한다.

Medical education must also embrace pedagogical technology alongside personal relationships in learning and assessment. On the technology side, content and some skills can be more efficiently provided through online tools and other forms of simulations (e.g., virtual patient cases; practice tests).



예전에는 개별 의사의 영역이라고 여겨진 진단과 같은 분야에까지도 '공동 절차'가 영역을 넓혀가고 있다. '인지 분산 이론'(distributed cognition theory)와 상황 이론(theory of situativity)은 의료에서 사람들 사이의 상호작용과 맥락이 중요하다는 것을 강조한다. DCT는 한 개인이 환자 관리를 위한 모든 지식을 가질 수 없으며, 따라서 피교육자를 교육하는 것에 대해서도 마찬가지이다. Situativity Theory는 Durning과 Artino가 말한 바와 같이 '지식/사고/학습이 경험에 위치(situated)한다'라는 것을 의미한다. 의학교육은 그 모든 과정에서 group learning, interprofessional, cross-displinary work를 효과적으로 포함시켜야 한다.

Group processes now extend into activities traditionally seen as the province of the individual clinician, such as diagnosis.64 Emerging ideas such as the distributed cognition theory and the theory of situativity highlight the critical importance of human interaction and context in medicine. The distributed cognition theory argues that no single individual holds all of the knowledge necessary to care for a patient or, for that matter, to teach medical trainees,65 and situativity theory, as described by Durning and Artino,66 “refers to theoretical frameworks which argue that knowledge, thinking, and learning are situated (or located) in experience.” The curriculum, at every point along the medical education continuum, needs to more effectively incorporate group learning and interprofessional, cross-disciplinary work.29,65,66




Assessment as Continuous Quality Improvement

지난 세기는 'psychometric' century였다. 완전히 psychometric하고 개개인에 초점을 맞춘 평가법은 21세기 교육과 진료에는 충분하지 않다. 실제 근무지기반평가는 전문가간 팀에 초점을 맞추고, 효과적으로 기술을 활용하고, 의료제공자의 변화하는 역할 초점을 맞춰야 한다.

In addition to a service ethos and an interprofessional focus, we need a more robust national system of faculty development in educational pedagogy, quality, safety, value, and assessment.67 In many ways, given its intense focus on measuring the individual across multiple domains, the last century was the “psychometric” century. A purely psychometric, individually focused approach to assessment is insufficient for 21st-century education and clinical practice.68 The focus on interprofessional teams, appropriate uses of technology, and the changing roles of health care providers, patients, and families translates into an urgent need to improve and expand the role of actual work-based assessments.40,69,70



완벽한 전문가로 전공의를 마치는 사람은 거의 없으나, 대부분의 의사결정은 전문의라면 독립적, 자율적 진료에 준비가 되었다는 믿음을 기반으로 한다. 그러나 GME의 새로운 목표는 "능숙한 평생 공동학습자"라는 개념이 되어야 한다.

Almost no one leaves GME training already an expert, yet most entrustment decisions are based on the belief that an individual who has completed his or her residency is ready to enter independent, autonomous practice. A revised goal for GME might be the codevelopment of a physician ready to enter practice, competent in a particular initial role, and activated to be an effective lifelong learner who is self-directed in seeking ongoing assessment and feedback and capable of continuous reflection on, evaluation of, and modification of his or her emerging roles. Better definition of “a proficient lifelong colearner” might help the medical education community progress toward that goal.



평가는 교육과정에 longitudinally embedded되어야 한다. 이것은 '지속적 습관'이 되어야 한다. 의사들은 그들의 커리어가 '수련'의 특성을 가지는 환경에서 절대 벗어날 수 없음을 인지해야 한다.

Assessment must be longitudinally embedded into the medical curriculum such that incoming evaluation data create continuous feedback loops for learners at every point from medical school to practice. Assessment must become an ongoing “habit”; physicians must recognize that their careers never leave the training environment.


Judah Folkman이 말한 바와 같이 '모든 의료행위는 연습이다'. 우리의 도전은 새롭게 양성할 의사들이 자신의 진료에 대한 자료, 그리고 그 이상의 자료를 활용하여 지속적 학습을 할 수 있게 하는 것이다.

As Judah Folkman76 once noted in his commencement address at the Geisel School of Medicine at Dartmouth, “all medical practice is practice.” Our new challenge is to better prepare future physicians for continual learning, using data both from and beyond their own practices.





Changing Roles and Professional Self-Regulation

정책결정자들과 의료계의 지도자들은 수련을 마친 의사들이 21세기의 의료와 의료요구에 부합하지 않는다고 좌절해왔다.

Policy makers and leaders of health care delivery systems have expressed frustration that physicians graduating from training programs are not meeting 21st-century health and health care system needs— despite decades of certification and accreditation assessment practices and standards. In other words, the graduating medical professional coming into the health care delivery system is not sufficiently prepared or proficient to actually deliver the effective, efficient, timely, safe, equitable, and patient-centered care required.



아마 가장 큰 변화는 총괄평가에서 형성평가, CQI로 옮겨가는 것일 것이다. 이는 규제를 하는 조직과 규제를 받는 조직간의 관계가 변한다는 것을 뜻한다. 유의미하고 중요한 기준을 강화하는 것이 멈추진 않을 것이다. 그러나 새로운 기준과 접근법을 개발하고 도입하는 것은 '공동생산'과 '공동창조'가 필요하다.

Perhaps the greatest change is the shift from almost exclusively summative assessment practices to more formative, continuous quality improvement (CQI) practices. The Next Accreditation System (NAS), which includes elements cocreated by members of both the educational community and regulatory groups (i.e., the ACGME, ABMS), represents an initial attempt to shift to a CQI model.77–80 This shift means that the relationship between the regulated and regulators is also changing. Enforcement of meaningful and important standards will not cease: The public expects and deserves nothing less. However, the development and implementation of new approaches and standards requires coproduction and cocreation.24–27



이 지저분하고 어려운 일에는 모든 이해관계자가 co-configuration으로 포함되어야 한다. 근로와 학습의 co-configuration에 반드시 필요한 것은 사용자의 변화하는 요구에 맞는 서비스를 생산하는 것이다.

This messy and difficult work and learning required of all stakeholders (including regulators) must occur in co-configuration. A critical requirement of co-configured work and learning is the creation of services that adapt to the changing needs of the user.81,82 Learning and work involve


(1) adaptive, “customer-intelligent” product-service combinations, 

(2) continuous relationships of mutual exchange …, 

(3) on-going configuration and customization … over lengthy periods of time, 

(4) active customer involvement and input…, 

(5) multiple collaborative producers that … operate in networks…, and 

(6) mutual learning from interactions between the parties involved.82















 2015 Sep;90(9):1215-23. doi: 10.1097/ACM.0000000000000779.

Achieving the Desired TransformationThoughts on Next Steps for Outcomes-Based Medical Education.

Author information

  • 1E.S. Holmboe is senior vice president, Milestones Development and Evaluation, Accreditation Council for Graduate Medical Education, Chicago, Illinois. P. Batalden is active emeritus professor, Dartmouth Institute for Health Policy and Clinical Practice, Geisel Medical School at Dartmouth, Hanover, New Hampshire.

Abstract

Since the introduction of the outcomes-based medical education (OBME) movement, progress toward implementation has been active but challenging. Much of the angst and criticism has been directed at the approaches to assessment that are associated with outcomes-based or competency frameworks, particularly defining the outcomes. In addition, these changes to graduate medical education (GME) are concomitant with major change in health care systems-specifically, changes to increase quality and safety while reducing cost. Every sector, from medical educationto health care delivery and financing, is in the midst of substantial change and disruption.The recent release of the Institute of Medicine's report on the financing and governance of GME highlights the urgent need to accelerate the transformation of medical education. One source of continued tension within the medical education community arises from the assumption that the much-needed increases in value and improvement in health care can be achieved by holding the current educational structures and architecture of learning in place while concomitantly withdrawing resources. The authors of this Perspective seek to reframe the important and necessary debate surrounding the current challenges to implementing OBME. Building on recent change and service theories (e.g., Theory U and coproduction), they propose several areas of redirection, including reexamination of curricular models and greater involvement of learners, teachers, and regulators in cocreating new training models, to help facilitate the desiredtransformation in medical education.

PMID:

 

26083400

 

[PubMed - in process]


가짜 교수의 의심스러운 수업: 의과대학생들의 생각없는 강의평가 (Med Educ, 2015)

A curious case of the phantom professor: mindless teaching evaluations by medical students

Sebastian Uijtdehaage & Christopher O’Neal






강의평가의 타당성에 대한 의문과 비뚤림에 대한 우려가 심해지고 있다. 또한 진실되지 못하다거나 학생들이 생각없이 평가를 해서 그 타당성을 해한다는 연구도 잇다.

Unfortunately, there is a burgeoning body of research from undergraduate and professional courses at North American campuses that casts serious doubt on the validity of SETs,1 suggests they are rife with biases2,3 and even untruths,4,5 and indicates that students may complete evaluations in a mindless manner that further harms the validity of the process.6



우리는 8주 과목에 가상의 교수자를 집어넣었다. 이 교수자에게는 다른 교수들과는 확실히 구분되는, 성-중립적 이름을 넣고, 일반적 강의제목(폐질환 개론)을 넣었다. 학생들은 과목 종료 후 2주 이내에 이 가상의 교수자를 포함한 모든 교수에 대해서 강의평가를 하게 되었다.

We inserted one fictitious lecturer into the evaluation forms for two 8-week, pre-clinical, classroomstyle courses (for the Year 2 class of 2010 and the Year 1 class of 2011). We gave these ‘lecturers’ gender-ambiguous names (e.g. ‘Pat Turner’, ‘Chris Miller’) that were distinct from existing names, and added generic lecture titles (e.g. ‘Introduction, Lung Disease’) (Table 1). Students were required to submit their anonymous ratings of all lecturers, including the fictitious ones, within 2 weeks after the course using our online evaluation system CoursEval (ConnectEDU, Inc., Boston, MA, USA).


학생들은 '적용불가능함' 이라는 옵션을 선택할 수도 있었다.

Students could choose not to evaluate a lecturer by marking the option ‘Not Applicable’.


그 다음 해에, 우리는 같은 절차를 매력적인 젊은 모델의 사진을 넣어서 다시 시행했다. 실제로 교육에 참여한 교수는 23명에서 52명 사이였다.

The following year, we repeated this process (in the classes of 2011 and 2012), but also included a small portrait (150 9 150 pixels) of an attractive young model who, perhaps regretfully, did not resemble any of our faculty members. The number of actual lecturers in each course ranged from 23 to 52, most of whom were depicted in portraits of similar dimensions in our evaluation system.



심지어 많은 학생들은 가상의 교수자의 교육 수행능력에 대한 코멘트를 남기기도 했다. 3명의 학생은 이 강의를 기억하지는 못하지만 이런 강의가 있었으면 한다라고 기술하였지만, 또 다른 3명의 학생은 '정말 훌륭한 강의였습니다' 와 같이 지어내서 기술하기도 했다.

A handful of students even went so far as to provide comments on the performance of the fictitious lecturers. Although three students explicitly stated that they did not recall the lectures but wished they had (‘I don’t think we had this lecture but it would have been useful!’), three other students confabulated: ‘She provided a great context’; ‘Lectures moved too fast for me’, and ‘More time for her lectures’.






이런 생각없는 평가는 근래의 문제만은 아니다. 1977년의 Reynolds의 연구와도 상통한다. 학생들은 취소되어서 실제로는 진행되지도 않은 영화수업에 대해서 실제 진행된 역사수업보다 높은 점수를 주었다. 

Mindless evaluation is not a modern problem. Our findings echo those described by Reynolds in a landmark 1977 paper.7 Like us, he serendipitously found that a vast majority of undergraduate psychology students rated a movie on sexuality higher than a lecture on the history of psychology, although in fact neither event had taken place (due to cancellations). Where our scenario becomes more problematic than that described by Reynolds7 is in the unique structure of a medical curriculum, in which a multitude of instructors teach in the same course and are evaluated in bulk by students.



Dunegan과 Hrivnak은 강의평가의 세 가지 위험요소를 꼽았다.

Dunegan and Hrivnak6 describe three risk factors that may encourage mindless evaluation practices: 

(i) the cognitively taxing nature of SETs; 

(ii) the lack of perceived impact of SETs on the curriculum, and 

(iii) the degree to which the evaluation task is experienced as just another routine ‘chore’. 

Clearly, all of these risk factors may present themselves in a medical school environment.


  • 첫 번째: With regard to the first risk factor, evaluating teachers is a cognitively demanding task when it is done conscientiously weeks after the fact.
  • 두 번째: Chen and Hoshower9 use expectancy theory to show that students are less motivated to partake in an activity such as evaluation if they fail to see the likelihood that the activity will lead to a desired outcome (e.g. teacher change).
  • 세 번째: Lastly, medical students can certainly be forgiven for finding evaluations to be painfully routine and burdensome. As SETs are part of the mainstay of teaching assessment, medical students fill out evaluations numerous times per year during all 4 years of their training




Dunegan과 Hrivnak의 프레임워크를 기반으로, 강의평가의 대안을 마련하였다. 예컨대 코스를 시작할 때 일부 학생들이 '전향적(후향적이 아닌) 평가자'가 되는 것이다. 이들에게 전문성훈련을 시키는 과정 중에서 평가도구의 효과적 활용에 대하여 먼저 교육을 받는다. 또한 팀으로서 이 학생들은 건설적 피드백을 제공하는 방법을 연습한다. 이 학생들은 공동으로 종합적 보고서를 작성하여 과목 책임교수와 각 교수들에게 전달한다. 평가도구는 predictive evaluation에 초점을 두며 opinion-based evaluation이 안다. predictive evaluation이 같은 결과를 내면서도 더 적은 응답만을 요구함이 연구된 바 있다. 교수자들은 이 보고서에 대해 응답할 것을 요구받을 수도 있으며, 어떻게 이러한 피드백이 활용되거나 활용되지 않을 것인가를 설명해야 한다. 학생 평가팀을 통해서 많은 것을 얻을 수 있을 뿐더러 이러한 방법은 학생들이 '생각을 하며' 평가를 하도록 하며, 더 중요하게는 교육개선과 승진결정에 더 강건한 기반을 제공한다.

Using the framework described by Dunegan and Hrivnak,6 we can conceive of an alternative approach to the SET that may mitigate the risk factors described here. For example, at the beginning of a course, a sample of students in a class could be charged to be prospective (not retrospective) course and faculty evaluators. As part of their professionalism training, these students could first be educated in the effective use of evaluation tools that can be employed in situ (e.g. with hand-held devices) and that do not rely on the activation of episodic memory. As a team, these students could practise providing constructive feedback when, upon completion of the course, they collaborate on a comprehensive report to the course chair and teachers involved in the course. Evaluation tools could be focused on predictive evaluations (e.g. by asking students to predict their peers’ opinions of a teacher) rather than on opinion-based evaluations; predictive evaluations have been shown to require fewer responses to achieve the same result.10 Faculty members could be required to respond to these reports and explain how the feedback is to be used or not used so that students understand the impact of their efforts. In addition to the educational benefit to be derived from practising teamwork and providing constructive feedback, such an approach may engage students in a mindful way and, importantly, may yield information that provides a more robust foundation for programme improvement and promotion decisions.



'학생들이 참석하지도 않은 수업에 평가할 정도로 능숙해지면서 학기가 끝날 때 까지 기다릴 필요도 없게 되었다'

The present study should raise a red flag to medical schools in which students are asked to evaluate numerous lecturers after a time delay. It defies common sense (and a huge body of literature) to expect that such an evaluation approach procures a solid foundation on which decisions regarding faculty promotions and course improvement can be based. If we continue along this path, we may just as well follow Reynolds’s tongue-in-cheek suggestion that ‘as students become sufficiently skilled in evaluating [. . .] lectures without being there, [. . .] there would be no need [for them] to wait until the end of the semester to fill out evaluations’.7









 2015 Sep;49(9):928-32. doi: 10.1111/medu.12647.

curious case of the phantom professormindless teaching evaluations by medical students.

Author information

  • 1Center for Educational Development and Research, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, USA.

Abstract

CONTEXT:

Student evaluations of teaching (SETs) inform faculty promotion decisions and course improvement, a process that is predicated on the assumption that students complete the evaluations with diligence. Anecdotal evidence suggests that this may not be so.

OBJECTIVES:

We sought to determine the degree to which medical students complete SETs deliberately in a classroom-style, multi-instructor course.

METHODS:

We inserted one fictitious lecturer into each of two pre-clinical courses. Students were required to submit their anonymous ratings of all lecturers, including the fictitious one, within 2 weeks after the course using a 5-point Likert scale, but could choose not to evaluate a lecturer. The following year, we repeated this but included a portrait of the fictitious lecturer. The number of actual lecturers in each course ranged from 23 to 52.

RESULTS:

Response rates were 99% and 94%, respectively, in the 2 years of the study. Without a portrait, 66% (183 of 277) of students evaluated the fictitious lecturer, but fewer students (49%, 140 of 285) did so with a portrait (chi-squared test, p < 0.0001).

CONCLUSIONS:

These findings suggest that many medical students complete SETs mindlessly, even when a photograph is included, without careful consideration of whom they are evaluating and much less of how that faculty member performed. This hampers programme quality improvement and may harm the academic advancement of faculty members. We present a framework that suggests a fundamentally different approach to SET that involves students prospectively and proactively.

© 2015 John Wiley & Sons Ltd.

PMID:
 
26296409
 
[PubMed - in process]


"상대적으로 본다면...", 대조효과가 평가자의 점수와 서술 피드백에 미치는 영향(Med Educ, 2015)

Relatively speaking: contrast effects influence assessors’ scores and narrative feedback

Peter Yeates,1 Jenna Cardell,2 Gerard Byrne3 & Kevin W Eva4







피교육자의 임상수행능력 평가의 중요성. 그러나 이 때 항상 문제가 되는 것은 평가자간 차이.

Accurate assessment of trainees’ clinical performance is vital both to guide trainees’ educational development1 and to ensure they achieve an appropriate standard of care.2 Although overall support exists for workplace-based assessment (WBA),3–6 inter-assessor score variability has been cited as a cause for concern.7,8



평가자에 대한 훈련은 대체로 드문 편. 평가자 훈련이나 평가스케일 조정의 효과도 크지 않음. 이러한 인터벤션이 가정하는 것은 수행능력을 평가하는데 뒤따르는 문제의 해결에 평가 또는 관찰된 수행능력과 정의가능한 기준 사이의 관계를 잘 이해하는 것이 도움이 된다는 것이다.

Training of assessors is generally sparse,9 and both training10,11 and scale alterations12,13 have produced only limited improvements. Implicit in such interventions is the assumption that challenges with rating performance can be overcome by helping raters better understand the rating task and the relationship between the observed performance and some definable criterion against which they can compare.


평가자의 판단은 두 가지 방향으로 나타난다. Contrast effect와 Assimilation effect.

As a result, the performance of recently viewed candidates can bias assessors’ judgements of current candidates. Such effects can (theoretically) occur in either of two directions: 

      • contrast effects occur when a preceding good performance reduces the scores given to a current performance by making the current performance seem poor ‘by contrast’, and 
      • assimilation effects occur when a preceding good performance increases the scores given to a current performance by focusing attention on similar aspects of performance.17


평가자가 관찰한 수행의 수준이 섞여 있을 때에는 다양한 상황이 발생할 수 있다. Primacy, Recency, Averaging.

When assessors observe a mixture of preceding performances, a variety of effects may occur: 

      • assessors may be most influenced by the initial performances they encounter (primacy), 
      • by the latest performances (recency), or 
      • by the aggregation of previous performances they have seen (averaging).



대조효과가 나타났을 때, 이것이 피교육자의 수행에 대한 평가자의 인식에 영향을 준 것인지, 평가자의 판단을 점수로 옮기는 과정에서 나타난 결과인지 알아보는 것이 목적.

An additional objective of the current study is to determine whether the observation of contrast effects represents an influence on assessors’ perceptions of performance or is an artefact of the way that assessors translate judgements into scores.



평가자의 판단을 점수로 옮기는 것이 어렵다는 것에 대한 지적

An additional objective of the current study is to determine whether the observation of contrast effects represents an influence on assessors’ perceptions of performance or is an artefact of the way that assessors translate judgements into scores. Crossley et al.19 have suggested that score variations arise in part because scales may not align with assessors’ thinking, suggesting that disappointing psychometric performance of WBA to date may stem not from disagreements about the performance observed, but from different interpretations of the questions and the scales’. Other authors have suggested that assessments should focus on narrative comments rather than scores.20–23



Ecological validity를 보존하기 위하여 이 평가자들이 익숙하게 사용해온 형식을 활용함.

As this study was intended to be a fundamental examination of the mechanism through which rater judgements might be influenced, we chose to preserve ecological validity by not altering the assessment format to which these examiners were accustomed. Similarly, we did not specify additional criteria or provide additional training to assessors.



Study design

The study used an Internet-based, randomised, double-blinded design. 



6단계 평가

Scores were assigned using a 6-point scale with reference to expectations for completion of FY-1: 

      • 1 = well below expectations; 
      • 2 = below expectations; 
      • 3 = borderline; 
      • 4 = meets expectations; 
      • 5 = above expectations, and 
      • 6 = well above expectations. 

The UK FP intends these judgements to be criterion-referenced as ‘expectations for completion of FY-1’ are defined by reference to the outcomes of its curriculum. 26



T-test의 robustness. 

Adding to the literature suggesting that t-tests are very robust to deviations from normality,27–29 a recent systematic review has reported that equivalent non-parametric tests are more flawed than their parametric counterparts when such deviations exist.30 Nonetheless, we examined the skewedness of the distributions (and found it to be < 1 in all instances) prior to proceeding with analysis via independent-samples t-tests.




자유형식 코멘트의 분석

Analysis of free-text comments

To address RQ 3, free-text feedback comments were coded using content analysis. Researchers, blinded to the group from which comments arose, segmented the feedback into phrases. Next, each comment was coded independently by two researchers to indicate whether it was ‘communication-focused’ (i.e. commenting on the candidate’s interpersonal skills) or ‘content-focused’ (i.e. commenting on the candidate’s knowledge or clinical skills). The two researchers used an initial subset of data to develop a shared understanding of the codes and then independently coded all remaining segments. The researchers also independently coded comments as positive, negative or equivocal. The independently coded comments were then compared and agreement was calculated using Cohen’s j-values. Discrepant codes were independently reconsidered by both researchers and remaining differences were discussed and resolved. Frequencies of each thematic category (communication and content) were calculated for each performance by each participant. The positive, negative and equivocal codes were assigned scores of +1, 1 and 0respectively, and their sum was calculated for each participant for each performance. This variable was termed ‘positive/negative (pos/neg) balance’.


피어슨 product moment correlation

We examined the relationship between scores and feedback measures using Pearson’s product moment correlations


Effect size

All analyses were performed using IBM SPSS Statistics for Windows Version 20.0 (IBM Corp., Armonk, NY, USA). A p-value of < 0.05 was set as the significance threshold. Cohen’s d is used to report effect size for all statistically significant comparisons. By convention, d = 0.8 is considered to represent a large effect, d = 0.6 represents a moderate effect, and d = 0.4 represents a small effect.














피드백의 valence (Valence (psychology), the emotional value associated with a stimulus)는 수행능력 점수와 중간-강한 상관과계를 보였다. 그러나 피드백의 내용은 평가자의 최근 평가경험이나 다른 수행능력에 영향을 받지 않아서 평가자들이 비슷한 이슈에 대해서는 조건에 무관하게 비슷한 판단을 내리나, 상황에 따라서 그것의 중요도에 대한 판단이 달라짐을 시사했다.

The valence of feedback showed moderate to strong relationships with performance scores and revealed similar influences of contrast. The content of participants’ feedback, by contrast, was not altered by their recent experiences of other performances, which suggests that raters across conditions identified similar issues, but interpreted their severity differently depending on the cases they had previously seen.



단 한 차례의 경험만으로도 다음 평가에 영향을 준다. 장기기억에 있는 정보보다 즉각적 맥락에 의해 더 영향을 받음.

Firstly, this study has shown that a single performance is sufficient to produce a contrast effect on assessors’ judgements despite the fact that participants claimed to have considerable experience in conducting assessments of this type. Schwarz 32 has suggested that contrast effects occur because humans are more readily influenced by information from their immediate context than by information in long-term memory.



즉각적 맥락에 의해 영향을 받긴 하나, 이전 평가의 평균적 경험에 기반하여 비교한다.

This study further suggests that whilst assessors are readily influenced by information in their immediate context, the comparison they make is against an evolving standard that may be based on the average of preceding performances.



자신의 판단을 숫자(점수)로 환산하는 과정에서 발생하는 오류인가?

Prior studies have yielded questions about whether apparent biases arise because assessors find it difficult to translate their judgements into scores.24,34




평가자의 피드백 내용 자체는 변하지 않았다. 다만 그것의 중요도에 대한 인식이 달라진 것으로 보인다. 단순하게 판단을 점수로 변환하는 과정이 아니라 평가자의 근본적 인상에 영향을 미치는 것으로 보인다.

In this study, the content of assessors’ feedback (linguistic expressions of their judgement) was unchanged as a function of our experimental manipulation, which suggests that they saw the issues similarly in each condition. The considerable variability in the valence of their comments, however, suggests that the perceived severity of the issues observed was influenced by contrasts with previously seen cases. This suggests that such cases fundamentally influence assessors’ impressions of performance rather than simply biasing their translation of judgements into scores.


소규모 프로그램에서는 장기간에 걸쳐서 상대적으로 작은 숫자의 학생만을 만나게 되고 학생 하나하나가 서로 더 비교되는 효과를 낳을 수도 있다.

In smaller programmes, however, such as the longitudinal integrated clerkships that are becoming increasingly popular,36 particular examiners will interact with a relatively small number of trainees over a long period of time, creating risk that the trainees will appear more different from one another than is realistic because they are implicitly contrasted against one another.


평가에 대한 사회적 압력이 있다. 많은 경우 평가자는 피평가자와 계속 함께 일을 해야하므로 긍정적인 쪽으로 비뚤림이 생기는 경우가 많다. 평가를 내릴 수 있는 폭이 좁을수록 대조효과가 현실적으로 영향을 미칠 가능성이 낮아질 것이다. higher-stake평가에서는 평가자와 피평가자가 서로 모르는 상황에서 이루어지기 때문에 평가자가 더 분포를 넓게 할 수 있고, 대조효과의 영향도 더 커질 것이다. 

Finally, it should be noted that social pressures of various types can have both implicit and explicit influences on the ratings that assessors assign. In many assessment contexts, including those in which assessors must continue to work with those being assessed, a positive skew in the ratings assigned is commonplace. The more the ratings are compressed into a narrow range, the less likely it is that a contrast effect of discernible practical significance will be observed. In higher-stakes examinations in which the examiner and examinee are unknown to one another or in the context of a video-based review of performance in which the examiner is anonymous (as in this study), raters may be more likely to spread their ratings out, thereby creating greater potential for the psychological contrasts observed in this study to be seen to have influence.







 2015 Sep;49(9):909-19. doi: 10.1111/medu.12777.

Relatively speakingcontrast effects influence assessors' scores and narrative feedback.

Author information

  • 1Centre for Respiratory Medicine and Allergy, Institute of Inflammation and Repair, University of Manchester, Manchester, UK.
  • 2Royal Bolton Hospital, Bolton NHS Foundation Trust, Bolton, Lancashire, UK.
  • 3Health Education North West, Health Education England, Manchester, UK.
  • 4Centre for Health Education Scholarship, Division of Medicine, University of British Columbia, Vancouver, BC, Canada.

Abstract

CONTEXT:

In prior research, the scores assessors assign can be biased away from the standard of preceding performances (i.e. 'contrast effects' occur).

OBJECTIVES:

This study examines the mechanism and robustness of these findings to advance understanding of assessor cognition. We test theinfluence of the immediately preceding performance relative to that of a series of prior performances. Further, we examine whether assessors' narrative comments are similarly influenced by contrast effects.

METHODS:

Clinicians (n = 61) were randomised to three groups in a blinded, Internet-based experiment. Participants viewed identical videos of good, borderline and poor performances by first-year doctors in varied orders. They provided scores and written feedback after each video. Narrative comments were blindly content-analysed to generate measures of valence and content. Variability of narrative comments and scores was compared between groups.

RESULTS:

Comparisons indicated contrast effects after a single performance. When a good performance was preceded by a poor performance, ratings were higher (mean 5.01, 95% confidence interval [CI] 4.79-5.24) than when observation of the good performance was unbiased (mean 4.36, 95% CI 4.14-4.60; p < 0.05, d = 1.3). Similarly, borderline performance was rated lower when preceded by good performance (mean 2.96, 95% CI 2.56-3.37) than when viewed without preceding bias (mean 3.55, 95% CI 3.17-3.92; p < 0.05, d = 0.7). The series of ratings participants assigned suggested that the magnitude of contrast effects is determined by an averaging of recent experiences. The valence (but not content) of narrative comments showed contrast effects similar to those found in numerical scores.

CONCLUSIONS:

These findings are consistent with research from behavioural economics and psychology that suggests judgement tends to be relative in nature. Observing that the valence of narrative comments is similarly influenced suggests these effects represent more than difficulty in translating impressions into a number. The extent to which such factors impact upon assessment in practice remains to be determined as theinfluence is likely to depend on context.

© 2015 John Wiley & Sons Ltd.

PMID:

 

26296407

 

[PubMed - in process]


의과대학 입학시 스무고개 수행능력을 통한 임상수행능력 예측 (Med Educ, 2015)

Twenty Questions game performance on medical school entrance predicts clinical performance

Reed G Williams1 & Debra L Klamen2






"지식이 있다"는 것은 무엇인가? - 네 가지가 있음

-'누가'에 대한 지식

-'무엇을'에 대한 지식

-'어떻게'에 대한 지식

-'언제'에 대한 지식

White,2 in a careful analysis of what it means to possess knowledge, concluded that knowledge is manifested as ability and that ability takes many forms. The more typical and easily measured forms of knowledge involve knowing...

      • who (e.g. who invented the transistor) or 
      • what (e.g. what significant event occurred in New York City on 11 September 2001). 


However, knowing also includes knowing 

      • how (e.g. how to ride a bike) and 
      • when (e.g. when to add elements to a mixture in a process). 


'안다'라는 것은 단순히 기존에 알려진 답을 다시 되풀이하는 것이 아니라 새로운 문제에 대한 답을 찾아내는 기능을 뜻한다.

Most importantly, White argued that knowing does not merely involve the ability to produce old answers previously acquired, but also the facility to find new answers to new problems.2



기존의 지식은 최소한 다섯 가지 형태로 결합될 수 있다. (사실, 개념과 원리, 절차, 전략, 신념)

To cope successfully with the tasks and demands of everyday life, humans must be proficient in combining previously learned knowledge, skills and attitudes (beliefs) into at least five forms: 

  • facts (bits of information); 
  • concepts and principles (e.g. knowledge of cause-and-effect relationships); 
  • procedures (e.g. knowledge of step-by-step processes regarding how and when to carry out an action, such as in how and when to carry out long division computations); 
  • strategies (general methods for approaching problems such as by breaking a problem into parts), and 
  • beliefs (e.g. stable attitudes that lead to predictable behaviours, such as beliefs about the factors that lead to patient behaviour changes). 

이 다섯 가지 요소의 다양한 조합을 통해서 업무를 수행하게 된다.

All five fit White’s2 construction of knowledge as ability. Various combinations of these five elements are compiled and drawn upon to meet the various tasks and demands placed on people.


본 연구는 스무고개 게임의 능력이 일상생활에서 얻은 지식과 어떻게 그것을 구조화하고, 효과적으로 저장하여 효율적/효과적으로 인출, 결합, 활용할 수 있는가를 보여줄 수 있다는 전제에서 시작하였다.

The present study is based on the premise that the TQ parlour game tests the knowledge people have acquired in the course of their everyday lives and how well their organising and storing of that knowledge allows them to efficiently and effectively retrieve, combine and use it to address the challenges posed in everyday life.


본 연구에서 가능한 추가적 이점은 '인내심'의 척도로 활용가능하다는 것이다. 답을 맞추지 못하였더라도 20개의 질문을 모두 한 학생이 있고 중간에 포기한 학생이 있다.

A further benefit of this task is that it provides a  measure of perseverance. Students who fail to solve the problem posed and quit before they have used their entire quota of questions may be providing evidence of low perseverance, which may be a negative indicator of potential success as a medical student.




Study design

This was a prospective, longitudinal, observational cohort study. All students entering Southern Illinois University School of Medicine in 2009 were invited to play a single game of TQ on a non-medical topic during the first week of medical school



Description of TQ tasks and game process

Each participating entering student played a single game of TQ in a one-to-one encounter with the investigator at the time of his or her orientation to medical school. The TQ tasks posed were based on non-medical knowledge acquired through normal life experiences. A number of objects (correct answers) were selected in advance. The object (correct answer) for each participant was selected using a random selection process.


The investigator kept an essentially verbatim record of the number and nature of the questions asked and guesses offered by the student. The investigator also kept detailed notes about the strategies used in playing the game.


질문의 접근법을 네 가지로 구분

Based on the notes taken about student performance on the TQ task, a second investigator, blinded to SCCX and diagnosis justification (DJ) performance, classified the performance into one of four groups based on the student’s approach to the task: Essentially random; Somewhat random; Somewhat logical, and Logical.


논리적 접근은 이런 것이다.

These performances: 

(i) started out with broad questions in order to define the right path; 

(ii) built new questions based on previous answers, and 

(iii) offered educated guesses rather than random (off-the-wall) guesses.



Diagnosis justification exercise (진단 정당화 시험)

For eight cases, students were required to provide a written justification for their final diagnosis as part of the post-encounter exercise. The specific instructions given to students were as follows: ‘Please explain your thought processes in getting to your final diagnosis; how you used the data you collected from the patient and from laboratory work to move from your initial differential diagnoses to your final diagnosis. Be thorough in listing your key findings (both pertinent positives and negatives) and explaining how they influenced your thinking’.


Each response was blindly read and rated by two physician judges (the case author and one additional physician who was an expert in diagnostic reasoning). More details on this task and its use have been published in earlier manuscripts.11,12





Logical Group이 하위 두 개 그룹보다 SCCX와 DJ Performance가 우수함


Logical Group이 SCCX와 DJ에서 가장 우수하며 경향성을 보임


답이 틀렸어도 끝까지 20문항을 모두 질문했는지 여부에 따른 차이


DJ exercise 예측력에 있어 MCAT 점수와의 비교




Effect size interpretations (large and medium) are based on conventions described by Kirk.13




다른 의과대학, 문화권에서 확인이 필요함

As with any research results, confidence in these results will increase if the study can be successfully replicated in this and other medical schools. It is especially important to determine whether similar results are observed in other cultures because it is certainly possible, if not probable, that the results will vary based on the child-rearing and educational practices observed elsewhere.









 2015 Sep;49(9):920-7. doi: 10.1111/medu.12758.

Twenty Questions game performance on medical school entrance predicts clinical performance.

Author information

  • 1Department of Surgery, Indiana University School of Medicine, Indianapolis, IN, USA.
  • 2Department of Medical Education, Southern Illinois University, Springfield, IL, USA.

Abstract

CONTEXT:

This study is based on the premise that the game of 'Twenty Questions' (TQ) tests the knowledge people acquire through their lives and how well they organise and store it so that they can effectively retrieve, combine and use it to address new life challenges. Therefore, performanceon TQ may predict how effectively medical school applicants will organise and store knowledge they acquire during medical training to support their work as doctors.

OBJECTIVES:

This study was designed to determine whether TQ game performance on medical school entrance predicts performance on a clinicalperformance examination near graduation.

METHODS:

This prospective, longitudinal, observational study involved each medical student in one class playing a game of TQ on a non-medicaltopic during the first week of medical school. Near graduation, these students completed a 14-case clinical performance examination. Performanceon the TQ task was compared with performance on the clinical performance examination.

RESULTS:

The 24 students who exhibited a logical approach to the TQ task performed better on all senior clinical performance examination measures than did the 26 students who exhibited a random approach. Approach to the task was a better predictor of senior examination diagnosis justification performance than was the Medical College Admission Test (MCAT) Biological Science Test score and accounts for a substantial amount of score variation not attributable to a co-relationship with MCAT Biological Science Test performance.

CONCLUSIONS:

Approach to the TQ task appears to be one reasonable indicator of how students process and store knowledge acquired in their everyday lives and may be a useful predictor of how they will process the knowledge acquired during medical training. The TQ task can be fitted into one slot of a mini medical interview.

© 2015 John Wiley & Sons Ltd.

PMID:
 
26296408
 
[PubMed - in process]


입학 후 학업능력 예측: 학생의 배경환경과 과거 학업능력의 상대적 중요성 (Med Educ, 2015)

Predicting performance: relative importance of students’ background and past performance

Karen M Stegers-Jager,1 Axel P N Themmen,1,2 Janke Cohen-Schotanus3 & Ewout W Steyerberg4






academic failure와 관계되어있다고 보고된 입학전 특성들

Pre-admission characteristics that have been reported to relate to academic failure are ethnic minority status,6,7 maturity,8,9 male gender,7,8,10 and lower levels of previous academic performance, in particular low Medical College Admission Test (MCAT) scores and low science grade point averages (GPAs).7–9


의과대학 입학 후 첫 달의 성적과의 관계를 밝힌 것도 있음.

Several studies have confirmed the relationship between student performance during the first months at university and subsequent performance. 15–17 A recent study by Winston et al. showed that results on an examination administered after the first 2 weeks of medical school represented a strong early predictor of success or failure.4



그러나 각 의학교육의 단계마다 낮은 학업능력의 risk factor는 서로 다름

In addition, it has been shown that risk factors for poor performance vary at different stages of the medical course.7,19,20




Participants and procedure

코호트 선택의 이유

Students from six consecutive cohorts (2002–2007) at Erasmus MC Medical School were included in this study (n = 2357). We selected these six cohorts for two reasons: (i) the curriculum did not change during this period, and (ii) data on the ethnicity of these cohorts were available from a national database of students in higher education in the Netherlands 1cijferHO. Data on academic performance were derived from the university student administration system and confidentiality was guaranteedBecause data were collected as part of regular academic activities, individual consent was not necessary.



모든 variable을 넣고 logistic regression.

As all variables are known to be associated with medical school performanceall were entered simultaneously in a multivariable logistic regression model. We also included cohort as a stratification variable. Statistical interaction terms were used to study the potentially differential effects of one predictor by values of another predictor.










1학년의 첫 4개월의 GPA의 영향은 남학생보다 여학생에서 두드러졌다. 같은 성적이라면 남학생이 1학년을 이수할 가능성이 더 높았다.

The effect of GPA at 4 months on the Year 1 completion rate was less prominent for males than for females: male students with a high GPA at 4 months were less likely to complete the Year 1 course on time than female students with a similar GPA, whereas male students with a low GPA at 4 months (< 4.5 on a scale of 1–10) were more likely to complete the Year 1 course on time than female students with a similar GPA.



본 연구는 미래 학업능력 예측에 가장 최근의 학업능력(입학 전이든 후든)의 중요성을 보여준다. 그러나 임상실습 단계에서는 학생의 background가 주요한 예측인자로 작용한다.

This study confirms the importance of the most recent past performance – either before or at medical school – as a predictor of future performance in pre-clinical training. However, it also reveals that in clinical training the student’s background becomes the main predictor of performance as students from all minority groups and first-generation university students had a higher risk of achieving lower clinical grades in a model that included pre-clinical performance.



입학전 GPA는 그 자체로는 강력한 예측인자였지만 입학 후 Performance가 포함된 모델에서는 유의성을 잃었다. 이 말은 의과대학 입학 직후 4개월간의 학업능력이 훨씬 더 중요하다는 것을 보여준다. 의과대학에서의 첫 시험에서의 성적을 바탕으로 위험학생을 선별해내는 것을 권고하는 것이며, 학생과 학업환경의 상호작용에 따른 결과가 입학시점에서의 자질 평가를 바탕으로 한 것보다 더 정확하다는 것을 보여준다. 

However, our study offers a more nuanced picture. Although pu-GPA was the most important predictor in the model that included only preadmission data, this factor was rendered insignificant by the addition of early performance at medical school. In other words, the factor pu-GPA was greatly outweighed by study performance data that became available during the first 4 months at medical school. This confirms the suggestion that the identification of at-risk students based on the first results of the interaction between a student and the academic environment is more accurate than identification using entry qualifications and is also in line with the findings of others.4,18



그러나 남학생의 경우는 1학년 학업능력을 첫 몇 개월의 성적으로 예측하는 것에서 예외였는데, 여학생에서보다 남학생에서 예측력이 더 낮았다. 이는 아마도 남학생의 self-efficacy가 더 높은 것이 원인일 수 있다. 남학생은 첫 4개월동안 성적이 높으면 스스로를 과대평가 할 수도 있으나, 낮은 성적을 받는 것 역시 여학생에 비해서 self-confidence에 덜 detrimental하다.

Apparently, the differences found in Year 1 performance for these subgroups can be explained to a large extent by performance during the first months at medical school. An exception to this is male gender, as being male remains a predictor of poorer Year 1 performance after the addition of early medical school performance. The interaction effect we found between gender and GPA at 4 months suggests that early performance is less predictive of later performance for males than it is for females. This may be explained by higher self-efficacy in males than in females, which has been reported previously in medical students.30,31 In male students, high grades during the first 4 months may lead to an over-estimation of their own ability, whereas achieving low early grades may be less detrimental to their self-confidence than it is in female students



인종과 사회적 배경은 임상 수행능력의 중요한 예측인자였다. 이는 임상수련은 전임상교육과는 서로 다른 메커니즘에 의해서 작동됨을 보여준다. 한 가지 가능한 이론은 문화적 자본(cultural capital)이다. Bourdieu에 따르면 이는 다음과 같이 정의된다. 비전통적인 소수인종 혹은 가계 내 최초대학입학자의 경우 이들의 cultural capital이 institutional habitus에 잘 맞지 않는다고 볼 수 있다. 임상실습에서는 보다 주관적인 평가가 많이 작동하기 때문에, 문화적 자본이 전임상교육 기간보다 더 강하게 작동할 수 있다. 

The finding that ethnicity and social background were important predictors of clinical performance, even after adjusting for pre-clinical performance, suggests that performance in clinical training is explained by mechanisms other than those referred to in pre-clinical training. A possible mechanism refers to the concept of cultural capital, which, according to Bourdieu, can be understood as ‘knowledge of the norms, styles, conventions and tastes that pervade specific social settings and allow individuals to navigate them in ways that increase their odds of success’ (see Massey et al.33). The cultural capital of non-traditional – ethnic minority and first-generation university – medical students is less likely than that of traditional medical students to be recognised and positively valued within medical school; that is, it does not fit the ‘institutional habitus’ (see Thomas34). As more subjective examination methods are used in clinical training than in pre-clinical training,35 it may be that the role of cultural capital is more prominent during clinical than pre-clinical training. Although further research is required to confirm these proposed effects of cultural capital, our assumption is supported by our finding that having a medical doctor as a parent is related to poorer performance in pre-clinical but better performance in clinical training.


임상실습과 관련하여 또 다른 가능성은, 문화적 자본과도 관련되어있지만, 문화적 편견이 평가자에게 작용했을 수 있다는 것이다. 사람들은 자기와 같은 그룹에 속한 사람을 보다 신뢰하는 경향(in-group bias)가 있고, 자신과 비슷하거나 그들이 좋아하는 사람과 비슷한 경우 더 신뢰하는 경향이 있다(similarity principle) 이러한 것을 평가자가 스스로 통제하려 하지만, 소수인종이나 first-generation university student는 "전통적" 학생그룹보다 낮은 점수를 받았을  수 있다.

Another possible mechanism in clinical training, which is related to cultural capital, is cultural bias on the part of the examiners. Inevitably, people will have more positive views of those they believe to be part of their group (referred to as ‘in-group bias’36) and people tend to trust those who are similar to themselves or who are similar to people they like (a phenomenon known as the ‘similarity principle’37). Unless traditional examiners are aware of and attempt to control these automatic reactions,38 it is likely that ethnic minority and first-generation university students will receive lower grades than their traditional counterparts. More detailed experimental studies may assist in elucidating the processes underlying judgement and decision making in clinical assessments.


본 연구의 첫 번째 한계점은 일부 factor에 대해서 제한된 숫자의 학생만 응답한 것인데 missing value에 대해서 multiple imputation 기법을 활용하였다. 

A first limitation of our study is that data on the pre-admission factors ‘first-generation university student’ and ‘medical doctor as parent’ were collected for a restricted number of participants. However, to deal with the missing values, we used the technique of multiple imputation, which is widely accepted as suitable.26 As they allow the use of data that are available for other predictors that would otherwise be lost, imputation methods, especially multiple imputations, are superior to complete case analysis. 26,41,42 The ORs calculated in the imputed dataset in our study were similar and, if different, generally more conservative than the ORs in the unimputed dataset (Table S1).




 2015 Sep;49(9):933-45. doi: 10.1111/medu.12779.

Predicting performancerelative importance of students' background and past performance.

Author information

  • 1Institute of Medical Education Research Rotterdam, Erasmus MC University Medical Centre Rotterdam, Rotterdam, the Netherlands.
  • 2Department of Internal Medicine, Erasmus MC University Medical Centre Rotterdam, Rotterdam, the Netherlands.
  • 3Centre for Research and Innovation in Medical Education, University Medical Centre Groningen, University of Groningen, Groningen, the Netherlands.
  • 4Centre for Medical Decision Making, Department of Public Health, Erasmus MC-University Medical Centre Rotterdam, Rotterdam, the Netherlands.

Abstract

CONTEXT:

Despite evidence for the predictive value of both pre-admission characteristics and past performance at medical school, their relativecontribution to predicting medical school performance has not been thoroughly investigated.

OBJECTIVES:

This study was designed to determine the relative importance of pre-admission characteristics and past performance in medical school in predicting student performance in pre-clinical and clinical training.

METHODS:

This longitudinal prospective study followed six cohorts of students admitted to a Dutch, 6-year, undergraduate medical course during 2002-2007 (n = 2357). Four prediction models were developed using multivariate logistic regression analysis. Main outcome measures were 'Year 1 course completion within 1 year' (models 1a, 1b), 'Pre-clinical course completion within 4 years' (model 2) and 'Achievement of at least three of five clerkship grades of ≥ 8.0' (model 3). Pre-admission characteristics (models 1a, 1b, 2, 3) and past performance at medical school (models 1b, 2, 3) were included as predictor variables.

RESULTS:

In model 1a - including pre-admission characteristics only - the strongest predictor for Year 1 course completion was pre-university grade point average (GPA). Success factors were 'selected by admission testing' and 'age > 21 years'; risk factors were 'Surinamese/Antillean background', 'foreign pre-university degree', 'doctor parent' and male gender. In model 1b, number of attempts and GPA at 4 months were the strongest predictors for Year 1 course completion, and male gender remained a risk factor. Year 1 GPA was the strongest predictor for pre-clinical course completion, whereas being male or aged 19-21 years were risk factors. Pre-clinical course GPA positively predicted clinical performance, whereas being non-Dutch or a first-generation university student were important risk factors for lower clinical grades. Nagelkerke's R(2) ranged from 0.16 to 0.62.

CONCLUSIONS:

This study not only confirms the importance of past performance as a predictor of future performance in pre-clinical training, but also reveals the importance of a student's background as a predictor in clinical training. These findings have important practical implications for selection and support during medical school.

© 2015 John Wiley & Sons Ltd.

PMID:

 

26296410

 

[PubMed - in process]



"문제"학생: 이 문제는 누구의 문제인가? AMEE Guide No.76

The ‘‘problem’’ learner: Whose problem is it? AMEE Guide No. 76

YVONNE STEINERT

McGill University, Canada






Introduction

이 가이드는 문제학생들을 대처하는 방법에 대한 프레임워크를 제시하고자 한다.

Clinical teachers often work with students or residents whom they perceive as a “problem”. For some, it is a knowledge deficit that first alerts them to a problem; for others it is an attitudinal problem or distressing behaviour (Steinert & Levitt 1993). And in some cases, it is difficult to know if the learner is, indeed, presenting with a problem. The goal of this Guide is to outline a framework for working with “problem” learners, which includes strategies for identifying and defining learners’ problems, designing and implementing appropriate interventions, and assuring due process. The potential stress of medical school and residency training will also be addressed, as will a number of prevention strategies. Although some of the issues involved in teaching students and residents may differ (e.g. length of exposure to the learner; available methods of assessment), the principles for working with “problem” learners remain the same. Moreover, although many of the examples in the Guide come from working with students and residents in medical specialties, the approaches apply to learners in all of the health professions (e.g. Clark et al. 2008). Identifying learners’ problems early – and providing guidance from the outset – can be an important investment in the training and development of future health professionals. It is hoped that this Guide, based on experiences in working with students and residents (Steinert & Levitt 1993; Steinert 2008) will be of help to clinical teachers, program directors, and faculty developers.


정의 Definitions

다양한 용어가 사용된 바 있다. ABIM에서는 "문제 레지던트"를 "윗사람 - 대체로 프로그램 관리자나 수석전공의 - 으로부터 인터벤션이 필요할 정도로 중대한 문제를 일으킨 전공의"라고 정의하고 있다. 다른 연구자는 "정동적, 인지적, 구조적, 대인관계적 어려움으로 인해서 학업 수행능력이 크게 떨어지는 사람"이라고 정의한 바 있으며, 정서적 스트레스나 물질 남용으로 인한 이차적 손상으로 인한 문제를 지적하기도 한다. 본 가이드에서는 "지식, 태도, 술기의 중대한 문제로 인해서 훈련 프로그램의 기대치를 충족시키지 못하는 학생/레지던트"라고 정의하고자 함.

A variety of terms have been used to describe the “problem” learner: the “resident in difficulty”; the “troublesome learner”; the “disruptive student”; and the “impaired physician” (Shapiro et al. 1987; Grams et al. 1992; Gordon 1993; Steinert et al. 2001; Yao & Wright 2001). The American Board of Internal Medicine (1999) has defined a “problem resident” as a “trainee who demonstrates a significant enough problem that requires intervention by someone of authority, usually the program director or chief resident”, whereas Vaughn et al. (1998) have provided the following definition: “a learner whose academic performance is significantly below performance potential because of a specific affective, cognitive, structural, or interpersonal difficulty”. The term has also been used to refer to impairment, secondary to emotional stress or substance abuse (Grams et al. 1992). This Guide will define a “problem” learner as a student or resident who does not meet the expectations of the training program because of a significant problem with knowledge, attitudes or skills (Steinert 2008).



유병률 Prevalence

Prevalence를 보고한 연구는 적지만 적게는 5.8%에서 9.1%까지 있다. 한 연구에 따르면 가장 흔한 문제는 '불충분한 의학지식', '부족한 임상판단', '비효율적 시간사용' 등이다. 

Studies reporting the prevalence of “problem” learners are limited (Roback & Crowder 1989; Yao & Wright 2000; Reamy & Harman 2006). However, reported rates vary from 5.8% over a four-year period in a Psychiatry program (Yao & Wright 2000) to 9.1% over a 25-year period in a Family Medicine program (Reamy & Harman 2006). In one study (Yao & Wright 2000), the most frequent problems identified by teachers were: insufficient medical knowledge (48%); poor clinical judgment (44%); and inefficient use of time (44%). In another study (Reamy & Harman 2006), insufficient knowledge and attitudinal problems were identified as the most common challenges, followed by interpersonal conflict, psychiatric illness, family stress and substance abuse. Not surprisingly, “problem” residents rarely identify themselves (Yao & Wright 2000).


중요하게 기억해야 할 사실 중 하나는, '문제학생'을 다룰 때 우리는 흔히 선생으로서 선입견을 갖기 쉽지만, 대부분의 학습자는 강한 학업수행능력과 높은 성공 동기를 가지고 있다는 사실이다. 또한 Brenner et al은 "대부분의 지원자는 졸업할 때까지 별다른 간섭이 필요 없는 성공적인 레지던트가 될 것이다. 이들은 그 길을 가는 동안 평균적인 정도의 어려움을 겪을 것이다"라고 했다. 그러나 문제학생의 존재는 프로그램 전체에 영향을 줄 수 있다. 왜냐하면 이들에 대한 모니터링, 상담, remediation등이 프로그램과 교수의 자원을 잡아먹기 때문이다. 일부 교육자들은 '문제학생'의 존재가 프로그램 전체의 integrity를 손상시키거나, 동료들의 경험에 악영향을 줄 것을 우려한다.

It is also important to remember that, although working with “problem” students or residents can easily color our perceptions as teachers, the majority of learners demonstrate strong academic performance and high motivations to succeed (Hays et al. 2011). Moreover, as Brenner et al. (2010) have stated, “most applicants will become successful residents who progress without interruption towards graduation, facing only the usual stumbles of normal professional development along the way”. However, the presence of a “problem” learner can significantly affect an entire program (Brenner et al. 2010), as increased monitoring, counseling, or remediation may tax the resources of both the program and the faculty. Some educators also fear that the presence of a “problem” learner may damage the integrity of the training program or negatively influence the experience of peers (Yao & Wright 2001).


선생으로서 우리는 어떤 학생이 이러한 '문제아'가 될 것인가를 알고 싶어 한다. 미리 안다면 회피할 수 있기 때문이다. 그러나 지금까지 여러 연구에서 의과대학/레지던트 지원자를 screen하거나 예측하는 용도로 신뢰할 수 있는 요인은 밝혀진 바 없다.

As teachers, we often wonder if it is possible to predict who will become a “problem” learner, hoping that we can avoid some of the anguish that is related to this educational experience. To date, however, studies have not been able to isolate factors that we can reliably use to either screen applicants to medical school/residency or predict future problems (Dubovsky et al. 2005; Brenner et al. 2010).



증상과 증후 “Signs and symptoms”

학습자가 어려움에 있음을 보여주는 다양한 종류의 증상이 잇다. 

A range of “signs” may suggest that a learner is in difficulty (Evans & Brown 2010; Evans et al. 2010). These signs include 

    • failing a written or practical test; 
    • poor (or late) attendance at regularly scheduled events; 
    • inadequate knowledge or clinical skills that are inconsistent with stage of training; 
    • unprofessional behaviors with patients or peers; 
    • poor interpersonal skills; 
    • a lack of insight; 
    • anxiety; 
    • depression or reluctance to become part of the team. 
    • A lack of professional behavior is also a common indicator (Bennett et al. 2005; Greenburg et al. 2007). 


Hay등은 전형적인 문제로 학습기술의 부족, 조직 기술의 부족, 정신건강 악화, 미성숙함, Insight 부족, 개인적 위기 등을 언급한다.

In an exploratory study, Hays et al. (2011) developed a framework of “typical” problems that included poor learning skills, poor organizational skills, poor mental health, immaturity, poor insight and major personal crises. Interestingly, a lack of insight has been identified as one of the most difficult problems to address.


학습자들이 다양한 요인의 결과로서 어려움을 겪을 수 있다는 것을 알아야 한다.

It is also important to note that learners can encounter difficulty as a result of many factors, including exhaustion and fear of failing, substance abuse, illness, family and personal issues or academic challenges (Bennett & O’Donovan 2001; Tyssen & Vaglum 2002; Evans & Brown 2010). Mental and physical illnesses, as well as learning disabilities, are relatively common in the general population; not surprisingly, they frequently occur among medical students and residents as well (Frank-Josephson & Scott 1997; Faigel 1998; Dyrbye et al. 2005; Midtgaard et al. 2008).



문제학생을 다루기 위한 프레임워크 A framework for working with “problem” learners

Although different approaches to working with problem learners exist in the literature (e.g. Shapiro et al. 1987; Gordon 1993; Vaughn et al. 1998; Kahn 2001; Mitchell et al. 2005), the following framework, which has been described previously (e.g. Steinert & Levitt 1993; Steinert 2008) and is outlined in Table 1, has been found to be helpful to clinical teachers and program directors.





직관에서 문제 파악까지 

From intuition to problem identification


학생이나 전공의의 문제를 정의하는 것은 여러 단계를 거치게 되며, 무언가 이상하다는 직관이나 느낌에서 시작되는 경우가 많다. 

Defining a student's or resident's problem usually involves several steps (Steinert & Levitt 1993), beginning with a hunch or intuition that something is amiss. This intuition may come from the direct observation of a learner with a patient or repeated interactions in both formal and informal settings. When teachers (or primary supervisors) first suspect a problem, they should ask themselves three initial questions in order to verify their suspicion: What is the problem? Whose problem is it? Is it a problem that must be changed? Answering these questions will help to determine whether the learner actually has a problem, what it might be, and whether something needs to be done. By going through this process, teachers will also be able to develop a working hypothesis that they can later confirm with the learner and other colleagues.


문제가 무엇인가? What is the problem?

우리의 경험상, 학습자의 문제는 대체로 지식, 태도, 술기 중 하나의 범주에 들어간다. 

In our experience, learners’ problems usually lie in one of three areas: knowledge, attitudes or skills (Steinert 2008). 

      • Knowledge problems, sometimes called cognitive difficulties (Hicks et al. 2005), often include deficiencies in basic or clinical sciences. Attitude problems (often manifested as behaviors) usually include difficulties related to motivation, insight, doctor-patient relations or self-assessment. 
      • For many, attitude problems are easy to identify but challenging to resolve. 
      • Skill deficits can include problems with interpretation of information, interpersonal or technical skills or clinical judgment and organization of work. More importantly, there is often an overlap between skill deficits and attitudinal problems (Steinert & Levitt 1993). (...)







누구의 문제인가? Whose problem is it?

문제가 어디 있는지를 찾아내는 것은 문제 정의 단계에서 가장 어려운 것 중 하나다. 우리의 경험에 따르면, 교사들은 주로 학생들한테 문제가 있다고 생각한다. 그러나 문제는 교사나 시스템에 있을 수도 있다.

Determining where the problem lies may be one of the most challenging aspects of problem definition. Based on our experience, it appears that teachers often assume that it is the learner who has the problem. However, difficulties may also lie with the teacher or the system.


교사의 문제 Teachers’ issues

교사는 다양한 역할을 하고 있으며, 스스로 원하는 수준만큼 역할을 다해내지 못한 이유로 학생이나 레지던트에게 '문제아' 딱지를 붙일 수도 있다. 모든 경우에 교사들은 확인된 문제를 위해서 스스로 어떤 기여를 하고 있는가를 분석해봐야 한다. 예컨대, 교사들은 단순히 개인적으로 스트레스를 받거나 교사 역할에 불만족스럽기 때문에 - 학생이 문제학생이 아님에도 - '문제아'라는 딱지를 붙이곤 한다.

Teachers play many roles (Whitman & Schwenk 1997) and may label a student or resident as a “problem” because they cannot fulfill the role they wish to fill (Steinert & Levitt 1993). Teachers also enter educational situations with specific assumptions, expectations and experiences, all of which can lead to problems; so can the teachers’ own stresses or biases. At all times, teachers should try to carefully analyze to what extent they are contributing to the identified problem. For example, they may label a learner a “problem” because they are personally stressed or dissatisfied with their teaching role, not because the learner is “in trouble”.


문제학생을 다루는 것은 교사들에게 다양한 반응을 불러일으킨다. 보통 다음과 같다.

Working with “problem” learners also engenders a variety of reactions in teachers. Common responses reported by teachers include the following (Steinert 2008):


          • Denial (Maybe he's just having a bad day …)
          • Avoidance (I think I’ll schedule another clinic during my teaching session.)
          • Desire to rescue or protect (If I work hard enough, I will be able to help her …)
          • Anger/frustration (Oh no! Why do I always get the challenging residents?)
          • Helplessness/impotence (It's so hard! We’ll never be able to do it.)
          • Acceptance (Let's get on with it and design a good remediation!)


당연하지만, 교사의 감정은 학습자의 감정을 보여주는 거울이기도 하다. 따라서 개인의 반응을 확인하는 것은 유용한 평가도구가 될 수 있다.

Not surprisingly, teachers’ sentiments often mirror the learner's feelings. Identifying personal responses can, therefore, serve as a useful assessment tool.



학습자의 문제 Learners’ issues

학습자의 문제에는 다음과 같은 것들이 있을 수 있다.

In addition to gaps in knowledge, attitudes or skills (as described above), learners’ problems can include: 

          • stress relating to training or career concerns; 
          • life stresses, such as immigration, moving to another location, marriage or divorce; 
          • medical or psychiatric illness; 
          • substance abuse; 
          • learning disabilities or interpersonal conflict. 


25%의 인턴은 약간의 우울함을 느끼며, 12.%의 주니어 의사는 알코올 남용을 하고 있다. 동시에 학습자의 기대/가정/반응이 문제 파악에 영향을 줄 수 있다. 추가적으로 학생이나 레지던트에게 '문제아'라는 딱지를 뭍이는 것도 중요한 영향력을 발휘하는데, 가급적 교사들은 어떤 딱지도 붙이지 말아야 한다. 장점보다 단점이 더 많다.

As an example, in one report, 25% of interns were mildly depressed and 12.5% of junior doctors were misusing alcohol (Lake & Ryan 2005). At the same time, learners’ expectations, assumptions, and reactions to the perceived problem (e.g. a sense of inadequacy or insecurity; anger or fear of losing control) may also contribute to problem identification. In addition, the process of labeling a student or resident as a “problem” can have a significant impact, and whenever possible, teachers should try to avoid all labels. They may cause more harm than good.


시스템의 문제 Systems’ issues

시스템의 문제는 보통 찾아내기가 쉽지 않을 수 있다. 

Systems problems, which are often difficult to identify, can include unclear standards and responsibilities beyond perceived levels of competence, an overwhelming workload, inconsistency in teaching or supervision, or a lack of feedback or assessment (Steinert & Levitt 1993). Learners will often report that they do not receive feedback from their supervisors on a routine basis or that their summative assessment is a “surprise,” while teachers will say that they did not have enough time to observe performance. Clearly, this challenge lies with the educational system and not the learner. Other systems’ issues include reduced clinical exposure, fragmentation of clinical teams (Evans et al. 2010), conflicting demands or expectations, and difficult patient problems. In multiple ways, identifying systems’ constraints is critical in defining the problem and designing an appropriate intervention. At the same time, teachers must feel supported by the system and know that they have access to resources when dealing with challenging situations.


반드시 바뀌어야 하는 문제인가? Is it a problem that must be changed?

학습자와 학습자의 동료들과 대화를 나누기 전에,교사들은 과연 그 문제가 반드시 해결되어야 하는 문제인지, 그리고 더 중요하게는, 만약 해결되지 않는다면 무슨 일이 생길지 생각해봐야 한다. 많은 교사들인 학습자들이 기쁘고 협력적이기를 바라나 이러한 기대는 현실적이지 않으며, 교사들은 어떤 행동이 자신들의 목적이나 가정에 위배되기 때문에 '문제'라고 바라보지는 않았는지 스스로 물어야 한다. 동시에, 초기에 문제를 발견하는 것이 중요한데, Evans et al은 "어려움에 빠진 학습자가 발견되더라도, 이들은 보통 큰 문제가 생기기 전까지는 방치되곤 한다"라고 했다. 가능하다면 이러한 '큰 문제'를 피해나가야 한다.

Before talking to the learner and other colleagues, a critical next step, teachers should ask themselves whether a particular problem must be changed, and more importantly, what would happen if it was not addressed (Steinert 2008). Although many teachers would like their learners to be happy, pleasant and cooperative (Steinert & Levitt 1993), this expectation is not realistic, and teachers must ask themselves whether they have labeled specific behaviors as problematic because they interfere with their own objectives or assumptions. It is not surprising for a teacher to realize that a suspected problem does not need to be addressed. At the same time, early identification is critical, for as Evans et al. (2010) have stated, “although learners in difficulty are often recognized, they frequently go unchallenged until a critical event occurs”. To the extent that is possible, we should try to avoid these critical events.



문제 발견에서 문제 정의까지 

From identification to problem definition


Once teachers have identified the problem(s) and considered their own role in the process, careful data-gathering is needed to confirm the teachers’ working hypothesis. This step includes a detailed description of the problem (e.g. when did it start; what makes it worse), the learner's perception of the problem, the learner's strengths and weaknesses in knowledge base, attitudes and skills (if not already identified), the learner's relevant life history (e.g. current life stresses; substance abuse; coping strategies), the teacher's perceived strengths and weaknesses, and colleagues’ perceptions, feelings, expectations and assumptions (Steinert 2008).


임상 교사들은 흔히 학생과 직접 이야기하는 것을 꺼린다. 일부는 그것은 자신들이 역할이 아니라고 느끼기 때문이며, 일부는 효과적으로 수행할 기술이 없기 때문에 오히려 벌집을 건드리는 격이 되는 것은 아닐지 걱정하기 때문이다. 일부 교사들은 이미 너무 일이 많아서 그런 것을 할 여력이 없다고 하고, 어떤 사람들은 보복적 법률 소송을 당할 것을 걱정한다. 이러한 것과 무관하게, 다음과 같은 질문이 필요하다.

Importantly, clinical teachers are often reluctant to talk to the learner directly. Some believe that it is not their role to do so; others feel that they lack the skills to do so effectively or worry that they are opening a potential “can of worms” that will make things worse (Evans et al. 2010). Some teachers feel that they are already “overstretched” and cannot take the time to get involved, whereas others fear reprisal through legal action (Lake & Ryan 2005). Irrespective of these sentiments, however, a direct approach is needed as teachers work through the following questions:



문제가 무엇인가?

1. What is the problem?

지식인지 태도인지 술기인지. 문제를 개선시키거나 악화시키는 요인 뿐 아니라 관찰가능한 행동이나 패턴을 찾아보아야 함. 학습자가 가진 문제의 "functional inquiry"를 위해서 임상기술을 활용할 수도 있음.

Teachers need to ascertain a detailed description of the learner's problem(s) and must decide if it is primarily one of knowledge, attitude, or skill. They must also try to identify observable behaviors and patterns as well as factors that either alleviate – or exacerbate – the problem. In multiple ways, teachers should rely on their clinical skills in order to conduct a “functional inquiry” of the learner's problem(s).



문제에 대한 학습자의 생각은 어떠한가?

2. What is the learner's perception of the problem?

교사가 문제가 있을 것이라는 의심을 가졌을 때, 그것이 실제로 문제인가를 확인하기 위해서는 학생이나 레지던트와 이야기해볼 필요가 있다. 어떤 경우에 많은 교사들은 이 단계를 회피하려고 하지만, 학습자가 자신의 어려움과 강점, 동기와 가정에 대해 어떤 생각을 가졌는지를 알아보는 것은 중요하며, 필수적 첫 단계이다. 더 중요하게는 학습자 중심의 인터뷰가 문제에 대한 학습자의 인식, 문제의 역사와 관련된 요인, 개인적 요인 등을 밝혀줄 수도 있다. 이러한 면담이 그 자체로 인터벤션이 될 수 있음을 기억하는 것이 중요한데, 왜냐하면 일부 학습자들은 그들이 겪고 있는 문제에 대해서 이야기할 기회를 갖는 것 자체를 좋아할 것이며, 자신에게 관심을 가지고 지지해주려는 교사에 대해 감사할 것이기 때문이다.

Talking to the student or resident is the most important step in confirming the teacher's suspicion that there is, indeed, a problem. For some reason, many teachers try to avoid this step, but ascertaining the learner's perception of his/her difficulties and strengths, motivations and assumptions, as well as training and career objectives, are an essential first step. More specifically, a learner-centred interview may uncover the learner's perception of the problem (as well as its causes), the history of the problem and related factors (e.g. academic difficulties) and personal factors (Evans & Brown 2010). It is also important to remember that such an interview can be considered an intervention in itself, as some learners welcome the opportunity to talk about what is troubling them and appreciate the teacher's support and interest in helping them from the outset.



자신의 강점과 약점에 대해서 어떻게 생각하는가?

3. What are the learner's perceived strengths and weaknesses?

학습자와의 대화는 학습자의 장점과 개선이 필요한 부분에 대한 철저한 평가가 필요하다. 그러나 안타깝게도 교사들은 부족한 부분을 찾아가는 식의 접근법에 의존한다. 대신 학습자의 강점과 개인적 자질에 대한 평가가 필요하다.

The discussion with the learner should include a thorough assessment of his or her strengths and areas for improvement in knowledge, attitudes and skills. Unfortunately, teachers often rely upon a deficit-based approach to teaching and learning; instead, an appreciation of the learner's strengths and personal qualities is needed. This information may also be gleaned by observing the learner in multiple situations (and different electives or rotations) or talking to colleagues and other members of the health care team. As described above, learners may struggle for a number of reasons. It behooves us to explore these issues together with the student or resident – and to draw upon our clinical skills in the assessment process.


학습자의 관련 과거 경험은 무엇이 있는가?

4. What is the learner's relevant life history?

비록 교사들은 '개인적'질문을 함으로서 도를 넘을 수 있다는 걱정을 하기도 하지만, 이러한 벙보가 진단을 내리고 적절한 개입 계획을 결정하는데 필요하다. Yao와 Wright는 학습자의 낮은 수행능력은 다음의 것과 관련되어 있을 수 있다고 했다.

Teachers often ask themselves how much – and what kind of – information they should gather. In fairness to the learner and the teacher's ability to make an accurate diagnosis and treatment plan (Steinert & Levitt 1993), teachers should inquire about current life stresses, recurrent problems and support systems. It is also important to inquire whether the learner has experienced similar problems in the past or whether this is a new challenge for him/her. As an example, a student with a learning disability is often aware of this problem long before the teacher has made the diagnosis. Although teachers are often concerned that they may be crossing a boundary by asking “personal” questions, this information is needed to make a diagnosis and to determine an appropriate intervention plan. Yao and Wright (2001) have suggested that a learner's poor performance may be related to one of the following causes: 

      • behavioral issues, such as those related to professionalism; 
      • medical conditions, including psychiatric illness; 
      • difficulty coping with stress; 
      • substance abuse and cognitive issues, including learning disabilities. 


Mitchell et al은 "레지던트의 수행능력을 수행능력에 영향을 미치는 배경요인에 대한 이해 없이 이해하려고 하는 것은 개별 환자와 그 환자가 처한 상황에 대한 이해 없이 특정 치료방침에만 매달리는 것과 같다"

This classification may be helpful in guiding this line of questioning. As Mitchell et al. (2005) have stated, “attempting to understand resident performance without understanding factors that influence performance is analogous to examining patient adherence to medication regimens without understanding the individual patient and his or her environment”.



교사와 시스템의 강점 및 약점은 무엇인가?

5. What are the teacher's – and the system's – perceived strengths and weaknesses?

Cleland et al은 의학교육자들이 학생들의 underpeformance를 보고하기 꺼려하는 것을 보여준 바 있다.

As stated earlier, the problem may lie with the teacher and/or the system. It is therefore important to ascertain the teachers’ own strengths (and areas for improvement) in knowledge, attitudes and skills, as well as his/her current life stresses and challenges. In an interesting study, Cleland et al. (2008) explored the reluctance of medical educators to report underperformance in students. In multiple ways, their findings, which included teachers’ attitudes towards a specific student (as well as failing students in general), normative beliefs and motivations, skills and knowledge, and environmental constraints, are all relevant in this context. We must also be aware of the potential role that the system can play in contributing to a “problem” situation. As stated earlier, it is worthwhile to identify systems issues so that we can try to minimize their influence as a contributing factor to the learner's problem.



그 학습자의 동료들은 어떻게 바라보고 있는가?

6. How do colleagues perceive the learner?


(...)


자료를 효과적으로 모으기 위해서는 임상 교사들은 다양한 상황에서 학습자를 관찰해야 하며, 환자의 문제를 학생/레지던트의 문제와 함께 보아야 하고, 그들의 평가가 동료들의 평가와 일치하는지도 확인해보아야 한다. 공식적 시험 결과도 도움이 될 수 있으며, 다른 로테이션에서의 피드백도 도움이 될 수 있다. 그러나 학습자를 직접 관찰하는 것, 그리고 직접 이야기해 보는 것의 중요성을 간과해서는 안되며 Yao와 Wright가 보고한 바와 같이, 문제는 보통 '직접 관찰' 또는 '결정적 사건'을 통해서 드러나기 때문이다.

To gather data effectively, clinical teachers need to observe learners in multiple situations, systematically review patients’ problems with students and residents, and work to ensure that their assessments are congruent with those of their colleagues (Steinert & Levitt 1993). Formal test results may also be helpful (Evans et al. 2010), and when appropriate, so is feedback from other rotations. However, the importance of direct observation and talking to the learner cannot be undermined. As noted by Yao and Wright (2001), problems are most often identified through direct observation (82%) and critical incidents (52%).



From definition to intervention

어떤 문제들은 긴급한 조치가 필요할 수 있고, 어떤 것은 시간이 더 필요할 수도 있다. 앞에서 기술한 바와 같이, 학습자를 모든 단계에 포함시키는 것이 중요하며, 계획이 무엇이든 인터벤션은 학습자의 well-being에 대한 진지한 관심을 가지고, 환자와 환자 가족의 안전을 고려하여 진행되어야 한다.

Once a working diagnosis has been established, teachers must design an appropriate intervention. This step includes a consideration of the problem(s) to be addressed, the available intervention options, who should be involved in the intervention, the proposed timeline for both the intervention and the evaluation of outcomes, and the process for documentation. Some problems (e.g. psychiatric illness; substance abuse) will require urgent attention (Steinert 2008); others will require additional time for observation or monitoring. As stated previously, it is essential to involve the learner in every step. In addition, whatever the plan, the intervention should ideally be conducted with genuine concern for the well-being of the learner (Winter & Birnberg 2002) and the safety of patients and their families.


어떤 문제를 해결하고자 하는가?

1. What problem are you trying to address?

대부분의 문제가 독립적으로 발생하는 것이 아니므로, 문제의 우선순위를 정하고 어떤 것을 먼저 해결할 것인가를 정하는 것이 중요하다. 교사간, 교사와 학습자간의 합의를 이루는 것이 중요한 첫 단계이다. 이단계에서 교사들은 학습자들로 하여금 문제를 인지하고 인정하도록 도와야 한다. 또한 학습자가 가능한 전략이나 해결책에 대한 조언을 구하게 해야 한다. 경험에 따르면 공동의 의사결정이 필수적이다. 미리 설계된 인터벤션은 학습자가 그 계획에 동의하지 않으면 대체로 실패하고 만다.

Most problems are complex in nature and do not occur in isolation. It is therefore important to prioritize the perceived problems and to decide which one will be addressed first. Consensus between teachers, and between the teacher and the learner, is also a critical first step. During this phase, the teacher may need to help the learner recognize and acknowledge the issues affecting performance (Evans et al. 2010) and solicit feedback on possible strategies and solutions. Based on experience, shared decision-making is essential; in fact, the designed intervention will usually fail if the learner does not agree with the intended plan.


확인된 문제를 어떻게 해결할 것인가?

2. How will you address the identified problem?

A number of interventions, outlined in Table 2, can be considered when working with “problem” learners. In some instances, the clinical teacher will be involved in all components; at other times, program directors or other senior administrators will be responsible (Steinert 2008). However, in all situations, we must be aware of what options are available to us and one person must be accountable. Frequently, time with monitoring, or further assessment, is sufficient. In other cases, we need to enhance teaching and learning opportunities, either by increasing time for observation or feedback, or by arranging one-on-one coaching with staff or peers. In some situations, workloads might need to be reduced to allow for independent study and reading (for knowledge problems) or increased practice and feedback (for skill-related deficits). Alternatively, a formal remedial program may be required, with clearly defined goals and objectives, learning strategies, and evaluation methods (Steinert 2008). Although suspension, probation or dismissal (from the program) are not desirable options, they must, at times, also be considered (Ikkos 2000).






추가 시간

Additional time

As in medicine generally, time can be an effective healer (Steinert & Levitt 1993). Some learners can overcome their difficulties by moving out of a particularly challenging or stressful rotation, or by working with a different clinical teacher. Others gain confidence or skill as time progresses. Whenever possible, additional time should be accompanied by careful monitoring through observation.


추가 평가와 모니터링

Further assessment and monitoring

In other situations, further assessment will be needed. This will include spending more time with the learner and carefully monitoring what they do. It will also involve observing the student or resident in different contexts, with different patients and families. Including colleagues and other members of the team in this assessment phase can be equally beneficial. It is often surprising how invaluable team coordinators’ comments can be with regard to a student's or resident's behaviors with patients and other health professionals.


일대일 토론

One-on-one discussions

One-on-one discussion with the learner constitutes an important strategy that is often taken for granted. Although frequently not considered part of an intervention, meeting with the learner, to review specific issues or concerns, can be very worthwhile. Such a meeting can also be used to clarify expectations (which learners often feel are not explicit) and discuss pre-assigned readings, clinical problems or identified deficits (e.g. problem-solving).


교수 학습 기회 향상

Enhanced teaching and learning opportunities

At times, increased observation and feedback can help to address identified problems. This is especially true for knowledge-based problems or skill-related deficits. More frequent case discussions and chart reviews can facilitate knowledge acquisition, as can mini-tutorials, review of patient management problems and discussion of pre-assigned readings. Increased opportunities to observe role models in action can encourage the acquisition of interpersonal skills, as can time in a simulation-based environment. The latter can also help to address deficiencies related to technical skills, interviewing skills and team work. A skill-based training course, tailored to individual needs, might also be recommended.


근무량 감축

A reduced clinical workload

A reduced clinical workload, with protected time to focus on knowledge or skill acquisition, may at times be in order. If the learner is feeling overwhelmed by the clinical demands (in relation to their own expertise and competence), a lesser workload may decrease stress so that learning can occur.


로테이션, 장소, 감독관 변경

A change in rotation, venue or supervisor

Changes at the system level should also be considered. Changing the learner's rotations (e.g. scheduling an easier rotation, working in a different setting or clinical environment) can be another alternative, as can changing the primary supervisor or adding other teachers (with different skill sets) to the roster. Working with “problem” learners is generally quite time-consuming for teachers, and sharing the workload may be beneficial to all concerned.


동료나 멘토의 지지

Peer or mentor support

Medical school and residency training can be a stressful time for students and residents (Dyrbye et al. 2005) At times, a supportive peer or teacher can be very helpful. The role of peers in working with “problem” residents has been debated by clinical teachers and residents alike; however, the value of “near-peer” support cannot be underestimated as long as peers maintain confidentiality and respect.


레미디얼 프로그램

A remedial program, with defined goals, objectives and strategies

The above components are frequently used in a more formal remedial program, which may include a variety of teaching methods (e.g. videotape reviews of clinical encounters, role plays of difficult doctor–patient interactions) or extra rotations in a specific discipline, with protected time for increased supervision, study and review (Steinert & Levitt 1993). Known to address specific problems with reasonable success, such programs require clearly defined goals and pre-determined outcomes. Moreover, in some settings, they have had considerable success with both students (Schwartz et al. 1998) and residents (Catton et al. 2002).


상담, 치료

Counseling or therapy

Although most clinical teachers find this a difficult option to pursue, counseling or therapy may be indicated, especially if the learner is presenting with aggressive or depressive symptoms, substance abuse, or psychiatric problems. Learning disabilities can also not be ignored as an underlying factor for perceived problems and often require intervention (Coles 1990). This is also an area where outside consultants or expertise should be sought.


휴가

A leave of absence

A survey of internal medicine programs from 1979 to 1984 found that 1% of the residents required a leave, and 56% of the programs granted leaves of absence because of “emotional impairments” (Smith et al. 2007). Although teachers are often reluctant to consider this option, it should be part of the repertoire of interventions, especially as leaves of absence are one of the suggested options for health-related problems including substance abuse (Long 2009).


정학/퇴학

Probation, suspension or dismissal

명확한 정책이 있어야 하지만, 한편으로는 Ikkos가 언급한 바와 같이, 문제학생을 다루는 법적 제도적 장치는 국가이나 기관에 따라서 다르다.

Academic dismissal(학업능력에 따른 퇴학)과 displinary dismissal(그 조직이나 기관의 정책에 위배되는 행동에 따른 퇴학)은 구분되어야 한다.

무엇이 성공인지에 대한 장기적인 관점이 필요하다. 

In order for this option to work, clear policies must be in place. It is also true that this intervention is dependent on local norms and values, and as Ikkos (2000) had said, the legal and administrative framework to deal with “problem” learners differs across countries and authorities. In addition, only a few reports describe termination policies in medical training programs (Irby et al. 1981; Tulgan et al. 2001). However, this option must be seriously considered, despite teachers’ reluctance to do so. Irby and Milam (1989) distinguish between academic dismissals, which result from academic or clinical performance issues, and disciplinary dismissals, which follow violations of institutional rules or policies. Irrespective of the nomenclature, however, we might need to dismiss learners from their programs when remediation efforts fail (Catton et al. 2002). As Winter and Birnberg (2002) have stated in the description of their work with impaired residents, we must have a long range view of success and “recognize that suspension or dismissal may only be a temporary setback … short-term failure, including relapse, may in fact lead to long-term success”. It is also important to remember that re-directing a student to another specialty – or career – may not be a failure in the long run.


Dudek et al은 교사가 학생을 낙제시키기를 머뭇거리게 하는 네 가지 요인을 밝혔다.

In an interesting study, Dudek et al. (2005) identified four factors to explain teachers’ reluctance to fail students and residents: 

        • a lack of documentation; 
        • a lack of knowledge about what to document; 
        • anticipation of an appeal; and 
        • a lack of remediation options. 

These factors are equally important in this context and must be addressed by program directors, educational leaders and administrators. In fact, we must put systems into place to protect our teachers as well as our learners.



흔한 인터벤션의 방법들은 다음과 같다.

As described previously (Steinert 2008), experience has shown that common interventions include: 

      • increased observation and feedback (for gaps in knowledge or skills); 
      • increased time with a faculty advisor (for knowledge deficits, attitudinal problems, interpersonal conflict or family stress); 
      • weekly study sessions, core content review and videotaping of clinical encounters (for knowledge, attitudinal or skill problems); and 
      • psychiatric counseling (for attitudinal problems, interpersonal conflict, family stress or substance abuse). 



어떤 성과가 기대되는지, 인터벤션의 실패는 어떨지가 초창기에 결정되어야 한다.

Anticipated outcomes, and consequences of failed interventions, must also be determined early in the process, though it is heartening to note that close to 90% of “problem” learners succeeded after a structured intervention or remediation program (Winter & Birnberg 2002; Reamy & Harman 2006).



인터벤션에 누가 관여할 것인가?

3. Who will be involved in the intervention?

비록 규정에 따라 정해져 있을 수도 있지만, 가능하다면 프로그램 관리자나 관련된 부학장이 인터벤션 계획에 관여해야 한다.

At times, the primary supervisor (or clinical teacher) will be responsible for both designing and implementing the intervention. At other times, another member of the team or outside consultant will be involved. Although this decision is often dependent on institutional policy or local norms, whenever possible, the program director or associate dean (or someone in a similar position) should be consulted and involved in the intervention plan. So should the student or resident. Depending on the design and complexity of the intervention, and the specific educational context, it may also be helpful to have more than one person involved in the intervention plan, and ideally, this should be discussed with the learner. In all cases, it is important that the learner is comfortable with the teacher(s) involved in the intervention, all of whom should have the time and expertise to deal with the learner's difficulties. As highlighted above, peer support can also be invaluable.


인터벤션의 time frame은 어떻게 되는가?

4. What is the time frame for the intervention?

교사들이 흔히 하는 실수는 명확한 목적이나 목표, 시간계획 없이 인터벤션에 뛰어드는 것이다. 

Teachers often err by “jumping into” an intervention without clear goals, objectives or time frames. Clearly, both the teacher and the learner would benefit from knowing how long the intervention will last and what the expected outcomes will be. It is also important to recognize that time frames may be context-specific. For example, much of undergraduate training occurs in one-month blocks; postgraduate training often provides more time for intervention and problem resolution. Clearly, the dimension of time must be seriously considered.


인터벤션을 어떻게 평가할 것인가?

5. How will the intervention be evaluated?

Whatever the intervention, learners often lament that they do not know what is expected of them. Accordingly, the criteria for success must be carefully laid out from the outset. For example, if the teacher and learner are working on improving technical skills, the expectations for success should be clearly enunciated at the outset and a system for evaluating progress should be determined. It is equally important to schedule regular, pre-arranged meetings between the learner and the supervisor to monitor ongoing progress, to determine whether the intervention plan has been able to achieve its specified goals (Steinert & Levitt 1993), and to make mid-course corrections. These meetings should also be scheduled before the intervention starts so that they are not viewed as a method of crisis intervention. Finally, it is essential to outline what consequences will be considered if no improvement is noted. At times, the problem may need to be re-defined; at other times, the remediation program will need to be extended or altered. And as stated earlier, probation or dismissal may need to be considered as a viable option. In this era of outcomes-based education, clear outcomes are needed at every step of the way.


인터벤션의 기록을 어떻게 남길 것인가?

6. How will the intervention be documented?

필수적 요소임에도 이 단계는 보통 생략되거나 우연에 맡겨지곤 한다.

Although thorough documentation is an essential component of all interventions, this step is often omitted or left to happenstance. For example, 

      • teachers must document the identified problem (with supporting data), 
      • the discussions with the learner and colleagues, 
      • the intervention plan, and 
      • the observed outcome of designated activities. 

Some teachers find it helpful to write up the intervention plan as a “learning contract”, outlining how the problem will be dealt with, in a particular time period; others prefer to keep carefully documented process notes. Though often skeptical at first, learners frequently express appreciation at knowing what is expected of them and what outcomes are desired. Documentation is also essential in ensuring due process.



정당한 절차를 어떻게 확보할 것인가?

7. How will due process be assured?

교사들은 반드시 정당한 절차에 따라 협력적으로 접근해야 하며, 공정함을 담보해야 하고, 비밍르 유지해야 하며, 충분한 정보를 제공하고 동의를 받아야(informed consent)한다. 공정함이란 학습자가 교육 프로그램의 목적을 알고, 승진의 규칙을 아는 것이다. 이는 또한 정기적으로 피드백이 주어지며, 교사의 평가는 직접 관찰한 객관적 자료에 기반한다는 것을 말한다. Documentation은 자연정의(natural justice)를 공고히 하는데 중요하며, 교사들은 평가, 인터벤션, 토론 등을 기록해야 한다. 동시에 이러한 정당한 절차는 bilateral한 과정이며, 동료들을 위하여 natural justice를 확실히 해야 함을 기억해야 한다. 많은 교사들이 문제학생을 다루는데 있어서 '외로움' '취약함'등을 어려움으로 꼽았다. 

Teachers must work collaboratively to ensure due process (Rankin & Kelly 1986; Rose 1989) and to guarantee fairness, confidentiality, and informed consent. Fairness implies that the learner is aware of the program's educational objectives and rules of promotion. It also implies that feedback is given on a regular basis and that the teachers’ evaluations are based on first-hand exposure and objective data. Documentation is critical in assuring natural justice, and teachers must be encouraged to document their assessments, interventions, evaluations and discussions with the learner. At the same time, we must remember that due process is a bilateral process and we must work to ensure natural justice for our colleagues. Many a teacher has commented on the “loneliness” and “vulnerability” that they experience when working with “problem” learners (Steinert 2008).


조직 차원의 정책을 개발하고 학습자의 문제를 다루기 위한 프로토콜을 만드는 것은 레지던트의 권한과 정당한 절차를 확실히 하기 위해서 중요하다. 비록 이러한 정책이나 프로토콜이 각 조직마다 다르다고 하더라도, "chain of command"를 반드시 명시하여 누가 어떤 부분에 책임이 있는지, 보고 구조는 어떻게 되는지, 평가와 개입의 time frame은 어떤지, 명확하고 세심한 기록의 필요성 등이 기술되어 있어야 한다.

Developing an institutional policy and protocol for handling learners’ problems can also help to assure residents’ rights and due process. Although such a policy and protocol will differ for each organization (or institution), it should describe the preferred sequence of events, the “chain of command” and who is responsible for which part of the protocol, the reporting structure, the time frame for assessment and intervention, and the need for clear and careful documentation. For example, some schools have entrusted a Board of Examiners (Catton et al. 2002) to handle residents’ problems; others have designated program directors or postgraduate deans to be responsible. Irrespective of the chain of command, it is important that all faculty members are aware of local policies and protocols and that the institution maintain a uniform approach to learners requiring attention. 


왜 robust system이 필요한가?

Long (2009) has described a number of reasons why it is important to have robust systems in place to work with “problem” learners. This includes 

      • the need for uniformity, 
      • the development of expertise, and most importantly, 
      • the early identification of learners in difficulty.



Prevention of problems

의과대학과 수련기간은 많은 학습자들에게 스트레스가 심한 혼란의 시기이다. 스트레스의 원인에는 다음과 같은 것들이 있다.

Medical school and residency training is “a time of stress and turmoil for many learners” (Dabrow et al. 2006). As stated earlier in this Guide, and as described in the literature, these stresses come from a number of sources, including 

    • communication problems in the workplace, 
    • feelings of not being respected, 
    • the constraints of collaborative work, 
    • the potential gap between the medical school and clinical care, 
    • work overload, 
    • responsibility towards patients, 
    • worries about career plans and a perceived lack of knowledge (Luthy et al. 2004). 

Depending on their life experiences and coping strategies, students’ responses to stress may – or may not – be adaptive (Dyrbye et al. 2005). Although a full discussion of prevention strategies is beyond the scope of this article, a number of approaches are worth considering. 


유용한 프레임워크

For example, Langlois and Thach (2000) have provided a helpful framework by which to look at the prevention of difficult learning situations, modeled along the lines of primary, secondary, and tertiary prevention. 

    1. At the level of primary prevention (i.e. preventing the problem before it occurs), they suggest a well-developed orientation program that includes the sharing of course expectations, a discussion of mutual goals and objectives, and ongoing assessment. 
    2. With respect to secondary prevention (i.e. early detection), they concur with the suggestions made in this Guide and re-affirm the importance of paying attention to early clues, responding quickly, and providing ongoing feedback and monitoring. 
    3. Tertiary prevention (i.e. managing a problem to minimize impact) is of course more complex and includes a number of carefully crafted intervention strategies; it is also wise at this stage to not try to “rescue” the learner by ignoring the problem or accepting poor performance. 

"다양한 잠재적 위험 상황은 기대를 설정하고, 피드백을 주고, 사려깊은, 지속적 평가를 제공함으로서 예방 가능하다"

Interestingly, few prevention programs for teachers in distress have been described in the literature. However, each of these suggestions would be equally relevant to the teacher and the system. As Langlois and Thach (2000) have said, “many potentially difficult situations can be prevented by setting expectations, giving feedback, and providing thoughtful, ongoing evaluation”.



수련과정의 스트레스 인정

Acknowledge the stress of training

As Hays et al. (2011) have said, “academically bright and ambitious medical students must cope with a combination of curriculum, assessment, career choice, [and] personal, family and social pressures”. As teachers, we must acknowledge the stress and strain of undergraduate and postgraduate training and offer support to deal with systemic issues (Howell & Schroeder 1984; Peterkin 1991). We must also provide an educational environment that allows for learner differences, timely feedback and ongoing assessment so that problems are identified early and evaluations are not a “surprise”. In addition, we should consider the role of faculty advisors or mentors, so that learners can receive support and guidance in an atmosphere of trust and respect. Peer support, which can help to guard against delay in problem identification, can also be a useful intervention (Steinert 2008).


학습 기술과 평생학습 전략 증진

Promote study skills and life-long learning strategies

Although life-long learning is often identified as an important attribute of competent practitioners, the skills inherent to this process are not frequently taught. Perhaps, it is time to re-dress this gap and teach students and residents ways in which to maximize learning in the workplace, direct their own learning, seek input from others, and use evidence at the point of care (Teunissen & Dornan 2008).


관련된 교육 이벤트 구성

Organize relevant educational events

Some programs have held annual retreats to combat stress in residency training (e.g., Klein et al. 2000). Others have developed wellness (or assistance) programs to deal with the stress inherent in medical training (Borenstein 1985; Zoller et al. 1985). Irrespective of the program design, these activities include a discussion of relevant stresses and ways of identifying high stress levels, strategies for coping with stress, and information about available resources. Some programs have also included psychiatric counseling as part of their wellness or assistance program (Dabrow et al. 2006). As an example, the program at the University of South Florida College of Medicine offers confidential evaluation, brief counseling, and referral services (as appropriate). Importantly, this program is not focused solely around crisis intervention; it also incorporates a number of components of a successful assistance program: total confidentiality; easy access; education regarding availability of services and overall integration with the educational program (Dabrow et al. 2006). Educational courses and seminars on professionalism may also be warranted (Marco 2002). Demonstrating a lack of unprofessional behaviour is often seen among “problem” learners. It is, therefore, important to both teach and assess these behaviors in an explicit manner (Cruess et al. 2009) and make expectations clear.


교수 개발

Develop your faculty

교육의 목적과 구성

As stated earlier, most teachers do not feel prepared to handle “problem” learners effectively and faculty development has a critical role to play in this context. In our setting, we frequently offer workshops on the “problem” student and resident to our faculty members. The goal of these workshops is to provide a systematic framework for teachers “to help them in their task by emphasizing early identification, accurate diagnosis, and appropriate interventions” (Steinert et al. 2001). Workshop topics include: defining the problem; data gathering: confirming the diagnosis; designing and implementing the intervention; and assuring residents’ rights. Participants work in small groups and are encouraged to focus on their own challenges and lessons pertinent to their own settings. Program evaluations have shown that this workshop can be an effective way to sensitize teachers to the challenges of working with “problem” learners, to increase their knowledge and skill, and to help them become more aware of systems issues that may impact learner progress. Muller et al. (2000) have also highlighted the benefits of a faculty development workshop in helping teachers to apply an “interactional model to working with learners in difficulty”. As they pointed out, such an activity can help faculty to explore critical issues, test out their assumptions, identify new ways of working with learners’ challenges and begin to work collaboratively.


Some general principles

In closing, some general principles will be emphasized. Although “success” is not always possible, most “problem” learners do succeed in finding their way to a fulfilling career.


조기 발견이 중요하다. Early identification is critical

As Evans et al. (2010) have stated, “early identification and early support, before the trainee or student runs into major difficulties, should be regarded as the gold standard for educational supervision.” Most educators have encountered learners with significant gaps in knowledge or professional behaviors that have not been addressed earlier in their training. We fail this group by not failing them, and at a minimum, we must provide them with feedback, remedial guidance, and a plan (LeBlanc & Beatty in press).


학생이나 레지던트는 고되다. It is not easy to be a student or resident

As teachers and program directors, we need to remember that it is not easy to be a student or resident. It is also true that some learners complete their trajectory without any problems, but the essence of training can be stressful for many. Awareness – and acknowledgement – of this fact can be very helpful for both the learner and the teacher.


성과에 초점을 두자 An outcomes approach is warranted

문제 해결에는 두 가지 프레임이 있다. 하나는 '문제'적 관점이고, 다른 하나는 '성과'적 관점이다. 

Claridge and Lewis (2005) describe two frames for problem solving: a problem frame and an outcome frame. 

      • In the former, which focuses on the details of the problem and the deficiencies at hand, the over-riding motivation is to “escape”. 
      • The outcome frame, on the other hand, focuses on internal motivation to change, finding solutions and moving towards a positive outcome. Belief in the individual as resourceful and capable underlies this frame, as does the notion of exploration and change. Clearly, all of these factors are important in working with “problem” learners.










 2013 Apr;35(4):e1035-45. doi: 10.3109/0142159X.2013.774082. Epub 2013 Mar 15.

The "problem" learner: whose problem is it? AMEE Guide No. 76.

Author information

  • 1Centre for Medical Education, Faculty of Medicine, McGill Universit, Canada. yvonne.steinert@mcgill.ca

Abstract

Clinical teachers often work with students or residents whom they perceive as a "problem". For some, it is a knowledge deficit that first alerts them to a problem; for others it is an attitudinal problem or distressing behaviour . And in some cases, it is difficult to know if the learner is, indeed, presenting with a problem. The goal of this Guide is to outline a framework for working with "problem" learners. This includes strategies for identifying and defining learners' problems, designing and implementing appropriate interventions, and assuring due process. The potential stress of medical school and residency training will also be addressed, as will a number of prevention strategies. Identifying learners' problems early - and providing guidance from the outset - can be an important investment in the training and development of future health professionals. It is hoped that this Guide will be of help to clinical teachers, program directors and faculty developers.

PMID:

 

23496125

 

[PubMed - indexed for MEDLINE]


성공적인 교육과정 개혁에는 조직문화가 중요하다 (Academic Medicine, 2015)

Culture Matters in Successful Curriculum Change: An International Study of the Influence of National and Organizational Culture Tested With Multilevel Structural Equation Modeling

Mariëlle Jippes, MD, PhD, Erik W. Driessen, PhD, Nick J. Broers, PhD,

Gerard D. Majoor, PhD, Wim H. Gijselaers, PhD, and Cees P.M. van der Vleuten, PhD






전 세계적으로 의학교육은 교육과정의 한계로 인해 혁신을 요구받고 있다. 약 1/3의 의과대학이 통합교육 및 PBL을 도입했지만, 여전히 변화가 없거나 변화에 난항을 겪는 학교가 많다. 의과대학 교육과정 개편의 성공이 국가적 문화 특성과 관련된 요인들에 영향을 받긴 하나, 변화에 대한 조직문화의 영향에 관해서는 주로 기업체에서 논의되어왔다. 직관적으로 조직의 가치, 신념, 행동방식 등이 국가적 가치, 신념, 행동에 영향을 받았을 것이다. 국가적 문화와 조직 문화의 상호연관성에 대한 연구의 부족을 고려하면, 더 연규가 필요하다.

Medical education has seen a rising demand internationally for innovation due to perceived shortcomings of medical curricula: theoretical overload, lack of practical experience, insufficient community orientation, and inefficient teaching methods.1–3 Although around one-third of all medical schools have adopted integrated and problem-based learning (PBL) curricula in the past decade in response to this demand,4 there are many schools that continued unaltered or whose efforts in innovation have floundered. Whereas the success of medical curricular reforms has been associated with factors related to national culture,4,5 the influence of organizational culture on change processes has only been described for business reorganizations.6–9 Intuitively, it seems to make sense that values, beliefs, and practices of organizations can be expected to derive from national values, beliefs, and behavior.10 Given the paucity of results from empirical research on the interconnectedness of national and organizational culture,10–13 this subject deserves further investigation. Available research revealed that organizations in the same country vary because of differences in organizational culture; however, organizations in different countries vary even more because of the additional influence of national culture.10,13 


어떻게 국가와 조직의 문화가 의과대학 교육에 영향을 주는가에 대한 통찰력은 교육과정 개혁을 촉진하기 위해서 학교가 반드시 해결해야 할 문제를 찾아줄 수도 있다. 기존 연구가 사실상 거의 없기 때문에 우리는 조직과 국가 문화가 교육과정 변화에 영향을 주는가를 알아보고자 하였다. 기존 문헌에 근거하여 국가와 조직 문화, 교육과정 변화에 영향을 주는 요인들의 상호관련성에 대한 가설을 세웠다.

Insight into how national and organizational culture influence medical curriculum change may identify issues that schools must address to facilitate curriculum innovation. Because no existing research seemed available, we explored the influence of national and organizational culture on curriculum change in medical schools. After reviewing the literature on the concepts of successful curriculum change and national and organizational culture, we arrived at a definition of these concepts for this study. On the basis of the literature, we hypothesized seven relationships between factors related to national and organizational culture and curriculum change, which we incorporated in a conceptual model (Figure 1). 



Background 


성공적인 교육과정 변화

Successful curriculum change 


'성공적인 교육과정 변화'에 대한 공통된 정의나 척도가 없기에 두 가지를 활용하였다. MORC와 Employee resistance 

There exists no universal definition and measure of successful (curriculum) change. Instead, we used two derivates to operationalize successful curriculum change: medical schools’ organizational readiness for curriculum change (MORC) and employee resistance.14–18 

      • MORC consists of two positive dimensions (motivation and capability) and one negatively phrased dimension (extrinsic pressure).17,18 
      • In addition, employee resistance has been shown to decrease the chance of successful organizational change.19,20 

We expect that organizational readiness for change and a low level of faculty resistance, in general, are positively related to successful change. 



조직 문화

Organizational culture 


조직 문화에 대한 정의와 척도는 여럿 개발되어 있음. Kalliath et al 의 설문이 가장 적합해보였음. 두 축을 가지고 있음.

Many definitions21 and measuring instruments22 have been developed to advance understanding of organizational culture. The compact and widely used22 questionnaire developed by Kalliath and colleagues23 based on Quinn and Spreitzer’s24 competing values framework seemed most appropriate for the setting and purpose of this study. The competing values framework comprises elements of organizational effectiveness sorted along two axes: “flexibility–control” and “internal–external,” which results in four competing organizational models (Supplemental Digital Figure 1, http:// links.lww.com/ACADMED/A264).24 

For example, medical schools that emphasize belongingness and trust tend to be dominant in the human relations quadrant. The leadership style in such medical schools reflects teamwork, participation, empowerment, and concern for employee ideas. Flexible organizations (“human relations” and “open systems”) tend to respond more positively to change than those featuring control-driven policies and regulations (“rational goal” and “internal process”).15,24,25 We expect that flexible policies, in general, are positively related to successful change. 



국가 문화

National culture 


국가의 문화를 정의하기 위한 시도 중 Hofstede의 모델이 가장 많이 활용된다. 여섯 개의 dimension.

Among numerous attempts to define and quantify national culture,12,26–29 Hofstede’s26 model is applied most widely. It distinguishes six dimensions of national culture, three of which are most relevant in relation to curriculum change.4 Supplemental Digital Table 1 (http://links.lww.com/ACADMED/ A264) provides a list of all participating countries with their scores on the different dimensions. 


불확실성을 기피하거나 통제위주의 정책을 할수록 성공적 변화 가능성은 낮다고 가설 설정

“Uncertainty avoidance” describes the degree of acceptance of uncertainty and a need for predictability, which is often pursued by adherence to written or unwritten rules. In countries with strong uncertainty avoidance (e.g., Belgium and El Salvador), organizations,30–33 including medical schools,4,5 tend to be averse to change. Uncertainty avoidance features, such as strict rules and regulations, correspond to the organizational models of rational goal and internal process. Support for the effect of national values on organizational values with respect to uncertainty avoidance was demonstrated by House and colleagues.12 We expect that national uncertainty avoidance values and control-driven policies, in general, are negatively related to successful change. 



위계적이거나 불평등한 관계를 받아들이는 정도이다. 이 정도가 클수록 성공적인 변화 가능성이 낮음. 국가적 Power distance value와 rigid hierarchy가 성공적인 변화와 부의 관계가 있다고 가설 설정

“Power distance” describes the degree of acceptance of hierarchical or unequal relationships, which demonstrated diverse effects on different phases of the change process. Low power distance (e.g., in Sweden and Canada) in the initiation phase may invite employees to suggest innovations to their superiors, thus stimulating change.34,35 By contrast, the implementation phase may benefit from hierarchic control as a result from strong power distance.30,35–37 Research in medical schools has demonstrated a negative relation between power distance and the presence of innovative curricula.4,5 Overall, there seems to be a tendency for a negative relation between power distance and organizational readiness for change.14,38 Features of power distance, such as a rigid hierarchy, resemble those of the organizational model “internal process.” The effect of national values on organizational values with respect to power distance was also demonstrated by House and colleagues.12 We expect that national power distance values and a rigid hierarchy, in general, are negatively related to successful change. 


개인주의. 개인주의 성향이 높을수록 개개인의 구분과 새로운 아이디어를 받아들을 가능성이 높음. 그러나 교육과정을 도입하는 단계에서는 개인주의 성향이 낮은 것이 더 좋다. 우리는 국가적 수준의 개인주의 성향과 성장과 혁신에 초점을 두는 성향이 성공적 변화와 관련된다고 가설 설정

“Individualism” refers to the degree of emphasis placed on an individual’s accomplishment, with the opposite being “collectivism.” National levels of individualism were also shown to have contrasting effects on different phases of the change process. High individualism (e.g., the United States and Australia) may increase the tendency to individual distinction and championing of new ideas, stimulating the adoption phase of change.32,36,39 In contrast, during the implementation phase of change, low individualism, which characterizes emphasis on teamwork and consensus, has been favored.35,37,40 In medical schools, empirical research has shown a positive relation between individualism and the presence of innovative curricula.4 Overall, there seems to be a tendency toward a positive relation between individualism and change.38 Features of individualism, such as growth and innovation, correspond to the organizational model open systems. With respect to collectivism, House and colleagues12 have also demonstrated a relationship between national and organizational values. We expect that national individualism values and a focus on growth and innovation, in general, are positively related to successful change. 


국가 수입

National income 


높은 개인주의 성향, 작은 Power distance, 낮은 불확실성 회피성향이 높은 GDP와 연관됨. 일반적으로 GDP와 성공적 변화가 관계가 있다고 가설 설정

National cultural values frequently showed a relation with national gross domestic product at purchasing power parity levels (GDP).26,38,41 High individualism, low power distance, and low uncertainty avoidance were associated with higher GDP.26,41 Intuitively, a lack of financial resources has an inhibiting effect on curricular change. We expect that national income, in general, is positively related to successful change. 


연구 가설

Study hypotheses 


We derived the following hypotheses, which are all incorporated in our conceptual model and will be analyzed simultaneously (Figure 1). 


• Hypothesis 1: Medical schools with more successful curriculum change have higher levels of MORC–capability and MORC–motivation and lower levels of MORC–extrinsic pressure, which will cause lower levels of faculty resistance. 


• Hypothesis 2: Flexible policies and procedures (human relations and open systems) have a positive effect on successful curriculum change. 


• Hypothesis 3: Control-oriented policies and procedures (rational goal and internal process) have a negative effect on successful curriculum change. 


• Hypothesis 4: Uncertainty avoidance has a positive effect on rational goal and internal process and a negative effect on successful curriculum change. 


• Hypothesis 5: Power distance has a positive effect on internal process and a negative effect on successful curriculum change. 


• Hypothesis 6: Individualism has a positive effect on open systems and a positive effect on successful curriculum change. 

• Hypothesis 7: National GDP level has a positive effect on successful curriculum change. 






Method 



Design 


We used data from a questionnaire conducted worldwide among medical schools in the process of curriculum change to test the hypotheses in our conceptual model (Figure 1) by using a multivariate statistical approach. 


Participants and sampling procedure 


4달간 email을 통한 설문조사.

Between January and April 2012, we sent e-mails to 1,073 international staff contacts of Maastricht University inquiring whether they were contemplating or implementing changes in their undergraduate or postgraduate medical curriculum and, if so, inviting them to participate in the study. We excluded newly established medical schools and schools where the implementation was completed (i.e., the first students had graduated from the new curriculum). We sent two e-mail reminders. We asked our contacts from schools in the process of change to distribute an anonymous Web-based questionnaire to at least 20 of their colleagues who were actively involved in medical education, preferably representing a mix of professional backgrounds: basic scientists, clinicians, and members of the curriculum committee. If necessary, two reminders were sent to the contact persons. For every completed questionnaire, we donated €5 to the World Wildlife Fund (www.wwf.org), and we offered to send each participating school the anonymized results for their school. 



Measurements 


국가 문화

National culture. 


We used Hofstede’s26 national or regional scores (if no national score was available) on uncertainty avoidance, power distance, and individualism (Supplemental Digital Table 1, http://links.lww.com/ ACADMED/A264) to measure national culture. 


조직 문화

Organizational culture. 


Participants were asked to answer the 16 questions related to four types of organizational culture (human relations, open systems, rational goal, and internal process) from the questionnaire developed by Kalliath and colleagues,23 which were scored on a seven-point Likert scale (1 = not valued at all; 7 = highly valued). 


MORC. 


We measured organizational readiness for change using the 53- item MORC questionnaire,18 which was scored on a five-point Likert scale (1 = strongly disagree; 5 = strongly agree) (Supplemental Digital Table 2, http:// links.lww.com/ACADMED/A264). 


변화 관련 행동

Change-related behavior. 


Change-related behavior was measured using five types of behavior described by Herscovitch and Meyer20: 

      • active resistance, 
      • passive resistance, 
      • compliance, 
      • cooperation, and 
      • championing. 

Participants were asked to characterize the behavior of the members of their organization in relation to the curriculum change by distributing 100 points over the five types of behavior. For our analysis, we used the percentage of organizational members showing resistance (both active and passive). 


GDP. 

We obtained current annual data on GDP per capita (U.S. dollars) from the Web site Trading Economics.42 



Data analysis 

문화간 디자인, 응답자들은 학교 안에 nested, 학교는 국가 안에 nested. 따라서 다수준 접근법이 필요하였다. 가설에 따른 인과관계를 기대하였으며, Structural equation modeling이 필요하였음. 따라서 Multilevel structural equation modeling을 활용하였음.

The cross-cultural design with participants nested within schools and schools nested within countries required a multilevel approach.43 In addition, we expected causal relations described in the hypotheses and summarized in our model (Figure 1), requiring structural equation modeling.44 We therefore used multilevel structural equation modeling to analyze the data.43 An advantage of this approach is that multiple relations can be tested simultaneously in one model. 


우선 construct scale의 신뢰도를 추정하였음. 변수들이 정규분포를 하지 않았으므로 robust maximum likelihood estimation을 사용하였음. multicolinearity는 보이지 않았으며, ICC는 충분히 컸다. 개념모델에 대한 검정은 MSEM을 Mplus statistical software로 맞춰보았다. 

We first estimated the reliability of the construct scales. Because the variables were not distributed normally, we performed robust maximum likelihood estimation,45,46 which produces maximum likelihood parameter estimates and standard errors that are robust to nonnormality. There were no signs of multicolinearity, implying an absence of strong correlations between the predictors (all tolerance values > 0.10). Intraclass correlations (ICCs) computed to examine between-cluster variability (Table 2) were sufficiently large (ICC > 0.05) to justify the use of multilevel structural equation modeling.47 The conceptual model (Figure 1) was tested by fitting a multilevel structural equation model to the data using Mplus statistical software, version 5.21 (Muthén and Muthén, Los Angeles, California). Observed scores at the individual level were included in the first “within level.” We added average MORC, organizational culture, and faculty resistance scores of participants from the same school in the second “between level.” Scores at the national level (national culture and GDP) were also included in the second level because the number of schools per country was too low to include these variables in a third level. We assumed random intercepts and fixed slopes across medical schools.44 The following fit indices and criteria were used: the root mean square error of approximation (RMSEA< 0.08), the comparative fit index (CFI > 0.9), and the standardized root mean square residual (SRMR < 0.08).48,49 


Ethical considerations 


After explaining the aim and purpose of the study, voluntary nature of participation, and confidentiality of the contributions, we obtained digital informed consent from all participants. The study was approved by the ethical review board of the Dutch Association for Medical Education. 



Results 


991명의 교수, 131개의 의과대학, 56개 국가. 이전 MORC연구에서 5명 이하가 응답한 학교는 제외하였으나 여기에서는 1명만 참여한 37개 학고를 배제하였음.


Of the 1,073 contact persons from 345 medical schools in 80 countries we invited to administer the MORC questionnaire at their schools, 708 (66%) agreed. We were not informed how many colleagues each of the contact persons invited to complete the MORC questionnaire. The questionnaire was completed by 991 staff members from 131 medical schools in 56 countries (Supplemental Digital Table 1, http:// links.lww.com/ACADMED/A264). The average age of the participants was 47 years (range 21–84), and 475 (47.9%) were male. All characteristics of participants are presented in Table 1. Supplemental Digital Table 3 (http:// links.lww.com/ACADMED/A264) shows the means, standard deviations, and intercorrelations (Pearson) of all variables. On the basis of the generalizability analysis of MORC in a previous study, schools with fewer than 5 participants should have been excluded.18 However, to maintain a sufficient number of schools while conforming to the minimum of 2 participants per cluster as required for two-level modeling, we excluded 37 schools with only 1 participant. Exclusion of 7 medical schools from 3 countries for which no data on national or regional culture were available resulted in a total of 911 respondents from 87 medical schools (on average, 10.5 respondents per school) in 48 countries. Missing values and nonapplicable answers were below 10% of the total number of observations and replaced by the item means.50 






Cronbach alphas of the organizational culture subscales (0.80–0.87) suggested reliable replication in our population (all above 0.67) (Table 2). The process of validation of MORC for our population is described in a previous study.18 







Our initial two-level structural equation model showed a poor fit with the data (CFI = 0.91, Tucker–Lewin index [TLI] = 0.70, RMSEA = 0.12, standard root mean square within [SRMRW]= 0.05, standard root mean square between [SRMRB] = 0.21) (Table 3). The modification indices suggested strong significant effects between underlying MORC dimensions (MORC–capability on MORC–motivation and vice versa) as well as direct effects of all four organizational types on resistance to change. As we considered it plausible that perceived capability and motivation would impact each other and organizational types would not have only an indirect, but also a direct effect on resistance to change, we applied the modifications (between MORC–capability and MORC–motivation and between open systems organizations and resistance to change). This yielded a reasonable fit (CFI = 0.96, TLI = 0.87, RMSEA = 0.08, SRMRW = 0.02, SRMRB = 0.21), which means that with the two adaptations the model gives an acceptable representation of the data. Therefore, the causal paths within the model may be interpreted (Table 3). Figure 2 presents a summary of the results from fitting our two-level model to the data. 



This final model 

  • fully supported hypothesis 1 (a positive effect of MORC– extrinsic pressure on faculty resistance and a negative effect of MORC–capability and MORC–motivation—via MORC– capability—on faculty resistance) and 
  • fully supported hypothesis 2 (a positive effect of human relations and open systems on successful curriculum change). 
  • Partial support was found for hypothesis 4 (an expected negative effect of uncertainty avoidance on successful curriculum change and an unexpected negative effect of uncertainty avoidance on rational goal). 
  • Partial support was also found for hypothesis 5 (expected positive effect of power distance on internal process and an unexpected positive effect of power distance on successful curriculum change). 










Discussion 

개념 모델이 잘 맞았음. 국가와 조직문화가 교육과정 변화의 성공에 큰 영향을 준다. 국가문화가 의학교육에 미치는 영향은 이미 논의된 바 있는데, 조직문화에 대해서는 별로 없다. 

Our findings revealed a reasonable fit of our conceptual model with the data after two plausible modifications, necessitating further research to test the adapted conceptual model. Nevertheless, the findings revealed significant effects of national and organizational culture on the success of medical curriculum change. The influence of national culture on medical education has been demonstrated previously.4,5,51–53 However, the impact of organizational culture on change has only been demonstrated in business and health care organizations.6–9 To our knowledge, our study is the first to demonstrate this effect in medical schools. 


긍정적 영향을 미치는 요인들로는 아래와 같은 것들이 있음. 

GDP가 일부 영향을 주긴 하지만, 유의한 역할을 하고 있지는 않음.

Specific characteristics of national culture (high power distance and/ or low uncertainty avoidance) and organizational culture (human relations and/or open systems) had a positive effect on successful curriculum change. Clear positive effects on successful change were a certain level of risk taking and flexible policies and procedures (low uncertainty avoidance/open systems), strong leadership and strict hierarchy (high power distance/internal process), a high concern for new ideas and teamwork (human relations), and focus on growth and innovation (open systems). As expected, a certain level of risk taking and flexible policies and procedures stimulated the introduction of innovative ideas.12,26,35 Power distance unexpectedly stimulated successful curriculum change, perhaps through the positive impact of centralized command on the coordination of the complex process of curriculum change.35 Although a certain level of financial investment is required for curriculum change, the level of national wealth (GDP) did not have a significant role in the process of curriculum change, so perhaps the effect of national wealth is much smaller than the effect of national and organizational culture. 


기존 연구에 비추어 본 해석

With regard to organizational culture, teamwork (human relations), especially beyond one’s own discipline, is uncommon in medical schools with traditional curricula, but may be advantageous for integrated curricula, such as PBL curricula.54 Adaptation of the curriculum to the external environment (open systems), including to local community needs, is one of the main challenges for medical schools.2,55–57 As other (regional) medical schools are facing the same problems, collaborations could serve the exchange of effective solutions.57,58 Although the rational goal and internal process organizational culture types did not show a direct effect on MORC, they indirectly had a positive effect through open systems and human relations, which indicates that it is important for an organization to aim for more balanced norms and values (congruence) with a strong focus on human relations/open systems and also a reasonable share of values related to internal process and rational goal. Similar findings were described by Quinn and Spreitzer,24 who argued that emphasis on one organizational type can lead to narrowness and an inability to adapt to a changing environment. 


국가 문화와 조직 문화 사이에는 근본적인 긴장관계가 있다. 의과대학에서도 마찬가지이다. PBL의 사례

There is a fundamental tension in the relationship between national and organizational culture.10 Organizations likely feel compelled to conform to existing cultural norms and values on the one hand, while they also have to innovate, which may challenge the cultural norms and values and cause the organizational culture to deviate from the dominant national cultural context. In medical schools, the same tension between national and organizational culture exists; 

for instance, the introduction of PBL requires an open communication style, which seems less feasible in more collectivistic cultures with a strong fear of loss of face.59,60 Nevertheless, many medical schools in collectivistic cultures have successfully introduced PBL.52,59 


둘 중에 어떤 것이 더 중요한 것인가? 국가문화는 서로 다른 의과대학 사이의 차이의 40%를 설명해주었고, 조직문화는 서로 다른 의과대학 내에서의 차이를 설명하였음.

In our model, both national and organizational culture influenced successful curriculum change, making us wonder whether both are equally important. As national and organizational culture are included in different levels in the model, we can only conclude that national culture explained 40% of differences in MORC– capability among different medical schools, and organizational culture explained differences within different medical schools (27.5% of differences in MORC–capability, 12.3% of differences in MORC–motivation, and 6.5% of differences in MORC–extrinsic pressure, respectively; data not shown). 


국가 문화를 바꿀 수는 없으니 그 영향은 그냥 인정하는 편이 효율적이다. 위험을 회피하는 문화에서 조직의 리더는 위험을 최소화하기 위해서 어떤 노력을 하고있는가를 설명하여 그러한 불안감을 낮춰주어야 한다. Power distance가 큰 문화에서는 의과대학의 리더는 조직구조를 중심화하고 하향식 의사결정을 만들어서 의사결정의 rationale가 논의된 후 빠른 의사결정이 이뤄지도록 해야한다.

National and organizational culture factors should be taken into account by medical schools in the process of curriculum change. Because it may be impossible to change national culture, it may be more efficient to anticipate its effects. In a culture that is risk-averse, the leader of a change project could mitigate the feeling of risk taking by explaining which efforts are made to minimize them. In a culture with high power distance, the leader of a medical school could use the centralized organizational structure and top-down decision making to make the required fast decisions after communicating the rationale behind the decisions to the organizational members. 


Hofstede's dimension의 한계

For the operationalization of national culture, we used Hofstede’s dimensions, which has its own limitations—for instance, with regard to the study population of IBM employees only.61 Unfortunately, the absence of Hofstede’s index scores for some countries forced us to use regional scores and exclude participants from three countries without national or regional scores (Supplemental Digital Table 1, http://links.lww.com/ ACADMED/A264). 


이러한 절차에 대한 반대가 있을 수 있지만, 포함되지 않은 국가에 대해서 mean dimension score로 대체하여 분석한 결과 MSEM의 fit indices에 유의미한 영향은 없었다.

Although objections to this procedure may be valid, a separate analysis in which missing country scores were substituted for mean dimension scores had no significant effect on the fit indices of the multilevel structural equation model (data not shown). Although we studied a relatively large cross-national sample, the relatively low number of respondents and especially the limited number of medical schools per country with respect to the large number of parameters may explain the initial poor fit indices of our conceptual model. In addition, because of this limited number of medical schools per country, we had to include observed scores on the national level in the second level, preventing analysis of variance in MORC between different countries. Another limitation is the inability to provide a response rate of the invited participants. Because it was left to the contact persons of Maastricht University to invite faculty members in their medical schools, we have no insight into how many individuals were eventually invited to participate. Further research to test the adapted model would benefit from a larger randomly selected sample. 


ICC는 그룹간 variance가 그룹내 variance보다 작음을 보여준다. 즉, 같은 의과대학 내에서도 조직문화와 MORC에 대한 인식이 서로 다를 수 있다는 것이다. 또한 동일한 의과대학의 구성원이라도 서로 다른 변화의 단계에 있다고 응답하였다. 또한 변화에 대해 준비된 정도도 조직 구성원에 따라서 경험/참여수준/개인적 선호 등에 따라 다를 수 있고, 이것이 결국 한 사람이 인식하는 의과대학의 준비도에 대한 판단에 영향을 준다. 

ICCs of both organizational culture and MORC scores showed that the betweengroup variance was small compared with the within-group variance, suggesting that perceptions of organizational culture and MORC may differ between members of the same medical school. In addition, members from the same school reported their school to be in a different phase of change (i.e., preparation or implementation phase). Perhaps perceptions of members of the same team or department may be more homogeneous than perceptions within the school as a whole, which would require further analysis of variance of the perceptions of readiness for change within teams and departments. Additionally, individual readiness for change may differ between organizational members on the basis of their previous experiences, their level of involvement in the change process, and their personal preferences, all of which can influence individual perceptions of a medical school’s readiness for change.25 Unfortunately, the software Mplus did not allow us to insert the variables of Table 1 (e.g., gender, age, context of change, and size of the medical school) as covariates. We expect these aspects to have an influence on the change process as well, which indicates the need for future expansion of this research. 


In a future study, it would be illuminating to use cluster analysis to investigate interactions between the different organizational types by comparing the effect of different organizational culture profiles on successful curriculum change.24 It would also be interesting to explore whether medical schools show similar profiles of organizational culture across countries. If confirmed, this might indicate the presence of a medical-school-specific macro-culture, similar to specific hospital cultures reported in other studies.10,62 





Conclusion 


Our findings show that change is influenced by national and organizational culture characteristics such as flexible policies and procedures, interdisciplinary teamwork, adaptation to local community needs, and by collaboration with regional schools. Medical schools contemplating or implementing curriculum change should consider the potential impact of cultural factors in designing strategies to deal with potential sources of resistance. As it may be impossible to change national culture, it may be more efficient to anticipate its effects.











 2015 Mar 17. [Epub ahead of print]

Culture Matters in Successful Curriculum Change: An International Study of the Influence of National and Organizational Culture Tested With Multilevel Structural Equation Modeling.

Abstract

PURPOSE:

National culture has been shown to play a role in curriculum change in medical schools, and business literature has described a similar influence of organizational culture on change processes in organizations. This study investigated the impact of both national and organizational culture on successful curriculum change in medical schools internationally.

METHOD:

The authors tested a literature-based conceptual model using multilevel structural equation modeling. For the operationalization of national and organizational culture, the authors used Hofstede's dimensions of culture and Quinn and Spreitzer's competing values framework, respectively. To operationalize successful curriculum change, the authors used two derivates: medical schools' organizational readiness for curriculum change developed by Jippes and colleagues, and change-related behavior developed by Herscovitch and Meyer. The authors administered a questionnaire in 2012 measuring the described operationalizations to medical schools in the process of changing their curriculum.

RESULTS:

Nine hundred ninety-one of 1,073 invited staff members from 131 of 345 medical schools in 56 of 80 countries completed the questionnaire. An initial poor fit of the model improved to a reasonable fit by two suggested modifications which seemed theoretically plausible. In sum, characteristics of national culture and organizational culture, such as a certain level of risk taking, flexible policies and procedures, and strong leadership, affected successful curriculum change.

CONCLUSIONS:

National and organizational culture influence readiness for change in medical schools. Therefore, medical schools considering curriculum reform should anticipate the potential impact of national and organizational culture.

PMID:

 

25785674

 

[PubMed - as supplied by publisher]


객관식 시험의 점수분석: AMEE Guide No. 66

Post-examination interpretation of objective test data: Monitoring and improving the quality of high-stakes examinations: AMEE Guide No. 66

MOHSEN TAVAKOL & REG DENNICK

University of Nottingham, UK






Introduction

측정 오류는 다음과 같은 이유로 발생할 수 있다.

The output of the examination process is transferred to students either formatively, in the form of feedback, or summatively, as a formal judgement on performance. Clearly, to produce an output which fulfils the needs of students and the public, it is necessary to define, monitor and control the inputs to the process. Classical Test Theory (CTT) assumes that inputs to post-examination analysis contain sources of measurement error that can influence the student's observed scores of knowledge and competencies. Sources of measurement error is derived from test construction, administration, scoring and interpretation of performance. For example; quality variation among knowledge-based questions, differences between raters, differences between candidates and variation between standardised patients (SPs) within an Objective Structured Clinical Examination (OSCE).


신뢰도에 대한 가장 간단한 해석은 1에서 신뢰도의 제곱을 뺀 값만큼이 error라는 것이다.

To improve the quality of high-stakes examinations, errors should be minimised and, if possible, eliminated. CTT assumes that minimising or eliminating sources of measurement errors will cause the observed score to approach the true score. Reliability is the key estimate showing the amount of measurement error in a test. A simple interpretation is that reliability is the correlation of the test with itself; squaring this correlation, multiplying it by 100 and subtracting from 100 gives the percentage error in the test. For example, if an examination has a reliability of 0.80, there is 36% error variance (random error) in the scores. As the estimate of reliability increases, the fraction of a test score that is attributable to error will decrease. Conversely, if the amount of error increases, reliability estimates will decrease (Nunnally & Bernstein 1994).


(...)


(...)



Interpretation of basic post-examination results

Individual questions

기술통계분석이 첫 단계가 된다. 만약 결측값이 없다면 학생들이 충분한 시간이 있었건, 일부 문제에 대해서는 추측해서 답을 썼다는 의미가 될 수 있다. 반대로 결측값이 너무 많다면 시간이 부족했거나, 너무 시험이 어려웠거나, 오답에 대한 감점이 있었을 수 있다.

A descriptive analysis is the first step in summarising and presenting the raw data of an examination. A distribution frequency for each question immediately shows up the number of missing questions and the patterns of guessing behaviour. For example, if there were no missing question responses identified, this would suggest that students either had good knowledge or were guessing for some questions. Conversely, if there were missing question responses, this might be either an indication of an inadequate time for completing the examination, a particularly hard exam or negative marking is being used (Stone & Yeh 2006; Reeve et al. 2007).


SD는 variation을 보여준다.

The means and variances of test questions can provide us with important information about each question. The mean of a dichotomous question, scored either 0 or 1, is equal to the proportion of students who answer correctly, denoted by p. The variance of a dichotomous question is calculated from the proportion of students who answer a question correctly (p) multiplied by those who answer the question incorrectly (q). To obtain the standard deviation (SD), we merely take the square root of p × q. For example, if in an objective test, 300 students answered Question 1 correctly and 100 students answered it incorrectly, the p value for Question 1 will be equal to 0.75 (300/400), and the variance and SD will be 0.18 (0.75 × 0.25) and 0.42 () respectively. The SD is useful as a measure of variation or dispersion within a given question. A low SD indicates that the question is either too easy or too hard. For example, in the above example, the SD is low indicating that the item is too easy. Given the item difficulty of Question 1 (0.75) and a low item SD, one can conclude that responses to item was not dispersed (there is little variability on the question) as most students paid attention to the correct response. If the question had a high variability with a mean at the centre of distribution, the question might be useful.


Total performance

전체 수행능력에 대한 평가를 위해서 점수의 총합과 그 SD를 구할 수 있다.

After obtaining the mean and SD for each question, the test can be subjected to conventional performance analysis where the sum of correct responses of each student for each item is obtained and then the mean and SD of the total performance are calculated. Creating a histogram using SPSS allows us to understand the distribution of marks on a given test. Students’ marks can take either a normal distribution or may be skewed to the left or right or distributed in a rectangular shape. Figure 1(a) illustrates a positively skewed distribution. This simply shows that most students have a low-to-moderate mark and a few students received a relatively high mark in the tail. In a positively skewed distribution, the mode and the median are greater than the mean indicating that the questions were hard for most students. Figure 1(b) shows a negatively skewed distribution of students’ marks. This shows that most students have a moderate-to-high mark and a few students received relatively a low mark in the tail. In a negatively skewed distribution, the mode and the median are less than the mean indicating that the questions were easy for most students.




Figure 1(c) shows most marks distributed in the centre of a symmetrical distribution curve. This means that half the students scored greater than the mean and half less than mean. The mean, mode and median are identical in this situation. Based on this information, it is hard to judge whether the exam is hard or easy unless we obtain differences between the mode, median or mean plus an estimate of the SD. We have explained how to compute these statistics using SPSS elsewhere (Tavakol & Dennick 2011b; Tavakol & Dennick 2012).


As an example, we would ask you to consider the two distributions in Figure 2, which represent simulated marks of students in two examinations.




Both the mark distributions have a mean of 50, but show a different pattern. Examination A has a wide range of marks, with some below 20 and some above 90. Examination B, on the other hand, shows few students at either extreme. Using this information, we can say that Examination A is more heterogeneous than Examination B and that Examination B is more homogenous than Examination A.


In order to better interpret the exam data, we need to obtain the SD for each distribution. For example, if the mean marks for the two examinations are 67.0, with different SDs of 6.0 and 3.0, respectively, we can say that the examination with a SD of 3.0 is more homogenous and hence more consistent in measuring performance than the examination with a SD of 6.0. A further interpretation of the value of the SD is how much it shows students’ marks deviating from the mean. This simply indicates the degree of error when we use a mean to explain the total student marks. The SD also can be used for interpreting the relative position of individual students in a normal distribution. We have explained and interpreted it elsewhere (Tavakol & Dennick 2011a).



Interpretation of classical item analysis

하지만 무한한 숫자로 시험을 반복할 수는 없기 때문에 최대한 많은 학생에게 시험을 보도록 하는 방법을 택한다. 

In scientific disciplines, it is often possible to measure variables with a great deal of accuracy and objectivity but when measuring student performance on a given test due to a wide variety of confounding factors and errors, this accuracy and objectivity becomes more difficult to obtain. For instance, if a test is administrated to a student, he or she will obtain a variety of scores on different occasions, due to measurement errors affecting his or her score. Under CTT, the student's score on a given test is a function of the student's true score plus random errors (Alagumalai & Curtis 2010), which can fluctuate from time to time. Due to the presence of random errors influencing examinations, we are unable to exactly determine a student's true score unless they take the exam an infinite number of times. Computing the mean score in all exams would eliminate random errors resulting in the student's score eventually equalling the true score. However, it is practically impossible to take a test an infinite number of times. Instead we ask an infinite number of students (in reality a large cohort!) to take the test once allowing us to estimate a generalised standard error of measurement (SME) from all the students’ scores. The SME allows us to estimate the true score of each student which has been discussed elsewhere (Tavakol & Dennick 2011b).


Reliability

It is worth reiterating here that just as the observed score is composed of the sum of the true score and the error score, the variance of the observed score in an examination is made up of the sum of the variances of the true score and the error score, which can be formulated as follows:









Now imagine a test has been administered to the same cohort several times. If there is a discrepancy between the variance of the observed scores for each individual, on each test, the reliability of the test will be low. The test reliability is defined as the ratio of the variance of the true score to the variance of the observed score:



Given this, the greater the ratio of the true score variance to the observed score variance, the more reliable the test. If we substitute variance (true scores) from Equation (1) in Equation (2), the reliability will be as follows:



And then we can rearrange the reliability index as follows:


This equation simply shows the relationship between source of measurement error and reliability. For example, if a test has no random errors, the reliability index is 1, whereas if the amount of error increases, the reliability estimate will decrease.



Increasing the test reliability

The statistical procedures employed for estimating reliability are Cronbach's alpha and the Kuder–Richardson 20 formula (KR-20). If the test reliability was less than 0.70, you may need to consider removing questions with low item-total correlation. For example, we have created a simulated SPSS output for four questions in Tables 1 and 2.




Table 1 shows Cronbach's alpha for four questions, 0.72. Table 2 shows item-total correlation statistics with the column headed ‘Cronbach's Alpha if Item deleted’. (Item-total correlation is the correlation between an individual question score and the total score).


The fourth question in the test has a total-item correlation of −0.51 implying that responses to this particular question have a negative correlation with the total score. If we remove this question from the test, the alpha of the three remaining questions increase from 0.725 to 0.950, making the test significantly more reliable.


Tables 3 and 4 show the output SPSS after removing Question 4:



Tables 3 and 4 illustrate the impact of removing Question 4 from the test, which significantly increases the value of alpha.


Table 4에서 2번문항을 지우면 alpha는 완벽(=1)해질 것이다. 즉 여러 문항이 완전히 동일한 것을 측정하고 있다는 것이다. 그러나 이것이 반드시 좋지만은 않은데, 시험문항에 쓸데없는 반복이 있다는 의미이기 때문이다. 이런 경우라면 신뢰도를 해치지 않으면서도 시험의 길이를 줄일 수ㅇ 있다. 신뢰도는 문항의 수에 영향을 받는 값이며 문항이 많을수록 신뢰도는 높아진다.

However, if we now remove Question 2, the value of the alpha for the test will be perfect, i.e. 1, which means each question in the test must be measuring exactly the same thing. This is not necessarily a good thing as it suggests that there is redundancy in the test, with multiple questions measuring the same construct. If this is the case, the test length could be shortened without compromising the reliability (Nunnally & Bernstein 1994). This is because the reliability is a function of test length. The more the items, the more the reliability of a test.


alpha나 KR-20이 신뢰도 추정에 유용하기는 하나, 여기서는 모든 가능한 오차의 원인이 하나로 합해져버리게 된다. 하지만 error의 소스는 다양할 수 있고, 이러한 각 error의 소스의 영향은 일반화가능도계수에 의해서 추정될 수 있다. 

Although Cronbach's alpha and KR-20 are useful for estimating the reliability of a test, they conflate all sources of measurement error into one value (Mushquash & O'Connor 2006). Recall that true scores equal observed scores plus errors, which is derived from a variety of sources. The influence of each source of error can be estimated by the coefficient of generalisability, which is similar to a reliability estimate in the true score model (Cohen & Swerdlik 2010). Later we will describe how to identify and reduce sources of measurement errors using generalisability theory or G-theory as it is known. What is more, in our previous Guide (Tavakol & Dennick 2012), we explained and interpreted item difficulty level, item discrimination index and point bi-serial coefficient in terms of CTT. In this Guide, we will explain and interpret these concepts in terms of Item Response Theory (IRT) using item characteristic parameters (item difficulty and item discrimination) and the student ability/performance to all questions using the Rasch model.



Factor analysis

Linear factor analysis is widely used by test developers in order to reduce the number of questions and to ensure that important questions are included in the test. For example, the course convenor of cardiology may ask all medical teachers involved in teaching cardiology to provide 10 questions for the exam. This might generate 100 questions, but all these questions are not testing the same set of concepts. Therefore, identifying the pattern of correlations between the questions allows us to discover related questions that are aimed at the underlying factors of the exam. A factor is a construct which represents the relationship between a set of questions and will be generated if the questions are correlated with the factor. In factor analysis language, this refers to factor ‘loadings’. After factor analysis is carried out, related questions load onto factors which represent specific named constructs. Questions with low loadings can therefore be removed or revised.


If a test measures a single trait, only one factor with high loadings will explain the observed question relationships and hence the test is uni-dimensional. If multiple factors are identified, then the test is considered to be multi-dimensional.


두 종류가 있다. EFA와 CFA

There are two main components to linear factor analysis: exploratory and confirmatory. Exploratory Factor Analysis (EFA) identifies the underlying constructs or factors within a test and hypothesises a model relationship between them. Confirmatory Factor Analysis (CFA) validates whether the model fits the data using a new data set. Below, each method is explained.


Exploratory factor analysis

문항을 손보거나 구체적인 지식영역에 해당하는 문항을 고르기 위하여 사용한다. EFA에서는 각 문항에 대한 communality도 함께 계산한다.

EFA is widely used to identify the relationships between questions and to discover the main factors in a test as previously described. It can be used either for revising exam questions or choosing questions for a specific knowledge domain. For example, if in the cardiology exam we are interested in testing the clinical manifestations of coronary heart disease, we simply look for the questions which load on to this domain. The following simulated example, using an examination with 10 questions taken by 50 students, demonstrates how to improve the questions in an examination. This allows us to demonstrate how to revise and strengthen exam questions and to calculate the loadings on the domain of interest. As well as identifying the factors EFA also calculates the ‘communality’ for each question. To understand the concept of communality, it is necessary to explain the variance (the variability in scores) within the EFA approach.


요인분석에서 variance는 두 부분으로 나뉘어진다.

We have already learnt from descriptive statistics how to calculate the variance of a variable. In the language of factor analysis, the variance of each question consists of two parts. 

      • One part can be shared with the other questions, called ‘common variance’; 
      • the rest may not be shared with other questions, called ‘error’ or ‘random variance’. 

한 문항에 대한 communality는 특정 요인들로 인해서 설명가능한 variance로서 0에서 1사이의 값을 갖는다.

The communality for a question is the value of the variance accounted for by the particular set of factors, ranging from 0 to 1.00. 

For example, a question that has no random variance would have a communality of 1.00; a question that has not shared its variance with other questions would have a communality of 0.00. The communality shown for Question 9 (Table 5) is 0.85, that is 85% of the variance in Question 9 is explained by factor 1 and factor 2, and 15% of the variance of Question 9 has nothing in common with any other question. To compute the shared variances for each question in SPSS, the following steps are carried out in SPSS (SPSS 2009). From the menus, choose ‘Analyse’, ‘Dimension Reduction’ and ‘Factor’, respectively. Then move all questions on to the ‘Variables’ box. Choose ‘Descriptive’ and then click ‘Initial Solution’ and ‘Coefficients’, respectively. Then click ‘Rotation’. Choose ‘Varimax’ and click on ‘Continue’ and then ‘OK’. In Table 5, we have combined the simulated data of the SPSS output together.



Loading의 값은 어느 정도가 되어야 하는가?

Table 5 shows that two factors have emerged. Factor 1 demonstrates excellent loading with Questions 9, 2, 6, 10, 4, 1 and 3 and Factor 2 demonstrates excellent loading with Questions 7 and 8, indicating these items have a strong correlation with Factors 1 and 2. 

      • It should be noted that loadings with values greater than 0.71 are considered excellent (0.71 × 0.71 = 0.50 × 100; i.e. 50% common variance between the item and the factor, or 50% of the variation in the item can be explained by the variation in the factor, or 50% of the variance is accounted for by the item and the factor), 
      • 0.63 (40% common variance) very good
      • 0.45 (20% common variance) fair. 
      • Values less than 0.32 (10% common variance) are considered poor and less contribute to the overall test and they should be investigated (Comrey & Lee 1992; Tabachnick & Fidell 2006). 


h^2라고 붙여진 열은 각 질문의 communality 값을 나타낸다. 5번문항에서 8%라는 것은, 이 질문에 의해서 variance의 8%가 설명된다는 의미이고, 30%이하의 값을 나타내는 문항은 그 factor에 load되어있는 다른 문항들과 관련되어있지 않다는 의미이다. Table 5에서 5번문항은 communality가 가장 낮고, factor 1이나 2 모두에 load되어있지 않으므로 삭제되거나 교정되어야 한다.

Table 5 also shows communalities for each question in the column labelled h2. For example, 92% of the variance in Question 2 is explained by the two factors that have emerged from the EFA approach. The lowest communality is for Question 5, indicting 8% of the variance is explained by this question. Low values of less than 30% indicate that the variance of the question does not relate to other questions loaded on to the identified factors. In Table 5, Question 5 has the lowest communality figure and has not loaded onto Factors 1 or 2, suggesting this question should be revised or discarded.


5번문항을 삭제하기 전에는 요인1이 .47, 요인2가 .23이었는데, 5번을 지운 이후에는 합이 0.78이 되었음. 또한 대부분의 문항이 요인1에 load되어있어 construct validity에 대한 convergence와 discrimination의 근거를 제시한다. 즉 이 시험은 'convergent'하면서(factor 1의 loading이 크므로), discriminant하다(factor 1에 load된 문항이 factor 2에는 load 되어있지 않으므로) 두 factor 각각에 대하여 Cronbach alpha를 계산해보아야 한다.

Table 5 also shows the values of variance explained by the two factors that have been identified from the EFA approach; 0.47 of the variance is accounted for by Factor 1 and 0.23 of the variance is accounted for by Factor 2. Therefore, 0.70 of the variance is accounted for by all of the questions. However, if we delete Question 5, we can increase the total variance accounted for to 0.78. A further interpretation of Table 5 is that the vast majority of questions have been loaded on to Factor 1, providing evidence of convergence and discrimination for the construct validity of the test. We can argue that the test is convergent as there are high loadings on to Factor 1. The test is also discriminant as the questions that have loaded on to Factor 1 have not loaded on to Factor 2. This means that Factor 2 measures another construct/concept which is discriminated from Factor 1. Because two factors have been identified, it would be appropriate to calculate Cronbach's alpha co-efficient for each factor because they are measuring two different constructs. It should be noted that items which load on more than two factors need to be investigated.


Confirmatory factor analysis

CFA에서는 EFA로부터 추출된 가설적 모델을 활용하여 잠재적 요인을 밝혀낸다. 그러나 순환논리를 피하기 위해서는 모델 적합성을 확인하기 위해서 새로운 자료가 필요하다.

The technique of CFA has been widely used to validate psychological tests but has been less used to evaluate and improve the psychometric properties of exam questions. The EFA approach can reveal how exam questions are correlated or connected to an underlying domain of factors. For example, an EFA approach may show that the internal structure of a 100 question test consist of three underlying domains, say physical examination, clinical reasoning and communication skills. The number of factors identified constitutes the components of a hypothesised model, the factor structure model. In the above example, the model would be termed a three-factor model. The CFA approach uses the hypothesised model extracted by EFA to confirm the latent (underlying) factors. However, in order to confirm model fitting, a new data set must be used to avoid a circular argument. For example, the same test could be administered to a different but comparable group of students.


따라서 먼저 EFA로부터 모델을 밝히고, CFA로 검증해야 한다. SEM을 통해서 새로운 샘플 데이터를 가설적 모델에 넣어서 모형의 적합도를 본다. 

Therefore, educators must first identify a model using EFA and test it using CFA. This approach also allows educators to revise exam questions and the factors underlying their constructs (Floys & Widaman 1995). For example, suppose EFA has revealed a two-factor model from an exam consisting of history-taking and physical examination questions. The researcher wishes to measure the psychometric characteristics of the questions and test the overall fit of the model to improve the validity and reliability of the exam. This can be achieved by the use of structural equation modelling (SEM) which determines the goodness-of-fit of the newly input sample data to the hypothesised model. The model fit is assessed using Chi-square testing and other fit indices. In contrast to other statistical hypothesis testing procedures, if the value of Chi-square is not significant, the new data fit and the model is confirmed. However, as the value of Chi-square is a function of increasing or decreasing sample size, other fit indices should also be investigated (Dimitrov 2010). These indices are the comparative fit index (CFI) and the root mean square error of approximation (RMSEA)

      • A CFI value of greater than 0.90 shows a psychometrically acceptable fit to the exam data. 
      • The value of RMSEA needs to be below 0.05 to show a good fit (Tabachnick & Fidell 2006). A RMSEA of zero indicates that the model fit is perfect. 

활용가능한 통계프로그램 및 수행방법

It should be noted that CFA can be run by a number of popular statistical software programmes such as SAS, LISREL, AMOS and Mplus. For the purpose of this article, we choose AMOS (Analysis of Moment Structures) for its use of ease. The AMOS software program can easily create models and calculate the value of Chi-square as well as the fit indices. In the above example, a test of 8 questions has two factors, history-taking and physical examination and the variance of these eight exam questions can be explained by these two highly correlated factors. The test developer draws the two-factor model (the path diagram) in AMOS to test the model (Figure 3). Before estimating the parameters of the model, click on the ‘view’ and click on ‘Analysis Properties’ and then click on ‘Minimization history’, Standardised estimates, ‘Squared multiple Correlations’ and ‘Modification indices’. To run the estimation, from the menu at the top, click on ‘Analyze’, then click on ‘Calculate Estimates’.



intercept와 slope를 계산해준다. intercept는 아이템의 난이도와 유사하며, slope는 변별도와 유사하다.

The output is given in Table 6. SEM calculates the slopes and intercepts of calculated correlations between questions and factors. From a CTT, the intercept is analogous to the item difficulty index and the slope (standardised regression weights/coefficients) is analogous to the discrimination index.



병력청취의 4번문항은  변별도가 낮아서 전체 점수에 별로 기여하지 못하고 있다고 판단할 수 있다.

Table 6 shows that Question 1 in history-taking and Question 3 in physical examination were easy (intercept = 0.97) and hard (0.08), respectively. Table 6 also shows that Question 4 in history-taking is not contributing to overall history-taking score (slope = −0.03). Further analysis was conducted to assess degree of fit model to the exam data. Focusing on Table 7, the absence of significance for the Chi-square value (p = 0.49) implies support for the two- factor model in the new sample. In reviewing values of both CFI and RMSEA in Table 7, it is evident that the two-factor model represents a best fit to the exam data for the new sample.


병력청취와 신체검진 사이의 관계를 보면, 0.7의 상관관계가 있어서 가설로 설정한 2요인 모델을 지지한다. 

Further evidence for the relationship between the history-taking and physical examination components of the test is revealed by the calculation of a 0.70 correlation between the two factors, supporting the hypothesised two-factor model. It should be noted that AMOS will display the correlation between factors/components by clicking the ‘view the output diagram’ button. You can also view correlation estimates from ‘text output’. From the main menu, choose view and then click on ‘text output’.



Generalisability theory analysis

We would ask you to recall that reliability is concerned with the ability of a test to measure students' knowledge and competencies consistently. For example, if students are re-examined with the same items and with the same conditions on different occasions, the results should be more or less the same. In CTT, the items and conditions may be the causes of measurement errors associated with the obtained scores. Reliability estimates, such as KR-20 or Cronbach's alpha, cannot identify the potential sources of measurement error associated with these items and conditions (also known as facets of the test) and cannot discriminate between each one. However, an extension of CTT called Generalisability Theory or G-theory, developed by Lee J. Cronbach and colleagues (Cronbach et al. 1972), attempts to recognise, estimate and isolate these facets allowing test constructors to gain a clearer picture of sources of measurement error for interpreting the true score. One single analysis of, for example, the results of an OSCE examination, using G-theory can estimate all the facets, potentially producing error in the test. Each facet of measurement error has a value associated with it called its variance component, calculated via an analysis of variance (ANOVA) procedure, described below. These variance components are next used to calculate a G-coefficient which is equivalent to the reliability of the test and also enables one to generalise students’ average score over all facets.


For example, imagine an OSCE has used SPs, a range of examiners and various items to assess students' performance on 12 stations. SPs, examiners and items and their interactions (e.g. interaction between SPs and items) are considered as facets of the assessment. The score that the student obtains from the OSCE will be affected by these facets of measurement error and therefore the assessor should estimate the amount of error caused by each facet. Furthermore, we examine students using a test to make a final decision regarding their performance on the test. To make this decision, we need to generalise a test score for each student based on that score. This indicates that assessors should ensure the credibility and trustworthy of the score as means to making a good decision (Raykov & Marcoulides 2011). Therefore, the composition of errors associated with the observed (obtained) scores that gained from a test need to be investigated. G-theory analysis can then provide useful information for test constructors to minimise identified sources of error (Brennan 2001). We will now explain how to calculate the G-coefficient from variance components.


G-coefficient calculation

To calculate the G-coefficient from variance components of facets, test analysers traditionally use the ANOVA procedure. ANOVA is a statistical procedure by which the total variance present in a test is partitioned into two or more components which are sources of measurement error. Using the calculated mean square of each source of variation from the ANOVA output (e.g. SPs, items, assessors, etc.), investigators determine the variance components and then calculate the G-coefficient from these values.


However, SPSS and other statistical packages like the Statistical Analysis System (SAS) now allow us to calculate the variance components directly from the test data. We will now illustrate how to obtain the variance components from SPSS directly for calculating the G-coefficient. The procedure used varies according to the number of facets in the test. There are single facet and multiple facet designs as described below.


Single facet design

A single facet design examines only a single source of measurement error in a test although in reality others may exist. For example, in an OSCE examination, we might like to focus on the influence of examiners as sources of error. In G-theory, this is called a one-facet ‘student (s) crossed-with-examiner (e)’ design: (s × e). Consider an OSCE in which three examiners independently rate a cohort of clinical students on three different stations using a 1–5 check list of 5 items. The total mark can therefore range from 5 to 25, with higher mark suggesting a greater level of performance in each station. Using G-theory, we can find out what amount of measurement error is generated by the examiners. For illustrative purpose, only 10 students and the three examiners are presented in the Data Editor of SPSS in Figure 4.




Before analysing, the data needs to be restructured. To this end, from the data menu at the top of the screen, one clicks on ‘restructure’ and follows the appropriate instructions. In Figure 5, the restructured data format is presented.




To obtain the variance components, the following steps are carried out:


SPSS를 활용한 분석 방법

From the menus chooses ‘Analyse’, ‘General Linear Model’, respectively. Then click on ‘variance components’. Click on ‘Score’ and then click on the arrow to move ‘Score’ into the box marked ‘dependent variable’. Click on student and examiner to move them into ‘random factors’. After ‘variance estimates’ appears, click OK and the contribution of each source of variance to the result is presented as shown in Table 8.



학생의 variance는 facet of measurement error로 분류되지 않으며, 이것은 object of measurement이다. 즉 여기서 평가자에 의해 설명되는 변인은 6.20%이며, 충분히 낮다. 남은 variance는 어떤 구체적 원인에 의한 것도 아니며, 여러 facet간의 상호작용에 의한 것이다. 

Table 8 shows that the estimated variance components associated with student and examiner are 10.144 and 1.578, respectively. Expressed as a percentage of the total variance, it can be seen that 40.00 % is due to the students and 6.20 % to the examiners. However, the variance of the students is not considered a facet of measurement error as this variation is expected within the student cohort and in terms of G-theory, it is called the ‘object of measurement’ (Mushquash & O'Connor 2006). Importantly for our analysis, the findings indicate that the examiners generated 6.20% of the total variability, which is considered a reasonably low value. Higher values would create concern about the effect of the examiners on the test. The residual variance is the amount of variance not attributed to any specific cause but is related to the interaction between the different facets and the object of measurement of the test. In this example, 13.656 or 53.80% of the variance is accounted for by this factor.



On the basis of the findings of Table 8, we are now in a position to calculate the generalisability coefficient. In this case, the G-coefficient is defined as the ratio of the student variance component (denoted ) to the sum of the student variance component and the residual variance (denoted ) divided by the number of examiners (k) (Nunnally and Bernstein 1994) and written as follows:


Inserting the values from above, this gives:


G-coefficient는 신뢰도계수에 대응되는 것이며 0에서 1사이의 값을 갖는다.이 값의 해석은 다양한 variance component들로부터 몇 개의 가능한 오차의 원인을 고려했을 때의 reliability이다. 

The G-coefficient, traditionally depicted as ρ 2, is the counterpart of the well-known reliability coefficient with values ranging from 0 to 1.0. (It is worth noting that the G-coefficient in the single facet design described above is equal to Cronbach's alpha coefficient (for non-dichotomous data) and to Kuder–Richardson 20 (for dichotomous data). The interpretation of the value of the G-coefficient is that it represents the reliability of the test taking into account the multiple sources of error calculated from their variance components. The higher the value of the G-coefficient, the more we can rely on (generalise) the students’ scores and the less influence the study facets have been. In the above example, the G-coefficient has a reasonably high value and the variance component for examiners is low. This shows that the examiners did not have significant variation in scoring students and that we can have confidence in the students’ scores.


A multi-facet design

더 많은 facet을 고려해야 하는 상황이 많다.

Clearly in an OSCE examination, there are a number of other potential facets that need to be taken into consideration in addition to the examiners. For example, the number of stations, the number of SPs and the number of items on the OSCE checklist. We will now explain how to calculate the variance components and a G-coefficient for a multi-facet design building on the previous example. Each of three stations now has a SP and a 5-item checklist leading to an overall score for each student. Here, examiners, stations, SPs and items can affect the student performance and hence are facets of measurement error.


i, s, st, sp, e를 source of error로 넣을 수 있다.

However, because we are now interested in the influence of the number items as a source of error, we need to input the score for each item (i), for each student (s), for each station (st), for each SP (sp) and for each examiner (e). After entering exam data into SPSS and restructuring it, analysis of variance components is carried out as described before. Table 9 shows the hypothetical results of variance components for potential sources of measurement error in the OCSE results.



interaction들 중 표에 나와있지 않은 것들은 그것들로 인한 measurement error는 없다는 뜻이다.

Table 9 shows that 59.16 %, 16.37 % and 15.04 of the sources of measurement error are generated by interactions between student, item and examiner, interactions between student and examiner and student, respectively. The lack of residual variance between other combinations of facets indicates that student scores cannot fluctuate owing to these interactions and consequently they do not lead to any measurement error. The value for the variance component for examiners (0.06) in Table 9 differs from the value in Table 8 (1.57) because in creating the multi-facet matrix, we are using individual item scores from students rather than their total mark for all stations. These findings also indicate that there is little disagreement about the actual scores given to student by each examiner (2.88%). We can insert the values of the variance components and the numbers associated with each facet shown in Table 8 into the following equation:






Zero values of variance components are not inserted, thus excluding SPs and stations.



In this example, the G-coefficient is high and the variance components of the facets are low, hence the reliability of the OSCE is very good. If higher values of variance components are found for particular facets, then they need to be examined in more detail. This might lead to better training for examiners or modifying items in checklists or the number of stations. Given the high G-coefficient shown with these hypothetical data, we could in principle reduce the values of k for individual facets whilst maintaining a reasonably high value of G and hence maintaining the reliability of the OSCE exam. In the real world of OSCEs, this could lead to simplifications and a reduction in the cost of OSCE examining. As for Cronbach's alpha statistic, there are different views concerning acceptable values for G ranging from 0.7 to 0.95 (Tavakol and Dennick 2011a, b). This ability to manipulate the generalisability equation in order to see how examination factors can influence sources of measurement error and hence reliability lies at the heart of decision study or D-study (Raykov & Marcoulides 2011). Thus G-theory and D-study provide a greater insight into the various processes occurring in examinations, hidden by merely measuring Cronbach's alpha statistic. This enables assessors to improve the quality of assessments in a much more specific and evidence-based way.



The IRT and Rasch modelling

CTT에서는 학생의 능력이나, 그 능력에 따라서 문항의 점수가 어떠한지에 대한 정보가 거의 없다. IRT에서는 이러한 점에 초점을 맞추고 있으며, 이 분석을 통해서 CAT의 문제은행을 더 강력하게 만들어줄 수 있다.

Test constructors have traditionally quantified the reliability of exam tests using the CTT model. For example, they use item analysis (item difficulty and item discrimination), traditional reliability coefficients (e.g. KR-20 or Cronbach's alpha), item-total correlations and factor analysis to examine the reliability of tests. We have just shown how G-theory can be used to make more elaborate analyses of examination conditions with a view to monitoring and improving reliability. CTT focuses on the test and its errors but says little about how student ability interacts with the test and its items (Raykov & Marcoulides 2011). On the other hand, the aim of IRT is to measure the relationship between the student's ability and the item's difficulty level to improve the quality of questions. Analyses of this type can also be used to build up better question banks for Computer Adaptive Testing (CAT).



Consider a student taking an exam in anatomy. The probability that the student can answer item 1 correctly is affected by the student's anatomy ability and the item's difficulty level. If the student has a high level of anatomical knowledge, the probability that he/she will answer the item 1 correctly is high. If an item has a low index of difficulty (i.e. a hard item), the probability that the student will answer the item correctly is low. IRT attempts to analyse these relationships using student test scores plus factors (parameters) such as item difficulty, item discrimination, item fairness, guessing and other student attributes such as gender or year of study. In an IRT analysis, graphs are produced showing the relationship between student ability and the probability of correct item responses, as well as item maps depicting the calibrations of student abilities with the above parameters. Also tables showing ‘fit’ statistics for items and students, to be described later.


parameter의 수에 따라서 1PL (Rasch model), 2PL, 3PL 등이 있다.

A variety of forms of IRT have been introduced. If we wish to look at the relationship between item difficulty and student ability alone, we use the one-parameter logistic IRT (1PL). This is called the Rasch model in honour of the Danish statistician who promoted it in the 1960s. The Rasch model assesses the probability that a student will answer an item correctly given their conceptual ability and the item difficulty. Two-parameter IRT (2PL) or three-parameter IRT (3PL) are also available where further parameters such as item discrimination, item difficulty, gender or year of study can be included. For the purposes of this article, we are going to concentrate on 1PL or Rasch modelling.


Rasch modelling에서 학생의 능력은 평균이 0이 되도록 표준화되며, 문항의 난이도 역시 평균이 0이 되도록 표준화된다. 즉, 표준화 이후에 평균점수가 0인 학생의 능력은 딱 평균인 것이며, 점수가 1.5라면 평균보다 1.5SD 위에 있는 것이다. 유사하게 난이도도 0인 것이 난이도가 중간인 것이다. 

In Rasch modelling, the scores of students’ ability and the values of item difficulty are standardised to make interpretation easier. After standardising the mean, student ability level is set to 0 and the SD is set to 1. Similarly, the mean item difficulty level is set to 0 and the SD is set to 1. Therefore, after standardisation a student who receives a mean score of 0 has an average ability for the items being assessed. With a score of 1.5, the student's ability is 1.5, SDs above the mean. Similarly, an item with a difficulty of 0 is considered an average item and an item with a difficulty of 2 is considered to be a hard item. In general, if a value of a given item is positive, that item is difficult for that cohort of students and if the value is negative, that item is easy (Nunnally & Bernstein 1994).


To standardise the student ability and item difficulty, consider Table 10, presenting the simulated dichotomous data for seven items on an anatomy test from seven students showing the student ability for each student and the difficulty level for each of the seven items. To calculate the ability of the student, which is called θ , the natural logarithm of the ratio of the fraction correct to the fraction incorrect (or 1 – fraction correct) for each student is taken. For example, the ability of student 2 (θ2) is calculated as follows:








학생 능력은 평균 이상이다.

This indicates that the ability of student 2 is 0.89 above the mean SD. To calculate the difficulty level of each item which is called b, the natural log of the ratio of the fraction incorrect (or 1 – fraction correct) to the fraction correct for each item is calculated. For example, the difficulty of item 1 is calculated as follows:




-1.73은 문항이 쉬웠다는 뜻

A value of −1.73 suggests that the item is relatively easy. This standardisation process is carried out for all students and all items and can easily be facilitated in an Excel spreadsheet (Table 10).






특정 수준의 능력을 가진 학생이 특정 난이도의 문항에 정답을 맞출 가능성은 아래와 같이 구할 수 잇다.

We are now in a position to estimate the probability that a student with a specific ability will correctly answer a question with a specific item difficulty. For 1PL, the following equation is used to estimate the probability:


즉 학생 1이 문항 1을 정확히 맞출 가능성은 0.12이다. 3번 학생이 3번 문항에 정답을 맞출 가능성은 50%이고, 이는 우연히 맞출 확률과 같다. Rasch 분석의 기본적 목적은 학생 능력과 난이도가 맞는 문항을 만드는 것이다. 좀 더 단순하게 말하면 '똑똑한 학생'을 '똑똑한 문항'과 매치시켜주는 것이다. 

Where p is the probability, θ is the student ability and b the item difficulty. Referring to Table 10, the ability of student 1 is −0.28 SD below the average, and item 1, with a difficulty level of −1.73, was answered correctly, which is below the average. On the basis of the above formula, the probability that student 1 will answer item 1 correctly is [1/(1 + e−(−0.28−(−1.73))] = 0.12. Considering student 3's ability level and the difficulty of item 4, the probability that the student will answer correctly item 3 is [1/(1 + e−(0.28−(0.28))] = [1/(1 + e0)] = 0.50. This shows that if the level of student ability and the level of item difficulty are matched, the probability that the student will select the correct answer is 50%, which is equal to chance. The fundamental aim of Rasch analysis is to create test items that match their degree of difficulty with student ability. In simple terms, the ‘cleverness’ of the students should be matched with the ‘cleverness’ of the items. In order to further examine the relationship between student ability and item difficulty, the data in Table 11 shows the probability (p) that a student will answer item 1, with item difficulty (b), correctly given their ability (θ) using data taken from Table 10 and using the equation above.


Item characteristic curves

In Rasch analysis, the relationship between item difficulty and student ability is depicted graphically in an item characteristic curve (ICC) shown in Figure 6.






In Figure 6, dotted lines are drawn to interpret the characteristics of item 1. There is a 50% probability that students with an ability of −1.85 will answer this question correctly. This implies that student with lower ability have an equal chance of answering this question correctly. In addition, a student with an average ability (θ = 0) has an 80% chance of giving a correct answer. The implication is that this question is too easy. It should be noted that if an item shifts the curve to the left along the theta axis, it will be an easy item and a hard item will shift the curve to right. Examples of ICC curves for items taken from an examination analysis shown in Figure 8 are displayed in Figure 7. Figure 7(a) shows a difficult question (Question 101) and Figure 7(b) shows an easy question (Question 3). Figure 7(c) shows the ‘perfect’ question (Question 46) in which students of average ability have a 50% chance of giving the correct answer.








Item-student maps

위의 Fig 8은 왼쪽과 오른쪽으로 나누어 볼 수 있다. 왼쪽은 학생의 능력이며 모든 학생이 평균 이상의 능력을 지님을 의미한다. 오른쪽은 문항의 난이도이며 일부 문항은 매우 어렵고 일부 문항은 매우 쉬우나 전반적으로 학생이 문제보다 더 '똑똑하다' 고 할 수 있다.

The distribution of students’ ability and the difficulty of each item can also be presented on an Item–student map (ISM). Using IRT software programmes such as Winsteps® (Linacre, 2011) item difficulty and student ability can be calculated and displayed together. Figure 8 shows the ISM using data from a knowledge-based test. The map is split into two sides. The left side indicates the ability of students whereas the right side shows the difficulty of each item. The ability of each student is represented by ‘hash’ (#) and ‘dot’ (.), items are shown by their item number. Item difficulty and student ability values are transformed mathematically, using natural logarithms, into an interval scale whose units of measurement are termed ‘logits’. With a logit scale, differences between values can be quantified and equal distances on the scale are of equal size (Bond & Fox 2007). Higher values on the scale imply both greater item difficulty and greater student ability. The letters of ‘M’, ‘S’ and ‘T’ represents mean, one standard deviation and two standard deviations of item difficulty and student ability, respectively. The mean of item difficulty is set to 0. Therefore, for example, items 46, 18 and 28 have an item difficulty of 0, 1, and −1 respectively. A student with an ability of 0 logits has a 50% chance of answering items 46, 60 or 69 correctly. The same student has a greater than 50% probability of correctly answering items less difficult, for example items 28 and 62. In addition, the same student has a less than 50% probability of correctly answering more difficult items such items 64 and 119.


By looking at the ISM in Figure 8 we can now interpret the properties of the test. First, the student distribution shows that the ability of students is above the average, whereas more than half of the items have difficulties below the average. Second, the students on the upper left side are ‘cleverer’ than the items on the lower right side meaning that the items were easy and unchallenging. Third, most students are located opposite items to which they are well matched on the upper right and there are no students on the lower left side. However, items 101, 40, 86 and 29 are too difficult and beyond the ability of most students.


Overall, in this example, the students are ‘cleverer’ than most of the items. Many items in the lower right hand quadrant are too easy and should be examined, modified or deleted from the test. Similarly, some items are clearly too difficult. The advantage of Rasch analysis is that it produces a variety of data displays encapsulating both student and item characteristics that enable test developers to improve the psychometric properties of items. By matching items to student ability, we can improve the authenticity and validity of items and develop higher quality item banks, useful for the future of computer adapted testing.



Conclusions

Objective tests as well as OSCE stations should be the psychometrically sound instruments used for measuring the proficiency of students and can be of use to medical educators interested in the actual use of these examination tests in the future. In this Guide, we tried to simply explain how to interpret the outcomes of psychometric values in objective test data. Examination tests should be standardised both nationally and locally and we need to ensure about the psychometric soundness of these tests. A normal question that may be posed is to what extent our exam data measure the student ability (to what extent the students have learned subject matter). The interpretation of exam data using psychometric methods is central to understand students’ competencies on a subject matter and to identify students with low ability. Furthermore, these methods can be employed for test validation research. We would suggest medical teachers, especially who are not trained in psychometric methods, practice these methods on hypothetical data and then analyse their own real exam data in order to improve the quality of exam data.











 2012;34(3):e161-75. doi: 10.3109/0142159X.2012.651178.

Post-examination interpretation of objective test data: monitoring and improving the quality of high-stakes examinations: AMEE Guide No. 66.

Author information

  • 1University of Nottingham, UK.

Abstract

The purpose of this Guide is to provide both logical and empirical evidence for medical teachers to improve their objective tests by appropriate interpretation of post-examination analysis. This requires a description and explanation of some basic statistical and psychometric concepts derived from both Classical Test Theory (CTT) and Item Response Theory (IRT) such as: descriptive statistics, explanatory and confirmatory factor analysis, Generalisability Theory and Rasch modelling. CTT is concerned with the overall reliability of a test whereas IRT can be used to identify the behaviour of individual test items and how they interact with individual student abilities. We have provided the reader with practical examples clarifying the use of these frameworks in test development and for research purposes.

PMID:
 
22364473
 
[PubMed - indexed for MEDLINE]


평가 관련 이론들 : AMEE Guide No. 57

General overview of the theories used in assessment: AMEE Guide No. 57

LAMBERT W. T. SCHUWIRTH1 & CEES P. M. VAN DER VLEUTEN2

1Flinders University, Australia, 2Maastricht University, The Netherlands





Introduction

평가는 모든 사람의 관심사이며 이는 놀랄 일도 아니다.

It is our observation that when the subject of assessment in medical education is raised, it is often the start of extensive discussions. Apparently, assessment is high on everyone's agenda. This is not surprising because assessment is seen as an important part of education in the sense that it not only defines the quality of our students and our educational processes, but it is also seen as a major factor in steering the learning and behaviour of our students and faculty.


그러나 평가와 관련된 논의는 종종 전통과 직관에 의존하고 있다. 전통에 관심을 갖는 것이 꼭 나쁜 것은 아니며, George Santayana는 "역사로부터 배우지 못하는 사람은 그것을 반복하게 된다"라고 했다.

Arguments and debates on assessment, however, are often strongly based on tradition and intuition. It is not necessarily a bad thing to heed tradition. George Santayana already stated (quoting Burk) that Those who do not learn from history are doomed to repeat it.1 So, we think that an important lesson is also to learn from previous mistakes and avoid repeating them.


직관 역시 변덕스럽게 옆으로 치워두어야 할 것은 아니며, 사람들의 행동을 변화시키는 강력한 힘이되기도 한다. 그러나 마찬가지로 직관이 연구 결과와 부합하지 않는 경우도 많다.

Intuition is also not something to put aside capriciously, it is often found to be a strong driving force in the behaviour of people. But again, intuition is not always in concordance with research outcomes. Some research outcomes in assessment are somewhat counter intuitive or at least unexpected. Many researchers may not have exclaimed Eureka but Hey, that is odd instead.


여기서 두 가지 중요한 과제가 나타난다. 첫째로, 우리는 실수를 반복하지 않기 위해서 전통적으로, 일상적으로 수행하고 있는 것이 여전히 그 만한 가치가 있는가를 비판적으로 평가해볼 필요가 있다. 두 번째로, 틀린 직관을 바로잡기 위해서는 연구결과를 적절한 방법과 접근법으로 translate할 수 있어야 한다. 

This leaves us, as assessment researchers, with two very important tasks. First, we need to critically study which common and tradition-based practices still have value and consequently which are the mistakes that should not be repeated. Second, it is our task to translate research findings to methods and approaches in such a way that they can easily help changing incorrect intuitions of policy makers, teachers and students into correct ones. Both goals cannot be attained without a good theoretical framework in which to read, understand and interpret research outcomes. The purpose of this AMEE Guide is to provide an overview of some of the most important and most widely used theories pertaining to assessment. Further Guides in assessment theories will give more detail on the more specific theories pertaining to assessment.


그러나 불행하게도 많은 다른 과학분야와 마찬가지로 의학교육 분야의 평가는 근간을 이루는 이론이 단일하지 않으며, 인접한 분야에서 다양한 이론을 끌어다 슨다. 추가적으로 의료전문직의 평가와 좀 더 직접적으로 관련된 이론적 틀이 만들어져오기도 했으며, 여기서 가장 중요한 인식은 '학습의 평가'와 '학습을 위한 평가'라는 두 인식의 차이이다. 

Unfortunately, like many other scientific disciplines, medical assessment does not have one overarching or unifying theory. Instead, it draws on various theories from adjacent scientific fields, such as general education, cognitive psychology, decision-making and judgement theories in psychology and psychometric theories. In addition, there are some theoretical frameworks evolving which are more directly relevant to health professions assessment, the most important of which (in our view) is the notion of ‘assessment of learning’ versus ‘assessment for learning’ (Shepard 2009).


In this AMEE Guide we will present the theories that have featured most prominently in the medical education literature in the recent four decades. Of course, this AMEE Guide can never be exhaustive; the number of relevant theoretical domains is simply too large, nor can we discuss all theories to their full extent. Not only would this make this AMEE Guide too long, but also this would be beyond its scope, namely to provide a concise overview. Therefore, we will discuss only the theories on the development of medical expertise and psychometric theories, and then end by highlighting the differences between the assessment of learning and assessment for learning. As a final caveat, we must say here that this AMEE Guide is not a guide to methods of assessment. We assume that the reader has some prior knowledge about this or we would like to refer to specific articles or to text books (e.g. Dent & Harden 2009).



의-전문가가 어떻게 만들어지는가에 대한 이론 

Theories on the development of (medical) expertise

의학 분야의 전문가는 어떤 특징을 갖는가? 초심자와 전문가의 차이는 무엇인가? 이런 질문들이 반드시 나오게 되어있으며, 이는 무엇을 평가할지 모른다면 어떻게 평가할지 결정할 수 없기 때문이다.

What distinguishes someone as an expert in the health sciences field? What do experts do differently compared to novices when solving medical problems? These are questions that are inextricably tied to assessment, because if you do not know what you are assessing it also becomes very difficult to know how you can best assess.


경험을 쌓고 학습을 하면 전문가가 된다는 것이 자명해 보일지도 모른다.

I may be obvious that someone can only become an expert through learning and gaining experience.


초창기 연구 중 하나는 de Groot의 연구로, 왜 체스의 그랜드마스터가 그랜드마스터가 되었으며, 무엇을 아마추어와 다르게 했는가를 연구하였다. 처음에 그는 아마추어보다 더 많은 수를 내다볼 수 있기 때문이라고 생각했다. 그러나 그렇지 않았으며, 약 7수 이상을 내다보는 것은 동일했다. 대신 de Groot이 찾은 것은 그랜드마스터는 체스판의 말의 위치를 더 잘 암기한다는 것이다. 그와 그의 후계자들은 그랜드마스터는 잠깐만 보고도 체스판위의 말의 위치를 정확하게 기억해낼 수 있음을 발견했다.

One of the first to study the development of expertise was by de Groot (1978), who wanted to explore why chess grandmasters became grandmasters and what made them differ from good amateur chess players. His first intuition was that grandmasters were grandmasters because they were able to think more moves ahead than amateurs. He was surprised, however, to find that this was not the case; players of both expertise groups did not think further ahead than roughly seven moves. What he found, instead, was that grandmasters were better able to remember positions on the board. He and his successors (Chase & Simon 1973) found that grandmasters were able to reproduce positions on the board more correctly, even after very short viewing times. Even after having seen a position for only a few seconds, they were able to reproduce it with much greater accuracy than amateurs.


여기서 아마도 '기억력이 우수하다는 것이군'이라고 생각할 수 있지만 그렇지 않다. 인간의 작업기억은 약 7단위(+/- 2) 정도이며, 학습으로 향상되는 것이 아니다.

One would think then that they probably had superior memory skills, but this is not the case. The human working memory has a capacity of roughly seven units (plus or minus two) and this cannot be improved by learning (Van Merrienboer & Sweller 2005, 2010).


가장 두드러진 차이는 작업기억에 넣을 수 있는 유닛의 숫자가 아니라, 각 유닛이 담고 있는 정보의 양 차이였다.

The most salient difference between amateurs and grandmasters was not the number of units they could store in their working memory, but the richness of the information in each of these units.


자국어로 정보를 저장할 때는 모든 단어와 일부 정형화된 표현이 하나의 단위로 저장되는데, 이는 이미 장기기억에 존재하는 기억과 직접적으로 연결되기 때문이다. 반면 외국어를 사용할 때에는 그 외국어가 장기기억에 입력된바가 없다면 인지적 자원의 일부를 글자 그 자체를 외우는데 사용해야 한다. (...) 이처럼 정보를 저장할 때 더 큰 양의 정보를 담은 유닛으로 저장하는 것을 chunking이라고 하며, 전문성이 무엇인지, 어떻게 개발되는지에 있어서 중요한 요소이다. 

To illustrate this, imagine having to copy a text in your own language, then a text in a foreign Western European language and then one in a language that uses a different character set (e.g. Cyrillic). It is clear that copying a text in your own language is easiest and copying a text in a foreign character set is the most difficult. While copying you have to read the text, store it in your memory and then reproduce it on the paper. When you store the text in your native language, all the words (and some fixed expressions) can be stored as one unit, because they relate directly to memories already present in your long-term memory. You can spend all your cognitive resources on memorising the text. In the foreign character set you will also have to spend part of your cognitive resources on memorising the characters, for which you have no prior memories (schemas) in your long-term memory. A medical student who has just started his/her study will have to memorise all the signs and symptoms when consulting a patient with heart failure, whereas an expert can almost store it as one unit (and perhaps only has to store the findings that do not fit to the classical picture or mental model of heart failure). This increasing ability to store information as more information-rich units is called chunking and it is a central element in expertise and its development. Box 1 provides an illustration of the role of chunking.




So, why were the grandmasters better than good amateurs? Well, mainly because they possessed much more stored information about chess positions than amateurs did, or in other words, they had acquired so much more knowledge than the amateurs had.


이 체스 연구로부터 배울 것이 있다면 - 다른 분야의 수많은 연구를 통해서도 밝혀진 바와 같이 - '풍부하고' '잘 구성된' 지식기반이 성공적인 문제해결의 근간이라는 것이다.

If there is one lesson to be drawn from these early chess studies – which have been replicated in such a plethora of other expertise domains that it is more than reasonable to assume that these findings are generic – it is that a rich and well-organised knowledge base is essential for successful problem solving (Chi et al. 1982; Polsen & Jeffries 1982).


다음 질문은 그렇다면 'well-organized'의 의미가 무엇인가? 일 것이다. 근본적으로 '조직화'는 새로운 정보를 빠르게 저장하고 더 오래 기억하게 하며, 관련된 정보가 필요할 때 바로 인출할 수 있게 만들어준다. (...) 컴퓨터를 인간의 두뇌에 비교하곤 하지만 인간은 컴퓨터와 같이 File Allocation Table을 사용하지 않는다. 이는, 인간에게 있어서 새로운 정보가 연결될 기존의 지식이 없는 상태에서 새로운 정보를 저장하는 것이 대단히 어려운 일임을 보여준다.

The next question then would be: What does ‘well-organised’ mean? Basically, it comes down to organisation that will enable the person to store new information rapidly and with good retention and to be able to retrieve relevant information when needed. Although the computer is often used as a metaphor for the human brain (much like the clock was used as a metaphor in the nineteenth century), it is clear that information storage on a hard disk is very much different from human information storage. Humans do not use a File Allocation Table to index where the information can be found, but have to embed information in existing (semantic) networks (Schmidt et al. 1990). The implication of this is that it is very difficult to store new information if there is no existing prior information to which it can be linked. Of course, the development of these knowledge networks is quite individualised, and based on the individual learning pathways and experiences. For example, we – the authors of this AMEE Guide – live in Maastricht, so our views, connotations and association with ‘Maastricht’ differ entirely from those of most of the readers of the AMEE Guides, although we may share the knowledge that it is a city (and perhaps that it is in the Netherlands) and that there is a university with a medical school, the rest of the knowledge is much more individualised.


'지식'이란 것은 상당히 '영역-특이적'인 것이다. 한 사람이 한 토픽에는 매우 많은 것을 알면서도 다른 토픽에 대해서는 거의 아는 바가 없을 수 있다. 그리고 전문성이라는 것이 '잘 구성된' 지식을 기반으로 하기에, '전문성' 역시 영역-특이적이다. 평가에 있어서 이것이 의미하는 바는, 한 사람의 수행능력을 하나의 사례나 문항으로 테스트하는 것은 다른 사례나 문항에 대한 예측력 차원에서 대단히 좋지 않다는 것이다. 따라서 절대로 제한된 평가 정보에 의존해서는 안된다. 고부담의 의사결정을 단일한 사례에 기반해서 한다면 이는 매우 신뢰성이 낮다고 볼 수 있다.

Knowledge generally is quite domain specific (Elstein et al. 1978; Eva et al. 1998); a person can be very knowledgeable on one topic and a lay person on another, and because expertise is based on a well-organised knowledge base, expertise is domain specific as well. For assessment, this means that the performance of a candidate on one case or item of a test is a poor predictor for his or her performance on any other given item or case in the test. Therefore, one can never rely on limited assessment information, i.e. high-stakes decisions made on the basis of a single case (e.g. a high-stakes final VIVA) are necessarily unreliable.


두 번째 중요한 교훈은 '문제해결능력'이란 것이 개인-특이적(idiosyncratic)이라는 점이다. 앞서 논의한 영역-특이성은 한 사람의 수행능력도 다양한 사례에 따라서 서로 달라질 수 있다는 것이라면, 여기서 말한 개인-특이성은 동일한 사례라도 전문가에 따라서 그 해결방법이 서로 다를 수 있다는 점이다. 이는 지식을 구조화한 방법이 개인마다 다르다는 점을 고려하면 논리적이다. 이로부터 평가와 관련해서 얻을 수 있는 교훈은 각 지원자의 전문성을 '진단'하는 차원에서 문제해결의 '절차'를 평가하는 것은 문제해결의 '결과'를 평가하는 것에 비해서 얻을 수 있는 정보가 더 적다는 것이다. 

A second important and robust finding in the expertise literature – more specifically the diagnostic expertise literature – is that problem-solving ability is idiosyncratic (cf. e.g. the overview paper by Swanson et al. 1987). Domain specificity, which we discussed above, means that the performance of the same person varies considerably across various cases, idiosyncrasy here means that the way different experts solve the same case varies substantially between different experts. This is also logical, keeping in mind that the way the knowledge is organised is highly individual. The assessment implication from this is that when trying to capture, for example, the diagnostic expertise of candidates, the process may be less informative than the outcome, as the process is idiosyncratic (and fortunately the outcome of the reasoning process is much less).


가장 중요한 이슈는 '전이(transfer)'에 대한 것이다. 이는 영역-특이성, 개인-특이성과 관련된 것인데, '전이'는 한 사람이 특정 문제에 대해서 적용할 수 있는 문제해결능력을 다른 문제에 대해서도 적용할 수 있는 정도를 말하며, 서로 다른 두 가지의 유사성을 이해하고 동시에 적용될 수 있는 원칙을 발견할 수 있어야 한다.

A third and probably most important issue is the matter of transfer (Norman 1988; Regehr & Norman 1996; Eva 2004). This is closely related to the previous issue of domain specificity and idiosyncrasy. Transfer pertains to the extent to which a person is able to apply a given problem-solving approach to different situations. It requires that the candidate understands the similarities between two different problem situations and recognises that the same problem-solving principle can be applied. Box 2 provides an illustration (drawn from a personal communication with Norman).




이 두 가지 문제에서 구체적으로 드러난 상황은 'surface feature'라고 할 수 있으며, 두 가지 문제에 근본적으로 깔려있는 원칙은 'deep structure'라고 할 수 있다. 이러한 상황에서 '전이'라는 것은 'surface feature'에 눈이 멀지 않고 'deep structure'를 밝혀낼 수 있는 능력이다.

Most often, the first problem is not recognised as being essentially the same as the second and that the problem-solving principle is also the same. Both solutions lie in the splitting up of the total load into various parts. In problem 1, the 1000 W laser beam is replaced by 10 rays of 100 W each, but converging right on the spot where the filament was broken. In the second problem the solution is more obvious: build five bridges and then let your men run onto the island. If the problem were represented as: you want to irradiate a tumour but you want to do minimal harm to the skin above it, it would probably be recognised even more readily by physicians. The specific presentation of these problems is labelled as the surface features of the problem and the underlying principle is referred to as the deep structure of the problem. Transfer exists by the virtue of the expert to be able to identify the deep structure and not to be blinded by the surface features.


의-전문성의 발달에 관련된 가장 많이 이용되는 이론 중 하나가 주장하는 것은 '의-전문성'이라는 것은 isolated fact을 모으는 것으로부터 시작해서, 이들을 혼합해서 의미있는 semantic network를 구성하는 것이다. 이 네트워크는 이후 보다 압축되어 고밀도의 illness script를 만들게 되고, 수년간의 경험이 쌓이면 이는 instance script가 되어서 특정 진단을 즉각적으로 인식할 수 있게 된다. illness script(특정 진단의 굳어진 패턴) 와 instance script의 차이는 instance script은 보통 사람이라면 지나칠 수 있는 맥락까지도 고려한다는 것이다. 이러한 맥락에는 환자의 외모나 냄새까지도 포함한다.

One of the most widely used theories on the development of medical expertise is the one suggested by Schmidt, Norman and Boshuizen (Schmidt 1993; Schmidt & Boshuizen 1993). Generally put, this theory postulates that the development of medical expertise starts with the collection of isolated facts which further on in the process are combined to form meaningful (semantic) networks. These networks are then aggregated into more concise or dense illness scripts (for example pyelonephritis). As a result of many years of experience, these are then further enriched into instance scripts, which enable the experienced clinician to recognise a certain diagnosis instantaneously. The most salient difference between illness scripts (that are a sort of congealed patterns of a certain diagnosis) and instance scripts is that in the latter contextual, and for the lay person sometimes seemingly irrelevant, features are also included in the recognition. Typically, these include the demeanour of the patient or his/her appearance, sometimes even an odour, etc.


평가에 있어서 중요한 교훈들.

These theories then provide important lessons for assessment:


  1. Do not rely on short tests. The domain specificity problem informs us that high-stakes decisions based on short tests or tests with a low number of different cases are inherently flawed with respect to their reliability (and therefore also validity). Keep in mind that unreliability is a two-way process, it does not only imply that someone who failed the test could still have been satisfactorily competent, but also that someone who passed the test could be incompetent. The former candidate will remain in the system and be given a re-sit opportunity, and this way the incorrect pass–fail decision can be remediated, but the latter will escape further observation and assessment, and the incorrect decision cannot be remediated again.
  2. For high-stakes decisions, asking for the process is less predictive of the overall competence than focussing on the outcome of the process. This is counterintuitive, but it is a clear finding that the way someone solves a given problem is not a good indicator for the way in which she/he will solve a similar problem with different surface features; she/he may not even recognise the transfer. Focussing on multiple outcomes or some essential intermediate outcomes – such as with extended-matching questions, key-feature approach assessment or the script concordance test – is probably better than in-depth questioning the problem-solving process (Bordage 1987; Case & Swanson 1993; Page & Bordage 1995; Charlin et al. 2000).
  3. Assessment aimed only at reproduction will not help to foster the emergence of transfer in the students. This is not to say that there is no place for reproduction-orientated tests in an assessment programme, but they should be chosen very carefully. When learning arithmetic, for example, it is okay to focus the part of the assessment pertaining to the tables of multiplication on reproduction, but with long multiplications, focussing on transfer (in this case, the algorithmic transfer) is much more worthwhile.
  4. When new knowledge has to be built into existing semantic networks, learning needs to be contextual. The same applies to assessment. If the assessment approach is to be aligned with the educational approach, it should be contextualised as well. So whenever possible, set assessment items, questions or assignments in a realistic context.



Psychometric theories

Whatever purpose an assessment may pursue in an assessment programme, it always entails a more or less systematic collection of observations or data to arrive at certain conclusions about the candidate. The process must be both reliable and valid. Especially, for these two aspects (reliability and validity) psychometric theories have been developed. In this chapter, we will discuss these theories.


타당도 Validity

최근 100년간 타당도에 대한 개념이 몇 차례 바뀌었다.

Simply put, validity pertains to the extent to which the test actually measures what it purports to measure. In the recent century, the central notions of validity have changed substantially several times. 


타당도에 대한 첫 번째 이론은 준거타당도 혹은 예측타당도에 대한 개념에 가까웠다. 완전히 비논리적은 것은 아니며, 많은 교수들이 '정말 이렇게 해서 좋은 의사가 나오는건가?'라는 질문을 하는 것과 유사하다. 그러나 '좋은 의사'에 대한 단일한, 충분한, 측정가능한 준거가 존재하지 않는한 이 질문은 대답할 수 없다. '좋은 의사'와 같은 용어에 대한 타당도를 정의내리기가 어려운 문제와 마찬가지인 것이다. 또한 준거의 타당도를 점증해야 하는 문제가 있다. 한 연구자가 '좋은 의사'를 측정하기 위한 척도를 제안하여 이를 특정 평가에 사용하였다면, 그 척도에 대한 타당도 평가가 필요하다. 이 준거에 대한 타당도 평가를 위해서는 그것을 평가하기 위한 준거가 필요하고, 이러한 문제는 무한히 반복된다.

The first theories on validity were largely based on the notion of criterion or predictive validity. This is not illogical as the intuitive notion of validity is one of whether the test predicts an outcome well. The question that many medical teachers ask when a new assessment or instructional method is suggested is: But does this produce better doctors?. This question – however logical – is unanswerable in a simple criterion-validity design as long as there is no good single measureable criterion for good ‘doctorship’. This demonstrates exactly the problem with trying to define validity exclusively in such terms. There is an inherent need to validate the criterion as well. Suppose a researcher was to suggest a measure to measure ‘doctorship’ and to use it as the criterion for a certain assessment, then she/he would have to validate the measure for ‘doctorship’ as well. If this again were only possible through criterion validity, it would require the research to validate the criterion for the criterion as well – etcetera ad infinitum.


두 번째의 직관적 접근법은 수행능력을 관찰하여 평가하는 것이다. 예컨대 플룻 연주 실력을 평가하는 것은 복잡하지 않다. 플룻 전문가들을 모시고 각 연주자를 연주하게 하면 된다. 물론, 일부 블루프린트에서는 다양한 음악 장르에 걸친 연주실력 평가를 해야 할 수도 있다. 오케스트라 지원자에 대해서는 오케스트라의 레파토리 중에 있는 음악을 잘 연주해야 한다. 이러한 내용타당도는 중요한 역할을 해왔고 지금도 그러하다.

A second intuitive approach would be to simply observe and judge the performance. If one, for example, wishes to assess flute-playing skills, the assessment is quite straightforward. One could collect a panel of flute experts and ask them to provide judgements for each candidate playing the flute. Of course, some sort of blueprinting would then be needed to ensure that the performances of each candidate would entail music in various ranges. For orchestral applicants, it would have to ensure that all classical music styles of the orchestra's repertoire would be touched upon. Such forms of content validity (or direct validity) have played an important role and still do in validation procedures.


그러나 우리가 학생에 대해 평가하고자 하는 것은 이렇게 명확하게 관찰가능하거나 관찰을 통해 추론할 수 있는 것이 아니다. 지식이나 신경증 정도 역시 잠재특성(latent traits)이며, 지식, 문제해결능력, 프로페셔널리즘도 마찬가지다. 직접 관찰이 불가능하고, 관찰된 행동을 기반으로 한 가정에 따라 평가할 수 밖에 없다.

However, most aspects of students we want to assess are still not clearly visible and need to be inferred from observations. Not only are characteristics such as intelligence or neuroticism invisible (so-called latent) traits, but also are elements such as knowledge, problem-solving ability, professionalism, etc. They cannot be observed directly and can only be assessed as assumptions based on observed behaviour.


Cronbach와 Meehl은 'construct validity'라는 개념을 주장했는데, 그들에 따르면 'construct validity'는 '귀납적 경험적 과정'과 마찬가지다. 각 연구자는 어떤 시험이 측정하고자 하는 구인(construct)에 대하여 명확한 이론을 만들거나 추정을 하게 된다. 그리고 나서 그 시험을 설꼐하고 수행하고 비판적 평가를 하여 구인에 대한 이론적 개념을 지지하는지 확인한다. 

In an important paper, Cronbach and Meehl (1955) elaborated on the then still young notion of construct validity. In their view, construct validation should be seen as analogous to the inductive empirical process; first the researcher has to define, make explicit or postulate clear theories and conceptions about the construct the test purports to measure. Then, she/he must design and carry through a critical evaluation of the test data to see whether they support the theoretical notions of the construct. An example of this is provided in Box 3.



이것은 '중간 효과'라고도 불리는데, 검사의 타당도에 대한 가정을 반증하는 중요한 요인이다.

The so-called ‘intermediate effect’, as described in the example (Box 3) (especially when it proves replicable) is an important falsification of the assumption of validity of the test.


여기서 배울 교훈은 다음과 같다.

We have used this example deliberately, and there are important lessons that can be drawn from it. 


이러한 중간 효과가 존재한다는 것은 타당도를 강하게 반박하는 근거가 된다. 검사의 타당성을 지지할 수 있는 근거는 결정적 '관찰'로부터 나타나야 한다. 비유를 들자면 다음과 같다. 특정 질병에 걸렸음을 최대한으로 보여주려면, 질병이 없을 때 검사 결과 음성으로 나올 가능성을 최대한 높은 검사를 사용해야 한다. '약한' 실험에서 나온 근거는 타당성을 보여줄 수 없다.

First, it demonstrates that the presence of such an intermediate effect in this case is a powerful falsification of the assumption of validity. This is highly relevant, as currently it is generally held that a validation procedure must contain ‘experiments’ or observations which are designed to optimise the probability of falsifying the assumption of validity (much like Popper's falsification principle2). Evidence supporting the validity must therefore always arise from critical ‘observations’. There is a good analogy to medicine or epidemiology. If one wants to confirm the presence of a certain disease with the maximum likelihood, one must use the test with the maximum chance of being negative when disease is absent (the maximum sensitivity). Confirming evidence from ‘weak’ experiments therefore does not contribute to the validity assumption.


두 번째로, 흔한 오해 중 하나로 '실제 직무 상황에서의 평가'라는 것이 '타당성'을 담보하지 않는다. 평가시에 authenticity를 최대한으로 높이기 위한 여러 근거들이 있으나, 그 효과는 주로 형성평가보다는 총괄평가에서 나타난다. 이러한 상황을 가정해보자. 만약 우리가 진료하는 의사의 매일매일의 수행능력의 수준을 평가하기 위해서 진료하는 실제 상황을 평가할 수도 있고, 차트를 리뷰하거나, 검사결과, 타과의뢰 자료를 볼 수도 있다. 후자가 분명 덜 authentic하지만, 더 valid할 수 있다. 첫 번째 평가방식에서 나타날 수 있는 '관찰자 효과'는 의사의 행동에 영향을 줘서 편향된 결과를 보여줄 수 있다. 

Second, it demonstrates that authenticity is not the same as validity, which is a popular misconception. There are good reasons in assessment programmes to include authentic tests or to strive for high authenticity, but the added value is often more prominent in their formative than in their summative function. An example may illustrate this: Suppose we want to assess the quality of the day-to-day performance of a practising physician and we had the choice between observing him/her in many real-life consultations or extensively reviewing charts (records and notes), ordering laboratory tests and referral data. The second option is clearly less authentic than the first one but it is fair to argue that the latter is a more valid assessment of the day-to-day practice than the former. The observer effect, for example, in the first approach may influence the behaviour of the physician and thus draw a biased picture of the actual day-to-day performance, which is clearly not the case in the charts, laboratory tests and referral data review.


세 번째로, 타당도는 검사 자체에 대한 것이 아니며, 그 검사가 의도한 특징을 검사한 정도에 대한 것이다. Box 3의 예시가 데이터 수집을 빈틈없이 했느냐에 대한 목적이 있었다면 타당한 평가였겠지만, 전문성 정도를 측정하기 위한 것이었다면 정보수집과 활용의 효율성을 구인의 중요한 요소로서  않았다. 

Third, it clearly demonstrates that validity is not an entity of the assessment per se; it is always the extent to which the test assesses the desired characteristic. If the PMPs in the example in Box 3 were aimed at measuring thoroughness of data gathering – i.e. to see whether students are able to distinguish all the relevant data from non-relevant data – they would have been valid, but if they are aimed at measuring expertise they failed to incorporate efficiency of information gathering and use as an essential element of the construct.


Current views (Kane 2001, 2006) highlight the argument-based inferences that have to be made when establishing validity of an assessment procedure.


요약하자면, 관찰에서 점수로, 관찰점수에서 완전체점수(universe score)로, 완전체점수에서 특정 영역으로, 특정 영역에서 구인으로의 순차적 추론이 이루어진다.

In short, inferences have to be made from observations to scores, from observed scores to universe scores (which is a generalisation issue), from universe scores to target domain and from target domain to construct.


혈압을 측정한다고 하면, 청진기의 소리(observation) => 혈압계의 수치(observed score) => (반복, 다른 상황에서의 측정 수치(universe score) => 환자의 심혈관계 상태(target domain) => 건강(construct)

To illustrate this, a simple medical example may be helpful: When taking a blood pressure as an assessment of someone's health, the same series of inferences must be made. When taking a blood pressure, the sounds heard through the stethoscope when deflating the cuff have to be translated into numbers by reading them from the sphygmomanometer. This is the translation from (acoustic and visual) observation to scores. Of course, one measurement is never enough (the patient may just have come running up the stairs) and it needs to be repeated, preferable under different circumstances (e.g. at home to prevent the ‘white coat’-effect). This step is equivalent to the inference from observed scores to universe scores. Then, there is the inference from the blood pressure to the cardiovascular status of the patient (often in conjunction with other signs and symptoms and patient characteristics) which is equivalent to the inference from universe score to target domain. And, finally this has to be translated into the concept ‘health’, which is analogous to the translation of target domain to construct. There are important lessons to be learnt from this.


  • 첫째, 타당도는 논증에 대한 사례적 근간을 쌓는 것이다. 이러한 논증은 타당도 연구의 결과에 기반할 수도 있고, 이치에 맞고 방어가능한 주장을 포함할 수도 있다.
    First, validation is building a case based on argumentation. The argumentation is preferably based on outcomes of validation studies but may also contain plausible and/or defeasible arguments.
  • 평가에서 측정하고자 하는 명확한 정의나 구인에 대한 이론 없이는 평가에 대한 타당도를 검증할 수 없다. 따라서 특정 검사도구는 그 자체로 타당한 것이 아니라, 특정 구인을 검사하는데 타당한 것이다.
    Second, one cannot validate an assessment procedure without a clear definition or theory about the construct the assessment is intended to capture. So, an instrument is never valid per se but always only valid for capturing a certain construct.
  • 세 번째로, 타당도 검증은 끝나지 않으며, 종종 무수한 관찰과 기대와 비판적 실험이 필요하기도 하다.
    Third, validation is never finished and often requires a plethora of observations, expectations and critical experiments.
  • 마지막으로 이러한 추론을 이끌어내기 위해서는 일반화가능성 필요하다.
    Fourth, and finally, in order to be able to make all these inferences, generalisability is a necessary step.


신뢰도 Reliability

신뢰도라는 것은 앞선 타당도 섹션에서 말한 '일반화' 단계의 하나이다. 그러나 '일반화'가 타당도 검증을 위해 필요한 단 하나의 과정이라고 해도, 이러한 일반화가 이뤄지는 방식은 이론에 따라 다르다. 다음의 세 단계의 일반화를 이해하는 것이 좋다.

Reliability of a test indicates the extent to which the scores on a test are reproducible, in other words, whether the results a candidate obtains on a given test would be the same if she/he were presented with another test or all the possible tests of the domain. As such, reliability is one of the approaches to the generalisation step described in the previous section on validity. But even if generalisation is’only’ one of the necessary steps in the validation process, the way in which this generalisation is made is subject to theories in its own. To understand them, it may be helpful to distinguish three levels of generalisation.


첫 번째로, '평행 검사'의 개념이 필요한데, '평행 검사'라는 것은 유사한 내용의, 동일한 난이도의, 유사한 블루프린트의, 이상적으로는 동일한 학생에게 원 시험 직후에, 학생이 이전 검사에 의한 피로가 없다는 가정 하에 진행되는 가상의 검사이다.

First, however, we need to introduce the concept of the ‘parallel test’ because it is necessary to understand the approaches to reproducibility described below. A parallel test is a hypothetical test aimed at a similar content, of equal difficulty and with a similar blueprint, ideally administered to the same group of students immediately after the original test, under the assumption that the students would not be tired and that their exposure to the items of the original test would not influence their performance on the second.


세 종류의 일반화가 있다.

Using this notion of the parallel test, three types of generalisations are made in reliability, namely if the same group of students were presented with the original and the parallel test:



  1. 같은 학생이 두 시험에서 합/불합 하는가.
    Whether the same students would pass and fail on both tests.
  2. 1등부터 꼴등까지의 등수가 동일한가
    Whether the rank ordering from best to most poorly performing student would be the same on both the original and the parallel tests.
  3. 모든 학생이 동일한 점수를 받는가
    Whether all students would receive the same scores on the original and the parallel tests.


Three classes of theories are in use for this: classical test theory (CTT), generalisability theory (G-theory) and item response theory (IRT).


고전검사이론 Classical test theory

CTT is the most widely used theory. It is the oldest and perhaps easiest to understand. It is based on the central assumption that the observed score is a combination of the so-called true score and an error score (O = T + e).3 The true score is the hypothetical score a student would obtain based on his/her competence only. But, as every test will induce measurement error, the observed score will not necessarily be the same as the true score.


This in itself may be logical but it does not help us to estimate the true score. How would we ever know how reliable a test is if we cannot estimate the influence of the error term and the extent it makes the observed score deviate from the true score, or the extent to which the results on the test are replicable?


The first step in this is determining the correlation between the test and a parallel test (test–retest reliability). If, for example, one wanted to establish the reliability of a haemoglobin measurement one would simply compare the results of multiple measurements from the same homogenised blood sample, but in assessment this is not this easy. Even the ‘parallel test’ does not help here, because this is, in most cases, hypothetical as well.


The next step, as a proxy for the parallel test, is to randomly divide the test in two halves and treat them as two parallel tests. The correlation between those two halves (corrected for test length) is then a good estimate of the ‘true’ test–retest correlation. This approach, however, is also fallible, because it is not certain whether this specific correlation is a good exemplar; perhaps another subdivision in two halves would have yielded a completely different correlation (and thus a different estimate of the test–retest correlation). One approach is to repeat the subdivision as often as possible until all possibilities are exhausted and use the mean correlation as a measure of reliability. That is quite some work, so it is simpler and more effective to subdivide the test in as many subdivisions as there are possible (the items) and calculate the correlations between them. This approach is a measure of internal consistency and the basis for the famous Cronbach's alpha. It can be taken as the mean of all possible split half reliability estimates (cf. e.g. Crocker & Algina 1986).


Cronbach's alpha가 널리 사용되고 있기는 하지만, 이는 norm-referenced 관점에서만 사용가능하다(상대평가적 관점), Criterion-referenced 관점에서 Cronbach's alpha를 사용하면 신뢰도가 과대추정된다. Box 4에 설명되어있다.

Although Cronbach's alpha is widely used, it should be noted that it remains an estimate of the test–retest correlation, so it can only be used correctly if conclusions are drawn at the level of the whether the rank orderings between the original and the parallel tests are the same, i.e. a norm-referenced perspective. It does not take into account the difficulty of the items on the test, and because the difficulty of the items of a test influences the exact height of the score, using Cronbach's alpha in a criterion-referenced perspective overestimates the reliability of the test. This is explained in Box 4.



Although the notion of Cronbach's alpha is based on correlations, reliability estimates can range from 0 to 1. In rare cases, calculations could result in a value lower than zero, but this is then to be interpreted as being zero.


신뢰도에 대해서는 실제 점수와 연결해서 평가해야 한다. 신뢰도 0.9가 0.75보다 항상 좋은 것일까?

Although it is often helpful to have a measure of reliability that is normalised, in that for all data, it is always a number between 0 and 1, in some cases, it is also important to evaluate what the reliability means for the actual data. Is a test with a reliability of 0.90 always better than a test with a reliability of 0.75? Suppose we had the results of two tests and that both tests had the same cut-off score, for example 65%. The score distributions of both tests have a standard deviation (SD) of 5%, but the mean, minimum and maximum scores differ, as shown in Table 1.



Based on these data, we can calculate a 95% confidence interval (95%-CI) around each score or the cut-off score. For this, we need the standard error of measurement (SEM). In the beginning of this section, we showed the basic formula in CTT (observed score = true score + error). In CTT, the SEM is the SD of the error term or, more precisely put, the square root of the error variance. It is calculated as follows:






커트라인은 65점으로 같은데 Test 1은 평균은 높지만 신뢰도가 낮고, Test 2는 평균은 낮지만 신뢰도가 높다. Test 1은 신뢰도는 낮지만 평균이 높아서 95% CI안에 매우 일부 학생만 들어가는 반면, Test 2는 신뢰도가 높지만 평균이 낮아서 95%CI안에 다수의 학생이 포함된다. 즉 낮은 신뢰도에도 불구하고 부정확한 pass-fail decision의 가능성이 낮아지는 것이다.

If we use this formula, we find that in test 1, the SEM is 2.5% and in test 2, it is 1.58%. The 95% CIs are calculated by multiplying the SEM by 1.96. So, in test 1 the 95% CI is ±4.9% and in test 2 it is ±3.09%. 

    • In test 1 the 95% CI around the cut-off score ranges from 60.1% to 69.9% but only a small proportion of the score of students falls into this 95% CI.4 This means that for those students we are not able to conclude, with a p ≤ 0.05, whether these students have passed or failed the test. 
    • In test 2, the 95% CI ranges from 61.9% to 68.1% but now many students fall into the 95% CI interval. We use this hypothetical – though not unrealistic – example to illustrate that a higher reliability is not automatically better. To illustrate this further, Figure 1 presents a graphical representation of both tests.





일반화가능도 이론 Generalisability theory

G-theory is not per se an extension to CTT but a theory on its own. It has different assumptions than CTT, some more nuanced, some more obvious. These are best explained using a concrete example. We will discuss G-theory here, using such an example.


When a group of 500 students sit a test, say a 200-item knowledge-based multiple-choice test, their total scores will differ. In other words, there will be variance between the scores. From a reliability perspective, the goal is to establish the extent to which these score differences are based on differences in ability of the students in comparison to other – unwanted – sources of variance. In this example, the variance that is due to differences in ability (in our example ‘knowledge’) can be seen as wanted or true score variance. Level of knowledge of students is what we want our test to pick up, the rest is noise – error – in the measurement. G-theory provides the tools to distinguish true or universe score variance from error variance, and to identify and estimate different sources of error variance. The mathematical approach to this is based on analysis of variance, which we will not discuss here. Rather, we want to provide a more intuitive insight into the approach and we will do this stepwise with some score matrices.


In Table 2, all students have obtained the same score (for reasons of simplicity, we have drawn a table of five test items and five candidates). From the total scores and the p-values, it becomes clear that all the variance in this matrix is due to systematic differences in items. Students collectively ‘indicate’ that item 1 is easier than item 2, and item 2 is easier than item 3, etc. There is no variance associated with students. All students have the same total score and they have collected their points on the same items. In other words, all variance here is item variance (I-variance).



Table 3 draws exactly the opposite picture. Here, all variance stems from differences between students. Items agree maximally as to the ability of the students. All items give each student the same marks, but their marks differ for all students, so the items make a consistent, systematic distinction between students. In the score matrix, all items agree that student A is better than student B, who in turn is better than student C, etc. So, here, all variance is student-related variance (person variance or P-variance).




Table 4 draws a more dispersed picture. For students A, B and C, items 1 and 2 are easy and items 3–5 difficult, and the reverse is true for students D and E. There seems to be a clearly discernable interaction effect between items and students. Such a situation could occurs if, for example, items 1 and 2 are on cardiology and 3–5 on the locomotor system, and students A, B and C have just finished their clerkship in cardiology and the other students just finished their orthopaedic surgery placements.




Of course, real life is never this simple, so matrix 5 (Table 5) presents a more realistic scenario, some variance can be attributed to systematic differences in item difficulty (I-variance), some to differences in student ability (P-variance), some to the interaction effects (P × I-variance), which in this situation cannot be disentangled from general error (e.g. perhaps student D knew the answer to item 4 but was distracted or he/she misread the item).





Generalisability is then determined by the portion of the total variance that is explained by the wanted variance (in our example, the P-variance). In a generic formula:






Or in the case of our 200 multiple choice test example:5



The example of the 200-item multiple-choice test is called a one-facet design. There is only one facet on which we wish to generalise, namely would the same students perform similarly if another set of items (another ‘parallel’ test) were administered. The researcher does not want to draw conclusions as to the extent to which another group of students would perform similarly on the same set of items. If the latter were the purpose, she/he would have to redefine what is wanted and what is error variance. In the remainder of this paragraph we will also use the term ‘factor’ to denote all the components of which the variance components are estimates (so, P is a factor but not a facet).


위의 식에서 어떤 것이 error term에 들어가는지가 어떤 종류의 일반화를 할 것인가를 결정한다.

If we are being somewhat more precise, the second formula is not always a correct translation of the first. The first deliberately does not call the denominator ‘total variance’, but ‘wanted’ and ‘error variance’. Apparently, the researcher has some freedom in deciding what to include in the error term and what not. This of course, is not a capricious choice; what is included in the error term defines what type of generalisations can be made.



If, for example, the researcher wants to generalise as to whether the rank ordering from best to most poorly performing student would be the same on another test, the I-variance does not need to be included in the error term (for a test–retest correlation, the systematic difficulty of the items or the test is irrelevant). For the example given here (which is a so-called P × I design), the generalisability coefficient without the I/ni term is equivalent to Cronbach's alpha.


The situation is different if the reliability of an exact score is to be determined. In that case, the systematic item difficulty is relevant and should be incorporated in the error term. This is the case in the second formula.


To distinguish between both approaches, the former (without the I-variance) is called ‘generalisability coefficient’ and the latter ‘dependability coefficient’. This distinction further illustrates the versatility of G-theory, when the researcher has a good overview on the sources of variance that contribute to the total variance she/he can clearly distinguish and compare the wanted from the unwanted sources of variance.


The same versatility holds for the calculation of the SEM. As discussed in the section on CTT the SEM is the SD of the error term, so in a generalisability analysis it can be calculated as the square root of the error variance components, so either


In this example the sources of variance are easy to understand, because there is in fact one facet, but more complicated situations can occur. In an OSCE with two examiners per station, things already become more complicated. 

    • First, there is a second facet (the universe of possible examiners) on top of the first (the universe of possible stations). 
    • Second, there is crossing and nesting. 


A crossed design is most intuitive to understand. The multiple-choice example is a completely crossed design (P × I, the ‘×’ indicating the crossing), all items are seen by all students. Nesting occurs when certain ‘items’ of a factor are only seen by some ‘items’ of another factor. This is a cryptic description, but the illustration of the OSCE may help. The pairs of examiners are nested within each station. It is not the same two examiners who judge all stations for all students, but examiners A and B are in station 1, C and D in station 2, etc. The examiners are crossed with students (assuming that they remain the same pairs throughout the whole OSCE), because they have judged all students, but they are not crossed with all stations as A and B have only examined in station 1, etc. In this case examiner pairs are nested within stations.


There is a second part to the analyses in a generalisability analysis, namely the decision study or D-study. You may have noticed in the second formula that both the I-variance and the interaction terms have a subscript/ni. This indicates that the variance component is divided by the number of elements in the factor (in our example the number of items in the I-variance) and that the terms in the formula are the mean variances per element in the factor (the mean item variance). From this, it is relatively straightforward to extrapolate what the generalisability or dependability would have been if the numbers would change (e.g. what is the dependability if the number of items on the test would be twice as high, or which is more efficient, using two examiners per OSCE station or having more station with only one examiner?), just by inserting another value in the subscript(s). Although it may seem very simple, one word of caution is needed: such extrapolations are only as good as the original variance component estimates. The higher the number of original observations, the better the extrapolation. In our example, we had 200 items on the test and 500 students taking it, but it is obvious that this leads to better estimates and thus better extrapolations than 50 students sitting a 20 item test.



문항반응이론 Item response theory

CTT와 G-theory가 공통적으로 가지고 있는 단점은 응시자 그룹으로부터 난이도가 응시자에 미치는 영향을 떼어낼 수가 없다는 것이다. 특정 검사나 시험에 대한 점수가 낮은 것은 특정 문항이 매우 어렵거나, 특정 응시자집단이 능력이 떨어지기 때문일 수 있다. IRT는 이러한 문제를 학생의 능력과 독립적으로 문항의 난이도를 측정하여 해결하고자 했으며, 문항 난이도와 독립적으로 학생의 능력을 측정하고자 했다.

Both CTT and G-theory have a common disadvantage. Both theories do not have methods to disentangle test difficulty effects from candidate group effects. If a score on a set of items is low, this can be the result of a particularly difficult set of items or of a group of candidates who are of particularly low ability level. Item response theories try to overcome this problem by estimating item difficulty independent of student ability, and student ability independent of item difficulty.


CTT에서 난이도는 p-value, 즉 해당 문항을 맞춘 학생의 비율로 나타난다. Rit와 Rir같은 수치는 전체 혹은 나머지 문항에서의 수행능력과 특정 문항에 대한 수행능력과의 상관관계를 보여준다. 다른 그룹이 동일한 검사를 했다거나, 다른 검사에 문항이 재사용되어도 p-value는 다를 것이다. IRT에서는 응시자의 응답을 모델화하여 개별 문항에 대한 능력을 보여준다.

Before we can explain this, we have to go back to CTT again. In CTT, item difficulty is indicated by the so-called p-value, the proportion of candidates who answered the item correctly, and discrimination indices such as point biserials, Rit (item-total correlation) or Rir (item-rest correlation), all of which are measures to correlate the performance on an item to the performance on the total test or the rest of the items. If in these cases a different group of candidates (of different mean ability) would take the test, the p-values would be different, and if an item were re-used in a different test, all discrimination indices would be different. With IRT the response of the candidates are modelled, given their ability to each individual item on the test.


이러한 모델링에는 몇 가지 가정이 필요하다.

Such modelling cannot be done without making certain assumptions. 

    • The first assumption is that the ability of the candidates is uni-dimensional and 
    • the second is that all items on a test are locally independent except for the fact that they measure the same (uni-dimensional) ability. If, for example, a test would contain an item asking for the most probable diagnosis in a case and a second for the most appropriate therapy, these two items are not locally independent; if a candidate answers the first items incorrectly, she/he will most probably answer the second one incorrectly as well.
    • The third assumption is that modelling can be done through an item response function (IRF) indicating that for every position on the curve, the probability of a correct answer increases with a higher level of ability. The biggest advantage of IRT is that difficulty and ability are modelled on the same scale. IRFs are typically graphically represented as an ogive, as shown in Figure 2.






모델링에는 데이터가 필요하다. 따라서 모델링을 하기 전에 사전 검사가 필요하다.

Modelling cannot be performed without data. Therefore pre-testing is necessary before modelling can be performed. The results on the pre-test are then used to estimate the IRF. For the purpose of this AMEE Guide, we will not go deeper into the underlying statistics but for the interested reader some references for further reading are included at the end.


세 수준의 모델링이 가능하다.

Three levels of modelling can be applied, conveniently called one-, two- and three-parameter models. 

    • A one-parameter model distinguishes items only on the basis of their difficulty, or the horizontal position of the ogive. Figure 3 shows three items with three different positions of the ogive. The curve on the left depicts the easiest item of the three in this example; it has a higher probability of a correct answer with lower abilities of the candidate. The most right curve indicates the most difficult item. In this one-parameter modeling, the forms of all curves are the same, so their power to discriminate (equivalent to the discrimination indices of CTT) between students of high and low abilities are the same.
    • A two-parameter model includes this discriminatory power (on top of the difficulty). The curves for different items not only differ in their horizontal position but also in their steepness. Figure 4 shows three items with different discrimination (different steepness of the slopes). It should be noted that the curves do not only differ in their slopes but also in their positions, as they differ both in difficulty and in discrimination (if they would only differ in slopes, it would be a sort of one-parameter model again).
    • A three-parameter model includes the possibility that a candidate with extremely low ability (near-to-zero ability) still produces the correct answer, for example through random guessing. The third parameter determines the offset of the curve or more or less its vertical position. Figure 5 shows three items differing on all three parameters.





대략 one-parameter modelling에는 200~300개의 응답이, three-parameter model에는 1000개의 응답이 필요하다.

As said before, pre-testing is needed for parameter estimation and logically there is a relationship between the number of candidate responses needed for good estimates; the more parameters have to be estimated, the higher the number of responses needed. As a rule of thumb, 200–300 responses would be sufficient for one-parameter modelling, whereas a three-parameter model would require roughly 1000 responses. Typically, large testing bodies employ IRT mix items to be pre-tested with regular items, without the candidates knowing which item is which. But it is obvious that such requirements in combination with the complicated underlying statistics and strong assumptions limit the applicability of IRT in various situations. It will be difficult for a small-to-medium-sized faculty to produce enough pre-test data to yield acceptable estimates, and, in such cases, CTT and G-theory will have to do.


IRT는 강력한 신뢰도 이론이다.

On the other hand, IRT must be seen as the strongest theory in reliability of testing, enabling possibilities that are impossible with CTT or G-theory. One of the ‘eye-catchers’ in this field is computer-adaptive testing (CAT). In this approach, each candidate is presented with an initial small set of items. Depending on the responses, his/her level of ability is estimated, and the next item is selected to provide the best additional information as to the candidate's ability and so on. In theory – and in practice – such an approach reduces the SEM for most if not all students. Several methods can be used to determine when to stop and end the test session for a candidate. One would be to administer a fixed number of items to all candidates. In this case, the SEM will vary between candidates but most probably be lower for most of the candidates then with an equal number of items with traditional approaches (CTT and G-theory). Another solution is to stop when a certain level of certainty (a certain SEM) is reached. In this case, the number of items will vary per candidate. But apart from CAT, IRT will mostly be used for test equating, in such situations where different groups of candidates have to be presented with equivalent tests.


권고 Recommendations

The three theories – CTT, G-theory and IRT seem to co-exist. This is an indication that there is good use for each of them depending on the specific test, the purpose of the assessment and the context in which the assessment takes place. Some rules of thumb may be useful.


    • CTT is helpful in straightforward assessment situations such as the standard open-ended or multiple choice test. In CTT, item parameters such as p-values and discrimination indices can be calculated quite simply with most standard statistical software packages. The interpretation of these item parameters is not difficult and can be taught easily. Reliability estimates, such as Cronbach's alpha, however, are based on the notion of test–ret7est correlation. Therefore, they are most suitable for reliability estimates from a norm-orientated perspective and not from a domain-orientated perspective. If they are used in the latter case, they will be an overestimation of the actual reproducibility.

    • G-theory is more flexible in that it enables the researcher to include or exclude source of variance in the calculations. This presupposes that the researcher has a good understanding of the meaning of the various sources of variance and the way they interact with each other (nested versus crossed), but also how they represent the domain. The original software for these analyses is quite user unfriendly and requires at least some knowledge of older programming languages such as Fortran (e.g. UrGENOVA; http://www.education.uiowa.edu/casma/GenovaPrograms.htm, last access 17 December 2010). Variance component estimates can be done with SPSS, but the actual g-analysis would still have to be done by hand. Some years ago, two researchers at McMaster wrote a graphical shell around UrGenova to make it more user friendly (http://fhsperd.mcmaster.ca/g_string/download.html, accessed 17 December 2010). Using this shell prevents the user from knowing and employing a difficult syntax. Nevertheless, it still requires a good understanding of the concept of G-theory. In all cases where there is more than one facet of generalisation (as in the example with the two examiners per station in an OSCE), G-theory has a clear advantage over CTT. In CTT multiple parameters should be used and somehow combined (in this OSCE Cronbach's alpha and Cohen's Kappa or an ICC for inter-observer agreement), in the generalisability analysis both facets are incorporated. If a one-facet situation exists (like the multiple choice examination) from a domain-orientated perspective (e.g. with an absolute pass–fail core), a dependability coefficient is a better estimate than CTT.

    • IRT should only be used if people with sufficient understanding of the statistics and the underlying concepts are part of the team. Furthermore, considerably large item banks are needed and pre-testing on a sufficient number of candidates must be possible. This limits the routine applicability of IRT in all situations other than large testing bodies, large schools or collaboratives.



새롭게 떠오르는 이론들 Emerging theories

새롭게 떠오르는 이론의 대부분은 '학습의 평가'에서 '학습을 위한 평가'로의 관점 전환과 관련되어있다. 비록 이 자체는 이론의 변화는 아니지만, 관점의 변화가 새로운 이론을 가져오거나 기존 이론의 확장을 가져왔다.

Although we by no means possess a crystal ball, we see some new theories or extension to existing theories emerging. Most of these are related to the changing views from (exclusively) assessment of learning to more assessment for learning. Although this in itself is not a theory change but more a change of views on assessment, it does lead to the incorporation of new theories or extensions to existing ones.


첫째로, '학습을 위한 평가'라는 것이 무언가를 알 필요가 있다. 기존의 관점이 상징적으로 보여주는 것이 바로 교과목 종료 후에 보는 총괄평가이다. 이러한 방법은 전 세계적으로 흔하게 사용되는 것이지만, 교육적 맥락에서는 이러한 방법에 대한 불만이 커지고 있다. 이러한 평가는 학습환경의 변화를 잘 따라가지 못하고 있으며, 이러한 'purely selective test'는 의료의 'screening procedure'에 비견될 수 있다. 필수 역량에 미달한 학생에 대하여 졸업 여부를 판별하는데는 좋을 수 있으나, 아직 역량이 부족한 학생에게 어떻게 충분한 역량을 키울 수 있게 해줄 것인가에 대한 정보는 주지 못한다. 또한 각 학생을 어떻게 가장 좋은 의사로 키울 수 있을 것인가에 대한 정보도 주지 못한다. 환자를 더 낫게 만드는 것은 screening 그 자체가 아니라 잘 맞춰진 진단과 치료인 것처럼, 학습자에 대한 진단 그 자체로는 학습을 향상시키지 못하며, 학습을 위한 평가만이 이를 가능하게 한다.

First, however, it might be helpful to explain what assessment for learning entails. For decades, our thinking about assessment has been dominated by the view that assessment's main purpose is to determine whether a student has successfully completed a course or a study. This is epitomised in the summative end-of course examination. The consequences of such examinations were clear; if she/he passes, the student goes on and does not have to look back; if she/he fails, on the other hand, the test has to be repeated or (parts of) the course has to be repeated. Successful completion of a study was basically a string of passing individual tests. We draw – deliberately – somewhat of a caricature, but in many cases, this is the back bone of an assessment programme. Such an approach is not uncommon and is used at many educational institutes in the world, yet there is a growing dissatisfaction in the educational context. Some discrepancies and inconsistencies are felt to be increasingly incompatible with learning environments. These are probably best illustrated with an analogy. Purely selective tests are comparable in medicine to screening procedures (e.g. for breast cancer or cervical cancer). They are highly valuable in ensuring that candidates lacking the necessary competence do not graduate (yet), but they do not provide information as to how an incompetent candidate can become a competent one, or how each student can achieve to become the best possible doctor she/he could be. Just as screening does not make the patients better, but tailored diagnostic and therapeutic intervention do, assessment of learning does not help much in improving the learning but assessment for learning can.


We will mention the most striking discrepancies between assessment of and assessment for learning.


  • 교육과정의 가장 중심이 되는 목표는 학생이 공부를 열심히 해서 가능한 많이 배울 수 있게 하는 것이다. 따라서 평가도 이러한 목적에 맞게 이뤄져야 한다. 충분한 역량을 갖춘 학생을 골라내는데만 집중된 평가는 이러한 목표에 도달할 수 없다.
    A central purpose of the educational curriculum is to ensure that students study well and learn as much as they can; so, assessment should be better aligned with this purpose. Assessment programmes that focus almost exclusively on the selection between the sufficiently and insufficiently competent students do not reach their full potential in steering student learning behaviour.
  • '학습의 평가'에서 하는 질문은 'A가 B보다 낫나?'이다. CTT나 G-theory에서는 학생간 차이가 없을 경우 신뢰도를 계산해낼 수 없다. '학습을 위한 평가'에서 질문은 '오늘의 A가 어제의 A보다 낫나?'이다. 수월성을 위하여 끊임없이 나아간다는 의미를 가지고 있는데, 모든 학생이 '우수'에 도달하면, 그 '우수'는 다시 '평범함'이 되기 때문이다. '학습을 위한 평가'에서 질문은 A와 B의 향상이 충분한가에 대해서도 당연히 생각해보게 된다.
    If the principle of assessment of learning is exclusively used, the question all test results need to answer is: is John better than Jill?, where the pass–fail score is more or less one of the possible ‘Jills’. Typically CTT and G-theory cannot calculate test reliability if there are no differences between students. A test–retest correlation does not exist if there is no variance in scores, generalisability cannot be calculated if there is no person variance. The central question in the views of assessment for learning is therefore: Is John today optimally better than he was yesterday, and is Jill today optimally better than she was yesterday. This gives also more meaning to the desire to strive for excellence, because now excellence is defined individually rather than on the group level (if everybody in the group is excellent, ‘excellent’ becomes mediocre again). It goes without saying that in assessment for learning, the question whether John's and Jill's progress is good enough needs to be addressed as well.
  • 보다 어려운 개념은 '학습의 평가'에서 '일반화' 혹은 '예측'이란 '동질성(uniformity)'에 기반하고 있다는 점이다. 즉 학생이 동일한 상황에서 동일한 시험을 잘 볼 것인가에 대한 예측과 일반화를 한다는 것이다. 그러나 '평가를 위한 학습'에서 '예측'이란 여전히 중요하긴 하지만, 평가법의 선택은 진단적 목적이 더 크고, 평가법을 학생의 구체적 특징에 따라서 선택할 수 있는 유연성이 있다. CAT나 임상의사의 진단적 사고 - 구체적 추가적 진단기술을 환자에 맞추어 사용하는 것 - 이 이와 유사하다 할 수 있다.
    A difficult and more philosophical result of the previous point is that the idea of generalisation or prediction (how well will John perform in the future based on the test results of today) in an assessment of learning is mainly based on uniformity. It states that we can generalise and predict well enough if all students sit the same examinations under the same circumstances. In the assessment for learning, view prediction is still important but the choice of assessment is more diagnostic in that there should be room for sufficient flexibility to choose the assessment according to the specific characteristics of the student. This is analogous to the idea of (computer) adaptive testing or the diagnostic thinking of the clinician, tailoring the specific additional diagnostics to the specific patient.
  • '학습의 평가'에서는 최선의 평가법이 개발해내는 것이 중요하다. 이러한 관점에서 가장 이상적인 평가 프로그램은 각 의학적 역량을 평가하기에 '가장 좋은' 평가도구만을 사용하게 된다. 예컨데 지식의 평가를 위한 객관식 문항, 술기 평가를 위한 OSCE, 문제해결능력을 위한 long simulation 등이다. 그러나 '학습을 위한 평가'에서는 다양한 정보를 얻기 위해서 다양한 도구를 사용하며, 다음의 세 가지 질문에 답하는 것이 중요하다.
    In the assessment of learning view, developments are focussed more on the development (or discovery) of the optimal instrument for each aspect of medical competence. The typical example of this is the OSCE for skills. In this view, an optimal assessment programme would incorporate only the best instrument for each aspect of medical competence. Typically, such a programme would look like this: multiple-choice tests for knowledge, OSCEs for skills, long simulations for problem-solving ability, etc. From an assessment for learning, view information needs to be extracted from various instruments and assessment moments to optimally answer the following three questions:


    1. 진단적 질문: 이 학생에 대한 완전한 그림을 그리기에 충분한 정보를 가지고 있는가?
    2. Do I have enough information to draw the complete picture of this particular student or do I need specific additional information? (the ‘diagnostic’ question)
    3. 치료적 질문: 이 시점에서 가장 필요한 교육적 개입은 무엇인가?
      Which educational intervention is most indicated for this student at this moment? (the ‘therapeutic’ question)
    4. 예후적 질문: 이 학생이 옳은 길을 가고 있으며 유능한 전문직으로 성장할 것인가?
      Is this student on the right track to become a competent professional on time? (the ‘prognostic’ question).


  • 단일한 혹은 소수의 평가로만 위의 질문에 답을 할 수는 없을 것이다. 평가프로그램이 필요하며, 각각의 장점과 단점이 있는 다양한 평가법이 필요하며 이는 의사가 다양한 진단적 도구를 활용할 수 있는 것과 마찬가지다. 이 도구들은 양적일 수도 있고 질적일 수도 있으며, 더 객관적일수도, 주관적일수도 있다. 비유를 좀 더 해보자면 만약에 의사가 환자의 Hb 수치 오더를 내리면 단순히 객관적인 수치를 알고 싶은 것일 수 있다. 그러나 한편으로 의사는 병리학자에게 특정 숫자가 아니라 서술적 판단을 요청할 수도 있다. 유사하게 평가 프로그램도 양적, 질적 요소를 다 갖출 수 있다.
    It follows logically from the previous point that this cannot be accomplished with one single assessment method or even with only a few. A programme of assessment is needed instead, incorporating a plethora of methods, each with its own strengths and weaknesses, much like the diagnostic armamentarium of a clinician. These can be qualitative or quantitative, more ‘objective’ or more ‘subjective’. To draw the clinical analogy further: if a clinician orders an haemoglobin level of a patient she/he does not want the laboratory analyst's opinion but the mere ‘objective’ numerical value. If, on the other hand, she/he asks a pathologist, s/he does not expect a number but a narrative (‘subjective’) judgement. Similarly, such a programme of assessment will consist of both qualitative and quantitative elements.


이 이론들 중 많은 부분은 여전히 더 개발이 필요하나 일부는 다른 분야의 이론에서 가져올 수도 있다.
Much of the theory to support the approach of assessment for learning still needs to be developed. Parts can be adapted from theories in other fields; parts need to be developed within the field of health professions assessment research. We will briefly touch on some of these.


  • 평가 프로그램의 질을 결정하는 것은 무엇인가? 한 가지 중요한 것은 좋은 평가프로그램은 개별 구성요소의 합보다 전체가 더 커야 한다는 점이다. 그러나 이런 목표를 달성하기 위해서 각 요소를 어떻게 결합할 것인가는 또 다른 문제이다. 
    What determines the quality of assessment programmes? It is one thing to state that in a good assessment programme the total is more than the sum of its constituent parts, but it is another to define how these parts have to be combined in order to achieve this. Emerging theories describe a basis for the definition of quality. Some adopt a more ideological approach (Baartman 2008) and some a more utilistic ‘fitness-for-purpose’ view (Dijkstra et al. 2009). 
    • 평가의 질이란 평가프로그램이 얼마나 '이상적인 모습'에 가까운가에 따른 것이다.
      In the former,
      quality is defined as the extent to which the programme is in line with an ideal (much like formerly quality of an educational programme was defined in terms of whether it was PBL or not); 
    • 평가의 질이란 프로그램에서 명확하게 정의한 목표에 의해서 정의되는 것이며, 각 부분이 이 목표를 달성하기 위해서 최적화되어야 한다. 
      in the latter
      the quality is defined in terms of a clear definition of the goals of the programme and whether all parts of the programmes optimally contribute to the achievement of this goal. This approach is more flexible in that it would allow for an evaluation of the quality of assessment of learning programmes as well. 
  • At this moment, theories about the quality of assessment programmes are being developed and researched (Dijkstra et al. 2009, submitted 2011).

  • 평가가 어떻게 학습에 영향을 미치는가? 상당한 합의가 있어 보인다. 그러나 연구가 그리 많이 되어있지는 않다.
    How does assessment influence learning? Although there seems to be complete consensus about this – a complete shared opinion, much empirical research has not been performed in this area. For example, much of the intuitive ideas and uses of this notion are strongly behaviouristic in nature and do not incorporate motivational theories very well. The research, especially in the health professions education, is either focussed on the test format (Hakstian 1971; Newble et al. 1982; Frederiksen 1984) or on the opinions of students (Stalenhoef-Halling et al. 1990; Scouller 1998). Currently, new theories are emerging incorporating motivational theories and describing better which factors of an assessment programme influence learning behaviour, how they do that and what the possible consequences of these influences are (Cilliers et al. submitted 2010, 2010).

  • Test-enhanced learning이 최근 논의되고 있다. 전문가 이론에 따르면 시험을 보는 것 자체가 다양한 측면에서 지식의 저장, 유지, 인출에 도움이 된다고 보는 것은 합당하다. 그러나 평가프로그램에 있어서, 특히 '학습을 위한 평가' 차원에서 어떻게 해야하는가는 별로 아는 바가 많지 않다.
    The phenomenon of test-enhanced learning has been discussed recently (Larsen et al. 2008). From expertise theories it is logical to assume that from sitting a test, as a strong motivator to remember what was learned, the existing knowledge is not only more firmly stored in memory, but also reorganised from having to produce and apply it in a different context. This would logically lead to better storage, retention and more flexible retrieval. Yet we know little about how to use this effect in a programme of assessment especially with the goal of assessment for learning.

  • 피드백이 효과를 나타내게 해주는 것은 무엇인가? 피드백을 총괄평가와 함께 주는 것은 그 가치를 떨어뜨리는 것이다라는 지적이 있지만, 어떤 요인이 여기에 영향을 주는가에 대해서는 알려져 있는 바가 적다. 
    What makes feedback work? There are indications that the provision of feedback in conjunction with a summative decision limits its value, but there is little known about which factors contribute to this. Currently, research not only focusses on the written combination of summative decisions and formative feedback, but also on the combination of a summative and formative role within one person. This research is greatly needed as in many assessment programmes it is neither always possible nor desirable to separate teacher and assessor role.

  • 평가프로그램의 차원에서 인간의 판단은 포함될 수 밖에 없다. 심리학에서 인간의 판단(human judgement)는 actuarial 한 방법에 비해서 오류의 가능성이 더 높다고 본다. 그 이유에는 여러가지가 있다.
    In a programme of assessment the use of human judgement is indispensible. Not only in the judgement of more elusive aspects of medical competence, such as professionalism, reflection, etc., but also because there are many situations in which a prolonged one-on-one teacher-student relationship exists, as is for example the case in long integrated placements or clerkships. From psychology it is long known that human judgement is fallible if it is compared to actuarial methods (Dawes et al. 1989). There are many biases that influence the accuracy of the judgement. 
    • The most well-known are primacy, recency and halo effects (for a more complete overview, cf. Plous 1993). 
    • A primacy effect indicates that the first impression (e.g. in an oral examination) often dominates the final judgement unduly; 
    • a recency effect indicates the opposite, namely that the last impressions determine largely the judgement. There is good indication that the length of the period between the observation and the making of judgement determines whether the primacy or the recency effect is most prominent effect. 
    • The halo effect pertains to the inability of people to judge different aspects of someone's performance and demeanour fully independently during one observation, so they all influence each other. 
    • Other important sources of bias are cognitive dissonance, fundamental attribution error, ignoring base rates, confirmation bias. All have their specific influences on the quality of the judgement. As such, these theories shed a depressing light on the use of human judgement in (high-stakes) assessment. 
  • 그러나, 이러한 이론들에고 불구하고 이러한 human judgement에서 오는 편향을 줄일 수 있는 방법이 있다. 자연주의 의사결정에 대한 이론에서는 왜 딱 잘라지는, 숫자를 기반으로 한 결정보다 사람의 의사결정이 더 부정확한가에 초점을 두는 것이 아니라, 왜 사람들이 '정보가 불충분하거나', '이상적이지 못한 상황'에서의 '명확히 정의되지 않는 문제'를 훌륭히 수행하는가에 대해서 연구한다. 정보의 저장, 경험으로부터의 학습, 상황-특이적 스트립트의 보유 등이 중요한 역할을 하는 것으로 보인다. 그리고 많은 부분이 빠른 패턴 인식과 매칭에 기반하고 있다. 
    Yet, from these theories and the studies in this field, there are also good strategies to mitigate such biases. Another theoretical pathway which is useful is the one on naturalistic decision making (Klein 2008; Marewski et al. 2009). This line of research does not focus on why people are so poor judges when compared to clear-cut and number-based decisions, but why people still do such a good job when faced with ill-defined problems with insufficient information and often under less than ideal situations. Storage of experiences, learning form experiences and the possession of situation-specific scripts seem to play a pivotal role here, enabling the human to employ a sort of expertise-type problem solving. Much is based on quick pattern recognition and matching. 
  • 두 가지 theoretical pathway 모두 관찰에서 얻은 제한된 정보만을 가지고 접근하는 인간의 접근법에 대해서 다루고 있다. 의료전문가가 임상추론을 하고 진단활동을 하는 것과 평가를 위해서 학생의 수행능력을 판단하는 것에는 유사성이 있음이 많은 연구에서 보고되고 있다.  
    Both theoretical pathways have commonality in that they both describe human approaches that are based on a limited representation of the actual observation. When, as an example, a primacy effect occurs, the judge is in fact reducing information to be able to handle it better, but when the judge uses a script, she/he is also reducing the cognitive load by a simplified model of the observation. Current research increasingly shows parallels between what is known about medical expertise, clinical reasoning and diagnostic performance and the act of judging a student's performance in an assessment setting. The parallels are such that they most probably have important consequences for our practices of teacher training.

  • 위에서 다룬 것을 설명하는데 필요한 이론이 CLT이다. CLT는 인간의 작업기억이 제한적이어서 제한된 수의 정보를 짧은 시간만큼만 기억할 수 있다는 것으로부터 시작한다. CLT에서 인지부하는 세 가지 종류가 있다. 내재적, 외재적, 본유적이다. 내재적 부하는 과제에 내재되어있는 복잡성에 의해서 생기는 부하이다. 외재적 부하는 그 과제와 직접적으로 관련되어있지는 않지만, 그 과제를 처리하기 위해서 필요한 모든 정보들과 관련되어있다. CLT에 근거하자면 Authentic setting에서 의과대학 교육과정을 바로 시작하는 것은 바람직하지 않은데, authenticity는 도움이 될지 모르겠지만, 과도한 외재적 부하가 과도하게 걸려서 학습을 위해 필요한 자원(본유적 부하)까지를 다 잡아먹기 때문이다.
    An important underlying theory to explain the previous point is cognitive load theory (CLT) (Van Merrienboer & Sweller 2005, 2010). CLT starts from the notion that the human working memory is limited in that it can only hold a low number of elements (typically 7 ± 2) for a short-period of time. Much of this we already discussed in the paragraphs on expertise. CLT builds on this as it postulates that cognitive load consists of three parts: intrinsic, extraneous and germane load. 
    • Intrinsic load is generated by the innate complexity of the task. This has to do with the number of elements that need to be manipulated and the possible combinations (element interactivity). 
    • Extraneous load relates to all information that needs to be processed yet is not directly relevant for the task. If, for example, we would start the medical curriculum by placing the learners in an authentic health care setting and require them to learn from solving real patient problems, CLT states that this is not a good idea. The authenticity may seem helpful, but it distracts, the cognitive resources needed to deal with all the practical aspects would constitute a high extraneous load even to such an extent that it would minimise the resources left for learning (the germane load).

내재적 인지부하[편집]

내재적 인지부하(intrinsic cognitive load)란 학습자료나 과제 자체가 가지고 있는 난이도와 복잡성이라 할 수 있다. 상호 작용성이 높은 학습 자료를 해결하기 위해서는 개념을 획득하고 개념들 간의 관련성을 이해하는 것이 작동기억의 부하를 감소시킬 수 있다[2]. 내재적 인지부하는 학습의 난이도에 따라 상대적일 수 있으며 이는 사전지식의 보유와 관련이 있다고도 할 수 있다[3].

외재적 인지부하[편집]

외재적 인지부하(extraneous cognitive load)는 학습 과제 자체의 난이도가 아닌 학습방법, 자료제시방법 등 교수전략에 의해 개선될 수 있는 인지부하이다. 그러나 외재적 인지부하는 내재적 인지부하에 영향을 받는다(김 경, 김동식, 2004). 즉, 학습 과제 자체가 내재적 인지부하가 낮다면 교수 설계가 부적절하여 외재적 인지부하가 발생하더라도 이것이 작동기억의 범위 내에 있기 때문에 문제를 해결하는데 어려움이 없게 된다.[1]


본유적 인지부하[편집]

본유적 인지부하(germane cognitive load)란 작동기억의 범위안에서 학습과 직접 관련이 있는 정신적인 노력을 의미한다. 인지부하는 학습자에게 지나치게 낮은 수준의 학습자료를 제공하거나 높은 자료를 제시하게 되면 일어나지 않는다. 그러나 학습자에게 적절한 수준의 학습자료를 제공하면 학습자는 문제를 해결하기 위해 정신적인 노력을 기울이게 된다. 이때 발생하는 인지부하를 ‘본유적 인지부하’라고 한다[2]
  • 마지막으로 새로운 모델이 개발되고 옛 모델도 재발견이 이뤄지고 있다.
    Finally, new psychometric models are developed and old ones are being rediscovered at this present time. It is clear that, from a programme of assessment view, in incorporating many instruments in the programme not one single psychometric model will be useful for all elements of the programme. In the 1960s and 1970s, some work was done on domain-orientated reliability approaches (Popham and Husek 1969; Berk 1980). In the currently widely used method internal consistency (like Cronbach's alpha) is often used as the best proxy for reliability or universe generalisation, but one can wonder whether this is the best approach to all situations. Most standard psychometric approaches do not handle a changing object of measurement very well. By this we mean that the students – hopefully – change under the influence of the learning programme. In the si
    tuation of a longer placement for example, the results of repeatedly scored observations (for instance, repeated mini-CEX) will differ in their outcomes, with part of this variance being due to the learning of the student and part to measurement error (Prescott-Clements et al. submitted 2010). Current approaches do not provide easy strategies to distinguish between both effects. Where internal consistency is a good approach to reliability, then stability of the object of measurement and of the construct can be reasonably expected; it is problematic when this is not the case. The domain-orientated approaches therefore were not focussed primarily on the internal consistency but on the probability that a new observation would shed new and unique light on the situation, much like the clinical adage never to ask for additional diagnostics if the results are unlikely to change the diagnosis and/or the management of the disease. As said above, these methods are being rediscovered and new ones are being developed, not to replace the existing theories, but rather to complement them.






 2011;33(10):783-97. doi: 10.3109/0142159X.2011.611022.

General overview of the theories used in assessment: AMEE Guide No. 57.

Author information

  • 1Flinders University, Adelaide 5001, South Australia, Australia. lambert.schuwirth@flinders.edu.au

Abstract

There are no scientific theories that are uniquely related to assessment in medical education. There are many theories in adjacent fields, however, that can be informative for assessment in medical education, and in the recent decades they have proven their value. In this AMEE Guide we discuss theories on expertise development and psychometric theories, and the relatively young and emerging framework of assessment for learning. Expertise theories highlight the multistage processes involved. The transition from novice to expert is characterised by an increase in the aggregation of concepts from isolated facts, through semantic networks to illness scripts and instance scripts. The latter two stages enable the expert to recognise the problem quickly and form a quick and accurate representation of the problem in his/her working memory. Striking differences between experts and novices is not per se the possession of more explicit knowledge but the superior organisation of knowledge in his/her brain and pairing it with multiple real experiences, enabling not only better problem solving but also more efficient problem solving. Psychometric theories focus on the validity of the assessment - does it measure what it purports to measure and reliability - are the outcomes of the assessment reproducible. Validity is currently seen as building a train of arguments of how best observations of behaviour (answering a multiple-choice question is also a behaviour) can be translated into scores and how these can be used at the end to make inferences about the construct of interest. Reliability theories can be categorised into classical test theory, generalisability theory and item response theory. All three approaches have specific advantages and disadvantages and different areas of application. Finally in the Guide, we discuss the phenomenon of assessment for learning as opposed to assessment of learning and its implications for current and future development and research.

PMID:
 
21942477
 
[PubMed - indexed for MEDLINE]


선택과목: AMEE Guide No. 46

Student Selected Components (SSCs): AMEE Guide No 46

SIMON C. RILEY

College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh, UK






Aims of this guide

용어

This Guide aims to provide practical advice and guidance to those faculty involved in developing and planning medical curricula, in programmes that are both new, or undergoing significant review, with the inclusion of an increased element of student choice. The Guide is of equal practical use to those already running or improving an existing SSC programme. The previous AMEE Guide “The Core Curriculum with Options or Special Study Modules” (Harden & Davis 1995) reviewed this new subject, whilst mostly concentrated on defining core and the place of Special Study Modules (SSMs). This current guide will expand further on this, examining the role and value added by integrating Student Selected Components (SSCs) to the core medical curriculum, as well as examining any potential pitfalls and problems. The review covers a very broad spectrum of topics in medical education and will emphasise their relevance in the SSC context, including some up-to-date references as entry points to the current literature in each. It will draw upon examples from existing SSC programmes, highlighting good practice and novel ideas. It is intended that this guide will also help to open the debate on the concept of personal choice in the medical curriculum; the opportunities, challenges and educational validity raised by this somewhat experimental component in use throughout UK medical schools, which is being adopted to a similar or lesser degree elsewhere across the world.



Introduction

선택과목(SSC)에 대해서 GMC는 TD에서 모든 영국의 의과대학에 '필수'교육과정이 2/3, 그 외, 즉 '선택'교육과정이 1/3로 구성된 의대 교육과정의 상당한 변화를 요구했다. 초기에는 SSM이라고 불린 이 과정은 후에 SSC로 재명명되었고(TD2), '학생들에게 장기간에 걸쳐서 끊임없이 증가하는 지식과 환경의 변화에 일생동안 직면하게 될 전문직에게 요구될 지적, 태도적 자세'를 준비하기 위한 기간으로 설명하고 있다.

The Student Selected Components (SSCs) can perhaps be regarded as one of the more radical innovations in medical curricula (Lowry 1992; Harden and Davis 1995). The General Medical Council (GMC) in its document ‘Tomorrows Doctors’ (GMC 1993), directed all UK medical schools to profoundly change the design of medical curricula and move into a new framework of ‘core curriculum’ constituting two thirds of the course, in conjunction with the remaining third, or ‘options’ component. These options were originally termed Special Study Modules (SSMs), and later relabeled SSCs in Tomorrow's Doctors 2 (GMC 2003). They were to provide opportunities for both choice, and depth of study to prepare students for the ‘long term intellectual and attitudinal demands of a professional life that will be constantly challenged by growth of knowledge and change of circumstance’ (GMC 1993).


선택과목의 도입의 원동력은 지식의 과도한 주입에서 오는 문제, 보다 학습자 중심의 학생 경험을 강조하는 변화 등이 주되게 작용하고 있다. 1980년대 후반과 1990년대 초반에는 전통적 교유과정의 근본적 문제를 인식하였다. 이 근본적 문제란 지식 습득에 지나친 강조를 하고, 전문직업성은 그만큼 강조하지 않는다는 것이다. GMC Recommendation에서는 다음과 같이 말한다.

우리는 여기서 다시 한 번 1957년과 1967년에 학생에게 가해지는 과도한 지식이 최대한 감소되어야함을 강조한다. 사실적 지식을 암기하고 재생산해내는 것이 중요한 학습의 원칙과 독립적 사고력의 개발을 가로막아서는 안된다. 학생은 독립적으로 일할 수 있는 능력을 배양해야 한다. 이를 위해서 학생은 개인적 학습을 위한 충분한 자유 시간을 가져야 하며, 교육과정 전체에 걸쳐서 자기 (주도) 학습을 해야 한다.

The main driving forces behind these changes were the fulminating problems of factual overload, and a proposed shift to a more learner-centred and stimulating student experience. In the late 1980s and early 1990s, there was a strong perception of fundamental problems with previous traditional curricula, which were narrow and constrained with far too much emphasis placed upon knowledge acquisition and insufficient on professionalism (Lowry 1992). Indeed, in the introduction to Tomorrow's Doctors (GMC 1993), it quoted the GMC Recommendations on Basic Medical Education report of 1980 with: ‘We therefore reiterate the views expressed in the recommendations of 1957 and 1967 that the student's factual load should be reduced as far as possible, to ensure that “the memorizing and reproduction of factual data should not be allowed to interfere with the primary need for fostering the critical study principles and the development of independent thought”. The student should also acquire and cultivate the ability to work independently. He must therefore have a certain amount of free time for private study and self education throughout the curriculum’. Underpinned by significant advances in educational theory, medical education was ripe for major change.



GMC의 급진적인 주장에도 불구하고 어떻게 해야하는지에 대해서는 설명된 바가 별로 없었다. 다만 교육과정의 1/3에 달하는 상당부분을 차지해야 한다는 것 분이었다. 그러나 어떤 요소를 포함시킬지에 대한 전례가 전혀 없는 것은 아니다. 1980년대까지 일부 영국 의과대학에서는 임상 'elective'를 하고 있었고, 종종 해외로까지 나가기도 했다. 이어진 TD2에는 보다 자세한 내용이 되어 있는데, 여기서는 '전통적인 5년 교육과정에서는 25%~33% 정도가 SSC로 가능하며, 최소한 각 학생이 수강하는 SSC의 2/3는 의학과 관련된 과목이어야 한다'라고 나와 있다.

Despite this radical proposal in Tomorrow's Doctors (GMC 1993) to introduce choice and depth of study with SSMs, there was very limited guidance about what or how these changes should be implemented, except that they should make up the very significant proportion (approximately one third) of the total curriculum time. However, this choice element was not completely without precedent. Previous to and through the 1980s there had been some choice in UK medical curricula, with the clinical ‘elective’ attachments, often taken abroad, becoming well established. Subsequently, Tomorrow's Doctors 2 (GMC 2003) was somewhat more prescriptive about what knowledge, skills and attitudes should be delivered in the curriculum. This guidance also decreased the time commitment expecting ‘…that in a standard five-year curriculum between 25% and 33% would normally be available for SSCs’, with ‘…at least two thirds of each student's SSCs must be in subjects related to medicine’.


또한 SSC에 대해서 'SSC는 핵심 교육과정을 보충해주면서, 학생들에게 다음의 것을 제공해야 한다'라고 한다.

It was also more prescriptive about SSCs, indicating ‘SSCs support the core curriculum and must allow students to do the following:


  • 연구 능력 Learn about and begin to develop and use research skills.
  • 자신의 학습에 대한 통제, 자기주도학습 능력 Have greater control over their own learning and develop their self-directed learning skills.
  • 필수과정을 벗어나는 내용에 대한 심도있는 내용 Study, in depth, topics of particular interest outside the core curriculum.
  • 자신의 능력과 기술에 대한 자신감 향상 Develop greater confidence in their own skills and abilities.
  • 자신이 한 작업을 말/시각/글로 발표 Present the results of their work verbally, visually or in writing.
  • 진로 계획 Consider potential career paths.’

평가와 관련해서 TD2에서는 '필수 와 선택 교육의 내용은 모두 최종 결과에 반영되어야 한다. 두 가지 모두에서 합격(satisfied)판정을 받지 못한 학생은 졸업할 수 없다.'라고 말한다.

With respect to assessment, Tomorrow's Doctors 2 also stated that ‘student performance in both the core and SSC parts of the curriculum must be assessed and must contribute to their overall result. Students who have not satisfied the examiners in both parts of the curriculum must not be allowed to graduate’.


모든 영국 의과대학은 SSC를 개발하여 포함시키고 있다. 그러나 여전히 어떻게 해야하는지는 불분명해서 학교마다 편차가 크다. 심지어 동일한 인증평가 보고서에서도 어떤 SSC가 바람직한가에 대해서 견해 차이가 관찰된다.

All UK medical schools have now developed and embedded SSCs into their curricula in a wide variety of different ways, sometimes exhibiting great innovation, and creating diversity between schools (Christopher et al. 2002). These were stated aims in Tomorrow's Doctors (GMC 1993) to reflect the autonomy and the particular strengths of their school and help to define their own programme. Nevertheless, the lack of guidance in Tomorrow's Doctors (GMC 1993), and what represented independent evolution of SSCs in different schools, has resulted in some SSC programmes upon accreditation being deemed unsatisfactory to varying degrees in their learning objectives, content, student choice, timetabling and assessment (Christopher et al. 2002; Murdoch-Eaton et al. 2004; GMC 2009). The lack of guidance is also reflected in those same GMC accreditation reports, which have a somewhat inconsistent view on what is acceptable as an SSC (Ellershaw et al. 2007).


학부 교육과정 시간표상에서 SSC는 지속적으로 필수 교육과정으로부터 축소 압박을 받고 있고, 이는 4년제 교육과정에서 특히 두드러진다. TD3에서 25~33%의 시간을 할애해야 한다는 요건은 사라졌으며, 어떻게 SSC가 교육과정에 통합되어야 하는가가 재평가받고 있다.

In undergraduate curricula timetables, SSCs are continually under pressure from the requirements of the core curriculum, especially in shorter medical courses such as four year graduate entry programmes. In the GMC document Strategic Options for Undergraduate Medical Education (GMC 2006) ‘Most respondents did not support increasing SSCs in the curricula’. In the subsequent consultation document of Tomorrow's Doctors 3 (GMC 2008) the previous requirement of 25–33% of the curriculum time dedicated to SSCs is absent, and how SSCs integrate into the curriculum is still being reappraised.


이러한 초창기의 문제에도 불구하고 SSC과정과 전체 교육프로그램을 개발에 대한 가이드가 만들어지고 있다. 

Despite these initial problems, help to develop SSC courses and whole programmes has begun to emerge, particularly from consortia groups of schools who have offered definitions of purpose and assessable key tasks (Murdoch-Eaton et al. 2004; Stark et al. 2005; Ellershaw et al. 2007; Scottish Doctor 2007; Riley et al. 2008a). Nevertheless, the long term outcomes and educational benefits of this GMC initiative to implement SSCs, although based on sound educational principles, has not been examined in any depth. The evidence of their contribution to improving medical education remains to be determined.



SSCs in an international context


세계적으로 국가에 따라서 학생들이 교육과정 모델에 의해서 제약을 받기도 한다. 일부에서는 UK와 같은 scheme을 도입하려는 정도의 열정을 보이지는 않기도 하지만, 전 세계적으로 교육과정 설계를 위한 국제적 협력을 하여 선택 요소를 늘리려는 방향으로 가고 있는 것은 확실하다. 미국에서도 의과대학 교육에 관하여 변화를 요구하며 프로페셔널리즘을 강조해야 한다는 우려가 있었다. 그러나 여전히 교육과정을 변화시키는 가장 좋은 방법에 대한 논란이 있으며, 일부는 '근본적 재구조화'를 그 대안으로 제시한다.

By and large the underlying reasoning behind the implementation of SSCs by the GMC in the UK is the same as that in the medical education community world-wide. Namely, that there is information overload, lack of choice and intellectual challenge, and a need for learner-centred curricula; all reflecting the need to prepare medical graduates to be adaptable to profound change which is present in the modern medical profession. Elsewhere in the world student choice can be more constrained depending upon their curricula models. Nevertheless, there is a worldwide trend towards courses with a significant choice component (Karle 2004) and in international collaborations on curriculum design (Harden & Hart 2002), although there does not seem to be the same enthusiasm to implement schemes with anything like the timetable commitment as in the UK. In the USA, similar concerns have been expressed about the delivery of medical teaching, extending back nearly a century (Christakis 1995), with demands for a change of culture (Brater et al. 2007) and an increase in professionalism (Humphrey et al. 2007). There remains a major debate on the best way forward to change the curriculum, with some suggesting “fundamental restructuring” (Cooke et al. 2006). The appropriateness and validity of an SSC type format should be carefully considered in these consultations.


WFME는 의학교육의 스탠다드를 제시하며 '선택 영역'을 교육과정 설계시 중요한 요소로 다룬 바 있다. 

The World Federation for Medical Education (WFME) has produced its document: ‘Global Standards for Quality Improvement’ (WFME 2003), which identifies ‘optional content’ as an important component of curriculum design. In conjunction with the Association of Medical Education in Europe (AMEE), WFME have commented on medical education and the Bologna Process, which aims for convergence of accepted set of standards throughout tertiary education across Europe (WFME 2005). This document also acknowledges the importance of establishing not only pan-European, but also global standards. In this regard, a major project undertaken by the Institute for International Medical Education identified seven domains to make up the ‘Global Minimum Essential Requirements’ for a physician (Schwarz & Wojtczak 2002). As will be discussed in this guide, at least four of these seven domains can in part be achieved through SSCs: 

  • ‘Critical Thinking and Research’, 
  • ‘Professional Values, Attitudes, Behaviour and Ethics’, 
  • ‘Communication Skills’, and 
  • ‘Management of Information’. 


SSCs can potentially contribute to the other domains, namely, 

  • ‘Scientific Foundation of Medicine’, and perhaps even include 
  • ‘Population Health and Health Systems’ and 
  • ‘Clinical Skills’, depending upon the design and content of the SSC programme.



Defining the ‘purpose’ of an SSC programme



SSC를 필수 교육과정과 통합하기

Integrating SSCs with the core curriculum

선택과정은 필수 교육과정과의 관계에서 다음과 같은 목적을 가지고 있을 수 있다.

Much care and consideration is required to create a learner-centred educational environment for medicine (Ludmerer 2004; Graffam 2007), and incorporation of an SSC programme should assist in this. In a well designed medical curriculum providing a learning environment which recognises learner autonomy, and delivers its teaching through a timetable with both core and significant student choice elements, an SSC programme needs to be well integrated, with clear purpose. In the UK, where SSCs are required to have a 25–33% curriculum time commitment, most schools have implemented purposeful SSC programmes. These permit choice but also perhaps reflect the strengths and distinctive qualities of their own programmes (GMC 2003). The purpose may be closely linked to complement the core curriculum by 

  • providing opportunities to gain a greater depth of clinical skill, insight and knowledge of specialties, which create choice in career exploration. In addition, or alternatively, it may...
  • provide a theme outside core or even an opportunity outside medicine, permitting students to explore broad and rich external interests (GMC 2003, 2008; Riley et al. 2008a).


Scottish Medical Schools SSC Liaison Group은 SSC의 목적에 대해서 Box 1과 같이 기술한 바 있다.

The Scottish Medical Schools SSC Liaison Group has developed a consensus statement on the purpose of SSC programmes in Scottish Medical Schools, as stated in Box 1. The group is made up of the Directors of SSCs from all of the Scottish Medical Schools. They all represent undergraduate medical programmes and consist of a preclinical school, a school with an integrated problem-based learning curriculum, and with three other schools which use a hybrid range of curriculum teaching and learning methodology. Despite these differences, there is good consensus in their courses reflecting the purpose specified in the GMC guidelines in Tomorrow's Doctors 2 (GMC 1993) and by Murdoch-Eaton et al. (2004).





Deriving purpose from learning outcomes



SSC에서 다루는 필수 역량

Core learning is delivered by SSCs

SSC의 목적은 구체적인 학습성과를 도달하는 것이다. 이들 학습성과는 순차적이고, 점진적이며, 핵심내용의 성과와 통합되어야 하며, 교수와 학생 모두에게 명확해야 한다. 명확성과 연속성은 교육과정 mapping을 통해서 이뤄질 수 있다. 통합된, 상보적인, 일관된 프로그램을 만들기 위해서는 교육과정 성과와 평가의 alignment가 중요하다. 

The purpose of an SSC programme is to achieve specific learning outcomes. These learning outcomes should be sequential, progressive and integrated with the outcomes derived from core teaching, with clarity to both students and staff (Hirsh et al. 2007). This clarity and continuity can be demonstrated by curriculum mapping the learning outcomes of both core and SSC programmes (Harden 2001; Prideaux 2003; Willett 2008), although mapping some of the more generic skills represents a different type of challenge (Robley et al. 2005). Alignment of curriculum outcomes with assessment is a critical step to create an integrated, complementary and coherent programme, which should result in an educationally stimulating and successful course (Harden 2001; Willett 2008).


질문을 확장시키면 다음과 같다. "SSC에서 필수 학습성과를 다루어야 하는가?" TD3에 따르자면 아마 이 답은 '그렇다'라고 해야 할 것이다. SSC에서 다루는 것 중 '필수'라고 할 수 없는 유일한 부분은 의학을 벗어나는 범위의 지식일텐데, 이 조차도 이들 내용에서 학습한 일반적인 기술들이 많은 경우 '전이가능'하기 때문에 core라고 할 수 있다.

The question now arises as curricula develop and expand: “should SSCs deliver core learning outcomes?” The answer appears to be a resounding ‘yes’, when judged in the consultation document for Tomorrow's Doctors 3 (GMC 2008). Delivery of core learning outcomes by SSCs has become much better defined since 1993, and this can be best represented in Figure 1, which shows that now in 2009, most of the learning outcomes attained during SSCs are defined as core. The only learning outside core may be knowledge outside medicine, although generic skills developed in this type of learning environment are usually transferable, and hence defined as core.






모든 학생이 모든 핵심성과에 도달해야 한다. 

All students should achieve all core outcomes

SSC에서 필수 지식/술기/태도를 다룬다고 한다면, 어떤 과목을 선택하든 불이익이 없도록 각 SSC에서 동등한 기회가 주어져야 할 것이다. 이를 위해서는 아마도 광범위한 일반적인 인성적 전문직업적 기술이 SSC에서 다뤄져야 할 것이며, 필수 내용과 어느 정도 상보적 관계에 있어야 할 것이다. 이 관계가 Fig2에 나와 있다. 일부 학생들만 특정 SSC에 대한 기회가 주어진다면 그것은 문제일 것이므로 필수 성과와 선택 사이에 균형을 잘 맞춰야 한다. 그러나 모든 필수 임상술기를 SSC에 의해서 다룰 수는 없는데, 왜냐하면 이것은 '진정한'선택권을 주는 것이 아니며, 균질하지 못한 SSC의 특성상 내용 전달의 일관성이나 모든 학생이 그 목표에 도달했다는 것이 불가능할 수 있다.

If core knowledge, skills and attitudes are delivered in SSCs, then there needs to be equal opportunity in each SSC, so students are not disadvantaged depending upon which SSC they take. It is now perhaps appropriate that a wide range of generic personal and professional skills are delivered in SSCs, and this will to a greater or lesser degree complement the core teaching. This accumulation of overall generic learning outcomes is illustrated in Figure 2. It is also appropriate that SSCs permit the further development of extra clinically-based skills and knowledge that represent greater depth of learning beyond the basic core requirements, reflecting student choice and interests. A problem in the delivery of the curriculum exists if only some students get the opportunity to achieve this outcome whilst others miss out (Bidwai 2001). This requires achieving a balance between core learning outcomes and providing student choice. It is unlikely that all core clinical skills could be delivered by SSCs, firstly, because this cannot then be defined as providing true choice, and secondly, because consistency of delivery and attainment by all students cannot be assured due to the heterogeneity of SSCs (GMC 2003; Murdoch-Eaton et al. 2004).





개별 학생이 자신의 진전과 성취, 학습목표에 대해서 스스로 mapping하는 것이 필요할 것이며, 이 자체가 중요한 학습 성과이기도 하다. 완전히 통합된 교육과정이라면 단 하나의 SSC를 놓치는 것조차 부적절한데, 왜냐면 이로 인해 필수 역량을 배양하지 못했을 것이기 때문이다.

It may be appropriate for individual students to take responsibility for mapping the attainment of their own progress, achievements and learning outcomes throughout the programme (Riley et al. 2009), forming an important learning outcome in itself. SSCs are sometimes seen as an opportunity for a student catch up, for instance if they missed part of the course for health reasons, or to take remedial teaching after failing a clinical attachment. In a fully integrated programme, missing an SSC should be regarded as inappropriate because the student may not attain core skills and competencies.



SSC학습성과 통합하기 

Integrating SSC learning outcomes

새로운 교육과정 혹은 교육과정 변화를 겪는 것은 완전히 통합된 SSC를 만들 수 있는 좋은 기회이다. 그러나 SSC를 도입하려는 시도는 상당한 저항에 부딪칠 수 있는데, 그럼에도 불구하고 보건의료가 급격한 주변 환경의 변화를 겪는 것을 고려한다면, 교육과정 내에서 SSC는 그러한 변화에 빠르게 대응할 수 있을 뿐 아니라, 향후 필수 내용으로 다뤄질 것을 선도적으로 다루는 역할도 가능하다. 

A new programme, or an existing programme undergoing profound curriculum change represents an obvious opportunity to construct from first principles a course with fully integrated SSCs. The conceptual framework of a learner-centred curriculum, particularly with the integration between core and SSCs, needs careful planning, and its implementation is dependent upon the course delivery, the medical environment where the teaching is delivered, its assessment methodology and the course ethos or theme (Harden 2000; Davis & Harden 2003; Ludmerer 2004). Succeeding in curricular change has been well described elsewhere (Bland et al. 2000; Cooke et al. 2006; Davis & Harden 2003), and the implementation of SSCs as a significant proportion of a whole programme, perhaps being regarded as a more radical feature, may encounter some resistance and hostility during these changes. Nevertheless, with health care being such a rapidly changing environment, SSCs may be an area in the curriculum that can readily respond to these changes, and indeed act as a forerunner to what may subsequently become core.


SSC프로그램의 목적과 어떻게 필수내용과 통합될 것인가는 다음의 것들에 달려있다.

The purpose of the SSC programme, and how it integrates with the core will be influenced by:


  • Theme of the Medical School – a medical school may have a particular reputation, ethos, approach, and range of clinical or research expertise. This can provide a theme for the SSC programme, which creates these distinct qualities in the medical graduate.
  • Type of programme – Most UK medical schools have a five year undergraduate medical programme, although with the newly developing graduate-entry programmes, some of these are over four years. Some UK schools also still create opportunity for an integrated year leading to either an honours programme or the attainment of an extra degree, and a preclinical school should liaise with their linked clinical schools to share learning objectives to prepare its students. Many international schools take six years to graduate their medical students, usually because of a first year course that equips the student in the basic sciences needed for medicine. Hence, there are many variations on a theme. Inclusion of an SSC programme into these variations needs careful planning with clarity of purpose for the various components. Intercalated years were often designed to give the student experience, depth and breadth of an interest, the very purpose of an SSC. Graduate entry courses expect students to have successfully established adult learning and transferable professional skills in other environments (Macpherson & Kenny 2008), some of the competencies that SSC are designed for. Providing challenging opportunities, ensuring that all students achieve all the learning outcomes, whilst avoiding any duplication so students do not become disengaged by repetition are essential to ensure full integration.
  • Type of curriculum – It is true to say that there is now greater thought given to the way curricula are structured and delivered. Curricula may be based on learning outcomes (Harden 2007), or competencies (Scottish Doctor 2008), but whatever way they are designed and delivered they must also be matched by appropriate assessment (Schuwirth 2007). What is not clear is the way that SSCs influence curricula delivery or how the teaching methodology adopted by the curriculum influences the SSCs. Many SSCs are designed to develop student-centred learning, which represents a quality measure of a Problem Based Learning type course. Despite schools becoming more integrated, many still believe in a pre-clinical, clinical divide. Should SSCs be pre-clinical and clinical and can we assure that there is effective communication of purpose between the two components?


SSC와 필수 사이의 균형 

Balance between SSCs and core (“core and options”)

필수와 SSC사이의 주된 갈등요인을 Box 2에 정리하였다. 필수적 'generic' 기술 등도 어떤 선택을 하든 학생들에게 같은 내용을 일관되게 가르칠 수만 있다면 SSC로 다룰 수도 있다.

There is a balance to be struck between what are presented as core learning outcomes in the main part of the curriculum and what core learning outcomes are derived from SSCs. Some of the main tensions between core and SSCs are highlighted in Box 2. Core ‘generic’ professional skills, for instance teamwork and critical appraisal, can reliably be delivered in an environment of choice, as long as there is consistency between the choices and each is capable of delivering the same to each student. It is also important to define a list of optional competencies or learning outcomes, beyond the core, which may defined or shaped strategically by the student themselves, and even be aspirational, to reflect their personal interests, motivation and career plans.





프로페셔널리즘을 학습성과로 다루기 

Delivering “professionalism” as a learning outcome

Designing a curriculum and creating a learning environment to deliver professionalism can be regarded as a complex and significant challenge, but a well designed and integrated SSC programme can make an important contribution to this. Accreditation agencies have defined the professional standards required in medical graduates and which patients expect (Royal College of Physicians 2005; Medical Schools Council & GMC 2009). However, there is still concern from patients that medical education has not responded sufficiently (Hasman et al. 2006). The appropriate learning environment to ensure professional development has always been the subject of debate, although over the last decade it has become more clearly defined (Cruess & Cruess 2006; Stern & Papadakis 2006; Hilton & Southgate 2007), so that it can be incorporated, (Gordon 2003) and mapped into the curriculum (Humphrey et al. 2007), whilst recognising the influence of personal factors (West & Shanafelt 2007). Our understanding of professionalism and its supportive literature is guided by both sociology and bioethics and the two have developed somewhat independently. A more inclusive and integrated blended approach is probably the most appropriate (Creuss & Creuss 2008), which will also suit students with a wide range of learning styles and background experiences.


전문직업성과 관련된 기술을 가르쳐야 한다는 것에 대한 요구는 높지만, 어떻게 이들 기술을 장려할 수 있는지 (특히 SSC에서)는 잘 다뤄진 바가 없다. 이는 일부분 '비공식 교육과정'이 '공식 교육과정'으로 침투해 들어온다는 것을 반영하는 것이기도 하다. '비공식 교육과정'은 장점과 단점이 있다.

There is much extolling the need to develop professional skills, but much less evidence in the literature indicating how development of these skills can be encouraged, particularly in SSCs. This in part reflects the relatively recent migration of professionalism from the informal or ‘hidden’ curriculum into the declared curriculum (Hafferty 1998; Jha et al. 2002; Whittle and Murdoch-Eaton 2002; Jha et al. 2007). There are both positive and negative influences within the hidden curriculum. 

  • Positive influences include opportunistic encounters with excellent role models and mentors and teachers. 
  • Negative influences include potential erosion of previously taught ideals, created by inappropriate communication, teaching of poor standard or even teaching by humiliation, and attempts to apply ethical principles in an increasingly demanding and busy target-led working environment (Hafferty 1998; Lempp & Seale 2004). 

감독관/멘토/동료들과 오랜 시간 서로 관계를 유지할 수 있는 기회이다. 적절하게 잘 훈련받은 교수자를 선정하는 것이 중요한 우선순위이다.

SSCs can present opportunities for longer attachments and interactions with a supervisor, mentor and indeed peers when working within a team, to develop as well as assess these professional competencies as a formal part of the curriculum. The selection of appropriately trained faculty to both lead and participate as teachers in these SSCs is of high priority.



Providing choice and depth of study

학생의 선택권에 대해서 상당히 중요한 문제가 있는데, 과연 의학을 벗어난 영역의 선택권도 줘야 할 것인가 또는 학생들이 스스로 교수의 간섭 없이 선택할 수 있게 해야 할 것인가 등의 문제이다. SMSSSCLG의 경험에 따르면, 대부분의 학생들은 다양한 경험을 하고 싶어하지만 일부 학생은 자신들의 선택권을 제한하고 싶어한다. 

The ability to provide a wide choice of subjects and opportunity to study a particular area of interest in depth are key requirements of an effective SSC (GMC 2003). Of these, provision of choice can raise quite significant challenges. For instance, should these choices include topics outside of medicine or should students be able to choose their own topic, without faculty interference? The experiences from the Scottish Medical Schools SSC Liaison Group (personal communication), are that most students wish to sample a range of experiences, although others may want to limit their choices for other reasons. 

  • A highly focused and purposeful student may have a clear commitment to a specialty and a desire to gain as much experience in that field as possible, although their rationale may actually be unfounded and inadvertently biased. 
  • In contrast, a poorly motivated or weak student may not want to leave the familiarity of a clearly defined curriculum, and face challenge. 


Student autonomy, engagement, and development of mature learning are key qualities which modern medical curricula are expected to develop, and these should be expressed as explicit learning outcomes.



선택권 제공 Provision of Choice – 

80~90%의 학생이 경험을 한다는 것도 여전히 10~20%의 학생이 필수 학습성과를 도달하지 못한다는 의미이기도 하며, 다른 필수 교육과정에서 배우지 못할 수도 있다. SSC를 통해서 모든 학생들이 필수 학습성과를 달성하게 하기 위해서는 상당한 고려가 필요하다. 각각의 SSC는 다음과 같은 요건을 갖추어야 한다.

Choice can or should only exist if it supports the attainment of appropriate learning outcomes; appropriate within a wide range of options as defined by an individual school. The essential competencies may not be satisfactorily achieved or uniformly delivered to all students if the learning experiences are too varied, with too much choice and variation in teaching methodology. This has been highlighted by Stark et al. (2005), where, in their consortium of SSC programmes, they have determined that 80-90% of their students gaining real research skills and experience. However, this also means that 10-20% of students have not attained what they describe as a core learning outcome, which they may be unlikely to achieve elsewhere in the curriculum. Designing a series of SSCs within a programme so that all students achieve all the core learning outcomes has to be carefully considered. It can be achieved by each SSC:


    1. having some design and format constraints to ensure key learning outcomes are achieved,
    2. a clear list of learning outcomes to be achieved by each student over a period of time, where each student has to manage their own programme of learning, ensuring that they achieve the required outcomes over the course of the programme;
    3. having a variation of (b) above, whereby students have to attend a specified number of defined SSCs with specified learning outcomes.

However, the management of (b) and (c) may be difficult as students will have increasing incremental or hierarchical levels of skill development as they pass through the course, which will affect the experience, the level of learning, and overall skills and outcomes attained.


선택권 - SSC는 Generalist를 만드는가 Specialist를 만드는가? Choice – do SSCs create generalists of specialists? – 

초기 단계에서 너무 많은 선택권을 주는 것은 부적절할 수 있다. 그 시기에는 TV의 의학드라마 정도가 학생들이 가지고 있는 배경지식이다. 그러나 학년이 올라가면서 경험과 통찰이 쌓이게 된다. SMSSSCLG에 따르면 절반 이상의 학생이 나중에 일차의료를 하게 됨에도 매우 일부 학생들이 1학년에는 general practice를 선택하였다. 

It still remains to be ascertained whether SSCs encourage students to become generalists or specialists, or affect their future career aspirations. Indeed, too much choice at particular stages of the curriculum may cause problems through confusion. In early stages of the course too much choice may be inappropriate because students want to do something perceived as exciting, without having a foundation of true understanding that accompanies choice. Television hospital dramas are a fertile ground on which to base preconceived ideas in first year students! In later years, students should have greater insight and experience. In the experience of the Scottish Medical Schools SSC Liaison Group, (personal communication) very few students select general practice in their first year, although around half of students will enter primary care as their final career. It can of course be argued equally strongly that students need these early formative experiences to gain that insight, but the correct balance needs to be sought.


주어진 목록에서 선택할 것인가, 스스로 과목이나 프로젝트를 제안할 것인가? Selection from a list or self-proposal of projects? – 

학생들에게 선택권을 주는 것이 종종 겪는 문제점은 학생이 1순위로 원한 것을 배정받지 못할 때에 기분이 안 좋아지거나 적극적으로 참여하지 않는 것이다.

The assignment of students to their chosen project often presents a major problem, with students often and obviously wanting their first choice, and perhaps being unhappy and disengaged with the process if they don't. Allocation systems that allow at least 90% of the student cohort to achieve their first choice, perhaps ensuring preference for first choice in the next SSC, and an open understanding of why SSCs are constructed and their value, can overcome some of these issues.


학생이 제안서를 제출하게 하는 것은 교수들에게 부담이 되는 일이다. 하지만 전례가 없지는 않다.

Permitting self-proposal of SSCs by the student may be regarded by faculty as being more challenging (Riley et al. 2008a). However, they are not without precedent, as medical electives are a well established self-proposed opportunity. Students need to be instructed to start out with clear learning outcomes which, with further development and modification and with appropriate academic support, can be realistically achieved.


영국의 Edinburgh대학에서는 중간 2년간 진행되는 모든 SSC가 학생의 제안으로 이뤄진다.

At the University of Edinburgh, UK, all SSCs from the middle of year two onwards are self-proposed by students. The student is required to take the initiative, decide on their field of interest and make contact with and sign up to an appropriate staff tutor, who will facilitate their work (University of Edinburgh, MB ChB overview 2009). Students receive support and guidance to find an appropriate project or attachment. They are given contact details of potential tutors, supplied with clear advice and information about what is expected of them and their tutor, the importance of defining clear learning outcomes and how they may go about achieving them.


학습의 깊이와 학생 참여수준 증대 Depth of study and increased student involvement – 

TD에서 제안하는 목표를 달성하기 위해서는 '깊이'가 어느 정도인지가 잘 정의되어야 하며, 이는 실현가능하고 도달가능해야 한다. 그럼에도 불구하고 학습의 '깊이'를 나타내는 overall learning outcome은 상당히 모호하며, 거기에 담긴 열정을 제외하면 그다지 도움이 되지 않는다. 학생의 선택이 어떤 배경에서 이뤄졌는지도 살펴볼 필요가 있다. 미래의 진로를 탐색하기 위한 것인지, 관심과 열정 때문인지, 약점을 보완하기 위한 것인지 등이다.

Tomorrow's Doctors (GMC 2003) indicated that students should study a topic in depth to create a more stimulating environment which enables students to develop self-directed learning skills. To achieve this outcome, depth needs to be well defined, as well as being realistic and achievable. Nevertheless, an overall learning outcome indicating ‘depth of study’ is vague and unhelpful except to indicate the aspiration held. It should be further elaborated upon for each SSC, and can indeed be developed by the student themselves. It may be sub-divided into both core objectives but also more specific non-core objectives, which may reside within any of the educational domains. The purpose behind the student's selection should also be explored, whether it based upon future career identification, interest and enthusiasm, or to improve on a perceived weakness.



Types of individual SSCs: an opportunity for innovation



There are a range of different formats that can be developed to build up an SSC programme, some highly innovative and with a wide range of themes and topics. Box 3 identifies many of these possible themes, and following are a listing of some different format options:


    • Clinical attachment to study a subject, over longer time and in more depth – this is perhaps a more familiar format in which most clinical departments can quite readily develop an appropriate attachment. These experiences can offer extra clinical skills and more complex clinical scenarios. They present opportunities in mainstream specialties as well as in specialties that are represented to a lesser depth in the core curriculum. It can provide opportunities for a specialty where core teaching and content is being delivered in a systems-based, integrated curriculum, so the range and boundaries of the specialty become blurred. Examples of less well represented specialties depends somewhat on curriculum delivery, or if the specialty that lies largely outside core, but may include clinical genetics, radiology, ENT, plastic surgery, dermatology, ophthalmology, clinical microbiology, biochemistry, pathology, psychiatry and tropical diseases.
    • Elective attachments – 과거에는 종종 해외로까지 나가는 이런 일렉티브의 교육적 가치가 'opportunistic experience'였으나 이러한 방식의 타당성에 문제가 제기되었다. 교육적으로 도움이 되는 일렉티브를 위해서는 학습성과가 명확히 설정되고, 이에 따라 도달가능하고 달성가능하다는 것을 보여주기 위한 적절한 평가가 필요하다.
      In the past, the educational value of elective attachments, often abroad, has very much been based on opportunistic experiences. The validity of this approach has been questioned (Dowell & Merrylees 2009). Learning outcomes for an educationally beneficial elective attachment should be clearly specified and appropriately assessed to ensure they are both achievable and attained. Perhaps by linking to selected international institutions to help ensure quality of teaching and assessment, electives can become an important, well-recognised and accepted activity that resides within the remit of the SSC programme.
    • Research project – Students can be involved in a range of research, audit, or blended research and audit projects. There is good evidence that these form an optimal environment for students to develop a wide range of research skills to enhance graduate attributes and professionalism 광범위한 연구능력과 프로페셔널리즘을 키울 수 있다.(Kanna et al. 2005; Jenkins et al. 2007; Macpherson & Kenny 2008; Struthers et al. 2008). These types of projects must present a stimulating and enjoyable opportunity for both student and the supervisor for them to be sustainable. They should be well resourced and include effective support for study and questionnaire design and applied statistics (MacDougall 2008). 학생을 위한 효과적인 연구와 설문 설계, 통계 등을 위한 지원이 필요하다. The development of library and literature search skills, including the ability to comprehend and indeed perform systematic reviews often feature highly in such SSCs. 문헌검색능력을 키울 수 있다.Support for appropriate ethical review and approval may be required, with careful consideration on how faculty screen large numbers of projects with a significant educational component. Unless a relatively simple and rapid ethical review process is in place, the ability to offer research projects may be significantly constrained (Robinson et al. 2007). 적절한 Ethical Review가 없으면 참여가 제한될 수 있다. These types of project may also offer a significant contribution to the institution's research capacity and productivity including publications, or pilot findings for further study or for future grant applications. 기관 차원의 학술발표나 파일럿연구, 향후 연구비 지원에 크게 기여할 수도 있다. They may also deliver useful audits that influence local care provision. Virtually all projects will provide some clearly defined and valued outcomes for the student. This may include being able to demonstrate attainment of specific skills when applying for their first postgraduate training post or subspecialty training, which is important in the UK with the erosion of opportunities for career exploration in early postgraduate training.


A fundamental requirement to ensure success and sustainability of this type of SSC is to ensure good alignment of outcomes for student and supervisor, where both have ownership and are motivated. The fourth year 14 week self-proposed research project SSC at the University of Edinburgh has proved to be very successful and sustainable in this way. Over the last five years, it has contributed a large number of useful audits, and to more than 200 conference abstracts, peer-reviewed systematic reviews or research articles, where the student involved is included as an author (University of Edinburgh, MB ChB website).


    • Skills-based project – This may take the form of a short term attachment to develop specific clinical or research skills. These can be quite innovative, for instance Queen's University, Belfast, offers an opportunity to learn “signing for the deaf” (Queen's University website 2008). SSCs can also provide an opportunity to fulfill the GMC requirement stated in Tomorrrow's Doctors (GMC 2003) that students should develop teaching skills. This can be peer teaching within the curriculum on a topic of choice (Ross & Cameron 2007; Sobral 2008), or externally, in primary (Brown 2005), or secondary schools (Furmedge 2008), for instance delivering a sex education programme (Jobanputra et al. 1999).


의학의 근간을 이루는 현대과학을 이해하는데 도움이 되는 과학적 심화기술과 원리를 위한 SSC도 가능하다.

A further example may be to use SSCs so students can gain in-depth scientific skills and principles that are functionally useful for understanding the modern science underpinning medicine, rather than a superficial knowledge. These principles, for instance in molecular and cellular biology, systems biology, genetics and public health can be delivered as SSCs from a menu of in-depth topics, where each student has to manage their own learning portfolio, which can then be applied between fields and specialties.


    • Providing wider insight into medicine and the care team – This may be arranged by providing a choice of projects themes or topics within the medical humanities and medical ethics. A third of North American schools have this sort of programme (Charon et al. 1995; Downie et al. 1997; Hodgson & Smart 1998; Charon 2001). Teamwork interactions with other health professionals, interdisciplinary working and awareness of the extended health care team provide other essential key professional skills that can be developed. In the third year SSC at the University of Edinburgh, students self-propose and organise a short attachment to shadow a member of the care team who is not a doctor. This SSC is described in more detail in Box 4 (available at www.medicalteacher.org).
    • Experience outside the field of medicine – the existing GMC guidelines have indicated that quite significant amounts of time may be spent on study outside medicine (GMC 2003). Evidence from the literature is scarce although many schools do provide at least limited opportunities, reflecting the time restrictions in most curricula. Nevertheless, some students may not want, or regard as worthwhile, this broadened perspective. 많은 학교에서 시간적 제약으로 제한된 기회만을 제공하곤 하나, 일부 학생은 다양한 관점을 익히는 것을 원하지 않기도 한다. With careful design and consideration, these types of projects can contribute to developing a range of relevant core professional skills (Murdoch-Eaton & Jolly 2000). These SSCs may be wide ranging, from archeology to zoology. 고고학에서 동물학까지 다양할 수 있다. There are many examples in different schools, through the arts (Lazarus & Rosslyn 2003), sciences (Macpherson and Kenny 2008), journalism (Gibson 2006), language, literature, and creative writing (Thomas 2006). If the opportunity is presented to a student or a group of students to work outside medicine, the options are potentially limitless. Many topics in these fields have been addressed in a second year SSC at the University of Edinburgh, where students self-select both their topic and as a small group sign up their own facilitator, producing a project report as a wiki (Riley et al. 2008b). This type of SSC can provide an appropriate break and a different challenge, and even for the few students who may be concerned that they are on the wrong course heading into the wrong profession, a chance to explore other courses, fields or professions.


SSC의 목표는 너무 거창해서도 안되며 너무 높은 수준의 기술을 요구하는 것도 부적절하다. 유사하게 기존의 학습에 기반하여 이루어져야 한다. SSC는 매우 다양하다. 

This outlines some of the options that can be considered, adapted and developed as SSC courses. There also needs to be a pragmatic recognition by staff at the development stage of what should or could be achieved by students. SSCs should not be overambitious or anticipate too high a skill base, and similarly they should recognise and build on existing prior learning. There is tremendous variety throughout SSCs (Heylings 1998), with opportunities for innovation in topic, delivery, including virtual learning environments and collaborative e-systems (Sandars 2006), and also in the way assessments are designed to ensure learning outcomes are achieved successfully.



Structure and timetable: integration of SSCs with core


SSC를 전체 교육과정에 펼쳐놓을 것인지, 아니면 어떠한 교육적 이유(이상적인 시점에 특정 술기를 터득하게 하거나 진로 선택에 도움을 주기 위해서)로 cluster시킬 수도 있다. 다음을 고려하라.

Timetabling of any SSC programme needs to be considered within the curriculum as a whole, recognising the individual subject outcomes and the overall learning outcomes and where these can be optimally developed to ensure progressive attainment for the student. There is often some fragmentation across a curriculum, with students attending short attachments in a wide range of different specialties. It should be considered whether SSCs are spread evenly throughout the curriculum, or are there educational reasons to cluster SSCs, perhaps to permit certain skills to be developed at an optimal time, or to facilitate informed career choice. The following should be considered when timetabling the curriculum:


  • The skills gained and outcomes achieved should be mapped temporally so that they are both achievable and complement the development of core within the student - e.g. an inappropriate exposure to gynaecological skills prior to any gynaecology teaching is unstructured, unachievable and probably counter-productive.
  • Learning outcomes and skills should be incrementally challenging and build upon existing skills. They should not be repetitive by returning to basic levels, nor should they assume a level that has not yet been attained. This is exemplified by the spiral curriculum model (Harden and Stamper 1999).
  • The SSC assigned time needs to be protected and equally valued by all, and not be interrupted by necessary attendance at ‘core’ teaching activities.
  • SSCs should not over-duplicate other opportunities, for instance professional and personal development, teamwork, integration between experiences, or mentoring.


개별 SSC의 구조는 다음과 같이 매우 다양할 수 있다.

It is essential to bear in mind the structure of individual SSC courses which can be highly varied and include:


  • Large or small groups with students working on individual aspects of a defined activity
  • Students working collaboratively as a team on a defined project
  • Individual students working solo on a project


SSC에서 각 교수가 맡을 역할이 중요하며 명확히 정해져야 한다.

The amount of staff input and their roles in the SSCs is also important and has to be clearly defined. Staff may act as tutors, facilitators, supervisors or mentors. They may communicate with students partially or entirely in the clinical setting, through group meetings or tutorials at different frequencies, or online in a virtual learning environment.


다양한 시간표의 옵션을 고려할 수 있다.

To optimize the learning opportunities different timetable options should also be considered, which others have previously described in detail (Harden & Davis 1995; Hirsh et al. 2007):


  • Embedded with other teaching – e.g. “long and thin” – one day per week over a prolonged period of time
  • Embedded within a specific specialty
  • Using a modular structure, intermittent or sequential
  • In a local environment or at an associated peripheral attachment, or away at another institution, including remote or abroad



Further developments: how should SSCs continue to develop?



Is the curriculum design model of SSCs in medical education experiment working? There remain several unanswered questions on the outcomes resulting from integration of SSCs into medical curricula. Some of these questions and areas for future research on the roles of SSCs, their delivery and effectiveness are highlighted in Box 7.




최근의 TD3에서는 25~33%라는 비율을 생략했다. 이는 비현실적이라는 지적이 있었으며, 20%정도가 유지가능하다고 본다.

Tomorrow's Doctors (GMC 1993) introduced the SSC element to the curriculum, and the updated Tomorrow's Doctors 3 (GMC 2008) currently published as a consultation document, leaves the discussion on their future open. This new version creates an opportunity for schools to concentrate on SSCs delivering a coherent, integrated and focussed learning programme that complements the whole curriculum, capable of delivering agreed core learning outcomes as well as creating optional learning activities for students. This recent consultation document has omitted the previous requirement for 25-33% of the timetable to be dedicated to SSCs. This original time requirement has been regarded as unrealistic (Ellershaw et al. 2007), and around 20% of curriculum time may be a more sustainable commitment for present day SSCs. The response from medical schools as to how they will use the SSCs in the future, and whether there is an inexorable return to a situation whereby core learning outcomes occupy most if not all of SSC time, remains to be seen. Monitoring of these facets of SSCs are important for the future, and specific timetabling and curriculum mapping of learning outcomes and where they are to be achieved, essential.


SSC는 의과대학 교육과정에서 새롭고, 가끔은 급진적인 아이디어 (임상술기나 프로페셔널리즘 평가)를 시도해볼 수 있는 토대와 기회를 제공한다.

SSCs offer an ideal proving ground within the medical curriculum for new, even somewhat radical ideas, including newer assessment methodologies to assess professionalism and / or clinical skills. To continue to improve SSCs, there is a necessity to have a more coherent and regular dialogue between medical schools, national and international, and with regulatory bodies to identify and share good practice and research opportunities.


SSC가 도입되기 시작했을 때부터 진화의 과정을 거쳐왔지만, 여전히 의과대학 교육이 과도한 내용을 담고 있다는 한계가 남아있다.

Since the inception of SSCs, their purpose and outcomes can be described as having undergone something of an evolutionary process, responding to pressures from medical educators, faculty, students, local academic institutions, care providers and regulatory bodies, to be purposeful and deliver defined learning, whilst adapting to local circumstances. Nevertheless, the problem with content overload in medical curricula remains, which medical curriculum designers and policy makers need to continue to recognise; SSCs may be one way of resolving this difficult issue by providing opportunity for the basic sciences as well as the clinical sciences. However, taking cognisance of the original purpose of SSCs (or SSMs as they were originally called) which was to create options for student learning remains important if not only to stop a return the original medical curricula.


SSC가 의과대학 교육과정에서 성공적인 요소로 남을 수 있을까? 제한적이지만 문헌과 여러 의견, 광범위한 학생들의 평가를 종합해보면 그 답은 '그렇다'로 보인다.

Are SSCs a successful element of the medical curriculum? When the limited literature reports and commentaries on SSC programmes, together with extensive evaluation data from students are taken into consideration, it would seem appropriate to indicate ‘yes’. However, some skepticism does remain, perhaps where SSCs are poorly integrated into the curriculum, faculty and students are under-supported, or the opportunities presented are not fully appreciated.



Conclusions



The original aims and objectives of SSMs as detailed in Tomorrow's Doctors in 1993 have evolved, from the ideals of one third of the curriculum giving the student choice and diversity. Subsequently, SSCs in Tomorrow's Doctors 2003 became more constrained, and their learning objectives better defined. SSCs are now well embedded and integrated in most medical schools in the UK, and to a lesser extent internationally, providing opportunities for innovation and learner-centred medical education. SSCs are now delivering core professional and personal skills in an environment of choice for the student. This element of choice allows our high quality course entrants some flexibility and opportunity to develop and utilise these skills, as well as explore future career options. Internationally they may even address shortages in certain specialties. At present there remains little empirical evidence on the longer-term benefits of SSCs, and this still remains an important challenge in medical education research, if only to support their continued existence and form.
























 2009 Oct;31(10):885-94. doi: 10.3109/01421590903261096.

Student Selected Components (SSCs): AMEE Guide No 46.

Author information

  • 1Obstetrics and Gynaecology Centre for Reproductive Biology, Queen's Medical Research Institute, College of Medicine and Veterinary Medicine, University of Edinburgh, Edinburgh, UK. Simon.C.Riley@ed.ac.uk

Abstract

Student Selected Components (SSCs) are one of the more innovative recent developments in medical education. Initially established in the UK in the 1990s in response to the General Medical Council's recommendations in Tomorrow's Doctors (1993), they provide students with a significant element of choice and depth of study in the curriculum. SSCs have become an integral part of medical curricula throughout the UK, and to a limited extent the rest of the world. In most cases they contribute to the delivery of learning outcomes broadly encompassing personal, professional and research skills, whilst creating opportunities for students to explore future career options. This AMEE Guide is written for developers of new medical curricula, where SSC-like initiatives offering choice and depth of study, in conjunction with core learning, are being considered. Its aim is to provide insight into the structure of an SSC programme and its various important component parts. It is also relevant for those already involved in SSC development by offering insight into effectively managing, assessing and improving existing programmes, to deliver effective, coherent and core-integrated teaching valued by students and faculty alike.


성인학습이론: 의학교육의 교수 학습에 대한 함의 (AMEE Guide No.83)

Adult learning theories: Implications for learning and teaching in medical education: AMEE Guide No. 83

DAVID C. M. TAYLOR1 & HOSSAM HAMDY2

1University of Liverpool, UK, 2University of Sharjah, United Arab Emirates




Introduction


Vygotsky와 같은 구성주의자

The more we read, the more we realise that there are many different ways of explaining how adults learn (Merriam et al. 2007). None of the individual theories fully explain what is happening when an aspiring health professional is engaged in learning. In this Guide, it will become clear that the authors hold a broadly constructivist view. Constructivists, like Vygotsky (1997), consider that learning is the process of constructing new knowledge on the foundations of what you already know. We will explain a constructivist schema, which we feel has an evidence base and forms a theoretical basis to help curriculum development, learning and teaching strategies, student assessment and programme evaluation.


Malcom Knowles의 Andragogy

Malcolm Knowles (1988) considered that adults learn in different ways from children. He introduced the term “andragogy” to differentiate adult learning from pedagogy; this differentiation now seems to be artificial. Many of the principles of andragogy can be applied equally to children's learning. It is probably more appropriate to think in terms of a learning continuum, which stretches throughout life, with different emphases, problems and strategies at different times.



Categories of adult learning theories

아래와 같이 구분지어질 수 있다.

Our task is complicated by the observation that the theories of learning flow partly from psychological theories of learning and partly from pragmatic observation. It is also important to remember that “learning” includes the acquisition of three domains: knowledge, skills and attitudes; any theories should ideally account for learning in each of these three domains.


In broad terms, theories of adult learning can be grouped into, or related to, several categories. There is quite a lot of overlap between the theories and the categories of theories, and here we give a simplified overview:


  • 도구적 학습이론 Instrumental learning theories: 개인의 경험에 초점을 두며, 행동주의와 인지학습이론이 있다.
    These focus on individual experience, and include the behaviourist and cognitive learning theories.
    • 행동주의 이론 Behavioural theories are the basis of many competency based curricula and training programmes (Thorndike 1911; Skinner 1954). A stimulus in the environment leads to a change in behaviour. Applying these theories usually results in learning that promotes standardisation of the outcome. This leads to the main issue with behavioural theories – namely who determines the outcomes and how they are measured?
    • 인지학습 이론 Cognitive learning theories focus learning in the mental and psychological processes of the mind, not on behaviour. They are concerned with perception and the processing of information (Piaget 1952; Bruner 1966; Ausubel 1968; Gagne et al. 1992).
    • 경험학습 Experiential learning has influenced adult education by making educators responsible for creating, facilitating access to and organising experiences in order to facilitate learning; both Bruner's (1966) discovery learning and Piaget's (1952) theory of cognitive development support this approach. Experiential learning has been criticised for focusing essentially on developing individual knowledge and limiting the social context (Hart 1992). Its application in medical education is relevant because it focuses on developing competences and practising skills in specific context (behaviour in practice: Yardley et al. 2012).

  • 인본주의적 이론 Humanistic theories: 개개인의 발달을 촉진하고자 하며, 보다 학습자 중심적이다. 자기실현을 위한, 자기주도적이고 내적으로 동기화된 개인을 만드는 것이 목표이다. These theories promote individual development and are more learner-centred. The goal is to produce individuals who have the potential for self-actualisation, and who are self-directed and internally motivated.
    • Knowles (1988) supported this theory by popularising the concept of “andragogy”. Although it explains the motivation to learn, its main limitation is the exclusion of context and the social mechanism of constructing meaning and knowledge. We now know that context and social factors are crucial in professional education (Durning & Artino 2011). 의미와 지식을 구성하는데 있어서 맥락적, 사회적 기전을 배제했다는 지적을 받음
    • Self-directed learning suggests that adults can plan, conduct, and evaluate their own learning. It has often been described as the goal of adult education emphasising autonomy and individual freedom in learning. Although it is axiomatic to adult learning, there are doubts about the extent to which self-directed learning, rather than directed self-learning is truly achievable (Norman 1999;Hoban et al. 2005). A limitation of the concept is failure to take into consideration the social context of learning. It has also implicitly underestimated the value of other forms of learning such as collaborative learning. ((타인에 의해)지시된 자기학습이 아닌 진정한 자기주도 학습이 어느 정도나 가능하겠느냐에 대한 의구심이 있음, 학습의 사회적 맥락을 고려하지 않은 점, 협동적 학습과 같은 다른 형태의 학습의 가치를 낮게 평가하는 점 등이 지적받고 있음)

  • 전환학습이론 Transformative learning theory: 학습자의 신념과 가정에 대한 도전으로서 비판적 성찰을 활용하는 방법. 아래와 같은 절차를 거친다. Transformative learning theory explores the way in which critical reflection can be used to challenge the learner's beliefs and assumptions (Mezirow 1978, 1990, 1995) The process of perspective transformation includes 
    • A disorienting dilemma which is the catalyst/trigger to review own views/perspectives – “knowing that you don’t know” '무엇을 모르는지 안다'
    • The context, which includes personal, professional and social factors
    • Critical reflection. Mezirow (1990) identifies different forms of reflection in transformation of meanings, structures, context, process and premise. Premise reflection involves the critical re-examination of long held presuppositions (Brookfield 2000). 의미/구조/맥락/절차/전제에 대한 전환이 일어난다.

  • 학습의 사회적 이론 Social theories of learning: The two elements that are crucial to social theories of learning are context and community (Choi & Hannafin 1995; Durning & Artino 2011). These concepts have been developed by Etienne Wenger (Lave & Wenger 1991; Wenger 1998), who emphasises the importance of “communities of practice” in guiding and encouraging the learner. Land and colleagues consider the way that learners enter the community of practice (Land et al. 2008). 학습자의 경험은 맥락과 공동체에 의해서 구성된다. The way in which a learner's experience is shaped by their context and community is developed by situativity theory and is discussed by Durning & Artino (2011). 상황인지이론의 주요 가정 세 가지 Situated cognition theories are based on three main assumptions:
    • 학습과 사고는 사회적 활동이다. Learning and thinking are social activities
    • 사고와 학습은 특정 상황에서 가능한 도구가 무엇인지에 의해서 구성된다. Thinking and learning are structured by the tools available in specific situations
    • 사고는 학습이 일어나는 상황에 따라 영향을 받는다. Thinking is influenced by the setting in which learning takes place (Wilson 1993).

  • 동기 모델 Motivational models: 성인학습을 교육이론과 연결시켜서 설명하고자 하는 모든 이론적 모델이든 반드시 두 가지 중요한 요소를 포함하고 있어야 한다. '동기'와 '성찰'이다. Any theoretical model that attempts to explain and relates adult learning to an educational theory must have two critical elements – motivation and reflection. 그러한 이론 중 하나가 자기결정이론이며, 여기서는 내적 동기를 중요시하는데, 이를 유지하기 위해서는 세 가지 기본적 욕구가 충족되어야 한다. 자율성/역량/소속감(관계성)이다.  One such theory is self-determination theory (Ryan & Deci 2000; ten Cate et al. 2011; Kusurkar & ten Cate 2013). The theory recognises the importance of intrinsic motivation, and considers that three basic needs must be fulfilled to sustain it: Autonomy, Competence, and a feeling of belonging – or “Relatedness”.
    • 학습에 대한 한 가지 이슈는, 성공에 대한 기대치가 높지 않으면 학습동기가 낮아질 것이라는 점이다. One of the issues about learning is that a low expectation of success will result in poor motivation to learn, unless the perceived value of success is overwhelming. This is partly explained by Maslow's theory of needs (Maslow 1954; Peters 1966), but it probably does not capture the balance between the different competing drives of hopes and expectation of learning as opposed to the time and effort needed to engage with the process. 기대가이론은 성공에 얼만큼의 '가치'를 두는가와 성공을 얼마나 기대하고 있느냐를 포함시켰다. The expectancy valence theory (Weiner 1992) incorporates the “value” of success and expectancy of success.
    • The Chain of Response model 반응체인모델은 학습프로젝트에 대한 참여를 중요하게 본다. 세 가지 내적동기요인이 서로 연관되는데, 이는 자기평가/교육에 대한 학습자의 태도/목표와 기대치의 중요도이다. concerns participation by adults in learning projects (Cross 1981). In this model three internal motivating factors are inter-related: self-evaluation, attitude of the learner about education and the importance of goals and expectations. The main external barriers to motivation are life events and transitions, opportunities, and barriers to learning or obtaining information.

  • 성찰모델 Reflective models: 성찰변화모델은 성찰이 행동을 하게 하고, 그리고 이를 변화시킨다는 것이다. The reflection-change models consider that reflection leads to action and then change. Reflective learning (Schön 1983, 1987) has important relevance to medical education, and more widely in society (Archer 2012). The role of deliberate practice (Duvivier et al. 2011), using reflection and feedback as tools to develop both knowledge and skills is starting to provide very valuable insights for educators helping students develop autonomous learning.

Even this brief consideration of types of theory applicable to adult learning will lead one to realise that they each have their strengths, and are each incomplete without the others. Before addressing a model that attempts to draw the theories together, we need to consider how we arrived at where we are.


성인학습이론의 역사적 맥락

Historical aspects of adult learning theories

In the late seventeenth century, the pervading view was that all knowledge derives from experience. Although he personally did not use the term, John Locke (Locke 1690)(비어있는 칠판) considered that the mind was a tabula rasa or “blank slate” at birth and that all acquired knowledge was derived from experience of the senses. These ideas were reworked and developed until the early twentieth century when Edward Thorndike (효과의 법칙, 연습의 법칙)derived his laws (Thorndike 1911), principally the law of effect – which stated that learning occurred if it had a positive effect on the individual, and the law of exercise – which meant that repetition strengthened the learning.


This was further developed by behaviourists, such as Skinner (1954) 빈도/인접성/우연성이 중요함. who demonstrated that some forms of learning could be demonstrated by a simple stimulus-response paradigm, so that a reward could be used to ensure an appropriate response to a stimulus. Skinner showed that there were three elements that strengthened learning, namely frequency (the number of times a stimulus was presented), contiguity (the time delay between the response and the reward) and contingency (the continued link between the stimulus and the reward). 


Chomsky (1975) 언어와 같은 고차원적 기술에는 행동주의적 관점이 적용되지 않는다. considers that the type of experiments favoured by behaviourists do not explain the acquisition of higher order skills, such as the learning of language. Chomsky argued that our brains are programmed to acquire higher order skills, which we develop and modify by experience. While some were looking at the potential neural mechanisms that underlie the acquisition of learning, others were considering the factors that can make it more effective.


Piaget, a cognitive constructivist, 인지적 구성주의자, 서로 다른 종류의 지식이 서로 다른 단계에서 습득된다. considered the different types of knowledge that could be acquired at different stages in a young person's life (Piaget, 1952). This stream of thought continues to the present day in the work of people like William Perry (1999) 대학생들이 이원론에서 다원론으로 발달해간다. who studied the way in which college students change from dualism (ideas are either true or false; teacher is always right) to multiplicity (truth depends on context; teacher is not necessarily the arbiter).


Social constructivists, like Vygotsky (1978), 사회적 구성주의자, 학습공동체가 학습을 만든다.  focus on the way that the learning community supports learning. A key idea in social constructivism is that of the Zone of Proximal Development, whereby a learner can only acquire new knowledge if they can link it in with existing knowledge. Conversations between learners/teachers articulating what is already known can extend the zone of proximal development by putting new ideas in the context of current understanding. This strand of thought has been taken forward in social learning theories by Bandura (1977),사회학습이론 and in a remarkable way by Wenger 학습공동체, in the concept of learning communities or “Communities of Practice” (Wenger 1998).



성인학습자와 교육학 

Andragogy and pedagogy: Knowles views and related learning models

Towards the end of the twentieth century, there was a body of research that suggested that adults learn differently from children and that “andragogy” was a better term for this process than “pedagogy”. The key difference between adults and children is said to be that adults are differently motivated to learn. Although the arguments no longer seem quite so clear, the line described by Knowles (Knowles et al. 2005) was that adult learners differ from child learners in six respects: 아동과 어른의 핵심적 차이는 어른은 학습에 대한 동기가 다르다는 것이다. 다음의 여섯 가지 차이가 있다.


  • 내가 이것을 왜 알아야 하지? The need to know (Why do I need to know this?)
  • 나는 나의 선택에 대한 책임이 있다. The learners’ self-concept (I am responsible for my own decisions)
  • 내가 중요하게 생각하는 것에 대한 경험을 가지고 있고, 당신은 그것을 존중해야 한다. The role of the learners’ experiences (I have experiences which I value, and you should respect)
  • 나를 둘러싼 환경이 바뀌고 있으므로 나는 배워야 할 필요가 있다. Readiness to learn (I need to learn because my circumstances are changing)
  • 내가 맞닥뜨릴 상황을 대처하는데 도움이 되는 학습을 한다. Orientation to learning (Learning will help me deal with the situation in which I find myself)
  • 나는 내가 학습하고 싶기에 학습한다. Motivation (I learn because I want to)


These observations, in association with David Kolb's experiential learning model ((Kolb 1984), see Figure 1) have allowed the consideration of learning and teaching strategies appropriate for adult learners.






In Kolb's scheme, 

  • the learner has a concrete experience
  • upon which they reflect. Through their reflection 
  • they are able to formulate abstract concepts, 
  • and make appropriate generalisations. 
  • They then consolidate their understanding by testing the implications of their knowledge in new situations. 
  • This then provides them with a concrete experience, and the cycle continues. 


Learners with different learning preferences will have strengths in different quadrants of the (Kolb) cycle. In Kolb's terminology

  •  “Activists” feel and do,
  •  “Reflectors” feel and watch,
  •  “Theorists” watch and think and
  •  “Pragmatists” think and do. 


흔하게 인용되고 있고, 쉽게 이해가 되는 개념이긴 하나, Kolb cycle로부터 개발된 학습유형 인벤토리는 신뢰도와 타당도가 좋지 않은 것으로 알려져 있다.

From the educator's point of view it is important to design learning activities that allow the cycle to be followed, engaging each of the quadrants. Although it is often quoted, and easily understood, the learning style inventory developed from the Kolb cycle has poor reliability and validity (Coffield et al. 2004).


큰 틀에서 구성주의적 관점을 따르자면 중요한 것은 이전의 경험, 그리고 이 이전의 경험과 새로운 경험 사이의 불협화음이다. 이러한 차이에 대해서 성찰해 보는 것이 RIA이며, 추상적 개념을 형성하게 해주고, 새로운 자료를 이해하게 해준다. 또한 직접 실험을 하거나 토론/토의를 통해서 기존 지식에 대한 검증을 하게 만든다. 그러나 여기에 빠진 것은 ROA이다. 학습자가 그들이 사용한 절차에 대해서 생각해보고, 그것이 어느 정도로 적절하고 좋은 방법이었는가를 생각해보는 것이며, 이는 학습에 필수적이다. 

Of particular importance to those who follow a broadly constructivist line (but lacking in the original model), will be the prior experience/knowledge of the individual, and the dissonance between this and the concrete experience that is provided as the learning opportunity. When we see something new, attend a lecture, or talk with a patient, we compare what we are seeing with what we already know, and reflect upon the difference (reflection in action, (Schön 1983)). This enables us to formulate abstract concepts that make sense of the new data. In turn this will lead us to propose tests of our knowledge, through direct experimentation or through debate and discussion. This is a familiar process to all acquainted with the scientific/clinical method; however at least one key element is missing, and this is reflection on action. It is crucial that the learner thinks about the processes they have used, and the extent to which they were rigorous or appropriate in the use of the material; this is fundamental to learning.


기존의 정보와 새로운 정보 사이에 연결이 만들어지는 과정: elaboration, refinement, restructuring

The next issue is the way in which new knowledge becomes integrated into the existing knowledge base. Proponents of the transformative learning approach consider that meaningful learning occurs when connections are made between new and existing information (Regan-Smith et al. 1994). Norman & Schmidt (1992) suggest that there are three main elements to this process: elaboration, refinement and finally restructuring

  • Elaboration is linking in new knowledge with what we already know. It is important, however, that the linkages are precise rather than general (Stein et al. 1984). 
  • Refinement is the act of sifting and sorting through the information to retain those elements that make sense. 
  • Finally, restructuring is the development of new knowledge maps (schemata) which arguably allow one to become an expert or demonstrate expertise (Norman et al. 2006).


Bloom's taxonomy와 Miller's pyramid에서 학습성과와 스캐폴드 만들기

Learning outcomes and scaffolding from Bloom's taxonomy to Miller's pyramid

교육자는 advance organiser를 제공함으로서 학습자를 도울 수 있다. 두 종류가 있는데 model과 metaphor 그리고 scaffolding이다.

The processes of acquiring new knowledge, relating it to what is already known and developing new understanding is complicated and difficult but educators can help the learners by providing advance organisers (Ausubel 1968). There are two types of advance organisers: models and metaphors, which we will consider later, and scaffolding.


Scaffolding이란 교수자가 교수학습자료를 이용해서 학습자를 인도하는 것이다. 복잡하고 방대한 양의 자료만 넘겨주게 되면 학습자는 한계점에 도달하게 되며, 이 상태를 state of liminality(경계선상태) 라고 한다.

Scaffolding refers to the structural things that teachers do to guide learners through the teaching and learning material. They are necessary because the sheer volume and complexity of knowledge to be acquired often leaves the learner standing on the threshold (in a state of liminality), rather than stepping into the world of learning.


경계선상태(liminality)의 문제는 과소평가하기 쉽다. 우리는 한계점 너머로 우리를 넘겨줄 누군가가 필요하다. 우리가 지식과 이해를 막 쌓기 시작한 단계에서 우리는 이것들이 어디에 해당하는 것이고, 서로 어떻게 연결되며, 개별 파트가 전체의 어디에 해당하는 것인지에 대한 아이디어가 필요하다. "Scaffolding"은 이러한 역할을 하는 것이다. programme level organiser가 주로 Scaffolding이 되는데, 여기에는 Syllabus, lecture, planned experiential learning, reading list 등이 포함된다. 흔하게는 요즘에는 학습성과의 목록을 제공해주기도 한다. 

It is easy to underestimate the problem of liminality. It is described well by Ray Land (Land et al. 2008; Meyer et al. 2010), but it refers to the sense of discomfort we feel when we do not quite understand the rules or the context of a new situation. We need someone to lead us over the threshold, introduce us to the new ideas, and probably explain some of the language (Bernstein 2000). As we start to build our knowledge and understanding, we need to have some idea of where things fit, how they fit together, and some idea of how the individual pieces are part of a greater whole. “Scaffolding” provides that perspective. Scaffolding includes programme level organisers, which are dependent on both the content and the context in which it is being learned. Programme organisers include the syllabus, lectures, planned experiential learning and reading lists. Most commonly, these days scaffolding includes providing learners with a list of intended learning outcomes. It is important to remember that it also includes the induction that students receive when they enter the programme or a new clinical environment.



Learning outcomes can be further refined using Bloom's taxonomy (Bloom et al. 1956), which has been revised by several authors, including Anderson (Anderson & Kratwohl 2001). In Figure 2, Bloom's taxonomy is shown in the pyramid itself, and Anderson's development of it in the side panels.





Anderson은 Creating을 Evaluating보다 높이 두었는데, 둘 다 학습자가 지식을 활용해서 무언가 하게 한다는 점을 공통적으로 강조하고 있다. 학습성과는 따라서 '동사'와 관련되어야 하며 학습해야 하는 '목록'이 되어서는 안된다.

Anderson's modifications indicate a belief that “creating” is a higher attribute than “evaluating”, but they are also important in emphasising that the learner does things with knowledge. Learning outcomes, therefore, should be associated with verbs, rather than lists of things to learn. The difficulty with the model is highlighted by the differences between Bloom and Anderson's model. In reality, the elements of the pyramid are arranged in a cycle. Evaluation leads to developing a new idea which is then applied, analysed, evaluated and so on.


Bloom's taxonomy는 다양한 변형이 있다. Miller의 피라미드는 그 하나다. 그러나 지식은 피라미드의 기초일뿐 피라미드 그 자체는 아니다.

Bloom's original work led to several variants. In medical education, the most frequently encountered is Miller's pyramid (Miller, 1990; Figure 3), which can be used as a guide for planning and assessing within a curriculum. The pyramid is important, because in training students for the healthcare professions it is essential to remember that the outcome of training is intended to be a graduate who can take their place in the workforce (Action). Knowledge is the foundation of the pyramid – but not the pyramid itself.







안내적 발견학습과 학생의 학습전략 

Guided discovery learning and students’ learning strategies

조하리의 창 In a structured learning environment new knowledge is sufficiently similar to the existing knowledge to allow its relevance to be perceived. A more challenging condition applies in real life, when the relevance of information is often far from apparent. The variants of this situation are described by the Johari Window (Figure 4), named after its originators Joseph Luft and Harry Ingram in the 1950s (Luft & Ingham 1955).






여기서 두 가지가 명백해지는데 개개인간 토론이 실용적 지식을 늘려준다는 점과, 일부는 여전히 잘 모르는 상태로 남아있을 것이다라는 것이다. 

Two things are immediately apparent from this construction – namely that (1) discussion between individuals will increase the amount of practical knowledge, and that (2) some things remain a mystery until we talk to someone else with a different range of knowledge or understanding. It follows that the more diverse a learning group's membership is, the more likely the individuals within the group are to learn. There will always be “unknown unknowns”, but teachers can help students move into those areas through a careful choice of task, resources and, of course, patients. Before we look at the ways in which we can assist learning, there are two other considerations; both of which relate to the way that the learner thinks about knowledge.


다양한 학습스타일이 있을 수 있다.

Newble, Entwistle and their colleagues, in a number of studies (Newble & Clarke 1986; Newble & Entwistle 1986), have shown that there are several different learning styles, and that learners have different learning preferences. There is a real and active debate about whether learning styles are fixed or flexible, and the extent to which they are determined by the context (Coffield et al. 2004). It does seem clear that some learners prefer to work towards a deep understanding of what they are learning; others prefer to acquire the facts, a term known as surface learning. A moment's reflection will show that each can be an appropriate strategy. Sometimes deep understanding is needed, and sometimes it is enough to know “the facts” – the surface. It is important to know normal blood gas values or electrolyte levels and this surface learning triggers appropriate clinical action. However, to sort out a patient with acidosis requires a deeper understanding of how the various physiological systems interact. The ability to be strategic about the sort of learning we engage in is important. But it can be affected by the assessment system. So, if an assessment system tests for recall of facts, then the successful learner will employ surface learning. If the system rewards deep thought, understanding and reasoning, then the successful learner will aim for that. Strategic learning이 세 번째 스타일인지 아닌지에 대한 논란이 있다. There is a difference of opinion about whether “strategic” is a third learning style or not (Newble & Entwistle 1986; Biggs et al. 2001). 


서로 다른 스타일에 대해서 아는 것은 중요한데, 왜냐하면 강의는 대체로 surface learner에, 확장된 프로젝트는 deep learner에 어필할 것이기 때문이다.

Recognising the different styles is important, as (most) lectures will appeal more to surface learners and extended project work will appeal more to deep learners. Some subject material actually needs to be known and rapidly recalled (blood gas values, electrolyte levels), while other material needs to be deeply understood to allow appropriate interventions (coping with acid base disturbances, or circulatory shock).


이원주의에서 다원주의로의 발전해나간다.

In a series of studies on American students in their college years, Perry (1999) noted that students change in their approach to learning as they progress through their college years. Typically students develop from an approach based on duality”, with a clear view that the teacher will tell them the difference between right and wrong, towards multiplicity”, where they recognise that context is important, and that they, their colleagues and the environment are valuable sources of knowledge and experience. Together with this change in focus comes a greater confidence with coping with uncertainty. This work was based on a relatively able, affluent and homogenous population of undergraduates and was subsequently extended by Perry's colleagues to a wider cross section of society. They (Belenky et al., 1997) uncovered a group of “silent” learners, who did not recognise their own rights to question or construct knowledge. Belenky and colleagues also extended the scale beyond receiving and understanding knowledge, to being co-constructors of knowledge (Belenky et al. 1997).


그러나 일부 연구에서는 꼭 이러한 방향으로 발전해나가는 것은 아니라고 하기도 한다.

Some recent work by Maudsley (2005) shows that medical students develop in the way they learn, but that the progression is not always from duality to multiplicity. There are two explanations for this paradox, one is that the learners tend towards more strategic learning styles in order to cope with the demands of the assessment system; the alternative explanation is rather more complex and relates to the business of becoming a new member of the profession.


학습이란 단순히 지식을 얻는 것 만이 아니라, 그것을 이해해서 활용하는 것 까지를 일컫는다.

The process of learning new things is not just about acquiring knowledge (surface learning), it includes being able to make sense of it, and hopefully making use of it. But being able to do these things means that you have to acquire an understanding of where things fit. A novice stands at the threshold, not quite knowing what to expect, and sometimes not even knowing what they are supposed to be looking at. This is a state of liminality, and the learner needs to have some threshold concepts so that they can move further (Land et al. 2008; Meyer et al. 2010). Frequently the difficulty is in the vocabulary or the way that language is used (Bernstein 2000), but it can also be troublesome concepts (Meyer & Land 2006), or just becoming part of the “team” and assuming a new identity (Wenger 1998). The role of the teacher is to help the learner over the threshold, and, as discussed above, help them until it starts to make sense. If we follow Wenger's arguments (Wenger 1998) then we will see that the whole community has a role in leading the novice over the threshold, and helping them to take their place in the community of practice, that is, in this case, the healthcare profession



성인은 어떻게 학습하는가?

How adults learn: a multi-theories model

다섯 단계를 제안하고자 한다.

It will be clear by now that there are several different theories about, and approaches to, learning. In the section that follows we introduce a model that encapsulates them and can be used to structure, plan and deliver successful learning experiences. We propose that there are five stages in the learning experience, which the learner needs to go through. The learner and the teacher will have particular responsibilities at each stage. We shall outline the model first, describe the responsibilities and then discuss each element in greater detail.


Outline

All learning starts with the learner's existing knowledge, which will be more or less sophisticated in any given domain (Figure 5).






  • 학습자의 지식이 도전받고, 그것이 완전하지 않음을 알게 되는 과정
    The dissonance phase exists when the learner's existing knowledge is challenged and found to be incomplete. The challenge can be internal, when a learner is thinking things through, or it can be external, provided by a teacher or patient. There are several things that influence whether the learner will engage with the dissonance phase. These include the nature of the task, the available resources, the motivation of the learner, and the learner's stage of development and their preferred learning style. It ends with the learner reflecting and determining their personal learning outcomes.
  • 학습자는 dissonance에 대한 가능한 설명과 해답을 찾아보게 되고, 과제수행/연구/성찰/토론 등을 통해서 새로운 정보를 정제(refine)하게 된다.
    During the
    refinement phase, the learner seeks out a number of possible explanations or solutions to a problem (elaboration), and through completing tasks, research, reflection and discussion refines the new information into a series of concepts which are, for the learner, new.
  • 학습자가 새롭게 얻어 증가한 지식을 처리하기 위해서 자신의 생각을 더욱 발전시키거나 재구조화하는 단계이다.
    The
    organisation phase is where the learner develops or restructures their ideas to account for the increased information they have acquired. There are at least two elements to this: reflection in action, where the learner tests and re-tests hypotheses to makes sense of the information and the organisation of the information into schemata which (for the learner, at least) make sense.
  • 이 단계는 가장 중요한 단계라 할 수 있는데, 왜냐하면 여기서 학습자들은 새롭게 얻은 지식을 분명하게 표현하고, 동료나 선생님의 신념에 맞서게 되기 때문이다. 피드백은 기존의 스키마를 강화시킬 수도 있고, 또는 새로운 정보를 다시 고려하게 함으로써 학습자를 굴복시킬 수도 있다.
    The
    feedback phase is arguably the most crucial, as it is where the learner articulates their newly acquired knowledge and tests it against what their peers and teachers believe. The feedback will either reinforce their schema, or oblige the learner to reconsider it in the light of new information.
  • 이 단계에서 학습자는 그들이 수행한 단계를 다시 생각해보면서 지나간 학습 사이클을 되돌아보고 거기서 무엇을 배웠는지(지식 측면의 향상)와 더불어 학습절차(RoA)까지 성찰해보게 된다.
    During the
    consolidation phase the learner reflects upon the process they have undergone, looking back over the learning cycle and identifying what they have learned from it, both in terms of increasing their knowledge base, but also in terms of the learning process itself (reflection on action).




Adult learning model in action

각각의 다섯 시기에 학습자와 교수자가 해야 할 구체적 역할들이 있다.

During each of these phases, we propose that there are specific roles for teachers and learners.





The model that we have given here shows that there are a number of ways in which applying the model can help in the design of learning activities, whether in one-to-one discussions, small group work, seminars or large lectures. The same principles apply to planning curricula, at short course, module or programme level. Whether working with an individual learner, or planning a major programme, the educator needs to recognise that the learner needs to move through a cycle, in order to truly understand and learn. We also need to be explicit that educator and learner have specific responsibilities at each stage of the learning process.


확장형 성인학습모형 

Adult learning model “expanded”:


The dissonance phase

The key to success as an educator is probably providing the advance organisers. We need to know what we want the learner to learn, and how it fits into the greater scheme. That means that we must have clearly defined outcomes, at the appropriate levels of one of the modifications of Bloom's taxonomy (Figure 2). We may need a student to gain new knowledge, apply their knowledge or create a new hypothesis, for instance. Once we know our intended outcome we are in a position to start thinking about the best way of helping the learner to acquire, and demonstrate that they have acquired, the learning outcomes.


학생이 무엇을 하게 만들까를 먼저 고민하지만, 그보다 가장 적절한 과제를 위해 고려해야 할 다음의 다섯 가지를 먼저 생각해야 한다.

When we plan an educational intervention, we usually start with an idea of the task we want the students to be involved in (attend a lecture, take a history from a patient, write an essay, or whatever). There are, however, five considerations that define the most appropriate task, and they should come first.


(1) 어떻게 학습자를 격려하고, 사전 지식을 활용하게 할 수 있을지 고민하라
Consider how the learner can be encouraged to articulate their prior knowledge

The entire learning process starts with what a learner already knows. In any intervention, we need to make sure that the learner has the possibility to articulate what they already know about something. There are many possible techniques, for instance “buzz groups” in lectures (Jaques 2003), the early phases of the PBL process where learners discuss what they already know (Taylor & Miflin 2008), or discussing something on the ward before performing an examination or obtaining a history from the patient. This stage helps the learner anchor the new knowledge in what they already understand, and places them on the first stage of the learning cycle. It also highlights to the learner where the gaps or uncertainties are in their knowledge.


(2) 학습스타일과 그것이 갖는 함의를 생각하라.
Consider learning styles and their implications

If the aim of the educational intervention is simply to present the learner with new knowledge, then surface learning is the most appropriate learning style. It is not the most appropriate learning style, though, if the learner is required to understand, or later elaborate on the knowledge (Newble & Entwistle 1986; Biggs et al. 2001). Elaboration, and the later stages of Bloom's taxonomy require an increasing depth of understanding. There are complicating factors, since many learners are strategic in choosing surface learning styles before they enter University courses, so they may appear to show a preference for surface learning. Even at graduate level, if students know that they will be tested on their acquisition of facts, rather than their understanding, they will naturally choose a surface learning style. If the educator is aiming for a deeper level of understanding, then it will be necessary to make sure that the assessment process does not derail it.


It is possible, but challenging, to use lectures to provide more than surface knowledge. Deep learning comes through discussion, research and weighing up the evidence. Curricula that use PBL (Taylor & Miflin 2008), Team based learning (TBL: Michaelsen et al. 2002) and Case-based learning (Ferguson & Kreiter 2007) are designed with this in mind, but more traditional programmes can introduce elements of the more discursive styles, or require learners to complete particular tasks, such as research, small group work or preparing papers.


(3) 학습자의 발달단계를 생각하라
Consider the stage of development of the learner

In the same way that surface learning has attractions for many learners, Perry's stage of duality has attractions for both the learner and the educator (Perry 1999). Lectures can reinforce a state of duality in which the learner accepts what the lecturer says. But learners need to be comfortable with uncertainty, dealing with a partial picture and recognising when they need to know more. It is not enough for a doctor just to know the right answers in a perfect situation; we rightly expect them to understand why they are the right answers, and how they are determined by circumstances. A senior clinician will have sufficient experience to recognise this, and it should come across in traditional bedside teaching. Learners can also develop their understanding of systems through well-facilitated PBL or case-based learning, where the facilitator encourages learners to think about the value they attribute to “facts”, and the way in which they think about them. Helping the learner shift from duality to early multiplicity, and look beyond the obvious first impressions, is crucial to bedside teaching, for instance, where test results or images have to be related to the patient's account of their problem.


(4) 학습자의 동기를 생각하라
Consider the learner's motivation

Sobral's (2004) work has shown that student's motivation can be strongly influenced by the educational environment and their frame of mind towards learning. This is also central to the self-determination theory (ten Cate et al. 2011; Kusurkar & ten Cate 2013). If that is the case, then early clinical contact that is both stimulating and relevant to the desired learning outcomes will be beneficial.


Although adult learners are expected to be self-motivated, they will also have a host of competing concerns. Balancing two or more imperatives is a normal state of affairs for both learner and educator. It is the responsibility of the educator to ensure that the task will engage the learner for long enough to allow the learner's enthusiasm to be captured. It is equally important not to squander the learner's energy and enthusiasm with poorly thought out tasks, or issues that are either trivial or too difficult.


There is more to consider here, particularly the dimensions of self-directed learning (Garrison 1997), which include motivation and self-regulation (Zimmerman 2002). There is some evidence that problem-based learning students are better at self-regulation (Sungur & Tekkaya 2006), which includes the ability to construct meaning. The goal, however is self-directed learning which transcends self-regulated learning to include motivation and, crucially, the ability to determine what should be learned (Loyens et al. 2008). Again, this is fostered by problem-based learning, but is easily destroyed by publishing or giving the students detailed intended learning outcomes.


(5) 자원을 고려하라
Consider the resources

가장 중요한 자원은 뭐니뭐니해도 시간이다.

Naturally, we need to consider physical resources such as space, books, journals, and access to electronic resources. The most precious resource, for all of us, is time. Whenever an educational activity is planned there must be sufficient time devoted to preparation and planning, including planning the way in which the activity will be evaluated and assessed. Clearly there will need to be sufficient time made clear for the educator/s involved in the delivery, but also in the evaluation and assessment processes. It is also important that there is sufficient time for the learners to engage with the learning activity and complete any necessary additional work, such as reading, and of course reflecting upon the material and the way in which they have learned.


위의 다섯가지를 고민하고 나서 과제를 고민하라.

Finally consider the task

The task the learners are set has to take into account all of the preceding considerations.


It needs to have learning outcomes which are aligned with the curriculum as a whole and which are specific enough to be reasonably achievable within the allocated time. No one could learn the anatomy and physiology of the nervous system in a couple of days, but they might be able to master the anatomy and physiology that underlie the crossed extensor reflex.


Opinions are divided about whether every task should be assessed, but it is widely asserted that “assessment drives learning” (Miller 1990), so attention needs to be paid to the assessment opportunities, and the material covered should be included in the assessment blueprint (Hamdy 2006).



The elaborate and refine phase

Dissonance에서는 가능한 많은 대안을 찾는게 우선이다. 

The dissonance provided by the task has been sufficient to introduce new possibilities, facts and concepts to the learner. They must now start to make sense of them. The first stage in this process is to consider as many of the possible explanations for the new information as possible. This is equivalent to the brainstorming phase in problem-based learning and has two main advantages. 

    • The first is that it helps ensure that connections are made between the new information and previous knowledge, ensuring that everything is learnt in the context of what is already known. 
    • The second is that it reinforces our natural tendency to be appropriately inventive and to think widely. This skill will be crucial for the future healthcare professional, where the obvious explanation for a patient's symptoms may be wrong. Shortness of breath, for instance, may have a respiratory or a cardiovascular origin.


Refinement를 하지 않고 Elaboration만 하면 혼란만 가중된다. 따라서 가능한 설명이 모이면, 그것들을 '가장 가능성 있는' 답안으로 정교화해야한다.

Elaboration without refinement will just lead to confusion, so once a number of possible explanations for a scenario have been determined, it is necessary to refine them into the most plausible solutions. This will be after some research, reflection and discussion or in the clinical environment after reading the patients notes or seeing the results of appropriate tests. In this phase we are mirroring the scientific and clinical method, which is a valuable exercise in and of itself. The outcome of this phase is the generation of a working hypothesis.


Elaboration과 Refinement 시기에 일어나는 대부분의 일은 학습자 내부에서 진행되지만, 그 모험이 성공할지 여부는 어떤 과제를 주었는지, 그리고 적절한 자원을 제공하였는지에 달려있다.

Most of what happens in the elaboration and refinement phase is internal to the learner, but the success of the venture will stem from the nature of the task they were set, and the provision of appropriate resources. The task must be such that it requires some thought and engagement to complete it, and the resources need to be appropriate to the task and the understanding of the learner. This phase is the key part of problem-based learning, but can also arise out of clinical and bedside teaching when the educator is aware of the possibilities and careful to exploit them.



The organisation phase

이 시기에 학습자는 문제를 모든 관점에서 바라보면서 그들이 이미 아는 것에 대응하여 가설을 검증하고 재검증해본다. 이는 복잡한 과제이며, 이 때 학습자는 RiA를 통해 스스로를 비판적으로 성찰하게 된다.

During this phase the learner looks at a problem from all angles, testing and retesting the hypothesis against what they already know. Part of this phase is fitting the information into what the learner already knows, and part of it is in constructing the new information into a story that makes sense to the learner. This is a complex task and involves the learner reflecting in action, challenging him- or herself to reflect critically.


교육자는 두 가지 지지하는 역할을 하게 된다.

The educator has two roles in supporting the learner. 

  • The first role is to provide them with scaffolding, a skeleton to support their ideas and give them coherence and structure. This may be the framework of the programme, with a series of themes, or it might be a lecture or lecture series, or it could even be a syllabus. The danger with scaffolding is that if it is too detailed it removes any freedom or responsibility from at the learner. It then becomes very difficult to determine whether true understanding (rather than simple recall) has been achieved. It also means that the learner will not know, until too late, whether they truly understand the subject.
  • The second role for the educator is to encourage critical reflection. At its best the educator will model this in tutorials or the supervising clinician in bedside teaching, but it is perfectly possible to model one's way of thinking about a problem in a lecture or seminar. Given that so much of our knowledge base changes, critical thinking is probably the most important skill we can give our students.

학생들로하여금 그들의 성찰적 기술을 검증해볼 수 있는 기회를 제공하는 것이 중요하다. 상호토론, 비공식적인 방법, 소그룹, 교육자와 함께, 친구들과 함께 등 다양한 방법이 가능하다.

It is essential that we provide students with opportunities to test their reflective skills. There are many possible ways but they include discussion with each other, informally, or in small groups, with the educator, or with critical friends. Although the idea of critical friends (Baskerville & Goldblatt 2009) is usually associated with teachers/researchers, there is no reason why it would not work between students, although they would need training and support in the first instance.



피드백 Feedback

피드백의 두 가지 요소

There are two elements to feedback.

  • 하나는 무엇을 배웠는지를 설명하는 것이다. The first is articulating what has been learned. All educators know that the real test of understanding something is explaining it to other learners. So the newly acquired material needs to be explained, or used in some way.
  • 학습자와 교육자가 도출된 결과에 만족할 때까지 주장의 강점과 약점을 짚어주고, 추가 질문을 하는 것이다. The educator's role, together with other learners, is the second element of feedback, which is to point out the strengths and weaknesses of any argument, and to ask further questions, until learner and educator are satisfied that the outcome has been met. In any facilitated small group session or bedside teaching session, this is part of the role of the facilitator – it is perfectly possible and acceptable to challenge constructively without handing out the correct answer or humiliating the student. In a group that is working well (whether a formal, structured group or a self-formed study group) other group members will pose questions and seek clarification. This is a combination of feedback and discussion, and can lead to co-construction of knowledge (Belenky et al. 1997). It is also relatively simple to provide feedback in a lecture theatre – either through team-based learning activities, or through instant feedback devices such as “clickers”, or, dare one say, the raising of hands!


가장 적절한 시점이 있다. 가장 확실한 예는 '평가를 하는 바로 그 시점'이다.

Although feedback is best given in frequent, small, doses, there are clearly times when it is crucial. The most obvious example is when the learner is being assessed. This is when learners realise the extent to which they have acquired and can demonstrate new knowledge. Any effective assessment system will provide learners with an indication of where they are going wrong, and which areas they should focus on for clarification of their understanding.


종종 놓치는 두 가지 요소가 있다.

There are two further elements of the feedback phase that are often ignored. 

  • 교육자가 스스로의 피드백에 대해서 생각해보는 것 
    The first is
    the duty of the educator to seek and reflect upon the feedback they obtain about their own performance. In this way we can develop and hone our skills to become better at what we do. 
  • 학습자와 교육자가 그들이 학습한 방법에 대해서 생각해봐야 한다.
    The second relates to epistemology. Educator and learner also need to reflect upon the way that they have been learning, and the relative highs and lows of the experience. This is to ensure that we can work smarter (rather than harder) next time.



The consolidation phase

두 가지 도전에 맞닥뜨리게 된다.

The learner faces two challenges in this phase. 

  • 하나는 지금까지 해온 것에 비추어봤을 때 이번에 배운 것이 무엇인지를 성찰해보는 것이다.

The first is to reflect on what has been learned in the light of what was known before.

Does it all make some sort of sense, or is there a logical inconsistency that needs to be thought through? 

How does the new knowledge help to explain the bigger picture and increase our understanding?


If the exercise has been subject to assessment, this is where the learner should ideally think about their assessment results, and their areas of relative strength and weakness, so as to ascribe confidence levels to what they think they know.


  • 학습자는 이미 자신의 학습과정이 작동한 방식에 대해 느끼고 설명한 바가 있을 것이다. 이 시기에서는 학습에 대한 스스로의 책임이 어디까지인지에 대한 고려를 해봐야 한다.

The learner will already have articulated (in the previous phase) how they felt the learning process worked. In this consolidation phase they need to consider the extent to which they took personal responsibility for their learning. 

How far are they along the continuum towards co-constructing knowledge? 

To what extent were they personally responsible for any breakdown in the process? 

What should they do differently next time?


교육자의 역할

The role of the educator in this phase is to provide encouragement for reflection on action. This might be through the provision of written feedback about examinations, highlighting areas of relative strength and weakness, or it could be through an appraisal or portfolio process. The key is to move from a right/wrong type of feedback to one where the possibilities for future development are made explicit. The educator's role, after all, is to lead the learner towards a deeper understanding.



의학교육 기관 차원의 함의

Institutional implications and applications of adult learning theory in medical education

At an institutional level connecting adult learning theory with practice is challenging. Some theories or aspects of a theory will be more relevant and helpful than others in a particular context. In exactly the same way that clinicians are expected to adopt practices on the basis of the best available evidence, educators should make use of the best available evidence to guide their educational decisions. Medical education institutions should rationalise and be explicit about their mission, vision, programme and curricula development, learning strategies, students’ assessment and programme evaluation guided by adult education theories and their particular socio-cultural context.


Institutional mission, vision and curriculum outcome

Many health care education programmes will have mission or vision statements describing graduates who have knowledge, skills and attitudes that allow them to respond to the health needs of the population with a high degree of moral and social responsibility. In outcome-based education one can expect a variety of strategies, each relying on one or more different educational theories. Understanding how people learn is important, and both learners and educators need to remember that learning is a process through which they weigh their knowledge against a critical examination of alternative possibilities (Ahlquist 1992). This understanding is basic to problem-based learning and the majority of clinical practice.


Although knowledge is the easiest, and most public domain, more than half of the outcome domains of medical education are related to attitude e.g. lifelong learning, empathy, utilitarianism, communication with patient and colleagues, ethics and professionalism. Transformative and experiential learning theories constitute an important theoretical frame for learning strategies suitable for these outcomes. The institution should be ready to embark on educational and cultural environment changes in order to operationalise these concepts.


Learning and teaching

Applying adult learning principles in medical education will probably necessitate changing educators’ and learners’ perceptions of their roles. Adult educators may consider adopting a view of themselves as both learners and educators. The learner's role is not only to receive knowledge but also to search, challenge, construct knowledge and change their own perception, views and beliefs.


Applications of these strategies necessitate significant institutional culture changes, active faculty development and increased learner autonomy and self-direction. To develop these skills all learners (including faculty members) should be trained to ask questions, critically appraise new information, identify their learning needs and gaps in their knowledge and most importantly to reflect and express their views on their learning process and outcomes.


The clinical environment is challenging for the learner and the educator. Clinical educators, students and patients interact together within the context of a hospital, clinics and community at large not just in a classroom. Time is at a premium, and the stakes for the patient are often high. Because of this it is important to make the best use of learning theories when helping people to learn.


Self-directed and experiential learning are key strategies, but feedback is crucial to help the learner make the best use of their contact time. Clinical reasoning, hypothesis generation and testing are essential skills for good clinical practice. The model of adult learning we have illustrated (Figure 5) shows that perception, insight, meaning-making and mental networking are interlinked and essential for good reasoning abilities. The clinical teachers should explain how they come up with a diagnosis or take a management decision by exploring with the learner the mental processes in the teacher's and the learner's minds by which “the implicit becomes explicit”.


Self-directed learning and student goal-setting should always be encouraged and supported but they should also be discussed, monitored and recorded. Portfolios, logbooks and reflective journals are particularly important tools for this. The key for successful implementation is for them to be more than “tick box” exercises, and we have found that using them as a basis for discussion makes them more effective.


Ethics and professional behaviours can be, and often are, taught but understanding them is demonstrated and consolidated within the clinical environment. Asking students to observe, record and discuss incidents that have ethical and professional implications is crucial to this development (Maudsley & Taylor 2009). Perspective transformation theory (Mezirow 1978) is most appropriate for acquiring these competencies. It supports reflection, and examination of the learner and teachers’ assumptions and beliefs, hoping it may lead to individual and social change. An off-shoot of adult learning theories is situated cognition (Wilson 1993) developed by Wenger (1998) into the theories of communities of practice. Its application to the clinical environment is relevant. Learning and thinking are social activities structured and influenced by the setting and tools available in a specific situation (Lave & Wenger 1991). Learning and teaching approaches at the bedside are different from the operating room, emergency department or in the community (Durning & Artino 2011; Yardley et al. 2012). Each context has its educational power and value. Observing the performance and behaviour of a trainer as role model, reflection in and on action and feedback on performance are important education principles to be considered in teaching and learning in clinical settings.


Student assessment and programme evaluation

Awareness of adult learning theories is needed to develop and select evaluation systems and instruments that can measure the expected competencies and outcomes. What to measure, how, when, by whom are important key questions and their answers are not always easy. The assessment should be tied to specific learning outcomes, and the learner should be given whatever feedback will help them develop or consolidate their knowledge, skills or attitudes. Time constraints mean that some elements of the feedback will need to be the learner's self- and peer-evaluation, but this should not be seen as a problem. Encouraging discussion, debate and reflection will increase learning opportunities. It is important to allow time, and provide a structure, for these activities if they are to be properly integrated into the learning/assessment system.


As mentioned above, a well thought through portfolio/log book with elements of reflection will allow for the learner's progress to be documented for themselves, and, importantly, for the educator/assessor.


By applying adult learning theories consistently and carefully, the educator can be sure of helping learners become part of the healthcare profession, and lay the foundations for a career of life-long development.



Summary

성인학습이론은 다양한 교육적, 사회적, 철학적, 심리학적 이론과 연관되어 있다.

Adult learning theories are related to several educational, social, philosophical and psychological theories. Most accessibly these were clustered by Knowles and called “andragogy” clarifying how adults learn best and their attitude towards learning.


단순한 모델을 제시하였고, 학습자와 교수자의 역할에 대한 함의를 제시하였다. 싸이클이지만 어느 사이클의 포인트에서든지 들어갈 수 있다.

A simple model is proposed which has considered different aspects of adult learning theories and their implication to the learner's role and teacher's role. Although the model is presented as a cycle actually the learner and teacher can enter the cycle at any point.


성인학습이론은 거의 모든 측면에 영향을 준다.

Adult learning theories should influence all aspects of health profession education, from mission and vision statements, outcomes, implementation and evaluation.


임상교육과 학습환경은 성인학습이론을 적용하기에 매우 이상적인 환경이며, 그 유용성을 보여주고 있다. 

The clinical teaching and learning environment is an ideal field for using adult learning theories and demonstrating their utility. Reinforcing clear thinking in both teacher and learner and considering them should improve clinical learning, and even clinical outcomes.












 2013 Nov;35(11):e1561-72. doi: 10.3109/0142159X.2013.828153. Epub 2013 Sep 4.

Adult learning theoriesimplications for learning and teaching in medical educationAMEE Guide No. 83.

Author information

  • 1University of Liverpool , UK.

Abstract

There are many theories that explain how adults learn and each has its own merits. This Guide explains and explores the more commonly used ones and how they can be used to enhance student and faculty learning. The Guide presents a model that combines many of the theories into a flow diagram which can be followed by anyone planning learning. The schema can be used at curriculum planning level, or at the level of individuallearning. At each stage of the model, the Guide identifies the responsibilities of both learner and educator. The role of the institution is to ensure that the time and resources are available to allow effective learning to happen. The Guide is designed for those new to education, in the hope that it can unravel the difficulties in understanding and applying the common learning theories, whilst also creating opportunities for debate as to the best way they should be used.

PMID:
 
24004029
 
[PubMed - indexed for MEDLINE]


+ Recent posts