프로그램 평가 모델과 관련 이론들 (AMEE Guide No.67) (Med Teach, 2012)
Program evaluation models and related theories: AMEE Guide No. 67
ANN W. FRYE1 & PAUL A. HEMMER2
1Office of Educational Development, University of Texas Medical Branch, 301 University Boulevard, Galveston, Texas 77555-0408, USA, 2Department of Medicine, Uniformed Services, University of the Health Sciences, F. Edward Hebert School of Medicine, Bethesda, MD, USA
도입
Introduction
많은 문헌에서 프로그램 평가를 '어떻게 하는지'를 다룬다.
Several detailed and well written articles, guides, and textbooks about educational program evaluation provide overviews and focus on the ‘‘how to’’ of program evaluation (Woodward 2002; Goldie 2006; Musick 2006; Durning et al. 2007; Frechtling 2007; Stufflebeam & Shinkfield 2007; Hawkins & Holmboe 2008; Cook 2010; Durning & Hemmer 2010; Patton 2011). Medical educators should be familiar with these and have some of them available as resources.
변화에 초첨을 두고
A focus on change
교육프로그램은 근본적으로 변화에 대한 것이다. 학생에 대한 것 뿐만 아니라 다른 모든 사람들의 변화에 대한 것이다. 따라서 효과적은 프로그램평가는 최소한 일정부분은 변화에 초점을 두어야 한다. 변화가 일어났는가? 변화의 특성은? 변화가 성공적인가?
We believe that educational programs are fundamentally about change. While a program’s focus on change is perhaps most evident for learners, everyone else involved with that program also participates in change. Therefore, effective program evaluation should focus, at least in part, on change: Is change occurring? What is the nature of the change? Is the change deemed ‘‘successful’’?
의도한 변화, 의도하지 않은 변화를 모두 봐야 한다.
program evaluation should look for both intended and unintended changes
이렇게 함으로써 프로그램 평가가 교육변화프로세스에서 필수적 부분이 될 수 있다.
In that way, the program evaluation becomes an integral part of the educational change process.
과거에 교육프로그램평가를 수행한다는 것은 단순한 선형(원인-효과)적 관점으로 인식되었다. 그러나 더 많은 연구가 이뤄질수록 complex systems로 각 요소들과 프로그램-관련 변화의 비선형관계가 나타났다.
In the past, educational program evaluation practices often assumed a simple linear (cause-effect) perspective when assessing recent program elements and outcomes. More evaluation educational scholarship describes programs as complex systems with nonlinear relationships between their elements and program-related changes.
프로그램평가의 정의
Program evaluation defined
가장 기본적 단계에서, 평가란 한 사람이 가지고 있는 정보에 대한 가치판단value judgment을 내리는 것이다. 따라서 교육프로그램평가는 정보를 활용하여 교육프로그램의 value 나 worth에 대한 결정을 내리는 것이다. 더 formal한 정의로는, 교육프로그램평가는 "‘‘systematic collection and analysis of information related to the design, implementation, and outcomes of a program, for the purpose of monitoring and improving the quality and effectiveness of the program.’’ (ACGME 2010a)."이다.
At the most fundamental level, evaluation involves making a value judgment about information that one has available (Cook 2010; Durning & Hemmer 2010). Thus educational program evaluation uses information to make a decision about the value or worth of an educational program (Cook 2010). More formally defined, the process of educational program evaluation is the ‘‘systematic collection and analysis of information related to the design, implementation, and outcomes of a program, for the purpose of monitoring and improving the quality and effectiveness of the program.’’ (ACGME 2010a).
이 정의에서 명확히 드러나듯, 프로그램평가는 프로그램을 routine, systematic, deliberate한 정보수집을 통해서 프로그램의 "성공"에 기여하는 것이 무엇인지, 평가과정에서 드러난 것들을 해결address하기 위해서는 무슨 행동이 취해져야 하는지 uncover or identify 하는 것이다.
As is clear in this definition, program evaluation is about understanding the program through a routine, systematic, deliberate gathering of information to uncover and/or identify what contributes to the ‘‘success’’ of the program and what actions need to be taken in order to address the findings of the evaluation process (Durning & Hemmer 2010).
프로그램평가에 필요한 정보는 통상적으로 측정프로세스를 통해 수집된다. 이 가이드에서 assessment와 evaluation.
Information necessary for program evaluation is typically gathered through measurement processes. In this Guide, we define
-
‘‘assessments’’ as measurements (assess- ment ¼assay) or the strategies chosen to gather information needed to make a judgment.
-
Evaluation,as noted earlier, is about reviewing, analyzing, and judging the importance or value of the information gathered by all these assessments.
프로그램평가를 하는 이유
Reasons for program evaluation
강력한 프로그램평가는 accountability를 강화시켜주며, 교육자들이 자신의 프로그램에 대해 유용한 지식을 쌓고, 프로그램의 지속적 발달을 sustain해준다.
A strong program evaluation process supports accountability while allowing educators to gain useful knowledge about their program and sustain ongoing program development. (Goldie 2006).
평가모델이 언제나 많은 요구를 만족시켜온 것은 아니다. 수년간 평가전문가들은 단순히 성과평가를 측정하는데 몰두해왔다. 새로운 모델은 역동적 프로세스로서의 학습을 support하고 프로그램의 개선에 새로운 관점을 제시해준다.
Evaluation models have not always supported such a range of needs. For many years evaluation experts focused on simply measuring program outcomes (Patton 2011). Newer evaluation models support learning about the dynamic processes within the programs, allowing an additional focus on program improvement (Stufflebeam & Shinkfield 2007; Patton 2011).
교육프로그램평가모델과 관련된 이론들
Theories that inform educational program evaluation models
환원주의
Reductionism
교육평가와 관련된 많은 접근법들이 계몽주의에 근간을 두고 있다 (세계에 대한 이해가 divine intervention에서 실험과 탐구의 모델로 이동하던 시기). 여기에 깔린 것은 order에 대한 가정assumption이었다. 즉, 지식이 축적되면 무질서에서 질서로disorder / order 이동이 있을 것이라는 가정이다. 현상은 환원되어 구성요소를 examine하여 이해되곤 하였다. 질서가 규범이었기에, 성과를 어느 정도 정확히 예측할 수 있었고, 프로세스가 결정(통제 혹은 예측)될 수 있었는데, 왜냐하면 사전에 정해진 질서있는 길orderly pathway로 흐를 것이기 때문이다. 이러한 사고의 유물은 많은 교육프로그램이 조직화된 방식에서 명확히 드러나며, 교육에 대한 우리의 접근법에서도 확인할 수 있다.
Many of the commonly used approaches to educational evaluation have their roots in the Enlightenment, when understanding of the world shifted from a model of divine intervention to one of experimentation and investigation (Mennin 2010c). Underlying this was an assumption of order: as knowledge accumulated, it was expected that there would be movement from disorder to order. Phenomena could be reduced into and understood by examining their component parts. Because order was the norm, one would be able to predict an outcome with some precision, and processes could be determined (controlled or predicted) because they would flow along defined and orderly pathways (Geyer et al. 2005). The legacy of this thinking is evident in the way many medical education programs are organized and can even be seen in our approaches to teaching (Mennin 2010c).
이러한 환원주의자적 관점에서 '전체(혹은 성과)는 이해understand할 수 있는 것이며, 따라서 각 구성요소constituent parts의 기여를 탐구하여 이해함으로써 예측가능해진다. 이 관점은 5세기동안 의학을 특징지어온 과학적 접근의 핵심 부분이다. 이 환원주의자적 관점은 80년간 교육평가에서도 major한 지위를 누려왔다. 원인-효과 접근법은 프로그램 구성요소의 관계 간 선형성을 가정한다. 즉, 특정 프로그램에 변화가 생기면 성과에 예측가능한 효과를 가질 것이다. 작은 변화는 작은 효과를, 큰 변화는 큰 효과를 가져온다. 선형성의 가정은 로직모델, 전-중-후 모델(before, during, after model)등과 같은 몇몇 유명한 프로그램평가모델에서도 드러난다.
The reductionist view, that the whole (or an outcome) can be understood and thus predicted by investigating and understanding the contribution of the constituent parts, is an integral part of the scientific approach that has characterized Medicine for five centuries. The reductionist perspective also dominated educational evaluation throughout a major portion of its short 80-year history as a formal field of practice (Stufflebeam & Shinkfield 2007). This cause-effect approach to analysis requires an assumption of in program linearity elements’ relationships. That is, changes in certain program elements are expected to have a predictable impact on the outcome. A small change would be expected to have a small impact, a large change a large impact. The assumption of linearity is evident in some popular program evaluation models such as the Logic Model (Frechtling 2007) and the Before, During, and After model (Durning et al. 2007; Durning & Hemmer 2010).
환원주의자 혹은 선형적 사고방식은 성과에 기여하는 요소만 안다면, 그 성과를 달성하는데 프로그램이 성공하였는지 실패하였는지도 설명가능하다. 이러한 원인-효과 패러다임은 여러 평가모델에 영향을 주었다.
The reductionist or linear way of thinking suggests that once the factors contributing to an outcome are known, program success or lack of success in achieving those outcomes can be explained. The cause-and-effect paradigm’s impact on several of the evaluation models we describe is clear.
시스템 이론
System theory
환원주의자 접근법이 의학과 의학교육에 큰 발전을 가져왔지만, 이러한 접근법의 한곈느 아리스토텔레스 시대로 돌아가서 "전체는 각 부분의 합보다 크다"라는 dictum에서도 발견할 수 있다. 다른 말로 하면, 우리가 보는 최종 산출물은 단순히 각 구성요소의 합이 아니라는 것이다. 성과는 각 component part에 의해서 설명되지 않으며, 그 part간의 관계, 그리고 part를 둘러싼 환경context이 중요하다는 것이 system theory를 형성한다.
Although the reductionist approach brought great advances in medicine and even medical education, concern with the approach’s limitations can be traced back to at least Aristotle and the dictum that the ‘‘whole is greater than the sum of its parts.’’ In other words, what we see as a final product—an educational program, a human being, the universe—is more than simply a summation of the individual component parts. The appreciation that an outcome is not explained simply by component parts but that the relationships between and environment (context) are among those parts and their important eventually led to formulation of a system theory.
20세기에 베르탈란피에게 attribute되곤 함. 시스템을 어떻게 보았는가.
In the 20th century, this is often attributed to Bertalanffy, a biologist who proposed a general system theory in the 1920s (Bertalanffy 1968, 1972). Bertalanffy proposed that ‘‘the fundamental character of the living thing is its organization, the customary investiga- tion of the single parts and processes cannot provide a complete explanation of the vital phenomena. This investiga- tion gives us no information about the coordination of parts (Bertalanffy 1972). Bertalanffy viewed a and processes.’’ system as ‘‘a set of elements standing in interrelation among themselves and with the environment.’’ (Bertalanffy 1972).
다른 말로 하자면, 시스템은 각 부분, 각 부분의 조직화, 각 부분 간 관계, 각 부분과 환경과의 관계 등으로 이뤄지며, 이 관계는 고정된 것이 아니고 역동적이고 변화하는 것이다.
Stated another way, the system comprises the parts, the organization of the parts, and the relationships among those parts and the environment; these relationships are not static but dynamic and changing.
이 가이드에서, 교육프로그램이란 "social system composed of component parts, with interactions and interre- lations among the component parts, all existing within, and interacting with, the program’s environment."이다. 교육프로그램의 시스템을 이해하는 것은 시스템이론에 부합하는 평가접근법을 필요로 한다.
In the context of this Guide, an educational program is a social system composed of component parts, with interactions and interre- lations among the component parts, all existing within, and interacting with, the program’s environment. To understand an educational program’s system would require an evaluation approach consistent with system theory.
베르탈란피가 과학을 보는 관점은 환원주의에서 벗어나서 시스템과 학문분야을 넘나드는 공통점commonalities을 찾는 것이다. 따라서 그의 General System Theory 에 관한 생각이 원래는 생물학에서 근간했으나, 20세기에 수학, 물리학, 사회과학에서도 베르탈란피가 제안하는 접근법을 강조했다."across a variety of disciplines and science, there are common underlying principles."
Bertalanffy’s proposal (re)presented a way of viewing science, moving away from reductionism, and looking for commonalities across disciplines and systems. Thus, while his ideas about a General System Theory were initially rooted in biology, 20th century work in mathematics, physics, and the social sciences underscored the approach that Bertalanffy proposed: across a variety of disciplines and science, there are common underlying principles.
마지막으로 General System Theory는 변화가 시스템의 내재적인 부분이라고 본다. 베르탈란피는 시스템을 닫힌closed 혹은 열린open것으로 묘사했는데, 그는 living system은 열린계open system이라고 믿었다. 시스템에서 '평형'이란 아무것도 변화하지 않는 것이며, 시스템이 죽어가고 있는 것이라 볼 수도 있다. 반면, 열린계란 steady-state에 있는 것으로서, element와 interrelationship이 균형을 이루면서, 여전히 활동적이고 종종 반대 혹은 저항하는 방향으로 활동하더라도 여전히 active한 것이다.
Finally, General System Theory embraces the idea that change is an inherent part of a system. Bertalanffy described systems as either being
-
‘‘closed’’, in which nothing either enters or leaves the system, or
-
‘‘open’’, in which exchange occurs among component parts and the environment.
He believed that living systems were open systems. Equilibriumin a system means that nothing is changing and, in fact, could represent a system that is dying. In contrast, an open system at steady-state is one in which the elements and interrelation- ships are in balance—still active, perhaps even in opposite or opposing directions, but active nonetheless (Bertalanffy 1968).
더 나아가서 열린계에는 등결과성equifinality가 있다. 최종 상태 혹은 성과가 다양한 시작점으로부터, 다양한 경로를 거쳐서 도달될 수 있는 것이다. 이것은 닫힌계와는 다른데, 여기서 성과는 시작점과 조건에 의해서 사전에 결정된다. 우리는 이러한 열린계의 관점이 교육프로그램에서 발생하는 것에 부합한다고 생각한다. it is an open system, perhaps sometimes at steady-state, but active.
Furthermore, in an open system, there is equifinality: the final state or outcome can be reached from a variety of starting points and in a variety of ways (much like a student becoming a physician by going through medical school) as contrasted with a closed system in which the outcome might be predetermined by knowing the starting point and the condi- tions. We believe this view of an open system is consistent with what occurs in an educational program: it is an open system, perhaps sometimes at steady-state, but active.
General System Theory이 발달함에 따라서 이 원칙을 address한 이론들이 등장하였다.
Since the advent of General System Theory, a number of other theories have arisen to attempt to address the principles across a variety of systems. One such theory, Complexity Theory, is growing in prominence in medical education
복잡도 이론
Complexity theory
그러나 교육프로그램은 거의 '평형상태'에 있는 경우가 없다. 의학교육프로그램은 프로그램의 내외부 요인에 영향을 받는다.
Educational programs, however, are rarely in equilibrium. Medical education programs are affected by many factors both internal and external to the program:
-
program participants’ characteristics,
-
influence of stakeholdersor regulators,
-
the ever-changing nature of the knowledge on which a discipline is based,
-
professional practice patterns, and
-
the environment in which the educational program functions, to name only a few (Geyer et al. 2005).
따라서 의학교육프로그램은 잘 해봐야 complex system으로 특징지어질 수 있으며, 다양한 요소 및 그것들의 상호작용으로 구성되어 있다. 전체 시스템은 각각의 component를 개별적으로 examine한다고 설명되지 않는다. 이러한 관점에서 프로그램의 전체는 각 부분보다 더 크며, 각 요소를 개별적으로 연구하여 설명할 수 있는 것보다 더 커다란 무언가가 돌아간다. 이는 사실 관심의 대상이 되는 성과에 관한 많은 variance가 시스템이나 프로그램 내에서 identify할 수 있는 요소들로 설명되지 않는 교육연구의 현상을 설명해주는 것이다.
Medical education programs are therefore best characterized as complex systems,given that they are made up of diverse components with interactions among those components. The overall system cannot be explained by separately examining each of its individual components (Mennin 2010b). In a sense, the program’s whole is greater than the sum of its parts—there is more going on in the program (the complex system) than can be explained by studying each component in isolation. This might, in fact, explain the phenomenon in educational research in which much of the variance in the outcome of interest is not explained by factors identified in the system or program:
복잡도 이론과 복잡도 과학은 시스템의 richness와 diversity를 표용하며, 이 때 모호함과 불확실성이 예상될 수 있다.
Complexity theory and complexity science are attempts to embrace the richness and diversity of systems in which ambiguity and uncertainty are expected.
-
‘‘Complexity ‘sci- ence’ then is the study of nonlinear dynamical interactions among multiple agents in open systems that are far from equilibrium.’’ (Mennin 2010c)
-
‘‘Complexity concepts and principles are well suited to the emergent, messy, nonlinear uncertainty of living systems nested one within the other where the relationship among things is more than the things themselves.’’ (Mennin 2010a)
복잡도이론은 우리가 교육프로그램을 평가한다고 했을 때 불확실성과 모호성을 accomodate하게 해준다. 복잡도이론은 그러한 자연상태에서의 모호성을 의학교육프로그램과 같은 시스템에 정상적으로 존재하는 하나의 부분으로 이해하게 도와준다. 모호함과 불확실성은 좋은 것도 나쁜 것도 아니고 그냥 예상되는 것이다. 교육프로그램평가는 이 불확실성에 대한 탐구를 포함해야 한다. 실제로 복잡도 이론은 교육자들로 하여금 복잡한 educational event를 이해하고 설명하는데 지나치게 단순한 모델에 의존하지 말라고 하는 것이다. ‘‘To think complexly is to adopt a relational, a system(s) view. That is to look at any event or entity in terms, not of itself, but of its relations.’’ (Doll & Trueit 2010)
Complexity theory allows us to accommodate the uncertainty and ambiguity in educational programs as we think about evaluating them. It actually promotes our understanding of such natural ambiguity as a normal part of the systems typical of medical educational programs. Ambiguity and uncertainty are neither good nor bad but simply expected and anticipated. Evaluating an educational program would therefore include exploring for those uncertainties. In fact, complexity theory invites educa- tors to cease relying on overly simple models to explain or understand complex educational events. ‘‘To think complexly is to adopt a relational, a system(s) view. That is to look at any event or entity in terms, not of itself, but of its relations.’’ (Doll & Trueit 2010)
다른 말로 하면, 프로그램의 성공여부는 프로그램 참가자들과 관련된 요소들 뿐만 아니라, 참가자 사이의 관계와 참가자들이 act하는 환경, 그리고 그 환경이 참가자들에게 미치는 영향 모두에 달려있다.
In other words, examining a program’s success must not only include refer- ences to elements related to program participants but also to the relationships of participants with each other and with the environment in which they act and how that environment may affect the participants.
Complexity theory 는 프로그램평가모델 선택에도 도움을 준다. 예컨대, 프로그램의 요소들 간 관계는 CIPP모델에서 두드러진다.
Complexity theory can inform our choice of program evaluation models. For example, the concept of program elements’ relationship is prominent in the CIPP evaluation model in which
-
Context studies play a critical role in shaping the approach to evaluating program effectiveness and in which program
-
Process studies are separate but of equal importance (Stufflebeam & Shinkfield 2007).
Before, During, After evaluation model 도 복잡도이론의 관점에서 해석될 수 있다.
The Before, During, After evaluation model (Durning et al. 2007; Durning & Hemmer 2010), described in the literature but not discussed in this Guide, can also be interpreted from the perspective of complexity theory.
Doll은 다음과 같이 말했다.
Doll suggests that ‘‘...the striving for certainty, a feature of western intellectual thought since the times of Plato and Aristotle, has come to an end. There is no one right answer to a situation, no formula of best practices to follow in every situation, no assurance that any particular act or practice will yield the results we desire.’’ (Doll & Trueit 2010)
흔히 사용되는 평가모델들
Common evaluation models
실험/유사-실험 모델
The experimental/quasi-experimental models
실험/유사-실험 모델 설계는 초기의 설계이며, 1960년대에 널리 사용되었다. 이 설계는 프로그램의 각 요소를 명확하게 isolate 시키며, 전통적 환원주의적 접근과 부합한다. 실험/유사-실험 설계는 지난 세기 생명과학을 발전시켜오는데 매우 유용했다. 그러나 complex environment에는 덜 유용했다. 강력하게 통제된 실험적 디자인은 의학교육과 같이 복잡한 교육프로그램에 도입하기 어려웠다. 교육자들은 새로운 방법을 과거에 하던 방법과 비교하고 싶어했다(아무것도 안하는 것과의 비교가 아니라). 따라서 실험연구의 성과는 marginal increment 밖에 없었다. 유사실험설계가 진-실험설계보다는 더 자주 사용되었다.
Experimental and quasi-experimental designs were some of the earliest designs applied as educational evaluation came into common use in the mid-1960s (Stufflebeam & Shinkfield 2007). These designs explicitly isolate individual program elements for study, consistent with the classic reductionist approach to investigation. The familiar experi- mental useful and quasi-experimental designs were enormously in advancing the biological sciences over the last century (Stufflebeam & Shinkfield 2007). They have proven less useful in the complex environments of educational programs: true experimental, tightly controlled designs are typically very difficult to implement in educational programs as complex as those in medical education. Educators usually need to compare a new way of doing things to the old way ofdoing things rather than to ‘‘doing nothing’’, so the of a experimental study’s outcomes are usually measures marginal increment in value. Quasi-experimental designs are used more often than the true experimental designs that are simply not feasible.
가장 흔히 사용되는 유사실험 설계를 보자
We now describe and comment on the most commonly used quasi-experimental designs seen in evaluation studies,
-
In the Intact-Group Design, learners are randomly assigned to membership in one of two groups. The program being evaluated is used by one of the two groups; the other gets the usual (unchanged) program. The use of randomization is intended to control all factors operating within the groups’ members that might otherwise affect program outcomes.
-
한계점, 유의사항 For optimal use of this evaluation design,the intact-groups study should be repeated multiple times. If repetition is not feasible, the evaluator/experimenter must continually be alert for unexpected differences that develop between the groups that are not due to the planned program implementation.
-
Evaluators who choose a Time-Series Experimental Design study the behavior of a single person or group over time. By observing the learner(s) or group(s) before a new program is implemented, then implementing the program, and finally conducting the same observations after the program, the evaluator can compare the pre- and post-program behaviors as an assessment of the program’s effects. This design is similar to the pre/post test design well-known to educators.
-
한계점 The design does not separate the effects that are actually due to the program being evaluated from effects due to factors external to the program, e.g. learner maturation, learning from concurrent courses or programs, etc.
-
모델의 유용성과 모델이 정당화되는 이유 The usefulness of this design is limited by the number of design elements that must be logically defended, including assumptions of linear relationships between program elements and desired outcomes, stability of outcome variables observed over a short time period, or (in the case of using different learner groups) sufficient of comparability comparison groups on outcome-related variables.
-
모든 프로그램 수행과 자료수집이 끝난 다음에 하는 평가. 참가자 관련 정보를 얻어야 함.
The Ex Post Facto Experiment design, though criticized by some evaluation experts, may be useful in some limited contexts. In this design the evaluator does not use random assignment of learners to groups or conditions. In fact, the evaluator may be faced with a completed program for which some data have been collected but for which no further data collection is feasible. Realizing the weakness of the design, its appropriate use requires analyzing outcome variables after every conceivable covariate has been included in the analysis model (Lieberman et al. 2010). The evaluator must therefore have access to relevant pre-program participant data to use as covariates.
이 모델에서 평가자들이 얻을 수 있는 것은?
What can evaluators expect to gain from experimental and quasi-experimental models?
의학교육자들에게 환원주의적 관점은 익숙하고, 따라서 실험/유사실험 평가연구는 익숙한 디자인의 편안함을 준다. 이 설계는 교육프로그램의 요소들과 성과 간 선형의 인과관계를 가정하는데, 교육프로그램의 복잡성은 이 가정이 적절함을 보장하기 어려울 수도 있다. 의학교육에서 이러한 유형의 연구가 어려운 이유는 교육기관은 연구환경처럼 구성된게 아니기 때문에, 무작위화 등을 support하지 않는다. 무작위배정에는 윤리적 고려가 이뤄져야 한다. 많은 교육적 상황에서 유사실험 설계를 implement하기 어렵다.
Reductionist approaches are familiar to most medical educators, so experimental and quasi-experimental evaluation studies offer the comfort of familiar designs. The designs do require assumption of linear causal relationships between educational elements and outcomes, although the complexity of educational programs can make it difficult to document the appropriateness of those assumptions. It can also be difficult simply to implement studies of this type in medical education because learning institutions are not constructed like research environments— they rarely support the randomization upon which true experimental designs are predicated. Ethical considerations must be honored when random assignment would keep learners from a potentially useful or improved learning experience. In many educational situations, even quasi-experimental designs are difficult to implement.
커크패트릭의 4단계 평가모델
Kirkpatrick’s four-level evaluation model
커크패트릭은 훈련프로그램의 학습자성과를 평가하는데 유명한 모델. 교육평가에 있어서 프로그램 성과를 명확하게clarify해주었고, 단순한 만족도를 넘어서 성과가 무엇인지 clear description해주었다.
Kirkpatrick’s four-level approach has enjoyed wide-spread popularity as a model for evaluating learner outcomes in training programs (Kirkpatrick 1996). Its major contributions to educational evaluation are the clarity of its focus on program outcomes and its clear description of outcomes beyond simple learner satisfaction.
그러나 커크패트릭모델에 대해 학습에 관여하는 variables, 프로그램 요소들과 context사이의 관계, 자원 활용의 효과성 등을 고려하지 않았다는 비판이 있다.
Kirkpatrick’s model has been criticized for what it does not take into account, namely intervening variables that affect learning (e.g. learner motivation, variable entry levels of knowledge and skills), relationships between important pro- gram elements and the program’s context, the effectiveness of resource use, and other important questions (Holton 1996).
What can evaluators gain from using the Kirkpatrick four- Kirkpatrick’s approach defines a useful level approach? 1996).
프로그램 성과에 대한 taxonomy,왜 프로그램이 작동하였나에 대한 데이터
By itself, taxonomy of program outcomes (Holton however, the Kirkpatrick model is unlikely to guide educators into a full evaluation of their educational program(Bates 2004) or provide data to illuminate why a program works.
Table 1. Comparison of evaluation models.
로직모델
The logic model
시스템이론은 로직모델 접근법이 프로그램의 요소와 요소-맥락 간 관계에 세심한 관심을 가지게끔 하였다. 로직모델이 프로그램 평가에만 사용되기보다는 프로그램 기획단계에서도 사용되긴 하나, 로직모델의 구조는 rational evaluation plan을 강력하게 지지한다. 로직모델은 이미 논의된 평가모델과 유사하며, 교육 기획과 평가에 있어서 strongly linear하다.
The influence of system theory on the Logic Model approach to evaluation can be seen in its careful attention to the relationships between program components and the compo-nents’ relationships to the program’s context (Frechtling 2007).Though often used during program planning instead of solely as an evaluation approach, the Logic Model structure strongly supports a rational evaluation plan. The Logic Model, similar tothe evaluation models already discussed, can be strongly linear in its approach to educational planning and evaluation
로직모델의 구조는 CIPP평가모델과 공통적인 특징을 갖지만, 교육혁신이 이루어지는 변화프로세스와 시스템에 초점을 맞춘다는 점이 다르다. 구조적 단순함이 초심자와 유경험자 모두에게 매력적으로 보이게 하는 요인이지만, 로직모델은 프로그램의 교육법과 기대성과가 명확히 understood되었다는 가정에 기반하고 있다. 따라서 로직모델의 가장 단순한 형태는 대부분의 교육맥락에서 나타나는 비선형적 복잡성을 과도하게 단순한 것이 될 수 있다. 로직모델은 교육자들이 프로그램을 dynamic system으로서 명확하게 이해하여 의도한 성과와 의도하지 않은 성과를 document할 계획이 있을 때 가장 유용하다.
The Logic Model’s structure shares characteristics with Stufflebeam’s CIPP evaluation model (Table 1) but focuses on the change process and the system within which the educa-tional innovation is embedded. Though its structural simplicity makes it attractive to both novice and experienced educators,this approach is grounded in the assumption that the relation-ships between the program’s educational methods and the desired outcomes are clearly understood. The simplest form of the Logic Model approach may therefore oversimplify the nonlinear complexity of most educational contexts. The Logic Model works best when educators clearly understand their program as a dynamic system and plan to document both intended and unintended outcomes.
로직모델을 프로그램 기획에 사용할 때, 기대성과에서 출발하여 거꾸로backward 다른 요소들에 대해 작업해나가는 순서로 하는 것이 편리하다.
When using a Logic Model for program planning, most find it useful to begin with the desired Outcomes and then work backwards through the other components (Frechtling 2007).
투입
Inputs.
모든 관련된 자원을 포괄함.
A Logic Model’s Inputs comprise all relevant resources, both material and intellectual, expected to be or actually available to an educational project or program. Inputs may include funding sources (already on hand or to be acquired), facilities, faculty skills, faculty time, staff time, staff skills, educational technology, and relevant elements of institutional culture (e.g. Departmental or Dean’s support).
CIPP모델에서의 '투입'은 'program input'이라는 측면에서 더 디테일하며, 로직모델의 input을 더 확장하는데 사용될 수 있다.
The CIPP model’s Input section is a more detailed way of looking at program ‘‘inputs’’ and can be used to expand the construction of the Logic Model’s input section.
활동
Activities.
'활동'이란 교육프로그램에서 계획된 '처방', 전략, 혁신, 변화 등이다. '활동'은 모델에서 특정된 순서로 일어날 것을 기대하게 된다. 활동의 명확한 순서를 매기는 것은 앞선 활동이 그 다음 활동에 영향을 줄 것이라는 것을 인정하는 것이다.
The second component of a Logic Model details the Activities, the set of ‘‘treatments’’, strategies, innovations or changes planned for the educational program. Activities are typically expected to occur in the order specified in the Model. That explicit ordering of activities acknowledges that a subsequent activity may be influenced by what happens after or during implementation of a preceding activity.
산출
Outputs.
산출은 'indicators that one of the program’s activities or parts of an activity is underway or completed and that something (a ‘‘product’’) happened'로 정의된다. 로직모델의 구조는 각 '활동'이 최소한 하나 이상의 '산출'을 낼 것을 dictate하며, 하나의 '산출'은 두 개 이상의 '활동'의 결과일 수 있다. 산출은 '크기' 또는 중요도에 있어서 다양하며, 종종 '성과'와 구분이 어렵다. 교육프로그램에서 '산출'은 다음의 것을 포함할 수 있다.
Outputs, the Logic Model’s third component, are defined as indicators that one of the program’s activities or parts of an activity is underway or completed and that something (a ‘‘product’’) happened. The Logic Model structure dictates that each Activity must have at least one Output, though a single Output may be linked to more than one Activity. Outputs can vary in ‘‘size’’ or importance and may sometimes be difficult to distinguish from Outcomes, the fourth Logic Model component. In educational programs, Outputs might include
-
the number of learners attending a planned educational event (the activity),
-
the characteristics of faculty recruited to contribute to the program (if, for example, ‘‘recruit faculty with appropriate expertise’’ were a program activity) or
-
the number of educational modules created or tested (if, for example, ‘‘create educational modules’’ were an activity).
성과
Outcomes.
성과는 프로그램 활동의 결과로서 의도된 단기, 중기, 장기 변화로 정의된다. 개인 수준, 그룹 수준, 조직 수준에서 특정될 수 있다. CIPP의 'Product'섹션이 로직모델의 '성과'에 대한 추가적 아이디어를 줄 수 있다.
Outcomes define the short-term, medium-term, and longer range changes intended as a result of the program’s activities. Outcomes may be specified at the level of individuals, groups or an organization (e.g. changes in a department’s infrastructure to support education). Cross-referencing to Stufflebeam’s CIPP model’s Product section may provide additional ideas for the Outcomes section of a Logic Model (Table 1).
네 개의 기본적 로직모델의 요소 외에도, complete한 로직모델은 프로그램의 context와 impact에 reference된다.
In addition to the four basic Logic Model elements, a complete Logic Model is carefully referenced to the program’s Context and its Impacts.
-
Context refers to important elements of the environment in which the program takes place,
-
Impact comprises both intended and unintended changes that occur after a program or intervention. 장기성과는 로직모델에서 'impact'로 정의되는 것이 나을 수 있다. Long-term outcomes with a very wide reach (e.g. improving health outcomes for a specific group) might be better defined as ‘‘impacts’’ than outcomes in a Logic Model approach.
로직모델은 교육자가 '선형관계 가정'에 대해서 매우 유의한다면 효과적인 설계가 될 수 있다.
The Logic Model approach can support the design of an effective evaluation if educators are appropriately cautious of its linear relationship assumptions. Typical evaluation ques- tions that might be used in a Logic Model approach include questions like these:
로직모델에서 얻어야 할 것은?
What should educators expect to gain from using the Logic Model approach?
교육기획자들이 I-A-O-O 사이의 의도된 link를 명확하게 정의해야 하므로, 이 모델은 일단 도입된 교육프로그램의 의도된 성과에 초점을 맞추게 해준다.
Because it requires that educational planners explicitly define the intended links between the program resources (Inputs), program strategies or treatments (Activities), the immediate results of program activities (Outputs), and using the the desiredLogicprogram accomplishments (Outcomes), Model, can assure that the educational program, once imple-mented, actually focuses on the intended outcomes.
두 명 이상의 사람이 기획-실행-평가에 참여할 때 특히 유용하다. 팀 구성원간 전문성과 활동 및 기대성과에 대해서 변화에 대한 관점이 다를 때 프로그램 설계에 도움이 된다.
Logic Models have proven especially useful when more than one person is involved in planning, executing, and evaluating a program. Team members’ varied areas of expertise and their different perspectives on the theory of change pertinent to the program’s activities and desired outcomes can inform the program’s design during this process.
로직모델의 pitfall은 본질적으로 linear하여 평가자가 맹목적으로 모델을 따를 경우 예상하지 못한 성과를 못 보거나 프로그램의 진행과정에서 나타나는 변화에 유연하지 못할 수 있다.
Some potential pitfalls of using the Logic Model should be considered, however. Its inherent linearity (Patton 2011) can focus evaluators on blindly following the Model during program implementation without looking for unanticipated outcomes or flexibly accommodating mid-stream program changes.
로직모델은 프로그램의 디렉터나 팀이 어떻게 변화가 발생하는지 매우 잘 developed 된 이해를 가지고 있을 때 가장 효과적이다. '프로그램의 로직모델은 어떤 전략이 의도한 성과(변화)를 이루기 가장 좋은지, 왜 그런지'에 대한 여러 이해관계자들의 공통적인 이해를 바탕으로 만들어지며, 따라서 이 모델의 사용자들은 이전 연구와 자신의 경험을 바탕으로 프로그램에서의 변화가 어떻게 일어날지에 대한 가정을 세워야 한다.
The Logic Model approach works best when the program director or team has a well-developed understanding of how change works in the educational program being evaluated. A program’s Logic Model is built on the stakeholders’ shared understandings of which strategies are most likely to result in desired outcomes (changes) and why, so users should draw on research and their own experience as educators to hypothe- size how change will work in the program being evaluated.
The CIPP (context/input/process/product) model
Daniel Stufflebeam이 만들었음
The CIPP set of approaches to evaluation is described by Daniel Stufflebeam, its creator, as his response to and improvement on the dominant experimental design model of its time (Stufflebeam& Shinkfield 2007).
로직모델과 공유하는 요소가 있지만, 로직모델을 속박하는 선형적 관계 가정에 hamper되지 않는다. 교육프로그램을 역동적 요소의 복잡하고 비선형적 관계로서 이해하는 사람은 CIPP가 매우 유용하다고 느낄 것이다.
Its elements share labels with the Logic Model (Table 1), but the CIPP model is not hampered by the assumption of linear relationships that constrains the Logic Model. An evaluator who understands an educational program in terms of its dynamic elements’ complex, and often nonlinear relationships will find the CIPP model a powerful approach to evaluation.
CIPP의 요소들은 항구적으로 변화하는 교육프로그램의 특성과 교육자들의 프로그램-개선 데이터라는 입맞을 모두 만족시킬 수 있다. C-I-P-P에 대한 서로 다른 관점은 교육 프로그램의 모든 단계를 address해준다.
CIPP components accommodate the ever-changing nature of most educational programs as well as educators’ appetite for program-improvement data. By alternately focusing on pro- gram Context, Inputs, Process, and Products (CIPP), the CIPP model addresses all phases of an education program: plan- ning, implementation, and a summative or final retrospective assessment if desired.
맥락평가
Context evaluation study.
새로운 프로그램이 시작될 때 주로 시행함. 기존 프로그램에 대한 축소cutting 결정이 내려져야 할 때도 시행될 수 있음.
A CIPP Context evaluation study is typically conducted when a new program is being planned. A new leader taking over an existing program, for example, may find thinking through a Context evaluation study helpful. Context studies can also be conducted when decisions about cutting existing programs are necessary.
맥락평가에서는 니즈, 문제, 자산, 기회 등을 평가하여 프로그램의 목표와 우선순위를 정한다. 나중에 '산출'을 위한 유용한 baseline을 제공한다. 외부 펀딩을 얻으려 할 때에 맥락연구를 잘 활용하여 제안서를 더 강력하게 만들 수 있다. 잠재적 장애요소와 자산assets가 포함되기 때문에 전통적인 '요구사정'에 비해서 더 inclusive하다.
A CIPP Context evaluation study identifies and defines program goals and priorities by assessing needs, problems, assets, and opportunities relevant to the program. The Context study’s findings provide a useful baseline for evaluating later outcomes (Products). When preparing a request for external funding, a program’s planning or leadership team can use a good Context study to strengthen the proposal. Because questions about potential impediments and assets are included, a Context evaluation is more inclusive than a conventional ‘‘needs assessment’’, though it does include that essential element.
방법은 다음 중에서 선택
The evaluator might select from among the following methods,
-
Document review
-
Demographic data analysis
-
Interviews
-
Surveys
-
Records analysis (e.g. test results, learner performance data)
-
Focus groups
투입평가
Input evaluation study
투입평가는 자원의 할당이 교육프로그램의 기획이나 제안서 단계에서 필요할 때 유용하다. 투입평가는 feasibility나 여러 대안적 접근법, 경쟁적 접근법의 비용-효과성을 평가한다 (인력활용, 다른 자원 등)
A CIPP model Input evaluation study is useful when resource allocation (e.g. staff, budget, time) is part of planning an educational program or writing an educational proposal. An Input evaluation study assesses the feasibility or cost-effectiveness of alternative or competing approaches to the educational need, including various staffing plans and ways to allocate other relevant resources
맥락평가를 기반으로 투입평가를 시행함으로써 어떻게 필요한 변화를 가장 잘 가져올 수 있는지 초점을 맞춘다. 잘 설계된 투입평가는 교육자들로 하여금 왜, 그리고 어떻게 approach를 선택했고 어떤 대안이 고려되었는지를 명확하게 설명해줄 수 있게 도와준다.
Building on the associated Context evaluation study, a CIPP model Input evaluation study focuses on how best to bring about the needed changes. A well-conducted Input evaluation study prepares educators to explain clearly why and how a given approach was selected and what alternatives were considered.
새로운 프로그램을 시작할 때, 투입평가는 펀딩의 assign을 정당화해준다. 이미 있는 프로그램에 대해 투입평가를 한다면 현재의 practice를 potential practice와 비교해볼 수 있다.
When used to plan a new program, an Input evaluation study can also set up clear justification for assigning grant funding or other critical resources to a new program. When applied to a program already in place, an Input evaluation study can help the educator to assess current educational practices against other potential practices.
투입평가 방법
Input study might involve any of the following methods:
-
Literature review
-
Visiting exemplary programs
-
Consulting experts
-
Inviting proposals from persons interested in addressing the identified needs
프로세스 평가
Process evaluation study
프로그램의 성과를 해석할 수 있게 준비하는 것.
A CIPP Process evaluation study prepares the evaluator to interpret the program’s outcomes (see Product study) by focusing attention on the program elements associated with those outcomes
in-process revision의 목적으로 formative information을 얻기 위해 한 번 이상 시행할 수 있다.
A Process evaluation study can be conducted one or more times as a program runs to provide formative information for guiding in-process revisions.
프로그램이 종료 된 이후에 어떻게 프로그램이 실제로 돌아갔는지 이해하기 위해서 할 수도 있음. 프로세스 평가는 한 site에서 도입된 교육모델 혹은 프로그램이 다른 site에서 implement가능하지 않을 수 있음을 명확히 인식하는 것이기도 하다. 맥락적 차이가 minor to major adaptation을 dicatate한다. 프로세스 평가는 실제로 시행된 프로그램에 대한 정보를 elicit한다.
This kind of evaluation study can also be conducted after a program concludes to help the educator understand how the program actually worked. A CIPP Process study explicitly recognizes that an educational model or program adopted from one site can rarely be implemented with fidelity in a new site: contextual differences usually dictate minor to major adaptations to assure effectiveness. The Process evaluation study elicits information about the program as actually implemented.이해관계자들에 대한 accountability를 위해 가치가 있다. 프로그램의 지속적 향상을 위한 자료 수집도 가능케 해준다. 프로세스에서 기록된 교훈은 다른 교육자들에게 유용하다.
The CIPP model’s Process evaluation study is invaluable for supporting accountability to program stakeholders. It also allows for the data collection necessary for a program’s continual improvement. The ‘‘lessons learned’’ about programmatic processes documented in a Process study are often useful to other educators,
방법
The evaluator might choose from among these methods:
-
Observation
-
Document review
-
Participant interviews
산출평가
Product evaluation study.
산출평가는 성과에 초점을 두기 때문에 친숙하다. 그러나 그 breadth는 매우 넓다.
The CIPP model’s Product evaluation study will seem familiar to most educators because of its focus on program outcomes. What may be more surprising is the breadth of that focus (Table 2). The CIPP Product evaluation study is the one most closely aligned to the traditional ‘‘summative’’ program evaluation found in other models, but it is more expansive. This type of evaluation study aims to identify and assess the program outcomes, including both positive and negative outcomes, intended and unintended outcomes, short-term and long-term outcomes. It also assesses, where relevant, the impact, the effectiveness, the sustainability of the program and/or its outcomes, and the transportability of the program. A CIPP model Product evaluation study also examines the degree to which the targeted educational needs were met.
프로세스 평가와 함께 있을 때 가장 해석을 잘 할 수 있다. 예컨대 poor implementation은 poor outcome을 만든다.
Program outcomes (Products) are best interpreted with the findings of the Process evaluation studies in hand: it is possible, for example, that poor implementation (a process issue) might cause poor or unintended outcomes. The art of the Product evaluation study is in designing a systematic search for unanticipated outcomes, positive or negative
-
Stakeholders’ judgments of the project or program
-
Comparative studies of outcomes with those of similar projects or programs
-
Assessment of achievement of program objectives
-
Group interviews about the full range of program outcomes
-
Case studies of selected participants’ experiences
-
Surveys
-
Participant reports of project effects
무엇을 기대해야 하는가?
What should educators expect if they choose to use the CIPP model?
formative & summative하게 사용할 수 있다.
CIPP model studies can be used both formatively (during program’s processes) and summatively (retrospec- tively).
- Careful attention to the educational context of program is supported, including
- what comes before, after, or concurrently for learners and others involved in the program,
- how ‘‘mature’’ the program is (first run versus a program of long standing, etc.), and
- the program’s dependence or independence on other educational elements.
- The CIPP model incorporates attention to multiple ‘‘inputs’’:
- learners’ characteristics, variability, and preparation for learning;
- faculty’s preparation in terms of content expertise and relevant teaching skills, the number of faculty available at the right time for the program;
- learning opportunities, including patient census and characteristics and other resources;
- adequacy of funding to support program needs and leadership support.
- The CIPP model allows educators to consider the processes involved in the program or to understand why the program’s products or outcomes are what they are.
- It incorporates the necessary focus on program products or outcomes, informed by what was learned in the preceding studies of the program but focuses on improvement rather than proving something about the program. It can provide multiple stakeholders information about the program’s improvement areas, interpretation of program outcomes, and continuous information for accountability.
Multiple data collection methods are usually required to do a good job with CIPP studies, and each data set must be analyzed with methods appropriate to the data and to the evaluation questions being addressed.
교육프로그램은 본질적으로 변화에 관한 것이다.
Educational programs are inherently about change
-
The reductionist theory’s strict linearity, reflected in the familiar experimental and quasi-experimental evaluation models, may be too limiting to accommodate the known complexity of educational programs.
-
Kirkpatrick’s four-level model of learner outcomes also draws on the assumption of linear relationships between program components and outcomes but may be useful in helping evaluators to identify relevant learner outcomes.
-
The Logic Model, often informative during program planning, specifies the intended relationships between its evaluation components and may require constant updating as a program evolves. The Logic Model’s grounding in systems theory prompts adopters to incorporate the program’s context in evaluation studies, making it more inclusive than earlier evaluation models.
-
Stufflebeam’s CIPP model is consistent with system theory and, to some degree, with complexity theory: it is flexible enough to incorporate the studies that support ongoing program improvement as well as summative studies of a completed program’s outcomes.
ACGME. 2010a. Accreditation council for graduate medical education:glossary of terms. Accreditation Council for Graduate MedicalEducation. Available from: http://www.acgme.org/acWebsite/about/ab_ACGMEglossary.pdf
Cook DA. 2010. Twelve tips for evaluating educational programs. Med Teach 32:296–301.
Durning SJ, Hemmer P, Pangaro LN. 2007. The structure of program evaluation: An approach for evaluating a course, clerkship, or components of a residency or fellowship training program. TeachLearn Med 19:308–318.
Musick DW. 2006. A conceptual model for program evaluation in graduate medical education. Acad Med 81:759–765.
Goldie J. 2006. AMEE education guide no. 29: Evaluating educationalprogrammes. Med Teach 28:210–224.
Program evaluation models and related theories: AMEE guide no. 67.
Author information
- 1Office of Educational Development, University of Texas Medical Branch, 301 University Boulevard, Galveston, Texas 77555-0408, USA. awfrye@utmb.edu
Abstract
- PMID:
- 22515309
- DOI:
- 10.3109/0142159X.2012.668637
- [PubMed - indexed for MEDLINE]
'Articles (Medical Education) > 교육과정 개발&평가' 카테고리의 다른 글
보건전문직 교육에서 프로그램 평가에 대한 생각: '효과가 있었나?'를 넘어서 (Med Educ, 2013) (0) | 2016.10.26 |
---|---|
의학교육에서의 과목평가 (Teaching and Teacher Education, 2007) (0) | 2016.10.26 |
인지부하이론CLT 관점에서 의학교육과정 디자인 (Med Teach, 2016) (0) | 2016.10.24 |
의학교육의 변화: CBME가 옳은 접근법인가? (Acad Med, 2016) (0) | 2016.05.12 |
CBME 진보: 임상가-교육자를 위한 헌장(Acad Med, 2016) (0) | 2016.05.12 |