고등교육 프로그램 평가에 커크패트릭 4단계 평가모형 활용 (Educ Asse Eval Acc, 2010)

Adaptation of Kirkpatrick’s four level model of training criteria to assessment of learning outcomes and program evaluation in Higher Education

Ludmila Praslova







1.1 평가의 정의와 목적

1.1 Definitions and purposes of assessment


평가의 정의

Definitions of assessment


평가는 다양한 상황에서, 다양한 의미로 사용됨. 학생에게 점수를 부여하는 것, 학생의 성취도에 대한 자료를 모아서 프로그램이나 기관 단위의 성취도를 점검하는 것 등. 다양한 수준에서 이뤄진다(교실, 과목, 프로그램, 교육 일반, 기관 등)

The term assessment is used in various contexts and has somewhat different connotations. For example, it is commonly used to describe the processes used to certify individual students or even to award grades (Ewell 2001). On the other hand, for accreditation purposes, assessment refers to the collection and use of aggregated data about student attainment to examine the degree to which program or institution-level learning goals are being achieved (Ewell 2001). Thus, assessment takes place at multiple levels: the classroom, course, program, general education, and institution (Bers 2008).


Ewell은 "평가는 학생이 학업단계마다 무엇을 알고 무엇을 할 수 있는지에 대한 타당하고 신뢰도있는 근거를 수집하는 체계적 방법으로 구성되어 있으며, 학생의 학습성과에 대한 공식-기술에 의해서 좌우된다."

(p.1). Ewell (2006) suggests that “Assessment comprises a set of systematic methods for collecting valid and reliable evidence of what students know and can do at various stages in their academic careers . . . governed by formal statements of student learning outcomes” (Ewell 2006, p. 10).



평가에 있어서 이해관계자

Stakeholder emphasis on assessment


평가에 있어서 이해관계자의 Interest는 전 세계적으로 높아지고 있다.

Growing stakeholder interest in assessment appears to be a global phenom- enon.



기관 차원에서의 피드백으로서의 평가

Assessment as vital institutional feedback



Allen에 따르면, 외부에서 지속적으로 평가를 요구하는 것이 평가의 중요도가 높아지는 한 가지 이유이다. 그러나 더 중요한 이유는 고등교육에서 점차 '가르치는 것'보다 '배우는 것'에 초점을 두어 학생의 성과를 강조하기 때문이다. 따라서 교육 성과에 대한 평가는 외부 이해관계자의 요구를 만족시키기 위해서만 행해지는 것이 아니다. 학습에 대한 평가는 고등교육기관이 핵심 교육 미션에 대한 피드백을 받기 위한 것이다.

According to Allen (2006), an external requirement for continuous assessment is only one of the reasons that underlie the growing importance of assessment. Perhaps an even more important reason is the overall movement of Higher Education toward being learning-focused and emphasizing student outcomes, as opposed to being teaching-focused. Thus, assessment of educational outcomes is not something that should only be done to satisfy external stakeholder requirements. Assessment of student learning is also a way for Institutions of Higher Education to receive feedback regarding the effectiveness of their core educational mission.


시스템스-이론에 따르면 고등교육기관은 환경과 여러가지로 연결되어 있는 open system으로 볼 수 있다.

According to the systems theory, institutions of Higher education, like other social organizations, can be understood as open systems connected to their environment in multiple ways, including input, output, and feedback (Katz and Kahn 1966).


조직 차원이나 기관 차원에서 환경과 어떻게 상호작용하는가에 대한 정보는 피드백의 형태로 나타나며, 이는 어떤 변화가 필요한지를 찾아서 그 변화를 이루고 이를 통해 적절하게 기능하여 그 시스템 내에서 살아남기 위한 것이다.

Information regarding organizational or institutional functioning in relation to the environment, in the form of feedback, is essential to adjustment and making needed changes and thus to proper functioning and, ultimately, to the survival of the system (Katz and Kahn 1966).



1.2 평가 수행에 관한 기관 차원의 어려움

1.2 Institutional struggles with assessment


평가에 대한 기관 내적/외적 중요도에도 불구하고, 여러 대학들은 여전히 평가를 이해하고 평가결과를 교수-학습의 개선을 위해 사용하는 것에 어려움을 겪는다. 또한 학생 성과를 평가하는 기술적 측면에서도 어려움을 겪고 있어 학습준거의 명확화, 적절한 평가도구 선정 등이 그것이다.

Despite both external and internal importance of assessment, many colleges and universities still struggle with understanding assessment and using assessment results to improve learning and teaching (Bers 2008), as well as with technical aspects of assessing student outcomes, including clarification of learning criteria and selecting appropriate measures and instruments (Allen 2006;Bers2008; Brittingham et al. 2008;Ewell2001).



1.3 평가를 위한 접근법

1.3 Overview of proposed approach to assessment


4단계 모형은 classic framework이다.

The four level model is a classic framework for assessing training effectiveness in organizational contexts.


Alliger 등은 그것을 보다 정교화했는데, behavior criteria를 transfer criteria라고 명명한다거나, reaction을 affective reaction과 utility judgement로 나눈다거나 하는 것이 그것이다.

Alliger et al. (1997) proposed some augmentation to the framework and further refined terminology and criteria, for example, by referring to behavior criteria as transfer criteria, and by specifying affective reactions and utility judgments as subtypes of reaction criteria.



2 고등교육에서 4단계 모형의 적용

2 Adaptation of the four level model of training evaluation criteria to assessment in Higher Education



Reaction과 Learning은 internal한 것으로, 프로그램 내에서 발생한 것에 초점을 둔다. Behavior와 Result는 프로그램 이후에 발생한 것에 관심을 두므로 external 하다고 본다.

Reaction and learning criteria are considered internal, because they focus on what occurs within the training program. Behavioral and results criteria focus on changes that occur outside (and typically after) the program, and are thus seen as external criteria.


2.1 반응

2.1 Reaction criteria


Alliger는 얼마나 프로그램을 즐겼는가 affective reactions 와 얼마나 배웠다고 생각하나 utility judgments로 나누었다.

Alliger et al. (1997) proposed the distinction between trainee’s reports regarding how much they enjoyed the training (affective reactions) and how much they believe they have learned (utility judgments) within the reaction criteria.


많은 여구자들이 reaction과 다른 나 머지 사이에 관련이 부족함을 지적했고, Alliger 등은 affective reaction과 다른 level은 관련이 없고, utility judgement와 다른 level은 약한 상관관계가 있다고 보여주었다.

Many researchers have pointed out the lack of relationship between reaction criteria and the other three levels of criteria (learning, behavior and results), and the meta-analytic study by Alliger et al. (1997) found no relationship between affective reactions and other levels, and only a weak relationship between utility judgments and the other levels of criteria.


그러나 많은 연구자들이 reaction 평가에 대해 유의할 것을 지적했음에도, 가장 흔히 평가하는 것으로 아직 남아있다.

However, despite the fact that many researchers caution against the use of reactions alone for the assessment of learning, reaction level criteria remain the most often assessed (Alliger et al. 1997; Arthur et al. 2003a; Dysvik and Martinsen 2008; Van Buren and Erskine 2002).



2.2 학습

2.2 Learning criteria


학습은 학습성과의 측정이며 다양한 형태의 지식검사를 하거나 훈련프로그램 직후에 수행능력이나 스킬 검사를 한다.

Learning criteria are measures of the learning outcomes, typically assessed by using various forms of knowledge tests, but also by immediate post-training measures of performance and skill demonstration in the training context (Alliger et al. 1997).


Alliger 등은 immediate knowledge, knowledge retention and behavior/skill demonstration 로 나누었지만, 큰 지지를 받지는 못하고 있다.

Alliger et al. (1997) proposed specifying immediate knowledge, knowledge retention and behavior/skill demon- stration measured within training as subtypes of learning criteria, but this idea received relatively limited support.


2.3 행동

2.3 Behavioral criteria


Transfer라고도 한다.

Behavioral criteria are also referred to as transfer criteria, a terminology change proposed by Alliger et al. (1997).


조직에서 Behavioral criteria는 supervisor rating이나 수행능력의 객관적 지표에 의해서 평가한다.

In organizations, behavioral criteria are typically operationalized as supervisor ratings or objective indicators of performance such as job outputs (Alliger et al. 1997; Arthur et al. 2003a; Landy and Conte 2007).


비록 Learning과 Behavior가 개념적으로 관련되어 있을 것으로 기대할 수 있으나, 연구 결과는 이 둘 사이에 중등도 상관관계만을 보여준다. 이는 수련-후 환경이 학습한 내용이나 기술을 적용할 기회를 주지 않아서 일 수 있다. 이러한 잠재적 제약은 평가도구 설계, 자료 수집, 자료 해석에 고려되어야 한다.

Although learning criteria and behavioral criteria conceptually are expected to be related, research has found relatively modest relationship between the two (Alliger et al. 1997; Arthur et al. 2003a). This is typically attributed to the fact that post-training environments may or may not provide opportunities for the learned material or skills to be demonstrated (Arthur et al. 2003a). This potential constraint needs to be considered in design of assessment instruments, and in collection and interpretation of behavioral data.


2.4 결과

2.4 Results criteria


Results는 매우 중요하지만 동시에 평가가 매우 어렵기도 하다. 조직 수준의 세팅에서 productivity gains, increased customer satisfaction, increased employee morale following management training, or increase in profitability of organizations  등으로 나타난다. 다른 level들에 비해서 평가 빈도가 현저히 낮다. Alliger는 조직 차원의 제약이 results 단계의 자료 수집을 어렵게 하며, 따라서 sponsor들은 비현실적 기대를 해서는 안된다고 했다.

Results criteria are both highly desirable and most difficult to evaluate. In organizational settings, they are operationalized by productivity gains, increased customer satisfaction, increased employee morale following management training, or increase in profitability of organizations (Arthur et al. 2003a; Landy and Conte 2007). Results are often difficult to estimate and results criteria are used considerably less frequently than assessments of any other level of Kirkpatrick’s model. Alliger et al. (1997) caution that organizational constraints substantially limit opportunities for collecting results data and remind that sponsors of training may have unrealistic expectations with regard to results level outcomes.


고등교육에서 results criteria에는 이해관계자가 다양하게 개입되어 있다.

Results criteria in Higher Education and multiple stakeholders of education


교육으로부터 이득을 보는 적어도 두 개의 집단이 있다. a) 학생, b) 사회.

Thus, it appears that there are at least two parties that are to profit from education: a) the student, who should develop skills useful for the workplace and life in general, and b) the society, which is interested in college graduates who are competent and responsible contributors to local and global communities.


따라서 results criteria에는 광범위한 성과가 포함되어야 한다. 또한 이 대부분은 개인과 사회 모두에 이익이 되어야 한다.

Thus, results criteria in education may include a wide range of outcomes, such as 

  • alumni employment and workplace success, 
  • graduate school admission, 
  • service to underprivileged groups or work to promote peace and justice, 
  • literary or artistic work, 
  • personal and family stability, and 
  • responsible citizenship. 

Moreover, most of these outcomes benefit both individual and the society




Alliger, G. M., Tannenbaum, S. I., Bennett, W., Jr., Traver, H., & Shotland, A. (1997). A meta-analysis of relations among training criteria. Personnel Psychology, 50, 341–358.


Katz, D., & Kahn, R. L. (1966). The social psychology of organizations. New York: Wiley.













Adaptation of Kirkpatrick’s four level model of training criteria to assessment of learning outcomes and program evaluation in Higher Education


Ludmila Praslova


Received: 7 July 2009 / Accepted: 11 May 2010 /

Published online: 25 May 2010

Springer Science+Business Media, LLC 2010


Abstract Assessment of educational effectiveness provides vitally important feedback to Institutions of Higher Education. It also provides important information to external stakeholders, such as prospective students, parents, governmental and local regulatory entities, professional and regional accrediting organizations, and representatives of the workforce. However, selecting appropriate indicators of educational effectiveness of programs and institutions is a difficult task, especially when criteria of effectiveness are not well defined. This article proposes a comprehensive and systematic approach to aligning criteria for educational effectiveness with specific indicators of achievement of these criteria by adapting a popular organizational training evaluation framework, the Kirkpatrick’s four level model of training criteria (Kirkpatrick 1959; 1976; 1996), to assessment in Higher Education. The four level model consists of reaction, learning, behavior and results criteria. Adaptation of this model to Higher Education helps to clarify the criteria and create plans for assessment of educational outcomes in which specific instruments and indicators are linked to corresponding criteria. This provides a rich context for understanding the role of various indicators in the overall mosaic of assessment. It also provides Institutions of Higher Education rich and multilevel feedback regarding the effectiveness of their effort to serve their multiple stakeholders. The importance of such feedback is contextualized both in the reality of stakeholder pressures and in theoretical understanding of colleges and universities as open systems according to the systems theory (Katz and Kahn 1966). Although the focus of this article is on Higher Education, core principles and ideas will be applicable to different types and levels of educational programs.


Keywords Assessment . Evaluation . Program evaluation . Higher Education . Education . Criteria

+ Recent posts