의학교육에서 총괄평가의 전-시험 학습효과 (Adv in Health Sci Educ, 2012)
A model of the pre-assessment learning effects of summative assessment in medical education
Francois J. Cilliers • Lambert W. T. Schuwirth • Nicoline Herman • Hanelie J. Adendorff • Cees P. M. van der Vleuten
Abbreviations
CPA Cognitive processing activities
MRA Metacognitive regulation activities
HE Higher education
HSE Health sciences education
LESA Learning effects of summative assessment
SA Summative assessment
Introduction
Summative assessment (SA) carries inescapable consequences for students and defines a major component of the learning environment’s impact on student learning (Becker et al. 1968; Snyder 1971). Consequently, better utilization of assessment to influence learning has long been a goal in higher education (HE), though not one that has been met with great success (Gijbels et al. 2009; Heijne-Penninga et al. 2008; Nijhuis et al. 2005).
Dochy et al. (2007) distinguish pre-, post- and pure learning effects of assessment.
Pre- assessment effects impact learning before assessment takes place and are addressed in literature on exam preparation (e.g., van Etten et al. 1997) and test expectancy (e.g., Hakstian 1971).
Post-assessment effects impact after assessment and are addressed in literature referring to feedback (e.g., Gibbs and Simpson 2004) and the relationship of assessment with student achievement (e.g., Sundre and Kitsantas 2004).
Pure assessment effects impact during assessment and are reported more rarely (Tillema 2001). The testing effect (e.g., Roediger and Butler 2011) could be classified as a pure or a post assessment effect depending on whether the effect on the learning process or subsequent achievement is considered.
Two major sets of effects can be distinguished: those related to perceived demands of the assessment task and those related to the design of the assessment system.
1. Perceived task demands
Learning is influenced by students’ perceptions of the demands of an assessment task which may accrue from explicit and implicit information from lecturers, from fellow students, past papers and students’ own experience of assessment (Entwistle and Entwistle 1991; Frederiksen 1984; van Etten et al. 1997).
Two types of demands may be distinguished: content demands and processing demands.
Content demands
Content demands relate to the knowledge required to respond to an assessment task (Broekkamp and van Hout-Wolters 2007). These influence what resources students utilize to prepare for assessment by way of cues inferred from the assessor and the assessment task (Entwistle and Entwistle 1991; Frederiksen 1984; Newble and Jaeger 1983;Sa¨ljo¨ 1979). They also influence the selection of what content to learn from selected resources. Students cover more content for selected response items than for constructed response items (Sambell and McDowell 1998) and tend to focus on smaller units of information for selected response assessments than for essays (Hakstian 1971, quoting various studies).
Processing demands
Processing demands relate to ‘‘skills required for processing … knowledge in order to generate the requested response’’ (Broekkamp and van Hout-Wolters 2007). These influence students’ approach to learning by way of cues inferred from the assessor (Ramsden 1979) and from the assessment task.
Constructed response items and open-ended assessments are more likely to engender a transformative or deep approach to learning; selected response items and closed assessments, a reproductive or surface approach (Laurillard 1984; Ramsden 1979; Sambell and McDowell 1998; Sambell et al. 1997; Scouller 1998; Tang 1994; Thomas and Bain 1984; van Etten et al. 1997; Watkins 1982).
Surprisingly, however, closed-book tests promoted a deep approach to learning more than open-book tests (Heijne-Penninga et al. 2008).
Tang (1994) speculated that students’ degree of familiarity with an assessment method influenced their approach to learning,
Watkins and Hattie (cited by Scouller 1998) that past success with surface strategies may encourage a perception that ‘‘deep level learning strategies are not required to satisfy examination requirements’’ (p. 454).
2. System design
The mere fact of assessment motivates students to learn and influences the amount of effort expended on learning (van Etten et al. 1997). The amount of time students spend studying increases, up to a point, as the volume of material and, independent of that, the degree of difficulty of the material, to be studied, increases (van Etten et al. 1997). High workloads also drive students to be more selective about what content to study and to adopt low-level cognitive processing tactics (Entwistle and Entwistle 1991; Ramsden 1984; van Etten et al. 1997). The scheduling of assessment in a course and across courses impacts the distribution of learning effort, as do competing interests e.g., family, friends and extracurricular activities (Becker et al. 1968; Miller and Parlett 1974; Snyder 1971; van Etten et al. 1997).
Theoretical underpinnings
Little previous work on LESA has invoked theory, nor are there many models offering insight into why assessment has the impact it does.
Methods
This study was conducted at a South African medical school with a six-year, modular curriculum. Phases One and Two comprised three semesters of preclinical theoretical modules; Phase Three, semesters four to nine, alternating clinical theory and clinical practice modules; Phase Four, semesters 10–12, clinical practice modules only.
A process theory approach (Maxwell 2004) informed this study. We adopted grounded theory as our research strategy, making a deliberate decision to start with a clean slate and thus utilized in-depth interviews (Charmaz 2006; DiCicco-Bloom and Crabtree 2006; Kvale 1996). This approach offers the advantage of potentially discovering constructs and relationships not previously described. Interviews were not structured beyond exploring three broad themes i.e., how respondents learned, what assessment they had experienced and how they adapted their learning to assessment, all across the entire period of their studies up to that point. Detailed information about the facets of assessment to which respondents adapted their learning and the facets of learning that they adapted in response to assessment were sought throughout, using probing questions where appropriate. When new themes emerged in an interview, these were explored in depth. Evidence was sought in subsequent interviews both to confirm and disconfirm the existence and nature of emerging constructs and relationships. In keeping with the grounded theory strategy used, data analysis commenced even as interviews proceeded, with later interviews being informed by preliminary analysis of earlier interviews.
All interviews were audio recorded, transcribed verbatim and reviewed as a whole, along with field notes. Data analysis was inductive and iterative. Emerging constructs and relationships were constantly compared within and across interviews and refined (Charmaz 2006; Dey 1993; Miles and Huberman 1994). Initial open coding was undertaken by one investigator, subsequent development, revision and refinement of categories and linkages through discussions between the team members. Once the codebook was finalized, focused coding of the entire dataset was undertaken. No new constructs emerged from the analysis of interviews 13–18.
Results
Analysis revealed two sources of impact and two LESA in this setting (Fig. 1). Combining this data with a previously proposed mechanism of impact (Cilliers et al. 2010) allows the construction of the model proposed in the figure.
평가는 이러한 특성이 있고.
As assessment becomes more
즉각적 imminent (‘‘when it comes to the last week, last week and a half of a block’’),
영향력 가능성 impact likelihood (‘‘it’s unavoidable’’) and
영향력 심각도 impact severity (‘‘just that you can pass the exam’’) are considered, along with
대응값 response value (success in assessment increasing, patient care decreasing in value as assessment looms).
평가의 특성은 다음과 합해져서
These factors, together with
과제 유형 task type (‘‘they’re not testing your understanding of the concept. They’re testing ‘can you recall ten facts in this way?’’’) and
응답 효율성 response efficacy (‘‘You just try and cram—try and get as many of those facts into your head just that you can pass the exam’’) considerations,
다음에 영향을 미친다
generate an impact on the
인지프로세스 활동의 특성 nature of cognitive processing activities (CPA) (‘‘So then you just learn five facts rather than trying to understand the core concepts’’).
스케쥴링과 즉각성의 패턴
Pattern of scheduling and imminence (row SF2a, Table 2)
인지 프로세스 CPA Respondents adopted higher-order CPA when assessment was more distant, lower- order CPA as assessment became more imminent (cf. Quote 1).
노력 Effort While the pattern of scheduling of assessment had the beneficial effect of ensuring that respondents regularly allocated effort to learning, they adopted a periodic rather than a continuous pattern of study. In an effort to devote attention to other aspects of their lives, respondents devoted little or no effort to learning at the start of each module. Interests and imperatives other than learning were relegated to a back seat as assessment loomed, however, and learning effort escalated dramatically.
자원 Resources Concurrently, though, as assessment became more imminent, the range of resources respondents utilized shrank.
내용 Content Cue-seeking behavior and responsiveness to cues both typically intensified as assessment grew more imminent.
지속성 Persistence While regular, periodic assessment lead to exhaustion, imminent assessment helped motivate respondents to persist in allocating time and effort to learning despite growing fatigue.
학습량
Prevailing workload (row SF2b, Table 2)
인지 프로세스 CPA Where workload was manageable, higher-order were adopted. Where work- CPA load was unmanageable, even respondents who preferred adopting higher-order CPA would utilize lower-order CPA.
노력 Effort The higher the prevailing workload, the greater the likelihood that effort would be allocated to studies rather than other aspects of respondents’ lives. More effort was also expended, distributed more evenly across the duration of the module.
자원 Resources A high workload inhibited the sourcing and utilization of resources other than those provided by lecturers. Only where resources provided by lecturers were considered inadequate did respondents source and utilize other resources, workload notwithstanding.
내용 Content Where workload was manageable, respondents studied content they considered relevant and material promoting understanding and clinical reasoning. Where workload was unmanageable, respondents focused on material more likely to ensure success in assessment, even if this selection conflicted with what they would have learned to satisfy longer-term clinical practice goals.
모니터링 Monitoring and adjustment While it ensured that respondents devoted appropriate amounts of effort to studying, a high workload could be accompanied by a disorganized rather than systematic approach to MRA.
CPA의 특성
Nature of CPA (column EF1, Table 2)
과제 유형 Task type Respondents inferred processing demands directly from the item type to be used or indirectly based on the complexity of the cognitive challenge posed (cf. Quotes 1, 11) and adjusted their CPA accordingly.
평가 기준 Assessment criteria Where respondents perceived marking to be inflexibly done according to a predetermined memorandum, they responded with rote memorization to try and ensure exact reproduction of responses.
접근가능한 학습자료 Nature of assessable material Where material was perceived to be understandable and logical, respondents adopted higher-order CPA. Where material was less understandable or where the level of detail required to understand the logic was too deep, respondents adopted superficial CPA.
교수자 Lecturers Lecturing using PowerPoint to present lists of facts rather than in a manner that helped respondents develop their understanding of a topic cued memorization as a learning response.
비밀 정보망 Student grapevine Peers identified certain modules as making higher-order cognitive demands, others as requiring only extensive memorization of material. Respondents geared their CPA accordingly.
자원의 선택
Choice of resources (column EF2b, Table 2)
시험 유형 Task type Assessment incorporating small projects resulted in respondents sourcing and utilizing resources they would not otherwise have used e.g., textbooks in the library, the internet generally and literature databases more specifically. However, apart from pro- moting the use of past papers as a resource, most other assessment tasks cued the utili- zation of less, rather than more, diverse resources.
기출 문제 Past papers The more any given lecturer utilized a particular question type or repeated questions from one assessment event to another, the more respondents utilized past papers to plan their learning and select material to learn.
교수자 Lecturers The resources lecturers provided or utilized were perceived to delineate what content was more likely to feature in assessment. Much planning effort was devoted to obtaining copies of PowerPoint slides used, or handouts provided, by lecturers. Some lecturers were perceived as being tied to a particular resource e.g., a prescribed textbook, which respondents then focused on. Equally, use was often not made of textbooks as other resources were perceived to be more appropriate for assessment purposes.
비밀 정보망 Student grapevine Cues obtained ahead of or early in the course of a module about the likely content of assessment influenced the resources respondents opted to use in prepa- ration for assessment (cf. Quote 13).
내용의 선택
Choice of content (column EF2c, Table 2)
시험 유형 Task type Respondents sought out material they perceived could be asked using any given task type and omitted information they perceived could not (cf. also Quotes 1, 11). Information about the overall extent of assessment, the number of marks devoted to each section of the work and the magnitude of questions also influenced choice of content. For example, if respondents knew there would be no question longer than 10 marks in an assessment, they omitted tracts of work they perceived could only be part of a longer question.
기출문제 Past papers were used to determine not only what topics but also what kind of material to study or omit (cf. also Quote 13).
강의 Lecturers Direct cues from lecturers included general comments in class like ‘‘this will (or won’t) be in the exam’’ and specific ‘‘spots’’ provided to students. Respondents attended to such cues even if they perceived the content identified as important to be irrelevant to later clinical practice.
비밀 정보망 Student grapevine Guidance about assessment from senior students and peers influenced respondents’ choice of content, even if they considered the material covered by the cues to be irrelevant to their longer-term goal of becoming a good clinician.
힌트가 없는 경우 Lack of cues Where respondents could not discern cues about what to expect in assessment, they typically tried to learn their work more comprehensively, but at the cost of increased anxiety.
Discussion
다른 결과 (Cilliers et al. 2010)와 함께 사전 평가 LESA가 무엇인지 설명 할뿐만 아니라 학생들이 평가와 함께하는 방식으로 상호 작용하는 이유를 설명하는 이론 모델을 제안하는 것이 가능 해졌다 (그림 1) . 이전에 평가와 학습의 연관성을 논의 할 때 자기조절이론이 시작되었다 (Ross 외 2006, 2003, Sundre and Kitsantas 2004, van Etten 외 1997). 자기조절이 실제로 역할을 수행하지만 더 넓은 틀의 일부임을 시사한다. 우리의 연구 결과는 Broekkamp and van Hout-Wolters (2007)가 제안한 모델의 일부 측면에 대한 경험적 지원을 제공한다.
Together with other findings (Cilliers et al. 2010) it has been possible to propose a theoretical model not only describing what the pre-assessment LESA are, but also explaining why students interact in the way that they do with assessment (Fig. 1). Self- regulation theory has previously been invoked when discussing the link between assess- ment and learning (Ross et al. 2006, 2003; Sundre and Kitsantas 2004; van Etten et al. 1997). Our findings suggest that self-regulation does indeed play a role, but that it is part of a broader framework. Our findings also lend empirical support to some aspects of the model proposed by Broekkamp and van Hout-Wolters (2007).
우리의 모델은 학습 행동의 결정 요인으로서 SA가 유일한 역할을 한다고 주장하지 않습니다. 대신 SA가 평가 이벤트 이전에 학습 행동에 어떻게 영향을 미치는지를 강조하고 다른 모델 (예 : Biggs 1978 그림 1, 267 쪽, Ramsden 2003 그림 5.1, 82 쪽)에서 평가와 학습을 연결하는 선을 채우는 방법을 강조합니다. 그러나 SA와 관련된 중대한 결과를 감안할 때 이 보고서에서 설명한 요소들이 전반적인 그림에서 중요한 역할을한다는 것은 거의 의심 할 여지가 없습니다.
Our model makes no claim for a solo role for SA as a determinant of learning behavior. Instead, it emphasizes how SA influences learning behavior prior to an assessment event and fleshes out the line linking assessment and learning in other models (e.g., Biggs 1978 fig. 1, p. 267; Ramsden 2003 fig. 5.1, p. 82). However, given the profound consequences associated with SA, there can be little doubt that the factors described in this report play a significant role in the overall picture.
평가를 사용하여 학습에 영향을 미치는 다른 시도들 (예, Gijbels et al. 2009, Heijne-Penninga et al. 2008)과 함께, 우리는 학습에 영향을 미치기 위해 평가를 수행하려는 사람들에게 우리의 연구를 주의적인 이야기로 본다. 이 보고서의 기초가되는 커리큘럼에서 학생들은 여러 이론 모듈을 동시에 연구하고 해당 연도의 네 가지 사전 결정된 시간에 여러 모듈에 대한 테스트를 작성했습니다. 이로 인해 바람직하지 않은 학습 패턴, 즉 2 또는 3 개월 동안 학습 노력이 거의 없었으며, 시험 전 2 주 동안 벼락치기 학습이 진행되었습니다. 현재 커리큘럼의 모듈 식 디자인은 부분적으로보다 지속적이고 효과적인 학습을 유도하기위한 시도였습니다. 그러나 학생들이 과거보다 더 자주 학습에 시간을 할당하는 동안 그 충격은 단순히 이전 교과 과정을 특징 짓는 것보다 짧은 binge-learning의 주기를 유도 한 것으로 보입니다.
Along with other reports of attempts to influence learning using assessment (e.g., Gijbels et al. 2009; Heijne-Penninga et al. 2008), we see our study as a cautionary tale to those who would wield assessment to influence learning. In the curriculum that preceded the one upon which this report is based, students studied multiple theoretical modules concurrently and wrote tests on multiple modules at four pre-determined times during the year. This resulted in what was considered an undesirable pattern of learning i.e., little learning effort for 2 or 3 months, followed by binge-learning for a couple of weeks prior to the tests. The modular design of the present curriculum was in part an attempt to induce more continuous and effective learning. However, while students do allocate time to learning more frequently than in the past, the impact appears to have been simply the induction of shorter cycles of binge-learning than had characterized the previous curriculum.
A model of the pre-assessment learning effects of summative assessment in medical education.
Author information
- 1
- Centre for Teaching and Learning, Stellenbosch University, Private Bag X1, Matieland, 7602, South Africa. fjc@sun.ac.za
Abstract
- PMID:
- 21461880
- PMCID:
- PMC3274672
- DOI:
- 10.1007/s10459-011-9292-5
'Articles (Medical Education) > 평가법 (Portfolio 등)' 카테고리의 다른 글
평가에 대해 다시 생각해보기: 환자안전, 학생석차, 피드백의 균형(Acad Med, 2017) (0) | 2017.06.09 |
---|---|
의학교육에서 평가(N Engl J Med 2007) (0) | 2017.05.16 |
총괄평가가 학습에 영향을 미치는 메커니즘(Adv in Health Sci Educ, 2010) (0) | 2017.05.15 |
교육평가에서 새로운 심리측정 모델을 위한 항변(Med Educ, 2006) (0) | 2017.05.12 |
임상역량의 Authentic 평가: 역량 추구의 함정 (Acad Med, 2010) (0) | 2017.05.11 |