we explored the efficacy of 10 learning techniques (listed in Table 1) that students could use to improve their success across a wide variety of content domains.1

we limited our choices to techniques that could be implemented by students without assistance (e.g., without requiring advanced technologies or extensive materials that would have to be prepared by a teacher).

Table 2. Materials pertain to the specific content that students are expected to learn, remember, or comprehend. Learning conditions pertain to aspects of the context in which students are interacting with the to-belearned materials.

Any number of student characteristics could also influence the effectiveness of a given learning technique.

The degree to which the efficacy of each learning technique obtains across long retention intervals and generalizes across different criterion tasks is of critical importance.

Reviewing the Learning Techniques

1 Elaborative interrogation

1.1 General description of elaborative interrogation and why it should work.

The prevailing theoretical account of elaborative-interrogation effects is that elaborative interrogation enhances learning by supporting the integration of new information with existing prior knowledge. During elaborative interrogation, learners presumably “activate schemata . . . These schemata, in turn, help to organize new information which facilitates retrieval” (Willoughby & Wood, 1994, p. 140). Although the integration of new facts with prior knowledge may facilitate the organization (Hunt, 2006) of that information, organization alone is not sufficient—students must also be able to discriminate among related facts to be accurate when identifying or using the learned information (Hunt, 2006). Consistent with this account, note that most elaborative-interrogation prompts explicitly or implicitly invite processing of both similarities and differences between related entities (e.g., why a fact would be true of one province versus other provinces). 

As we highlight below, processing of similarities and differences among to-be-learned facts also accounts for findings that elaborative-interrogation effects are often larger when elaborations are precise rather than imprecise, when prior knowledge is higher rather than lower (consistent with research showing that preexisting knowledge enhances memory by facilitating distinctive processing; e.g., Rawson & Van Overschelde, 2008), and when elaborations are self-generated rather than provided (a finding consistent with research showing that distinctiveness effects depend on self-generating item-specific cues; Hunt & Smith, 1996).

In one of the earliest systematic studies of elaborative interrogation, Pressley, McDaniel, Turnure, Wood, and Ahmad (1987) presented undergraduate students with a list of sentences, each describing the action of a particular man (e.g., “The hungry man got into the car”). In the elaborative-interrogation group, for each sentence, participants were prompted to explain “Why did that particular man do that?” Another group of participants was instead provided with an explanation for each sentence (e.g., “The hungry man got into the car to go to the restaurant”), and a third group simply read each sentence. On a final test in which participants were cued to recall which man performed each action (e.g., “Who got in the car?”), the elaborative-interrogation group substantially outperformed the other two groups (collapsing across experiments, accuracy in this group was approximately 72%, compared with approximately 37% in each of the other two groups). From this and similar studies, Seifert (1993) reported average effect sizes ranging from 0.85 to 2.57.

As illustrated above, the key to elaborative interrogation involves prompting learners to generate an explanation for an explicitly stated fact. The particular form of the explanatory prompt has differed somewhat across studies—examples include “Why does it make sense that…?”, “Why is this true?”, and simply “Why?” However, the majority of studies have used prompts following the general format, “Why would this fact be true of this [X] and not some other [X]?”

1.2 How general are the effects of elaborative interrogation?

1.3 Effects in representative educational contexts.

1.4 Issues for implementation.

One possible merit of elaborative interrogation is that it apparently requires minimal training. In the majority of studies reporting elaborative-interrogation effects, learners were given brief instructions and then practiced generating elaborations for 3 or 4 practice facts (sometimes, but not always, with feedback about the quality of the elaborations) before beginning the main task. In some studies, learners were not provided with any practice or illustrative examples prior to the main task. Additionally, elaborative interrogation appears to be relatively reasonable with respect to time demands. Almost all studies set reasonable limits on the amount of time allotted for reading a fact and for generating an elaboration (e.g., 15 seconds allotted for each fact). In one of the few studies permitting self-paced learning, the time-on-task difference between the elaborative-interrogation and reading-only groups was relatively minimal (32 minutes vs. 28 minutes; B. L. Smith et al., 2010). Finally, the consistency of the prompts used across studies allows for relatively straightforward recommendations to students about the nature of the questions they should use to elaborate on facts during study.

With that said, one limitation noted above concerns the potentially narrow applicability of elaborative interrogation to discrete factual statements. As Hamilton (1997) noted, “elaborative interrogation is fairly prescribed when focusing on a list of factual sentences. However, when focusing on more complex outcomes, it is not as clear to what one should direct the ‘why’ questions” (p. 308). For example, when learning about a complex causal process or system (e.g., the digestive system), the appropriate grain size for elaborative interrogation is an open question (e.g., should a prompt focus on an entire system or just a smaller part of it?). Furthermore, whereas the facts to be elaborated are clear when dealing with fact lists, elaborating on facts embedded in lengthier texts will require students to identify their own target facts. Thus, students may need some instruction about the kinds of content to which elaborative interrogation may be fruitfully applied. Dosage is also of concern with lengthier text, with some evidence suggesting that elaborative-interrogation effects are substantially diluted (Callender & McDaniel, 2007) or even reversed (Ramsay, Sperling, & Dornisch, 2010) when elaborative-interrogation prompts are administered infrequently (e.g., one prompt every 1 or 2 pages).

1.5 Elaborative interrogation: Overall assessment.

2 Self-explanation

자가 설명에 대한 정수의 연구에서 베리(1983)는 Wason card selection 작업을 사용하여 논리적 추론에 미치는 영향을 탐구했다. 이 과제에서 학생은 "A", "4", "D" 및 "3"이라는 라벨이 붙은 네 개의 카드를 보고 "한 쪽에 A가 있을 경우" 규칙을 테스트하기 위해 어떤 카드를 뒤집어야 하는지를 나타낼 수 있습니다. 학생들에게 먼저 규칙의 구체적인 인스턴스화(예: 항아리의 한쪽은 잼 맛, 다른 쪽에서는 판매가격)를 풀도록 했다. 정확도는 0에 가까웠다. 그런 다음 "if P, 그 다음 Q" 규칙을 해결하는 방법에 대한 최소한의 설명을 제공받았고, 이 규칙 및 기타 논리 규칙(예: "if P, then Q"가 아닌)의 사용과 관련된 일련의 구체적인 문제를 제공받았다. 이러한 일련의 구체적인 연습 문제 때문에, 한 그룹의 학생들이 각각의 카드를 선택하느냐의 이유를 말함으로써 각각의 문제를 해결하는 동안 스스로 설명하도록 유도되었다. 또 다른 그룹의 학생들은 세트의 모든 문제를 해결했고, 그 후에야 그들이 문제를 어떻게 해결했는지 설명해 달라는 요청을 받았다. 대조군 그룹의 학생들은 어떤 시점에서든 자기 설명을 하도록 요청받지 않았다. 연습 문제의 정확도는 세 그룹 모두에서 90% 이상이었다. 그러나 논리적 규칙이 후속 전송 테스트에서 제시된 일련의 추상적 문제에서 인스턴스화되었을 때, 두 자기 설명 그룹은 대조군 그룹보다 실질적으로 우세했다(그림 2 참조). 두 번째 실험에서, 또 다른 대조군은 그들이 방금 해결했던 구체적인 실행 문제와 다가올 추상적인 문제 사이의 논리적 연결에 대해 명확하게 말했지만, 그들은 더 나은 결과를 얻지 못했다.

In the seminal study on self-explanation, Berry (1983) explored its effects on logical reasoning using the Wason card-selection task. In this task, a student might see four cards labeled “A,” “4,” “D,” and “3" and be asked to indicate which cards must be turned over to test the rule “if a card has A on one side, it has 3 on the other side” (an instantiation of the more general “if P, then Q” rule). Students were first asked to solve a concrete instantiation of the rule (e.g., flavor of jam on one side of a jar and the sale price on the other); accuracy was near zero. They then were provided with a minimal explanation about how to solve the “if P, then Q” rule and were given a set of concrete problems involving the use of this and other logical rules (e.g., “if P, then not Q”). For this set of concrete practice problems, one group of students was prompted to self-explain while solving each problem by stating the reasons for choosing or not choosing each card. Another group of students solved all problems in the set and only then were asked to explain how they had gone about solving the problems. Students in a control group were not prompted to self-explain at any point. Accuracy on the practice problems was 90% or better in all three groups. However, when the logical rules were instantiated in a set of abstract problems presented during a subsequent transfer test, the two self-explanation groups substantially outperformed the control group (see Fig. 2). In a second experiment, another control group was explicitly told about the logical connection between the concrete practice problems they had just solved and the forthcoming abstract problems, but they fared no better (28%).

As illustrated above, the core component of self-explanation involves having students explain some aspect of their processing during learning. Consistent with basic theoretical assumptions about the related technique of elaborative interrogation, self-explanation may enhance learning by supporting the integration of new information with existing prior knowledge. However, compared with the consistent prompts used in the elaborative-interrogation literature, the prompts used to elicit self-explanations have been much more variable across studies. Depending on the variation of the prompt used, the particular mechanisms underlying self-explanation effects may differ somewhat. The key continuum along which selfexplanation prompts differ concerns the degree to which they are content-free versus content-specific. For example, many studies have used prompts that include no explicit mention of particular content from the to-be-learned materials (e.g., “Explain what the sentence means to you. That is, what new information does the sentence provide for you? And how does it relate to what you already know?”). On the other end of the continuum, many studies have used prompts that are much more content-specific, such that different prompts are used for different items (e.g., “Why do you calculate the total acceptable outcomes by multiplying?” “Why is the numerator 14 and the denominator 7 in this step?”). For present purposes, we limit our review to studies that have used prompts that are relatively content-free. Although many of the content-specific prompts do elicit explanations, the relatively structured nature of these prompts would require teachers to construct sets of specific prompts to put into practice, rather than capturing a more general technique that students could be taught to use on their own. Furthermore, in some studies that have been situated in the self-explanation literature, the nature of the prompts is functionally more closely aligned with that of practice testing.

Even within the set of studies selected for review here, considerable variability remains in the self-explanation prompts that have been used. Furthermore, the range of tasks and measures that have been used to explore self-explanation is quite large. Although we view this range as a strength of the literature, the variability in self-explanation prompts, tasks, and measures does not easily support a general summative statement about the mechanisms that underlie self-explanation effects.

2.2 How general are the effects of self-explanation?

2.3 Effects in representative educational contexts.

2.4 Issues for implementation.

As noted above, a particular strength of the self-explanation strategy is its broad applicability across a range of tasks and content domains. Furthermore, in almost all of the studies reporting significant effects of selfexplanation, participants were provided with minimal instructions and little to no practice with self-explanation prior to completing the experimental task. Thus, most students apparently can profit from self-explanation with minimal training.

However, some students may require more instruction to successfully implement self-explanation. In a study by Didierjean and Cauzinille-Marmèche (1997), ninth graders with poor algebra skills received minimal training prior to engaging in self-explanation while solving algebra problems; analysis of think-aloud protocols revealed that students produced many more paraphrases than explanations. Several studies have reported positive correlations between final-test performance and both the quantity and quality of explanations generated by students during learning, further suggesting that the benefit of self-explanation might be enhanced by teaching students how to effectively implement the self-explanation technique (for examples of training methods, see Ainsworth & Burcham, 2007; R. M. F. Wong et al., 2002). However, in at least some of these studies, students who produced more or better-quality self-explanations may have had greater domain knowledge; if so, then further training with the technique may not have benefited the more poorly performing students. Investigating the contribution of these factors (skill at self-explanation vs. domain knowledge) to the efficacy of self-explanation will have important implications for how and when to use this technique.

An outstanding issue concerns the time demands associated with self-explanation and the extent to which self-explanation effects may have been due to increased time on task. Unfortunately, few studies equated time on task when comparing selfexplanation conditions to control conditions involving other strategies or activities, and most studies involving self-paced practice did not report participants’ time on task. In the few studies reporting time on task, self-paced administration usually yielded nontrivial increases (30–100%) in the amount of time spent learning in the self-explanation condition relative to other conditions, a result that is perhaps not surprising, given the high dosage levels at which self-explanation was implemented. For example, Chi et al. (1994) prompted learners to self-explain after reading each sentence of an expository text, which doubled the amount of time the group spent studying the text relative to a rereading control group (125 vs. 66 minutes, respectively). With that said, Schworm and Renkl (2006) reported that time on task was not correlated with performance across groups, and Ainsworth and Burcham (2007) reported that controlling for study time did not eliminate effects of self-explanation.

Within the small number of studies in which time on task was equated, results were somewhat mixed. Three studies equating time on task reported significant effects of selfexplanation (de Bruin et al., 2007; de Koning, Tabbers, Rikers, & Paas, 2011; O’Reilly, Symons, & MacLatchy-Gaudet, 1998). In contrast, Matthews and Rittle-Johnson (2009) had one group of third through fifth graders practice solving math problems with self-explanation and a control group solve twice as many practice problems without self-explanation; the two groups performed similarly on a final test. Clearly, further research is needed to establish the bang for the buck provided by self-explanation before strong prescriptive conclusions can be made.

2.5 Self-explanation: Overall assessment.

3 Summarization

요약과 관련된 문제에 대한 소개로서, 시제품 실험의 설명부터 시작합니다. 브레칭과 쿨하비(1979)는 고교 3학년생들에게 가상의 부족에 대한 2,000단어짜리 글을 공부하도록 했다. 학생들은 5가지 학습 조건 중 하나에 배정되었고 본문을 공부하기 위해 최대 30분을 할애했다. 각 페이지를 읽은 후, 요약 그룹의 학생들은 그 페이지의 주요 내용을 요약한 세 줄의 텍스트를 쓰라는 지시를 받았다. 노트 필기를 하는 그룹의 학생들은, 그들이 읽는 동안 각 페이지의 노트를 3줄까지 읽도록 지시 받았다는 것을 제외하고는 비슷한 지시를 받았다. 언어 복사 그룹에 속한 학생들은 각 페이지에서 가장 중요한 세 줄을 찾아 복사하라는 지시를 받았다. 문자 검색 그룹의 학생들은 본문의 대문자를 모두 베꼈고, 또한 세 줄을 채웠다. 마지막으로, 대조군 그룹의 학생들은 아무것도 기록하지 않고 단순히 텍스트를 읽는다. (글쓰기와 관련된 네 가지 조건의 학생 중 일부는 자신이 쓴 내용을 복습할 수 있었지만, 현재 목적상 우리는 최종 시험 전에 복습할 기회를 얻지 못한 학생들에게 초점을 맞출 것이다.) 학생들은 1주일 후에 학습하거나 시험을 치렀다.원문을 가로질러서 즉시 및 지연된 테스트 모두에서, 요약 및 노트 작성 그룹의 학생들이 가장 잘 수행했고, 서신 검색 그룹에서 가장 나쁜 수행으로 언어 복사 및 제어 그룹의 학생들이 그 뒤를 이었다(그림 3 참조).

As an introduction to the issues relevant to summarization, we begin with a description of a prototypical experiment. Bretzing and Kulhavy (1979) had high school juniors and seniors study a 2,000-word text about a fictitious tribe of people. Students were assigned to one of five learning conditions and given up to 30 minutes to study the text. After reading each page, students in a summarization group were instructed to write three lines of text that summarized the main points from that page. Students in a note-taking group received similar instructions, except that they were told to take up to three lines of notes on each page of text while reading. Students in a verbatim-copying group were instructed to locate and copy the three most important lines on each page. Students in a letter-search group copied all the capitalized words in the text, also filling up three lines. Finally, students in a control group simply read the text without recording anything. (A subset of students from the four conditions involving writing were allowed to review what they had written, but for present purposes we will focus on the students who did not get a chance to review before the final test.) Students were tested either shortly after learning or 1 week later, answering 25 questions that required them to connect information from across the text. On both the immediate and delayed tests, students in the summarization and note-taking groups performed best, followed by the students in the verbatim-copying and control groups, with the worst performance in the letter-search group (see Fig. 3).

Bretzing and Kulhavy’s (1979) results fit nicely with the claim that summarization boosts learning and retention because it involves attending to and extracting the higher-level meaning and gist of the material. The conditions in the experiment were specifically designed to manipulate how much students processed the texts for meaning, with the letter-search condition involving shallow processing of the text that did not require learners to extract its meaning (Craik & Lockhart, 1972). Summarization was more beneficial than that shallow task and yielded benefits similar to those of note-taking, another task known to boost learning (e.g., Bretzing & Kulhavy, 1981; Crawford, 1925a, 1925b; Di Vesta & Gray, 1972). More than just facilitating the extraction of meaning, however, summarization should also boost organizational processing, given that extracting the gist of a text requires learners to connect disparate pieces of the text, as opposed to simply evaluating its individual components (similar to the way in which note-taking affords organizational processing; Einstein, Morris, & Smith, 1985). 

One last point should be made about the results from Bretzing and Kulhavy (1979)—namely, that summarization and note-taking were both more beneficial than was verbatim copying. Students in the verbatim-copying group still had to locate the most important information in the text, but they did not synthesize it into a summary or rephrase it in their notes. Thus, writing about the important points in one’s own words produced a benefit over and above that of selecting important information; students benefited from the more active processing involved in summarization and notetaking (see Wittrock, 1990, and Chi, 2009, for reviews of active/generative learning). These explanations all suggest that summarization helps students identify and organize the main ideas within a text.

So how strong is the evidence that summarization is a beneficial learning strategy? One reason this question is difficult to answer is that the summarization strategy has been implemented in many different ways across studies, making it difficult to draw general conclusions about its efficacy. Pressley and colleagues described the situation well when they noted that “summarization is not one strategy but a family of strategies” (Pressley, Johnson, Symons, McGoldrick, & Kurita, 1989, p. 5). Depending on the particular instructions given, students’ summaries might consist of single words, sentences, or longer paragraphs; be limited in length or not; capture an entire text or only a portion of it; be written or spoken aloud; or be produced from memory or with the text present.

A lot of research has involved summarization in some form, yet whereas some evidence demonstrates that summarization works (e.g., L. W. Brooks, Dansereau, Holley, & Spurlin, 1983; Doctorow, Wittrock, & Marks, 1978), T. H. Anderson and Armbruster’s (1984) conclusion that “research in support of summarizing as a studying activity is sparse indeed” (p. 670) is not outmoded. Instead of focusing on discovering when (and how) summarization works, by itself and without training, researchers have tended to explore how to train students to write better summaries (e.g., Friend, 2001; Hare & Borchardt, 1984) or to examine other benefits of training the skill of summarization. Still others have simply assumed that summarization works, including it as a component in larger interventions (e.g., Carr, Bigler, & Morningstar, 1991; Lee, Lim, & Grabowski, 2010; Palincsar & Brown, 1984; Spörer, Brunstein, & Kieschke, 2009). When collapsing across findings pertaining to all forms of summarization, summarization appears to benefit students, but the evidence for any one instantiation of the strategy is less compelling.

The focus on training students to summarize reflects the belief that the quality of summaries matters. If a summary does not emphasize the main points of a text, or if it includes incorrect information, why would it be expected to benefit learning and retention? Consider a study by Bednall and Kehoe (2011, Experiment 2), in which undergraduates studied six Web units that explained different logical fallacies and provided examples of each. Of interest for present purposes are two groups: a control group who simply read the units and a group in which students were asked to summarize the material as if they were explaining it to a friend. Both groups received the following tests: a multiple-choice quiz that tested information directly stated in the Web unit; a short-answer test in which, for each of a list of presented statements, students were required to name the specific fallacy that had been committed or write “not a fallacy” if one had not occurred; and, finally, an application test that required students to write explanations of logical fallacies in examples that had been studied (near transfer) as well as explanations of fallacies in novel examples (far transfer). Summarization did not benefit overall performance, but the researchers noticed that the summaries varied a lot in content; for one studied fallacy, only 64% of the summaries included the correct definition. Table 3 shows the relationships between summary content and later performance. Higher-quality summaries that contained more information and that were linked to prior knowledge were associated with better performance.

3.2 How general are the effects of summarization?

3.3 Effects in representative educational contexts.

3.4 Issues for implementation.

요약은 이미 요약하는 방법을 알고 있는 학부생이나 다른 학습자에게 실현 가능하다. 이러한 학생들에게, 요약은 완성하거나 이해하는 데 많은 시간이 걸리지 않는 쉬운 실행 기법이 될 것이다. 유일한 우려는 이 학생들이 어떤 다른 전략을 통해 더 나은 서비스를 받을 수 있는지의 여부일 것이다. 그러나 분명히, 요약은 학생들이 전형적으로 선호하는 학습 전략(예: 강조 표시 및 재독서)보다 나을 것이다. 더 까다로운 문제는 전문 요약자가 아닌 학생들에게 전략을 적용하는 것과 관련이 있다. 중학생이나 학습장애가 있는 학습자가 요약의 혜택을 받을 수 있도록 상대적으로 집중적인 교육 프로그램이 필요하다. 이러한 노력은 잘못된 것이 아니다. 훈련 절차는 실질적인 문제를 제기하지만(예: 1991년 중등교육 및 중등교육의 경우 Maljria & Salvia, 1992년: 6.5–11시간의 훈련)학습장애를 가진 학생; Rinehart 등, 1986년: 6학년 학생들에게 5일 동안 매일 45-50분씩 교육을 실시함). 물론, 강사들은 요약 자체가 목표이기 때문이 아니라 요약 자체를 연구 기법으로 사용할 계획이기 때문에 학생들이 자료를 요약하기를 원할 수 있고, 그 목표는 훈련의 노력을 할 가치가 있을 수 있다.

Summarization would be feasible for undergraduates or other learners who already know how to summarize. For these students, summarization would constitute an easy-to-implement technique that would not take a lot of time to complete or understand. The only concern would be whether these students might be better served by some other strategy, but certainly summarization would be better than the study strategies students typically favor, such as highlighting and rereading (as we discuss in the sections on those strategies below). A trickier issue would concern implementing the strategy with students who are not skilled summarizers. Relatively intensive training programs are required for middle school students or learners with learning disabilities to benefit from summarization. Such efforts are not misplaced; training has been shown to benefit performance on a range of measures, although the training procedures do raise practical issues (e.g., Gajria & Salvia, 1992: 6.5–11 hours of training used for sixth through ninth graders with learning disabilities; Malone & Mastropieri, 1991: 2 days of training used for middle school students with learning disabilities; Rinehart et al., 1986: 45–50 minutes of instruction per day for 5 days used for sixth graders). Of course, instructors may want students to summarize material because summarization itself is a goal, not because they plan to use summarization as a study technique, and that goal may merit the efforts of training.

그러나 요약본을 학습 기법으로 사용하는 것이 목표라면, 우리의 질문은 교사의 시간 및 학생들의 다른 활동에 필요한 시간 측면에서 모두 교육 시간이 어느 정도 소요될 것인가 하는 것이다. 예를 들어, 효과 면에서, 요약은 다른 기법과 비교했을 때 집합 중간에 있는 경향이 있다. 직접적인 비교에서, 그것은 때때로 재독하는 것보다 더 유용했고, 노트적기만큼 유용했으며, 설명생성 또는 자기질문보다는 덜유용했다.

However, if the goal is to use summarization as a study technique, our question is whether training students would be worth the amount of time it would take, both in terms of the time required on the part of the instructor and in terms of the time taken away from students’ other activities. For instance, in terms of efficacy, summarization tends to fall in the middle of the pack when compared to other techniques. In direct comparisons, it was sometimes more useful than rereading (Rewey, Dansereau, & Peel, 1991) and was as useful as notetaking (e.g., Bretzing & Kulhavy, 1979) but was less powerful than generating explanations (e.g., Bednall & Kehoe, 2011) or self-questioning (A. King, 1992).

3.5 Summarization: Overall assessment.

4 Highlighting and underlining

관련 문제에 대한 소개로서, 시제품 실험의 설명부터 시작합니다. Fowler와 Barker는 학부생들에게 Scientific American and Science의 권태와 도시 생활에 대한 기사를 읽게 했다. 학생들은 세 가지 그룹 중 하나에 배정되었는데, 그들은 기사를 읽기만 하는 대조군 그룹, 그들이 원하는 만큼 텍스트를 강조할 수 있는 능동 강조 그룹, 또는 수동 강조 표시 그룹이다.p. 모든 학생들이 본문을 연구하기 위해 1시간을 받았다. 모든 피험자는 1주일 후에 실험실로 돌아와 54개 항목 객관식 시험을 치르기 전에 10분간 원본을 검토할 수 있었다. 전반적으로, 강조표시 그룹은 최종 테스트에서 대조군보다 높은 성과를 거두지 못했는데, 이 결과는 안타깝게도 대부분의 문헌에서 반영되었다(예: 훈, 1974; Idstein & Jenkins, 1972년).

As an introduction to the relevant issues, we begin with a description of a prototypical experiment. Fowler and Barker (1974, Exp. 1) had undergraduates read articles (totaling about 8,000 words) about boredom and city life from Scientific American and Science. Students were assigned to one of three groups: a control group, in which they only read the articles; an active-highlighting group, in which they were free to highlight as much of the texts as they wanted; or a passive-highlighting group, in which they read marked texts that had been highlighted by yoked participants in the active-highlighting group. Everyone received 1 hour to study the texts (time on task was equated across groups); students in the active-highlighting condition were told to mark particularly important material. All subjects returned to the lab 1 week later and were allowed to review their original materials for 10 minutes before taking a 54-item multiple-choice test. Overall, the highlighting groups did not outperform the control group on the final test, a result that has unfortunately been echoed in much of the literature (e.g., Hoon, 1974; Idstein & Jenkins, 1972; Stordahl & Christensen, 1956).

However, results from more detailed analyses of performance in the two highlighting groups are informative about what effects highlighting might have on cognitive processing. First, within the active-highlighting group, performance was better on test items for which the relevant text had been highlighted (see Blanchard & Mikkelson, 1987; L. L. Johnson, 1988 for similar results). Second, this benefit to highlighted information was greater for the active highlighters (who selected what to highlight) than for passive highlighters (who saw the same information highlighted, but did not select it). Third, this benefit to highlighted information was accompanied by a small cost on test questions probing information that had not been highlighted.

To explain such findings, researchers often point to a basic cognitive phenomenon known as the isolation effect, whereby a semantically or phonologically unique item in a list is much better remembered than its less distinctive counterparts (see Hunt, 1995, for a description of this work). For instance, if students are studying a list of categorically related words (e.g., “desk,” “bed,” “chair,” “table”) and a word from a different category (e.g., “cow”) is presented, the students will later be more likely to recall it than they would if it had been studied in a list of categorically related words (e.g., “goat,” “pig,” “horse,” “chicken”). The analogy to highlighting is that a highlighted, underlined, or capitalized sentence will “pop out” of the text in the same way that the word “cow” would if it were isolated in a list of words for types of furniture. Consistent with this expectation, a number of studies have shown that reading marked text promotes later memory for the marked material: Students are more likely to remember things that the experimenter highlighted or underlined in the text (e.g., Cashen & Leicht, 1970; Crouse & Idstein, 1972; Hartley, Bartlett, & Branthwaite, 1980; Klare, Mabry, & Gustafson, 1955; see Lorch, 1989 for a review).

Actively selecting information should benefit memory more than simply reading marked text (given that the former would capitalize on the benefits of generation, Slamecka & Graf, 1978, and active processing more generally, Faw & Waller, 1976). Marked text draws the reader’s attention, but additional processing should be required if the reader has to decide which material is most important. Such decisions require the reader to think about the meaning of the text and how its different pieces relate to one another (i.e., organizational processing; Hunt & Worthen, 2006). In the Fowler and Barker (1974) experiment, this benefit was reflected in the greater advantage for highlighted information among active highlighters than among passive recipients of the same highlighted text. However, active highlighting is not always better than receiving material that has already been highlighted by an experimenter (e.g., Nist & Hogrebe, 1987), probably because experimenters will usually be better than students at highlighting the most important parts of a text.

More generally, the quality of the highlighting is likely crucial to whether it helps students to learn (e.g., Wollen, Cone, Britcher, & Mindemann, 1985), but unfortunately, many studies have not contained any measure of the amount or the appropriateness of students’ highlighting. Those studies that have examined the amount of marked text have found great variability in what students actually mark, with some students marking almost nothing and others marking almost everything (e.g., Idstein & Jenkins, 1972). Some intriguing data came from the active-highlighting group in Fowler and Barker (1974). Test performance was negatively correlated (r = –.29) with the amount of text that had been highlighted in the activehighlighting group, although this result was not significant given the small sample size (n = 19).

텍스트를 너무 많이 표시하면 여러 가지 결과가 발생할 수 있습니다. 첫째, 과도하게 표시하면 표시된 텍스트가 다른 텍스트와 구별되는 정도가 줄어들고, 눈에 띄지 않는다면 사람들은 표시된 텍스트를 기억할 가능성이 더 적다. (로치, 로치, & 클레슈비츠, 1995) 둘째, 가장 중요한 세부 사항들을 정리하는 것보다 많은 텍스트를 강조하는 데 더 적은 프로세싱이 필요할 수 있다. 이 후자의 아이디어와 일관되게, 실험자들이 학생들이 표시할 수 있는 텍스트의 양에 명백한 제한을 가할 때 텍스트 표시의 이점이 더 관찰될 수 있다. 예를 들어, 리카드와 8월(1975년)은 한 문단당 한 문장의 밑줄만 긋는 학생들이 나들이 대조군보다 더 많은 과학 교재를 기억해냈다는 것을 발견했다. 비슷하게, L. L. L. Johnson (1988)은 한 단락당 하나의 문장을 표시하는 것이 전반적인 혜택으로 번역되지는 않았지만, 읽기 수업에서 대학생들이 밑줄이 그어진 정보를 기억하는데 도움이 된다는 것을 발견했다.

Marking too much text is likely to have multiple consequences. First, overmarking reduces the degree to which marked text is distinguished from other text, and people are less likely to remember marked text if it is not distinctive (Lorch, Lorch, & Klusewitz, 1995). Second, it likely takes less processing to mark a lot of text than to single out the most important details. Consistent with this latter idea, benefits of marking text may be more likely to be observed when experimenters impose explicit limits on the amount of text students are allowed to mark. For example, Rickards and August (1975) found that students limited to underlining a single sentence per paragraph later recalled more of a science text than did a nounderlining control group. Similarly, L. L. Johnson (1988) found that marking one sentence per paragraph helped college students in a reading class to remember the underlined information, although it did not translate into an overall benefit.

4.2 How general are the effects of highlighting and underlining?

4.3 Effects in representative educational contexts.

4.4 Issues for implementation.

Students already are familiar with and spontaneously adopt the technique of highlighting; the problem is that the way the technique is typically implemented is not effective. Whereas the technique as it is typically used is not normally detrimental to learning (but see Peterson, 1992, for a possible exception), it may be problematic to the extent that it prevents students from engaging in other, more productive strategies.

One possibility that should be explored is whether students could be trained to highlight more effectively. We located three studies focused on training students to highlight. In two of these cases, training involved one or more sessions in which students practiced reading texts to look for main ideas before marking any text. Students received feedback about practice texts before marking (and being tested on) the target text, and training improved performance (e.g., Amer, 1994; Hayati & Shariatifar, 2009). In the third case, students received feedback on their ability to underline the most important content in a text; critically, students were instructed to underline as little as possible. In one condition, students even lost points for underlining extraneous material (Glover, Zimmer, Filbeck, & Plake, 1980). The training procedures in all three cases involved feedback, and they all had some safeguard against overuse of the technique. Given students’ enthusiasm for highlighting and underlining (or perhaps overenthusiasm, given that students do not always use the technique correctly), discovering fail-proof ways to ensure that this technique is used effectively might be easier than convincing students to abandon it entirely in favor of other techniques.

4.5 Highlighting and underlining: Overall assessment.

5 The keyword mnemonic

한 학생이 라 덴트, 라 클레프, 복수도, 애도자(죽기)와 같은 프랑스 어휘를 배우려고 애쓰는 모습을 상상해보라. 학습을 용이하게 하기 위해 학생들은 앳킨슨과 and(1975년)이 개발한 쌍방향 이미지를 바탕으로 한 기술인 키워드 니모닉을 사용한다. 이 니모닉을 사용하기 위해서, 학생들은 먼저 "라 덴트"의 치과의사나 "라 클리프"의 절벽과 같은 외국 단어와 비슷한 영어 단어를 찾을 것이다. 그 학생은 영어 번역과 상호작용하는 영어 키워드의 정신적인 이미지를 발달시킬 것이다. 그래서, 라 덴트-투스를 위해, 그 학생은 한 쌍의 쟁기로 큰 어금니를 들고 있는 치과의사를 상상할 수 있다. Raugh와 Atkinson (1975년)은 대학생들에게 스페인어-영어 어휘(예: Gusano-worm)를 배우기 위해 키워드를 사용하게 했고, 학생들은 최초로 각 실험자가 제공한 키워드를 연관시켰다. 영어 번역이 포함된 키워드 이후 테스트에서 학생들은 스페인어로 된 큐(예: "구산"-?)를 제시했을 때 영어 번역을 생성하라는 요청을 받았다. 니모닉이라는 키워드를 사용한 학생들은 키워드 없이 번역에 상당하는 것을 연구한 대조군 학생들보다 훨씬 더 잘 수행했다.

Imagine a student struggling to learn French vocabulary, including words such as la dent (tooth), la clef (key), revenir (to come back), and mourir (to die). To facilitate learning, the student uses the keyword mnemonic, which is a technique based on interactive imagery that was developed by Atkinson and Raugh (1975). To use this mnemonic, the student would first find an English word that sounds similar to the foreign cue word, such as dentist for “la dent” or cliff for “la clef.” The student would then develop a mental image of the English keyword interacting with the English translation. So, for la dent–tooth, the student might imagine a dentist holding a large molar with a pair of pliers. Raugh and Atkinson (1975) had college students use the keyword mnemonic to learn Spanish-English vocabulary (e.g., gusano–worm): the students first learned to associate each experimenter-provided keyword with the appropriate Spanish cue (e.g., “gusano” is associated with the keyword “goose”), and then they developed interactive images to associate the keywords with their English translations. In a later test, the students were asked to generate the English translation when presented with the Spanish cue (e.g., “gusano”–?). Students who used the keyword mnemonic performed significantly better on the test than did a control group of students who studied the translation equivalents without keywords.

Beyond this first demonstration, the potential benefits of the keyword mnemonic have been extensively explored, and its power partly resides in the use of interactive images. In particular, the interactive image involves elaboration that integrates the words meaningfully, and the images themselves should help to distinguish the sought-after translation from other candidates. For instance, in the example above, the image of the “large molar” distinguishes “tooth” (the target) from other candidates relevant to dentists (e.g., gums, drills, floss). As we discuss next, the keyword mnemonic can be effectively used by students of different ages and abilities for a variety of materials. Nevertheless, our analysis of this literature also uncovered limitations of the keyword mnemonic that may constrain its utility for teachers and students. Given these limitations, we did not separate our review of the literature into separate sections that pertain to each variable category (Table 2) but instead provide a brief overview of the most relevant evidence concerning the generalizability of this technique.

5.2 a–d How general are the effects of the keyword mnemonic?

5.3 Effects in representative educational contexts.

5.4 Issues for implementation.

The majority of research on the keyword mnemonic has involved at least some (and occasionally extensive) training, largely aimed at helping students develop interactive images and use them to subsequently retrieve targets. Beyond training, implementation also requires the development of keywords, whether by students, teachers, or textbook designers. The effort involved in generating some keywords may not be the most efficient use of time for students (or teachers), particularly given that at least one easy-to-use technique (i.e., retrieval practice, Fritz, Morris, Acton, Voelkel, & Etkind, 2007) benefits retention as much as the keyword mnemonic does.

5.5 The keyword mnemonic: Overall assessment.

6 Imagery use for text learning

텍스트 학습을 향상시키기 위한 이미지의 잠재력을 보여주는 한 실험에서, Leutner, Leopold, Sumfleth(2009)는 10학년 학생들에게 물 분자의 쌍극 특성에 관한 장황한 과학 텍스트를 읽도록 35분을 주었다. 학생들은 이해(제어 그룹)를 위해 텍스트를 읽도록 지시받거나 간단하고 명확한 심상을 사용하여 각 단락의 내용을 정신적으로 상상하라는 말을 들었다. 상상력 설명서와 그림이 교차되었다. 일부 학생들은 각 단락의 내용을 나타내는 그림을 그리도록 지시받았고, 다른 학생들은 그림을 그리지 않았다. 읽은 직후, 학생들은 본문에서 직접 정확한 답을 얻을 수 없지만 그것으로부터 추론할 필요가 있는 질문들을 포함한 객관식 시험을 치렀다. 그림 5에서 표시한 대로, 각 단락의 내용을 정신적으로 상상하는 지침은 대조군(Cohen's d =72)의 학생들에 비해 정신 영상 그룹에서 학생들의 이해력 테스트 성과를 크게 향상시켰다. 특히 (a) 교육이 필요하지 않은 경우, (b) 복잡한 과학 콘텐츠와 관련된 텍스트, (c) 학습자에게 콘텐츠에 대한 추론을 하도록 요구하는 기준 시험을 고려할 때, 이러한 효과는 인상적이다. 마지막으로, 그림은 이해를 향상시키지 못했고, 실제로 이미지 지침의 장점을 부정했다. 다른 활동이 이미지의 기능을 방해할 수 있는 잠재성은 아래의 학습 조건(6.2a)에 대한 하위섹션에 자세히 설명되어 있습니다.

In one demonstration of the potential of imagery for enhancing text learning, Leutner, Leopold, and Sumfleth (2009) gave tenth graders 35 minutes to read a lengthy science text on the dipole character of water molecules. Students either were told to read the text for comprehension (control group) or were told to read the text and to mentally imagine the content of each paragraph using simple and clear mental images. Imagery instructions were also crossed with drawing: Some students were instructed to draw pictures that represented the content of each paragraph, and others did not draw. Soon after reading, the students took a multiple-choice test that included questions for which the correct answer was not directly available from the text but needed to be inferred from it. As shown in Figure 5, the instructions to mentally imagine the content of each paragraph significantly boosted the comprehension-test performance of students in the mental-imagery group, in comparison to students in the control group (Cohen’s d = 0.72). This effect is impressive, especially given that (a) training was not required, (b) the text involved complex science content, and (c) the criterion test required learners to make inferences about the content. Finally, drawing did not improve comprehension, and it actually negated the benefits of imagery instructions. The potential for another activity to interfere with the potency of imagery is discussed further in the subsection on learning conditions (6.2a) below.

A variety of mechanisms may contribute to the benefits of imaging text material on later test performance. Developing images can enhance one’s mental organization or integration of information in the text, and idiosyncratic images of particular referents in the text could enhance learning as well (cf. distinctive processing; Hunt, 2006). Moreover, using one’s prior knowledge to generate a coherent representation of a narrative may enhance a student’s general understanding of the text; if so, the influence of imagery use may be robust across criterion tasks that tap memory and comprehension. Despite these possibilities and the dramatic effect of imagery demonstrated by Leutner et al. (2009), our review of the literature suggests that the effects of using mental imagery to learn from text may be rather limited and not robust.

6.2 How general are the effects of imagery use for text learning?

6.3 Effects in representative educational contexts.

6.4 Issues for implementation.

대부분의 연구는 학생들이 공부하는 동안 텍스트 콘텐츠의 이미지를 생성하도록 격려한 비교적 간단한 지침을 사용하여 이미지의 영향을 조사해왔다. 이미지가 학습을 저해하지 않는 것처럼 보이기 때문에(일부 조건에서 성능이 향상됨) 선생님들은 학생들에게 상상의 묘사에 쉽게 도움이 되는 글을 읽을 때 이미지를 사용하라고 가르칠 수 있다. 학생들이 적절한 조건에서 이미지를 일관성 있고 효과적으로 사용할 수 있도록 하기 위해 얼마나 많은 교육이 필요할지는 알 수 없다..

The majority of studies have examined the influence of imagery by using relatively brief instructions that encouraged students to generate images of text content while studying. Given that imagery does not appear to undermine learning (and that it does boost performance in some conditions), teachers may consider instructing students (third grade and above) to attempt to use imagery when they are reading texts that easily lend themselves to imaginal representations. How much training would be required to ensure that students consistently and effectively use imagery under the appropriate conditions is unknown.

6.5 Imagery use for learning text: Overall assessment.

7 Rereading

로스코프(1968년)의 초기 연구에서 대학생들은 설명문(가죽을 만드는 것에 관한 1500단어 또는 호주 역사에 대한 750단어짜리 복도 중 하나)을 0회, 1, 2, 4회 읽었다. 독서는 스스로 진행되었고, 다시 읽혔다. 10분 정도 지연된 후, 내용 단어의 10%가 텍스트에서 삭제되고 학생들은 누락된 단어를 채우도록 하는 클로즈 테스트가 실시되었다. 그림 6과 같이 reading 횟수 함수로 성능이 향상되었습니다.

In an early study by Rothkopf (1968), undergraduates read an expository text (either a 1,500-word passage about making leather or a 750-word passage about Australian history) zero, one, two, or four times. Reading was self-paced, and rereading was massed (i.e., each presentation of a text occurred immediately after the previous presentation). After a 10-minute delay, a cloze test was administered in which 10% of the content words were deleted from the text and students were to fill in the missing words. As shown in Figure 6, performance improved as a function of number of readings.

Why does rereading improve learning? Mayer (1983; Bromage & Mayer, 1986) outlined two basic accounts of rereading effects. According to the quantitative hypothesis, rereading simply increases the total amount of information encoded, regardless of the kind or level of information within the text. In contrast, the qualitative hypothesis assumes that rereading differentially affects the processing of higher-level and lower-level information within a text, with particular emphasis placed on the conceptual organization and processing of main ideas during rereading. To evaluate these hypotheses, several studies have examined free recall as a function of the kind or level of text information. The results have been somewhat mixed, but the evidence appears to favor the qualitative hypothesis. Although a few studies found that rereading produced similar improvements in the recall of main ideas and of details (a finding consistent with the quantitative hypothesis), several studies have reported greater improvement in the recall of main ideas than in the recall of details (e.g., Bromage & Mayer, 1986; Kiewra, Mayer, Christensen, Kim, & Risch, 1991; Rawson & Kintsch, 2005).

7.2 How general are the effects of rereading?

7.3 Effects in representative educational contexts.

7.4 Issues for implementation.

재독서의 한 가지 장점은 학생들이 그것을 사용하기 위해 교육을 받을 필요가 없다는 것이다. 단, 재독서는 초기 독서 직후가 아니라 중간 지연 후 완료되었을 때 일반적으로 가장 효과적이라는 지시를 받았을 뿐이다. 또한, 일부 다른 학습 기법에 비해서 재독서는 시간 요구에 관해서 상대적으로 경제적입니다(예: 자기 페이스 학습을 허용하는 연구에서 재독서에 소요된 시간은 일반적으로 초기 시간보다 짧습니다). 그러나, 학습 기법의 일대일 비교에서, 재독서는 여기서 논의되는 좀 더 효과적인 기법에 비해서는 덜 효과적이다. 예를 들어, 재독서를 정교한 질문, 자기 설명 및 연습 테스트(아래의 실습 테스트 섹션에서 설명함)에 대한 직접적인 비교는 재독서가 학습 촉진을 위한 열등한 기술임을 일관되게 보여주었다.

One advantage of rereading is that students require no training to use it, other than perhaps being instructed that rereading is generally most effective when completed after a moderate delay rather than immediately after an initial reading. Additionally, relative to some other learning techniques, rereading is relatively economical with respect to time demands (e.g., in those studies permitting self-paced study, the amount of time spent rereading has typically been less than the amount of time spent during initial reading). However, in head-to-head comparisons of learning techniques, rereading has not fared well against some of the more effective techniques discussed here. For example, direct comparisons of rereading to elaborative interrogation, selfexplanation, and practice testing (described in the Practice Testing section below) have consistently shown rereading to be an inferior technique for promoting learning.

7.5 Rereading: Overall assessment.

8 Practice testing

테스트 효과를 보여주는 예로서, Runquist(1983)는 학부생들에게 초기 연구를 위한 단어 쌍 목록을 제공했다. 참가자가 충전기 작업을 완료한 짧은 간격 후, 절반의 쌍은 큐드 리콜을 통해 테스트되었으며 나머지 절반은 테스트되지 않았습니다. 참가자들은 10분 또는 1주일 후에 모든 페어에 대해 최종 재호출 테스트를 완료했다. 최종 시험 수행은 연습 시험을 거치지 않은 쌍(10분 후 36%, 1주일 후 4%)보다 더 우수했다. 본 연구는 실무 시험을 수반하고 포함하지 않는 조건 간의 성과를 비교하는 방법을 설명하는 반면에, 많은 다른 연구들은 실무 시험 조건과 습득해야 할 정보의 추가 표시를 포함하는 좀 더 엄격한 조건을 비교했다. 

As an illustrative example of the power of testing, Runquist (1983) presented undergraduates with a list of word pairs for initial study. After a brief interval during which participants completed filler tasks, half of the pairs were tested via cued recall and half were not. Participants completed a final cued-recall test for all pairs either 10 minutes or 1 week later. Final-test performance was better for pairs that were practice tested than pairs that were not (53% versus 36% after 10 minutes, 35% versus 4% after 1 week). Whereas this study illustrates the method of comparing performance between conditions that do and do not involve a practice test, many other studies have compared a practice-testing condition with more stringent conditions involving additional presentations of the to-be-learned information. 

예를 들어, Roediger와 Karpicke(2006b)는 학부생에게 초기 연구를 위한 짧은 설명문 텍스트와 두 번째 연구 시험 또는 실습 프리-레콜 테스트를 차례로 제시하였다. 1주일 후, 무료 리콜은 재입원한 그룹보다 연습 시험을 치른 그룹 중 상당히 더 좋았다. 재학습과 비교할 때 Karpicke와 Roediger(2008)는 한 번 항목이 정확히 기억될 때까지 스와힐리-영어 번역을 통해 학생들이 학습하고 연습하는 데 도움이 되는 능력을 보여 주었다. 첫 번째 정확한 리콜 후 추가 시험 없이 후속 연구 사이클에서만 또는 추가 연구 없이 후속 시험 사이클에서만 항목이 제시되었다. 1주일 후에 실시된 최종 테스트에서의 성과는 지속적인 테스트 후(80%)가 지속적인 연구 후(36%)보다 훨씬 높았다.

For example, Roediger and Karpicke (2006b) presented undergraduates with a short expository text for initial study followed either by a second study trial or by a practice free-recall test. One week later, free recall was considerably better among the group that had taken the practice test than among the group that had restudied (56% versus 42%). As another particularly compelling demonstration of the potency of testing as compared with restudy, Karpicke and Roediger (2008) presented undergraduates with Swahili-English translations for cycles of study and practice cued recall until items were correctly recalled once. After the first correct recall, items were presented only in subsequent study cycles with no further testing, or only in subsequent test cycles with no further study. Performance on a final test 1 week later was substantially greater after continued testing (80%) than after continued study (36%).

Why does practice testing improve learning? Whereas a wealth of studies have established the generality of testing effects, theories about why it improves learning have lagged behind. Nonetheless, theoretical accounts are increasingly emerging to explain two different kinds of testing effects, which are referred to as direct effects and mediated effects of testing (Roediger & Karpicke, 2006a). Direct effects refer to changes in learning that arise from the act of taking a test itself, whereas mediated effects refer to changes in learning that arise from an influence of testing on the amount or kind of encoding that takes place after the test (e.g., during a subsequent restudy opportunity).

Concerning direct effects of practice testing, Carpenter (2009) recently proposed that testing can enhance retention by triggering elaborative retrieval processes. Attempting to retrieve target information involves a search of long-term memory that activates related information, and this activated information may then be encoded along with the retrieved target, forming an elaborated trace that affords multiple pathways to facilitate later access to that information. In support of this account, Carpenter (2011) had learners study weakly related word pairs (e.g., “mother”–“child”) followed either by additional study or a practice cued-recall test. On a later final test, recall of the target word was prompted via a previously unpresented but strongly related word (e.g., “father”). Performance was greater following a practice test than following restudy, presumably because the practice test increased the likelihood that the related information was activated and encoded along with the target during learning.

연습 시험의 매개 효과와 관련하여, 연습 시험의 조정 효과와 관련하여 Pic와 Rawson은 유사한 설명을 제안했다. 연습 시험이 후속 재학습 기회에서 더 효과적인 중재자의 인코딩을 용이하게 한다고 지적했다(즉, 단서 및 목표를 연결하는 정교한 정보). Pic와 Rawson(2010)은 학습자에게 초기 스터디 블록에서 스와힐리-영어 변환을 제시했으며, 이어서 세 블록을 재학습하고, 참가자의 절반에게는 연습 복습 시험을 선행했습니다. 모든 학습자는 재학습 시마다 키워드 중재자를 생성하고 보고하라는 메시지를 받았습니다. 1주일 후에 시험했을 때 재시험을 치른 학생들과 비교했을 때, 재시험을 치른 학생들은 큐 단어와 함께 프롬프트되었을 때 중재자를 더 잘 기억해냈고 그들의 중재자를 더 잘 기억해낼 수 있었다.

Concerning mediated effects of practice testing, Pyc and Rawson (2010, 2012b) proposed a similar account, according to which practice testing facilitates the encoding of more effective mediators (i.e., elaborative information connecting cues and targets) during subsequent restudy opportunities. Pyc and Rawson (2010) presented learners with Swahili-English translations in an initial study block, which was followed by three blocks of restudy trials; for half of the participants, each restudy trial was preceded by practice cued recall. All learners were prompted to generate and report a keyword mediator during each restudy trial. When tested 1 week later, compared with students who had only restudied, students who had engaged in practice cued recall were more likely to recall their mediators when prompted with the cue word and were more likely to recall the target when prompted with their mediator.

최근의 증거는 또한 연습 시험이 학생들이 정신적으로 정보를 얼마나 잘 구성하고 그들이 개별 항목의 특이한 면을 얼마나 잘 처리하는지, 더 나은 유지와 시험 수행을 지원할 수 있는지를 모두 향상시킬 수 있다는 것을 암시한다. Zaromb 및 Roediger(2010년)는 학습자에게 8개의 스터디 테스트 블록 또는 4개의 무료 테스트 블록에 대해 서로 다른 분류 범주(예: 야채, 의복)의 단어로 구성된 목록을 제공했습니다. 기본 테스트 효과를 재현한 결과, 이틀 후 최종 무료 리콜은 조사 대상(17%)보다 연습 테스트(39%)를 받았을 때 더 컸다. 중요한 것은, 연습 시험 조건 또한 주로 조직화 처리와 특이적 처리를 두드리는 2차 측정에서 연구 조건을 능가했다는 것이다.

Recent evidence also suggests that practice testing may enhance how well students mentally organize information and how well they process idiosyncratic aspects of individual items, which together can support better retention and test performance (Hunt, 1995, 2006). Zaromb and Roediger (2010) presented learners with lists consisting of words from different taxonomic categories (e.g., vegetables, clothing) either for eight blocks of study trials or for four blocks of study trials with each trial followed by a practice free-recall test. Replicating basic testing effects, final free recall 2 days later was greater when items had received practice tests (39%) than when they had only been studied (17%). Importantly, the practice test condition also outperformed the study condition on secondary measures primarily tapping organizational processing and idiosyncratic processing.

8.2 How general are the effects of practice testing?

8.3 Effects in representative educational contexts.

8.4 Issues for implementation.

Practice testing appears to be relatively reasonable with respect to time demands. Most research has shown effects of practice testing when the amount of time allotted for practice testing is modest and is equated with the time allotted for restudying. Another merit of practice testing is that it can be implemented with minimal training. Students can engage in recall-based self-testing in a relatively straightforward fashion. For example, students can self-test via cued recall 

더 체계적인 형태의 연습 테스트(예: 객관식, 단답식, 빈칸 채움 테스트)는 전자 교과서 장 끝부분에 수록된 연습 문제나 질문을 통해 학생들이 쉽게 이용할 수 있는 경우가 많다. 이렇게 함으로써, 학생들은 시험의 이점이 시험, 용량, 타이밍에 따라 결정된다는 점을 고려할 때, 연습 시험을 가장 효과적으로 사용하는 방법에 대한 몇 가지 기본적인 교육으로부터 이익을 얻을 수 있을 것이다. 위에서 설명한 바와 같이 연습 시험은 검색retrieval과 관련된 경우, 연습 세션 내에서 또는 연습 세션 간에 항목이 한 번 이상 정확하게 답변될 때까지 지속될 경우, 시험이나 세션 사이의 간격이 길어지는 경우 더 효과적이다.

More structured forms of practice testing (e.g., multiple-choice, short-answer, and fill-in-the-blank tests) are often readily available to students via practice problems or questions included at the end of textbook chapters or in the electronic supplemental materials that accompany many textbooks. With that said, students would likely benefit from some basic instruction on how to most effectively use practice tests, given that the benefits of testing depend on the kind of test, dosage, and timing. As described above, practice testing is particularly advantageous when it involves retrieval and is continued until items are answered correctly more than once within and across practice sessions, and with longer as opposed to shorter intervals between trials or sessions.

Concerning the effectiveness of practice testing relative to other learning techniques, a few studies have shown benefits of practice testing over concept mapping, note-taking, and imagery use (Fritz et al., 2007; Karpicke & Blunt, 2011; McDaniel et al., 2009; Neuschatz, Preston, Toglia, & Neuschatz, 2005), but the most frequent comparisons have involved pitting practice testing against unguided restudy. The modal outcome is that practice testing outperforms restudying, although this effect depends somewhat on the extent to which practice tests are accompanied by feedback involving presentation of the correct answer. Although many studies have shown that testing alone outperforms restudy, some studies have failed to find this advantage (in most of these cases, accuracy on the practice test has been relatively low). In contrast, the advantage of practice testing with feedback over restudy is extremely robust. Practice testing with feedback also consistently outperforms practice testing alone.

연습 시험을 통한 피드백의 구현을 권장하는 또 다른 이유는 연습 시험에서 학생들이 부정확하게 반응할 때 perservation error가 발생하지 않도록 하기 때문이다. 예를 들어, Butler와 Roediger(2008)는 복수 선택 연습 시험이 피드백이 제공되지 않을 때 최종 재호출 시험에서 거짓 대안의 침입을 증가시켰지만, 피드백이 주어졌을 때는 그러한 증가가 관찰되지 않았다. 다행히도 피드백의 수정 효과는 연습 테스트 직후에 제시될 것을 요구하지 않는다. Metcalfe et al. (2009)은 피드백이 Immm-diate보다 지연되었을 때 초기 잘못된 반응에 대한 최종 시험이 실제로 더 낫다는 것을 발견했다. 또한 고무적인 것은 피드백이 높은 신뢰도 오류를 시정하는데 특히 효과적이라는 증거이다(예: 버터필드 & 메트칼프, 2001). 

Another reason to recommend the implementation of feedback with practice testing is that it protects against perseveration errors when students respond incorrectly on a practice test. For example, Butler and Roediger (2008) found that a multiple-choice practice test increased intrusions of false alternatives on a final cued-recall test when no feedback was provided, whereas no such increase was observed when feedback was given. Fortunately, the corrective effect of feedback does not require that it be presented immediately after the practice test. Metcalfe et al. (2009) found that final-test perfor-mance for initially incorrect responses was actually better when feedback had been delayed than when it had been imme-diate. Also encouraging is evidence suggesting that feedback is particularly effective for correcting high-confidence errors (e.g., Butterfield & Metcalfe, 2001).

Finally, we note that the effects of practice-test errors on subsequent performance tend to be relatively small, often do not obtain, and are heavily out-weighed by the positive benefits of testing (e.g., Fazio et al., 2010; Kang, Pashler, et al., 2011; Roediger & Marsh, 2005). Thus, potential concerns about errors do not constitute a serious issue for implementation, particularly when feedback is provided.

마지막으로, 비록 우리가 이 단면의 목적에 맞게 학생들의 연습 테스트 사용에 초점을 맞추었지만, 우리는 강사들이 또한 교실에서의 연습 사용을 늘림으로써 학생들의 학습을 지원할 수 있다는 것에 간단하게 주목한다. 또한 여러 연구는 가끔 보는 긴 시험보다 더 빈번한 종합 평가 관리(예: 학기당 시험 두세 번이 아닌 일주일에 한 번 시험)를 통해 긍정적인 결과를 보고했으며, 그러한 긍정적 효과는 학습 성과뿐만 아니라, 과정 만족도와 보다 빈번한 시험에 대한 선호도와 같은 요인에 대한 학생들의 rating에 대해서도 나타났다.

Finally, although we have focused on students’ use of practice testing, in keeping with the purpose of this monograph, we briefly note that instructors can also support student learning by increasing the use of low-stakes or no-stakes practice testing in the classroom. Several studies have also reported positive outcomes from administering summative assessments that are shorter and more frequent rather than longer and less frequent (e.g., one exam per week rather than only two or three exams per semester), not only for learning outcomes but also on students’ ratings of factors such as course satisfaction and preference for more frequent testing (e.g., Keys, 1934; Kika, McLaughlin, & Dixon, 1992; Leeming, 2002; for a review, see Bangert-Drowns, Kulik, & Kulik, 1991).

8.5 Practice testing: Overall assessment.

9 Distributed practice

관련 문제를 설명하기 위해, 우리는 학생들이 원래의 세션에서 기준으로 스페인어 단어의 번역본을 학습한 분산 연습에 대한 고전적인 실험의 설명으로 시작한다. (Bahrick, 1979) 그 후 학생들은 6개의 추가 세션에 참여하여 번역을 검색하고 다시 배울 수 있는 기회를 가졌습니다(피드백 제공). 그림 10은 이 연구의 결과를 나타낸다. 제로 스페이싱 조건(그림 10에서 원으로 표현됨)에서 학습 세션은 연속이었고 학습은 6개의 매스 세션에서 빠르게 진행되었다. 1일 조건(그림 10의 제곱으로 표현됨)에서 학습 세션은 1일 간격으로 이루어졌고, 전체 세션에서 거의 모든 것을 망각하게 됩니다(즉, 각 세션에서 첫 번째 테스트에서 수행은 여전히 낮음). 대조적으로, 학습 세션이 30일로 분리되었을 때, 잊어버리는 것은 전체 세션에서 훨씬 더 컸으며, 초기 시험 수행은 심지어 6개의 세션 후에도 다른 두 조건에서 관찰된 수준에 도달하지 못했다. (그림 10의 삼각형 참조) 현재 목적에서 중요한 점은 30일 후 최종 테스트에서 패턴이 역전되어, 30일 후 리러닝 세션이 30일로 분리된 조건에서 변환의 최적 보존이 관찰된다는 것이다. 즉, 가장 많은 중단 시간을 잊어버린 상태로 인해 가장 많은 장기 보존이 이루어졌다. 간격 연습(1일 또는 30일)은 대량 실행(0일)보다 우수했고 지연 시간(30일)이 짧은 기간(1일)보다 길수록 이점이 더 컸다.

To illustrate the issues involved, we begin with a description of a classic experiment on distributed practice, in which students learned translations of Spanish words to criterion in an original session (Bahrick, 1979). Students then participated in six additional sessions in which they had the chance to retrieve and relearn the translations (feedback was provided). Figure 10 presents results from this study. In the zero-spacing condition (represented by the circles in Fig. 10), the learning sessions were back-to-back, and learning was rapid across the six massed sessions. In the 1-day condition (represented by the squares in Fig. 10), learning sessions were spaced 1 day apart, resulting in slightly more forgetting across sessions (i.e., lower performance on the initial test in each session) than in the zero-spacing condition, but students in the 1-day condition still obtained almost perfect accuracy by the sixth session. In contrast, when learning sessions were separated by 30 days, forgetting was much greater across sessions, and initial test performance did not reach the level observed in the other two conditions, even after six sessions (see triangles in Fig. 10). The key point for our present purposes is that the pattern reversed on the final test 30 days later, such that the best retention of the translations was observed in the condition in which relearning sessions had been separated by 30 days. That is, the condition with the most intersession forgetting yielded the greatest long-term retention. Spaced practice (1 day or 30 days) was superior to massed practice (0 days), and the benefit was greater following a longer lag (30 days) than a shorter lag (1 day).

Many theories of distributed-practice effects have been proposed and tested. Consider some of the accounts currently under debate (for in-depth reviews, see Benjamin & Tullis, 2010; Cepeda et al., 2006). One theory invokes the idea of deficient processing, arguing that the processing of material during a second learning opportunity suffers when it is close in time to the original learning episode. Basically, students do not have to work very hard to reread notes or retrieve something from memory when they have just completed this same activity, and furthermore, they may be misled by the ease of this second task and think they know the material better than they really do (e.g., Bahrick & Hall, 2005). 

Another theory involves reminding; namely, the second presentation of to-be-learned material serves to remind the learner of the first learning opportunity, leading it to be retrieved, a process well known to enhance memory (see the Practice Testing section above). Some researchers also draw on consolidation in their explanations, positing that the second learning episode benefits from any consolidation of the first trace that has already happened. Given the relatively large magnitude of distributed-practice effects, it is plausible that multiple mechanisms may contribute to them; hence, particular theories often invoke different combinations of mechanisms to explain the effects.

9.2 How general are the effects of distributed practice?

9.3 Effects in representative educational contexts.

9.4 Issues for implementation.

Several obstacles may arise when implementing distributed practice in the classroom. Dempster and Farris (1990) made the interesting point that many textbooks do not encourage distributed learning, in that they lump related material together and do not review previously covered material in subsequent units. At least one formal content analysis of actual textbooks (specifically, elementary school mathematics textbooks; Stigler, Fuson, Ham, & Kim, 1986) supported this claim, showing that American textbooks grouped to-be-worked problems together (presumably at the end of chapters) as opposed to distributing them throughout the pages. These textbooks also contained less variability in sets of problems than did comparable textbooks from the former Soviet Union. Thus, one issue students face is that their study materials may not be set up in a way that encourages distributed practice.

A second issue involves how students naturally study. Michael (1991) used the term procrastination scallop to describe the typical study pattern—namely, that time spent studying increases as an exam approaches. Mawhinney, Bostow, Laws, Blumenfield, and Hopkins (1971) documented this pattern using volunteers who agreed to study in an observation room that allowed their time spent studying to be recorded. With daily testing, students studied for a consistent amount of time across sessions. But when testing occurred only once every 3 weeks, time spent studying increased across the interval, peaking right before the exam (Mawhinney et al., 1971). In other words, less frequent testing led to massed study immediately before the test, whereas daily testing effectively led to study that was distributed over time. The implication is that students will not necessarily engage in distributed study unless the situation forces them to do so; it is unclear whether this is because of practical constraints or because students do not understand the memorial benefits of distributed practice.

With regard to the issue of whether students understand the benefits of distributed practice, the data are not entirely definitive. Several laboratory studies have investigated students’ choices about whether to mass or space repeated studying of paired associates (e.g., GRE vocabulary words paired with their definitions). In such studies, students typically choose between restudying an item almost immediately after learning (massing) or restudying the item later in the same session (spacing). Although students do choose to mass their study under some conditions (e.g., Benjamin & Bird, 2006; Son, 2004), they typically choose to space their study of items (Pyc & Dunlosky, 2010; Toppino, Cohen, Davis, & Moors, 2009). 

This bias toward spacing does not necessarily mean that students understand the benefits of distributed practice per se (e.g., they may put off restudying a pair because they do not want to see it again immediately), and one study has shown that students rate their overall level of learning as higher after massed study than after spaced study, even when the students had experienced the benefits of spacing (e.g., Kornell & Bjork, 2008). Other recent studies have provided evidence that students are unaware of the benefits of practicing with longer, as opposed to shorter, lags (Pyc & Rawson, 2012b; Wissman et al., 2012).

In sum, because of practical constraints and students’ potential lack of awareness of the benefits of this technique, students may need some training and some convincing that distributed practice is a good way to learn and retain information. Simply experiencing the distributed-practice effect may not always be sufficient, but a demonstration paired with instruction about the effect may be more convincing to students (e.g., Balch, 2006).

9.5 Distributed practice: Overall assessment.

10 Interleaved practice

인터리브 연습은, 블록 연습과는 반대로, 대학생들에게 다른 기하학적 고체의 볼륨을 계산하도록 가르치는 것을 수반한, Rohrer와 Taylor(2007)가 사용하는 방법을 고려함으로써 쉽게 이해할 수 있다. 학생들은 1주일간 두 번의 연습 시간을 가졌다. 각 연습 시간 동안 학생들은 4가지 종류의 기하학적 고체의 체적을 찾는 방법에 대한 튜토리얼을 제공받았고 16가지 연습 문제(각 고체에 대해 4가지)를 완성했다. 각 연습 문제의 완료 후, 올바른 해결 방법이 10초간 제시되었다. 연습 차단 상태에 있는 학생들은 처음에 주어진 실드의 볼륨을 찾는 것에 관한 자습서를 읽었고, 그 후에 바로 그러한 종류의 실체에 대한 네 가지 연습 문제가 뒤따랐다. 그리고 나서 주어진 고체의 볼륨 문제를 연습한 다음 다음, 고체의 다음 종류에 대한 자습서 및 연습 문제 등을 수행했습니다. 인터리브 연습 그룹에 속한 학생들은 처음에 4개의 자습서를 모두 읽은 다음 모든 연습 문제를 완료했으며, 4개의 연속된 문제 세트마다 각각의 4가지 솔리드 유형에 대해 하나의 문제가 포함되었다는 제약이 있었다. 두 번째 연습이 끝난 일주일 후, 모든 학생들은 네 가지 고체 각각에 대해 두 가지 새로운 문제를 해결한 기준 시험을 치렀다. 연습 세션 중 및 기준 테스트 동안 올바른 응답의 학생들의 백분율은 그림 13에 제시되어 있다. 이는 일반적인 인터리빙 효과를 나타낸다. 연습 중에는, 인터리브 연습보다 블록 연습에서 더 낫지만, 이러한 이점은 테스트 간에 급격하게 역전되었다. 인터리브 연습은 정확도를 43% 향상시켰다.

Interleaved practice, as opposed to blocked practice, is easily understood by considering a method used by Rohrer and Taylor (2007), which involved teaching college students to compute the volumes of different geometric solids. Students had two practice sessions, which were separated by 1 week. During each practice session, students were given tutorials on how to find the volume for four different kinds of geometric solids and completed 16 practice problems (4 for each solid). After the completion of each practice problem, the correct solution was shown for 10 seconds. Students in a blockedpractice condition first read a tutorial on finding the volume of a given solid, which was immediately followed by the four practice problems for that kind of solid. Practice solving volumes for a given solid was then followed by the tutorial and practice problems for the next kind of solid, and so on. Students in an interleaved-practice group first read all four tutorials and then completed all the practice problems, with the constraint that every set of four consecutive problems included one problem for each of the four kinds of solids. One week after the second practice session, all students took a criterion test in which they solved two novel problems for each of the four kinds of solids. Students’ percentages of correct responses during the practice sessions and during the criterion test are presented in Figure 13, which illustrates a typical interleaving effect: During practice, performance was better with blocked practice than interleaved practice, but this advantage dramatically reversed on the criterion test, such that interleaved practice boosted accuracy by 43%.

이러한 인상적인 효과에 대한 한 가지 설명은 인터리빙이 학생들에게 어떤 솔루션 방법(즉, 여러 가지 다른 공식 중 어느 것을 사용해야 하는지)을 식별하는 연습을 제공했다는 것이다(Mayfield & Chase, 2002 참조). 다르게 말하면, 상호간의 연습은 학생들이 각각의 문제에 대해 올바른 해결 방법을 사용할 수 있도록 다른 종류의 문제들을 구별하는 데 도움이 된다.

One explanation for this impressive effect is that interleaving gave students practice at identifying which solution method (i.e., which of several different formulas) should be used for a given solid (see also, Mayfield & Chase, 2002). Put differently, interleaved practice helps students to discriminate between the different kinds of problems so that they will be more likely to use the correct solution method for each one.

연습 중 정확도는 블록 학습을 한 학생이 부분적인 문제(각각 68% 대비 99%)와 전체 문제(98% 대 79%)에 대해 더 컸다. 반면, 1일 후 정확도는 인터리브(Interleaved Practice)를 받은 학생(38%)이 훨씬 높았다. Rohrer와 Taylor(2006)와 마찬가지로, 이러한 패턴에 대한 타당한 설명은 학생들이 다양한 종류의 문제를 구별하고 각각의 문제에 적용할 수 있는 적절한 공식을 배우는데 도움을 주었다는 것이다. 이 설명은 4학년 학생들이 기준 과제 동안 전체 문제를 해결할 때 범한 오류에 대한 자세한 분석에 의해 뒷받침되었다. 제작 오류에는 학생들이 원래 훈련되지 않은 공식(예: b × 8)을 사용한 사례가 포함된 반면, 차별 오류에는 학생들이 연습했지만 문제에 적합하지 않은 네 가지 공식 중 하나를 사용한 사례가 포함된다. 그림 14에서 보듯이, 그 두 그룹은 fabrication 오류에서 다르지 않았다. 그러나 discrimination error는 인터리브보다 블록연습에서 더 흔했다. 인터리브 연습을 한 학생들은 분명히 문제의 종류를 더 잘 구별할 수 있었고, 각각의 문제에 정확한 공식을 일관되게 적용했다.

Accuracy during practice was greater for students who had received blocked practice than for students who had received interleaved practice, both for partial problems (99% vs. 68%, respectively) and for full problems (98% vs. 79%). By contrast, accuracy 1 day later was substantially higher for students who had received interleaved practice (77%) than for students who had received blocked practice (38%). As with Rohrer and Taylor (2006), a plausible explanation for this pattern is that interleaved practice helped students to discriminate between various kinds of problems and to learn the appropriate formula to apply for each one. This explanation was supported by a detailed analysis of errors the fourth graders made when solving the full problems during the criterion task. Fabrication errors involved cases in which students used a formula that was not originally trained (e.g., b × 8), whereas discrimination errors involved cases in which students used one of the four formulas that had been practiced but was not appropriate for a given problem. As shown in Figure 14, the two groups did not differ in fabrication errors, but discrimination errors were more common after blocked practice than after interleaved practiced. Students who received interleaved practice apparently were better at discriminating among the kinds of problems and consistently applied the correct formula to each one.

How does interleaving produce these benefits? One explanation is that interleaved practice promotes organizational processing and item-specific processing because it allows students to more readily compare different kinds of problems. For instance, in Rohrer and Taylor (2007), it is possible that when students were solving for the volume of one kind of solid (e.g., a wedge) during interleaved practice, the solution method used for the immediately prior problem involving a different kind of solid (e.g., a spheroid) was still in working memory and hence encouraged a comparison of the two problems and their different formulas. Another possible explanation is based on the distributed retrieval from long-term memory that is afforded by interleaved practice. In particular, for blocked practice, the information relevant to completing a task (whether it be a solution to a problem or memory for a set of related items) should reside in working memory; hence, participants should not have to retrieve the solution. So, if a student completes a block of problems solving for volumes of wedges, the solution to each new problem will be readily available from working memory. By contrast, for interleaved practice, when the next type of problem is presented, the solution method for it must be retrieved from long-term memory. So, if a student has just solved for the volume of a wedge and then must solve for the volume of a spheroid, he or she must retrieve the formula for spheroids from memory. Such delayed practice testing would boost memory for the retrieved information (for details, see the Practice Testing section above). This retrieval-practice hypothesis and the discriminative-contrast hypothesis are not mutually exclusive, and other mechanisms may also contribute to the benefits of interleaved practice.

10.2 How general are the effects of interleaved practice?

10.3 Effects in representative educational contexts.

10.4 Issues for implementation.

Not only is the result from Mayfield and Chase (2002) promising, their procedure offers a tactic for the implementation of interleaved practice, both by teachers in the classroom and by students regulating their study (for a detailed discussion of implementation, see Rohrer, 2009). In particular, after a given kind of problem (or topic) has been introduced, practice should first focus on that particular problem. 

After the next kind of problem is introduced (e.g., during another lecture or study session), that problem should first be practiced, but it should be followed by extra practice that involves interleaving the current type of problem with others introduced during previous sessions. As each new type of problem is introduced, practice should be interleaved with practice for problems from other sessions that students will be expected to discriminate between (e.g., if the criterion test will involve a mixture of several types of problems, then these should be practiced in an interleaved manner during class or study sessions). 

Interleaved practice may take a bit more time to use than blocked practice, because solution times often slow during interleaved practice; even so, such slowing likely indicates the recruitment of other processes—such as discriminative contrast—that boost performance. Thus, teachers and students could integrate interleaved practice into their schedules without too much modification.

10.5 Interleaved practice: Overall recommendations.

Closing Remarks

Relative utility of the learning techniques

Implications for research on learning techniques

Implications for students, teachers, and student achievement

Pressley and colleagues (Pressley, 1986; Pressley, Goodchild, et al., 1989) developed a good-strategy-user model, according to which being a sophisticated strategy user involves “knowing the techniques that accomplish important life goals (i.e., strategies), knowing when and how to use those methods . . . and using those methods in combination with a rich network of nonstrategic knowledge that one possesses about the world” (p. 302). However, Pressley, Goodchild, et al. (1989) also noted that “many students are committed to ineffective strategies . . . moreover, there is not enough professional evaluation of techniques that are recommended in the literature, with many strategies oversold by proponents” (p. 301). We agree and hope that the current reviews will have a positive impact with respect to fostering further scientific evaluation of the techniques.

비효율적인 전략에 대한 학생들의 헌신과 관련하여, 최근의 조사에 따르면, 학생들은 우리가 비교적 실용성이 낮은 것으로 밝혀진 두 가지 전략인 반복읽기와 밑줄긋기의 사용을 가장 자주 지지하는 것으로 나타났다. 그럼에도 불구하고, 몇몇 학생들은 연습 시험을 이용하여 보고하고, 이 학생들은 그것의 사용으로부터 이익을 얻는 것처럼 보인다. 예를 들어, 구룽(2005)은 대학생들에게 입문심리학 강좌에서 강의실 시험 준비에 사용한 전략을 설명하도록 했다. 학생들이 연습 시험을 사용하는 것으로 보고된 빈도는 최종 시험에서 그들의 성적과 상당한 상관관계가 있었다. (Hartwig & Dunlosky, 2012 참조) 연습 테스트가 상대적으로 사용하기 쉽다는 점을 고려할 때, 현재 이 기술을 사용하지 않는 학생들은 그것을 그들의 학습 루틴에 통합할 수 있어야 한다.

Concerning students’ commitment to ineffective strategies, recent surveys have indicated that students most often endorse the use of rereading and highlighting, two strategies that we found to have relatively low utility. Nevertheless, some students do report using practice testing, and these students appear to benefit from its use. For instance, Gurung (2005) had college students describe the strategies they used in preparing for classroom examinations in an introductory psychology course. The frequency of students’ reported use of practice testing was significantly correlated with their performance on a final exam (see also Hartwig & Dunlosky, 2012). Given that practice testing is relatively easy to use, students who do not currently use this technique should be able to incorporate it into their study routine.

왜 많은 학생들이 효과적인 기술을 일관되게 사용하지 않는가? 한 가지 가능성은 학생들이 어떤 기술이 효과적인지 또는 정규 학교 수업 중에 어떻게 효과적으로 사용하는지에 대해 교육받지 못한다는 것이다. 문제의 일부는 교사들 스스로가 다양한 학습 기술의 효과에 대해 듣지 못하고 있다는 것일 수 있다. 교사들이 교육 심리학 수업에서 이러한 기술에 대해 배울 가능성이 가장 높기 때문에, 대부분의 기법이 교육-심리학 교과서에 충분히 다뤄지지 않는다는 것이 드러나고 있다. 우리는 서론에서 6권의 교과서를 조사했다. 그리고 이미지(예: 키워드 니모닉)를 바탕으로 한 니모닉을 제외하고, 어떤 기술도 모든 책에서 다뤄지지 않았다. 더욱이, 이러한 기법들 중 하나 이상을 기술한 교과서의 하위집단에서, 대부분의 경우, 적용 범위는 비교적 미미했고, 특정 기법에 대한 간략한 설명과 그것의 사용, 효과 및 제한에 대한 지침은 비교적 적었다. 따라서, 많은 교사들은 어떤 기술이 가장 잘 작동하고 학생들이 그것들을 사용하도록 훈련시키는 방법에 대한 충분한 소개를 받을 것 같지 않다.

Why don’t many students consistently use effective techniques? One possibility is that students are not instructed about which techniques are effective or how to use them effectively during formal schooling. Part of the problem may be that teachers themselves are not told about the efficacy of various learning techniques. Given that teachers would most likely learn about these techniques in classes on educational psychology, it is revealing that most of the techniques do not receive sufficient coverage in educational-psychology textbooks. We surveyed six textbooks (cited in the Introduction), and, except for mnemonics based on imagery (e.g., the keyword mnemonic), none of the techniques was covered by all of the books. Moreover, in the subset of textbooks that did describe one or more of these techniques, the coverage in most cases was relatively minimal, with a brief description of a given technique and relatively little guidance on its use, effectiveness, and limitations. Thus, many teachers are unlikely getting a sufficient introduction to which techniques work best and how to train students to use them.

A second problem may be that a premium is placed on teaching students content and critical thinking skills, whereas less time is spent teaching students to develop effective techniques and strategies to guide learning. As noted by McNamara (2010), “there is an overwhelming assumption in our educational system that the most important thing to deliver to students is content” (p. 341, italics in original). One concern here is that students who do well in earlier grades, in which learning is largely supervised, may struggle later, when they are expected to regulate much of their own learning, such as in high school or college. Teaching students to use these techniques would not take much time away from teaching content and would likely be most beneficial if the use of the techniques was consistently taught across multiple content areas, so that students could broadly experience their effects on learning and class grades. 

Even here, however, recommendations on how to train students to use the most effective techniques would benefit from further research. One key issue concerns the earliest age at which a given technique could (or should) be taught. Teachers can expect that upper elementary students should be capable of using many of the techniques, yet even these students may need some guidance on how to most effectively implement them. Certainly, identifying the age at which students have the self-regulatory capabilities to effectively use a technique (and how much training they would need to do so) is an important objective for future research. Another issue is how often students will need to be retrained or reminded to use the techniques to ensure that students will continue to use them when they are not instructed to do so. Given the promise of some of the learning techniques, research on professional development that involves training teachers to help students use the techniques would be valuable.

Beyond training students to use these techniques, teachers could also incorporate some of them into their lesson plans. For instance, 

Many students are being left behind by an educational system that some people believe is in crisis. Improving educational outcomes will require efforts on many fronts, but a central premise of this monograph is that one part of a solution involves helping students to better regulate their learning through the use of effective learning techniques. Fortunately, cognitive and educational psychologists have been developing and evaluating easy-to-use learning techniques that could help students achieve their learning goals. In this monograph, we discuss 10 learning techniques in detail and offer recommendations about their relative utility. We selected techniques that were expected to be relatively easy to use and hence could be adopted by many students. Also, some techniques (e.g., highlighting and rereading) were selected because students report relying heavily on them, which makes it especially important to examine how well they work. The techniques include elaborative interrogation, self-explanation, summarization, highlighting (or underlining), the keyword mnemonic, imagery use for text learning, rereading, practice testing, distributed practice, and interleaved practice. To offer recommendations about the relative utility of these techniques, we evaluated whether their benefits generalize across four categories of variables: learning conditions, student characteristics, materials, and criterion tasks. Learning conditions include aspects of the learning environment in which the technique is implemented, such as whether a student studies alone or with a group. Student characteristics include variables such as age, ability, and level of prior knowledge. Materials vary from simple concepts to mathematical problems to complicated science texts. Criterion tasks include different outcome measures that are relevant to student achievement, such as those tapping memory, problem solving, and comprehension. We attempted to provide thorough reviews for each technique, so this monograph is rather lengthy. However, we also wrote the monograph in a modular fashion, so it is easy to use. In particular, each review is divided into the following sections: General description of the technique and why it should work How general are the effects of this technique?  2a. Learning conditions  2b. Student characteristics  2c. Materials  2d. Criterion tasks Effects in representative educational contexts Issues for implementation Overall assessment The review for each technique can be read independently of the others, and particular variables of interest can be easily compared across techniques. To foreshadow our final recommendations, the techniques vary widely with respect to their generalizability and promise for improving student learning. Practice testing and distributed practice received high utility assessments because they benefit learners of different ages and abilities and have been shown to boost students' performance across many criterion tasks and even in educational contexts. Elaborative interrogation, self-explanation, and interleaved practice received moderate utility assessments. The benefits of these techniques do generalize across some variables, yet despite their promise, they fell short of a high utility assessment because the evidence for their efficacy is limited. For instance, elaborative interrogation and self-explanation have not been adequately evaluated in educational contexts, and the benefits of interleaving have just begun to be systematically explored, so the ultimate effectiveness of these techniques is currently unknown. Nevertheless, the techniques that received moderate-utility ratings show enough promise for us to recommend their use in appropriate situations, which we describe in detail within the review of each technique. Five techniques received a low utility assessment: summarization, highlighting, the keyword mnemonic, imagery use for text learning, and rereading. These techniques were rated as low utility for numerous reasons. Summarization and imagery use for text learning have been shown to help some students on some criterion tasks, yet the conditions under which these techniques produce benefits are limited, and much research is still needed to fully explore their overall effectiveness. The keyword mnemonic is difficult to implement in some contexts, and it appears to benefit students for a limited number of materials and for short retention intervals. Most students report rereading and highlighting, yet these techniques do not consistently boost students' performance, so other techniques should be used in their place (e.g., practice testing instead of rereading). Our hope is that this monograph will foster improvements in student learning, not only by showcasing which learning techniques are likely to have the most generalizable effects but also by encouraging researchers to continue investigating the most promising techniques. Accordingly, in our closing remarks, we discuss some issues for how these techniques could be implemented by teachers and students, and we highlight directions for future research.


