임상역량의 Authentic 평가: 역량 추구의 함정 (Acad Med, 2010)

Toward Authentic Clinical Evaluation: Pitfalls in the Pursuit of Competency

Shiphra Ginsburg, MD, MEd, Jodi McIlroy, PhD, Olga Oulanova, MA, Kevin Eva, PhD, and Glenn Regehr, PhD




의학 교육자들은 수십 년 동안 레지던트의 임상적 능력을 평가하는 최선의 방법에 대해 고심하고있다. 흥미롭게도, 임상 수행에 대한 대부분의 평가는 여전히 연수생의 행동에 대한 판단을 내리는 평가자에게 크게 의존합니다. 이 수수께끼에 대한 지배적 인 해결책은 표준화를 통해 이러한 주관적인 효과를 완화하려는 시도이며, 이를 위해

  • 무엇이 평가되어야 하는지(예 : 의사 소통과 같은 영역에서 특정 지식, 태도 또는 기술이 평가되고 있음), 

  • 어떻게 다양한 성과 수준으로 구성되는지(예 : "뛰어난 성과", "기대치 초과"및 "개선 필요"와 같은 용어가 의미하는 것)

...에 대한 합의가 이루어졌습니다 

Medical educators have struggled for decades with the question of how best to evaluate the clinical competence of residents. Interestingly, most evaluations of clinical performance, still rely extensively on evaluators making judgments about trainees’ behaviors. The dominant solution to this conundrum has been to try to mitigate these subjective effects through standardization, so that there is some consensus about 

  • what is being evaluated (e.g., what specific knowledge, attitudes, or skills are being assessed in a domain such as communication) and 

  • what constitutes various levels of performance (e.g., what is meant by such terms as “outstanding performance,” “exceeds expectations,” and “needs improvement”).


동시에 의학 교육자 (및 사회)는 "훌륭한 의사"가된다는 것을 더 확실하게 표현하는 방향으로 나아갔습니다.

At the same time, medical educators (and society) have moved toward the development of a more authentic representation of what it means to be a “good doctor.”


(CanMEDS)

(ACGME)


이러한 프로젝트는 "역량 획득여부를 평가하기 위한 유용하고 신뢰할 수있는 유효한 방법"을 개발하는 데 도움을주기위한 것입니다.

The project was also meant to assist programs to develop “useful, reliable, and valid methods for assessing attainment of the competencies.”


이러한 목표에도 불구하고, 최근 문헌을 체계적으로 검토 한 결과, 하나의 역량을 다른 역량과 구분되는 독립적 구인으로서 신뢰성있게 측정 할 수있는 평가 방법이 없음을 발견했습니다.5 저자는 역량 자체가 "틀린 것"은 아니지만 평가 방법은 그렇지 않음을 결론지었습니다 프레임 워크와 깔끔하게 대응합니다. 또한 일부 역량 (시스템 기반 실습과 같은)은 다른 개인 및 외력에 크게 의존하므로 레지던트가 속해있는 시스템에서 레지던트만을 분리하여 평가할 수 없습니다.

Despite these goals, a recent systematic review of the literature found no assessment methods that can reliably measure the competencies separately from one another as independent constructs.5 The authors concluded that it is not that the competencies themselves are “wrong” but that assessment measures do not correspond neatly with the framework. In addition, some of the competencies (like systems-based practice) are so dependent on other individuals and external forces that it may not be possible to evaluate a resident separate fromthe systemin which the resident is functioning.


의학 교육자들이 학습 체계를 구성하고 안내하는 학습 도구로서의 역량을 평가 도구로서의 역량으로 직접 번역하려고 시도하는 것이 이 둘 사이의 구분을 모호하게했을 수도 있습니다. 우리는 역량 프레임 워크와 교수의 일상 생활 평가에 대한 경험 사이에 존재하는 명백한 긴장을 더 잘 이해하려고 노력했습니다.

It may be that medical educators have blurred the distinction between using competencies as an educational framework to organize and guide learning, and attempting to translate themdirectly into evaluation tools. With this in mind, we sought to better understand the apparent tensions that exist between competency frameworks and faculty’s experience in the day-to-day evaluation of residents.



방법

Method


Participants and interviews


Potential participants included all clinical faculty at two Canadian universities (University of Toronto and McMaster University) who had at least two years of experience in teaching and evaluating residents in internal medicine. Sampling was purposive, in that we initially targeted faculty in general internal medicine who attended on the general medical wards at any of our five main teaching hospitals, as they would likely have the most experience in the areas we were exploring.


Faculty attendings were invited to participate by e-mail. Each attending was interviewed for 30 to 60 minutes by the same trained research assistant according to a script developed by the research group. One pilot interview was conducted to test the script; some refinements were made, and that interview was not used in our analysis. During the interviews, attendings were asked to describe (without mentioning names) first a specific outstanding resident they had supervised, then a problematic resident, and finally an average resident. These descriptions could be about any aspect of performance, and there was no attempt to encourage discussion of any particular area. However, descriptions had to be of actual residents rather than generalized opinions. Probes were used where necessary to promote specific descriptions of behaviors (e.g., if the attending stated that the resident was “very professional,” the research assistant would ask, “How was that displayed?” or “What did you observe that led to that opinion?”). Probes were also used where necessary to identify areas in which excellent residents revealed deficiencies and problematic residents showed strength. The interviews were audiotaped and transcribed verbatim, with any potentially identifying features removed.



Analysis


Analysis of the interviews began alongside data collection...

    • to ensure the interviews were effectively eliciting the types of descriptions we had anticipated and 

    • to determine when theoretical saturation had been reached.6 

This occurred after 15 interviews were done at the first university and 4 at the second, resulting in a final sample of 19 interviews that were analyzed using grounded theory. We chose grounded theory for this analysis because we were attempting to develop a theoretical framework to describe how faculty actually thought—and talked—about their residents.7 Each researcher read the initial transcripts during the open coding process. We then met repeatedly as a group and refined the coding using constant comparison, where categories were further defined, merged, or deleted. Agreement was achieved through consensus, and discussions proceeded until the coding structure was deemed stable. It was then entered into NVivo software, which was used by the research assistant to code all 19 transcripts.8




결과

Results


녹취록을 분석 한 결과, 지식, 전문성, 환자 상호 작용, 팀 상호 작용, 체계, 처분, 신뢰 및 직원에 대한 영향 등 거주자에 대한 의견을 표명 할 때 참석자가 고려한 사항을 모두 반영하는 8 개의 주요 영역 또는 주제가 파악되었습니다. 이 영역의 정의와 예는 표 1에서 볼 수 있으며 각 빈도는 그림 1에서 그래픽으로 표시됩니다.

Analysis of the transcripts resulted in the identification of eight major domains, or themes, that together reflect what faculty attendings consider when forming opinions about their residents: knowledge, professionalism, patient interactions, team interactions, systems, disposition, trust, and impact on staff. Definitions and examples of these domains can be seen in Table 1, and the frequencies with which each was mentioned are presented graphically in Figure 1.




성과 영역과 그것이 다뤄진 방법

Domains of performance and how they were discussed


첫 번째 중요한 발견은 역량 영역의 본질, 그리고 그것이 레지던트에 대한 전반적인 인식에 어떻게 통합되었는지이다. 그러나 개별 설명에서 참석자는 모든 레지던트에 대해 모든 도메인을 논의하지는 않았다.

Our first major finding related to the nature of the domains of competence discussed and how they were incorporated into the overall impression of the resident. However, in their individual descriptions, attendings did not discuss every domain for every resident,


더 흥미롭게도 도메인은 해당 레지던트의 다른 퍼포먼스 영역에 따라 중요성이 달라졌다. 각 테마는 긍정적이거나 부정적인 용어로 논의 될 수 있지만 토론 된 레지던트의 유형에 반드시 의존하지는 않습니다.

More interestingly, a domain could take on variable importance, depending on other areas of performance for that resident. Each of the themes could be discussed in either positive or negative terms, but this was not necessarily dependent on the type of resident being discussed.



흥미롭게도 "탁월한 레지던트가 되려면 탁월한 지식 기반을 보유해야합니다. 다른 모든면에서 뛰어나더라도, 내과적 지식이 부족하다면 뛰어나다고 말할 수 없다. "이러한 상대적인 결함은 지식 기반이나 지식 번역 (n = 9) 분야에서 가장 빈번하게 발생했습니다.

Interestingly, despite such comments as “To be outstanding you have to have outstanding knowledge base, I think. You can be outstanding in everything else but if you don’t know enough internal medicine you can’t,” these relative deficiencies were most often in the area of knowledge base or knowledge translation (n = 9).


또한 지식 자체에 대한 접근성이 높다고 여겼기 때문에 ( "모르는 것은 누구나 바로 구글에서 찾아볼 수 있다"), 대부분은 지식을 수월성의 진정한 표식으로 고려하지 않았습니다.

Furthermore, because knowledge itself was seen as being easily accessible (“You don’t know what it is, you Google it, you go on any of the online resources—most people have themon a handheld”), it was not considered by most to be a true marker of who is excellent.



흥미롭게도, 세 명의 주치의는 자신의 일에 "너무 투자"하거나 번아웃 위험에 처한 우수한 레지던트에 대한 우려를 제기했습니다.

Interestingly, three attendings brought up concerns about excellent residents who seemed “too invested” in their work and at risk of burning out.


요약하면, 참석자들은 자신들이 뛰어난 것으로 생각하는 레지던트의 결점을 간과하거나 변명하는 것처럼 보였던 반면, 일부 '문제있는 레지던트'에 대해서는 일부 도메인에서 우수성을 보여주더라도, 이것이 그들을 '구해save'주지 못했다. 주치의가 가진 인상은 레지던트에게 노출된 수의 선형적 합산 결과가 아니며, 더욱이, 무엇이 가장 중요하거나 덜 중요한지는 가변적이며 idiosyncratic하였다.

In sum, attendings seemed to overlook, or excuse, deficiencies in residents they thought of as being outstanding, whereas competence or even excellence in some domains did not “save” other residents from being thought of as problematic. Attendings’ impressions did not result from a linear sum of dimensions; further, what was weighted most or least heavily in any one description seemed to be variable and idiosyncratic.


테마의 상대적 중요성

Relative prominence of themes


우리의 두 번째 발견은 그림 1에서 묘사 된 바와 같이 주제의 상대적 빈도와 관련이 있습니다. 직업윤리는 전체 데이터 세트에서 가장 자주 사용되는 코드였으며 참석자들이 우수한 레지던트에 대해 토론했을 때 특히 두드러졌습니다. 

Our second finding relates to the relative frequencies of the themes, as depicted in Figure 1. Work ethic was by far the most frequently used code in the entire data set and was especially prominent when attendings discussed excellent residents.


또 다른 사람은 "그는 항상 자리에 있었고, 민감하게 반응했다. 그는 능동적으로 문제를 예측했다. 그는 그들에게 일어날 일을 기다리지 않았다. 그는 그것이 발생할 것으로 미리 예상했다. "

 Another stated, “He was available, he would always respond. He was proactive in anticipating problems. He did not wait for them to happen; he expected them to develop.”


"역량이라고 분류할 수 없는 것들"

“Noncompetency” constructs


우리의 세 번째 주요 결과는, 참석자들이 사실상 전혀 역량이라고 할 수 없는 것들에 대해서 자세히 설명했다는 점이다. 예를 들어 '성향'이 있다. 참석자가 레지던트가 문제가 있다고 생각하는 이유에 대한 설명으로, 태도 및 성격 특성에 대해 자주 언급하였다.

Our third major finding was that attendings elaborated several constructs that affected their opinions of residents that were not in fact competencies at all. Consider, for example, the theme of disposition. Attendings frequently commented on residents’ attitudes and personality characteristics, as typified by this explanation of why one attending thought a resident was problematic:


마찬가지로, 'Staff에 대한 영향'이라는 주제는, 레지던트가 교수 구성원의 삶에 어떻게 영향을 주었는지에 따라 레지던트에 대한 의견이 달라짐을 보여준다. 다시 말하지만, 이러한 의견은 특정 분야의 역량이나 역량을 설명하지는 않았으며, 오히려 참석자의 명시된 의견에 대한 설명으로 제공되었습니다.

Similarly, the theme of impact on staff evolved to capture comments attendings made in which their opinion of a resident was shaped by how that resident affected the faculty member’s life. Again, these comments did not describe a particular area of performance or competency but, rather, were offered as support or as explanation for attendings’ stated opinions.




고찰

Discussion


"핵심 역량"을 평가하기 위한 평가 도구를 개발하는 것은 어려웠습니다. 각각의 역량이 다른 역량과 별도로 평가 될 수 없는 것처럼 보이며, 대부분의 평가는 하나의 구인만을 측정한다 (또는 다수의 구인을 측정하나 프레임 워크에 깔끔하게 매핑되지 않는다.).

Developing assessment instruments to evaluate these “core competencies” has been difficult, as recently reported by Lurie et al.5 It seems the individual competencies cannot be evaluated separately fromone another, and most assessments probably measure a single construct (or several that do not map neatly onto the framework, as supported by our findings).


이러한 어려움에 대한 한 가지 가능한 이유는 원하는 역량 중 상당 부분이 어떤 식으로든 사회적으로 결정된다고 인식하기 때문이다. 예를 들어, ACGME 역량 중 Practice-BL이나 SBP와 관련된 개인의 성과는 다른 사람들 및 환경과의 상호 작용에 의존합니다. 따라서 개인의 기여를 구분해내기가 어렵다. 5 그러나 무엇보다도, 올바른 도구만 있다면, 개인의 "진정한 점수"를 정확하게 측정 할 수 있다는 근본적인 전제가 여전히 존재하는 것 같습니다.

One possible reason for these difficulties relates to a growing recognition that many of the desired competencies are in some ways socially determined. For example, an individual’s performance related to the ACGME competencies of practice-based learning or systems-based practice is dependent on interactions with other people and the environment. An individual’s contribution cannot be easily teased out.5 Perhaps more important, however, an underlying presupposition still seems to exist that there is a “true score” within an individual that can be measured accurately once the right tools are found.



특정 상황 (필기 시험과 같은 지식을 시험하는 시험)에는 이것이 적용가능할지도 모르지만, 평가 방법의 선택은 객관성이나 표준화에 대한 맹목적인 욕구가 아니라 교육적 맥락이나 시험 상황의 목적에 따라 결정되어야합니다. 아마도 임상 환경에서 역량 평가의 어려움은, 평가의 출발점이 역량이 관찰되는 맥락이 아니라, 역량 그 자체라는 사실에서 비롯됩니다.

That may be true for certain situations (like written exams to test knowledge), but the choice of assessment method should be determined by the educational context or by the purpose of the testing situation, not by a blind desire to be as objective or standardized as possible. Perhaps some of the difficulties in evaluating competencies in a clinical setting arise fromthe fact that the starting point is usually the competency one wants to assess, rather than the context in which it is being observed.



둘째, 감독관은 개인의 성과를 고려하는 메타-역량의 집합에 따라 연수생의 성과를 개념화한다고 제안했다. 예를 들어, Bogo 등 10)은 감독자가 뛰어나거나 문제가 있는 사회 복지 연수생을 논의할 때, 해당 연수생에 대한 전반적인 의견에 따라 특정 도메인의 상대적 중요성이 높아지거나 낮아지는 것으로 드러났다.

Second, others have suggested that faculty supervisors conceptualize trainees’ performance according to a set of meta-competencies, within which they consider an individual’s performance. For example, Bogo et al10 found that, as supervisors discussed their outstanding and problematic social work trainees, they would elevate—or discount—the relative importance of a particular domain, depending on their overall opinion of a given trainee.


Bogo와 동료 연구에서, 이러한 설명은 "그러나 진술but statements"로 구성되었다. 예를 들어, 모범적인 학생이 특정 스킬에서 개선이 필요할 경우, 감독관은 이것이 공식적인 훈련이 부족한 결과라고 믿으며, 이를 감싸주었다. 이것은 귀인 이론attribution theory에 의해 설명 될 수 있습니다.이 예에서 관리자는 피교육자의 부족을 교육 부족으로 인한 것이라고 귀인한 것이다

In Bogo and colleagues’ study,10 these descriptions were framed as “but statements”; for instance, an exemplary student’s skills in a particular area needed work but the supervisor excused it, believing it was simply the result of a lack of formal training in that area. This can be explained by attribution theory, as the supervisor in this example attributed the deficiency to a lack of training


따라서 어떤 학습자에게 약점이 있다고, 그 학습자가 뛰어난 학생으로 평가받지 못하는 것이 아니었다. 유사하게, 참석자들은 문제 학습자가 적절한 (또는 잘 발달 된) 영역의 성과를 보일 때 거부감을 느꼈습니다. 따라서 레지던트에 대한 전반적인 인상은 다양한 차원의 단순한 선형 합산이 아니며, 이들 차원에 가중치를 주더라도 레지던트를 평가하는 감독자의 감각을 적절하게 포착해내지 못한다. 

 Thus, as supported by our data, a weakness does not necessarily preclude a learner frombeing considered outstanding. As a corollary to this process, attendings were often dismissive of adequate (or even well- developed) areas of performance in learners they think of as problematic. Thus, consistent with research comparing scores fromchecklists versus global ratings,12 the overall impression of the resident is far froma simple linear addition of the various dimensions being assessed, and even a weighting of these dimensions would be unlikely to adequately capture the supervisor’s sense of the resident as a clinician-in-training.


우리는 사람들에게 자신의 언어로, 예를 들어 동료들과 이야기하는 방식으로, 레지던트의 수행능력을 이야기해달라고 권고했습니다. 이에 참여자들은 모든 레지던트에 대해 모든 구인(역량)을 다루지 않았다. 그러나 일반적으로 평가 도구는 역량이 설정된 순서대로 제시되도록 설계되어 있기 때문에, 각 역량에 대해 거의 동일한 시각적 공간을 제공합니다. 이 순서는 각 역량의 상대적 중요성에 대한 레지던트 프로그램의 암묵적인 믿음을 나타낼 수 있으며, 동등한 간격은 각 역량이 동등하게 고려되어야 함을 의미합니다. 그러나 우리의 연구 결과에 따르면이 시각적 레토릭은 교수진이 레지던트의 성과를 개념화하는 방식이나 이에 대한 의견을 표현하는 방식과 일치하지 않습니다.

We explicitly encouraged them to discuss residents’ performance in their own language, the way they would speak, for example, with their colleagues. They did not, therefore, address every construct for every resident. In contrast, evaluation instruments are usually designed so that the competencies are presented in a set order, giving approximately equal visual space to each. This order may reveal the residency program’s implicit beliefs about the relative importance of each competency, and the equal spacing implies that each should be considered equally for each resident. Our findings suggest that this visual rhetoric is inconsistent with the way faculty actually conceptualize and express their opinions about the performance of their residents. 


우리가 분석 한 또 다른 중요한 주제는 레지던트가 교수들에게 미치는 영향이었습니다.

Another critical theme that arose in our analysis was a resident’s impact on the attending.


객관성을 추구 할 때 함정에 대한 van der Vleuten 등의 우려에서 보자면, 임상 교육에서는 주관적인 접근이 실제로 바람직 할 수 있습니다. 이 상황을 객관화하려는 노력이 오히려 authenticity를 상실하게 한다. 따라서 우리는 역량 프레임 워크가 "평가 영역 바깥"에 존재할 때 가장 적합하다고 생각할 수 있습니다. 역량 프레임워크는 분명히 교육지도에 매우 유용하지만, 평가 목적에 있어서는 최고의 장소가 아닐 수도 있습니다. 역량 프레임 워크는 평가에서 중요하지 않지만 평가는 단순히 다양한 차원의 합이 아니다.

Returning to the concerns of van der Vleuten et al about pitfalls in the pursuit of objectivity, in the setting of clinical teaching units, a more subjective approach to evaluation may actually be desirable. In an effort to objectify in this setting, we risk the loss of authenticity. We agree, therefore, that competency frameworks may best be thought of as “outside the realm of evaluation”; they are certainly very useful in guiding education, but they may not be the best place to start from for evaluation purposes. It is not that the competency frameworks are unimportant in assessment, but evaluation is more subtle than a sum of the various dimensions.


또한 Hodges14에서 제시 한 바와 같이 교육 및 평가 모델은 숨겨진 "부작용"을 초래할 수 있습니다. 역량의 측정과 계산을 명시적으로 지나치게 강조함으로써 우리는 오히려 역량을 인식하지 못하거나, 심지어 경우에 따라서는 incompetence를 유발할 수 있습니다.

Further, as Hodges14 has suggested, any model of education and evaluation may result in hidden “side effects.” By overemphasizing what we explicitly choose to measure and count, we may fail to recognize—or in some cases may even create—incompetence.



앞 단락에서 설명한 문제는 평가 양식을 간단하게 수정하여 해결할 수 없습니다. 한 가지 역량이 항상 다른 것보다 중요하지 않은 경우가 아니기 때문에 종종 차별화 된 가중치를 지정하는 것이 좋습니다. 도메인의 상대적인 중요성은 평가 대상이 되는 특정 개인뿐만 아니라 교수 개인의 특이성이 존재하기 때문에 평가를 내리는 특정 평가자에게도 의존한다.

The issues described in the preceding paragraphs cannot be resolved with simple tweaks to the evaluation forms. Differentially weighting the scales, for example, which is often suggested, will not work because it is not the case that one competency is always more important than another. The relative importance of a domain depends not only on the particular individual being described, but also on the particular evaluator, as it has also been shown that idiosyncrasies exist in terms of what individual faculty attendings value.15


또한, 관측에서 해석에 이르기까지 추상화 한 다음 스케일상의 숫자를 변환하는 것은 문제가되는 것으로 나타 났으며 결과적으로 신뢰성이 떨어졌습니다 .16 레지던트의 성과에 대한 표준화 된 내러티브 설명을 사용하여 평가 한 결과, 임상 수퍼바이저가 실제로 사용하는 언어로 쓰여진 문장이 기존의 구조화 된 평가 양식보다 borderline 성능을 향상시키는 데 더 좋을 수 있습니다 .17

Further, the act of abstracting from observations to interpretations and then translating into numbers on scales has been shown to be problematic, with a resulting loss of authenticity.16 Promising research in social work has found that evaluations using standardized narrative descriptions of residents’ performance, written in the language that clinical supervisors actually use, may be better at picking up borderline performance than traditional, structured evaluation forms.17





결론

Conclusions



 임상 환경에서 레지던트의 성과를 평가하는 것은 표준화 된 역량 프레임 워크를 증진하려는 노력에도 불구하고 여전히 주관적인 요인에 크게 영향을 받고 있다. 그러나 이것은 실패로 간주되어서는 안됩니다. 우리는 감독자로서의 역할을 인간으로서의 자신과 분리 할 수 ​​없다는 것을 보여주었습니다. 인간적 요소에 대한 과도한 의존, 상황에 대한 과소 평가, 우리가 학습자에 대해 갖는 주관적인 의견과 감정적 반응 등등, 인간으로서 우리에게 영향을 미치는 것은 평가자로서 우리에게도 영향을 미칩니다. 

 assessment of residents’ performancein the clinical setting is still, despite concerted efforts to promote standardized competency frameworks, heavily influenced by the subjective. But this should not be considered a failure. Along with others, we have shown that, as faculty attendings, we cannot separate ourselves as human beings from the role we play as supervisors. Whether it is our demonstrated overreliance on person factors and underappreciation of the situation19,20 or the subjective opinions and emotional reactions we have about our learners,13,21 what affects us as human beings affects us as evaluators. 


나아가 Leach가 제안한 것처럼 "평가의 relevance는 역량의 통합적 버전에 의존하는 반면, 측정은 역량의 세분화적 버전에 의존한다. 이 역설은 쉽게 해결 될 수 없습니다. 역량이 구체화될수록 전체 역량과의 관련성은 낮아진다. "

Further, as suggested by Leach,22 the relevance of evaluation is “dependent on an integrated version of the competencies, whereas measurement relies on a speciated version of the competencies. The paradox cannot be resolved easily. The more the competencies are specified, the less relevant to the whole they become.









14 Hodges B. Medical education and the maintenance of incompetence. Med Teach. 2006;28:690–696.





 2010 May;85(5):780-6. doi: 10.1097/ACM.0b013e3181d73fb6.

Toward authentic clinical evaluationpitfalls in the pursuit of competency.

Author information

1
Wilson Centre for Research in Education, University Health Network, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada. shiphra.ginsburg@utoronto.ca

Abstract

PURPOSE:

The drive toward competency-based education frameworks has created a tension between competing desires-for quantified, standardized measures on one hand, and for an authentic representation of what it means to be a good doctor on the other. The purpose of this study was to better understand the tensions that exist between competency frameworks and faculty's real-life experiences in evaluating residents.

METHOD:

Interviews were conducted with 19 experienced internal medicine attendings at two Canadian universities in 2007. Attendings each discussed a specific outstanding, average, and problematic resident they had supervised. Interviews were analyzed using grounded theory.

RESULTS:

Eight major themes emerged reflecting how faculty conceptualize residents' performance: knowledge, professionalism, patient interactions, team interactions, systems, disposition, trust, and impact on staff. Attendings' impressions of residents did not seem to result from a linear sum of dimensions; rather, domains idiosyncratically took on variable degrees of importance depending on the resident. Relative deficiencies in outstanding residents could be overlooked, whereas strengths in problematic residents could be discounted. Some constructs (e.g., impact on staff) were not competencies at all; rather, they seem to act as explanations or evidence of attendings' opinions. Standardized evaluation forms might constrain authentic depictions of residents' performance.

CONCLUSIONS:

Despite concerted efforts to create standardized, objective, competency-based evaluations, the assessment of residents' clinical performance still has a strong subjective influence. Attendings' holistic impressions should not be considered invalid simply because they are subjective. Instead, assessment methods should consider novel ways of accommodating these impressions to improve evaluation.

PMID:
 
20520025
 
DOI:
 
10.1097/ACM.0b013e3181d73fb6


+ Recent posts