Original reportUsing Elements from an Acute Abdominal Pain Objective Structured Clinical Examination (OSCE) Leads to More Standardized Grading in the Surgical Clerkship for Third-Year Medical Students
Introduction
Most clerkship directors believe that an optimal grading system for medical students in the surgical clerkship should consist of four or five categories. Having fewer categories does not sufficiently discriminate students. The lowest grades are rarely assigned if more than 4 or 5 categories are used in clerkship grading.1 Letter grading systems provided good discrimination with high to moderate reliability.2
However, no agreement exists on the specific items for which medical students should be evaluated during the surgical clerkship. Poor reliability is found in the assessments of medical students in multiple domains. First, evidence indicates that faculty have poor reliability with regard to assessment of medical student skills and behaviors during the medical interview. Data analysis suggests that faculty members score students on the basis of likability rather than with regard to specific behavioral skills.3 Additionally, 1 study that evaluated a 1–5 rating scale using 3 faculty raters for medical student interpersonal patient skills and fund of knowledge yielded reliabilities of 0.20 and 0.32, respectively.4 Moreover, evidence suggests that the assessment of clinical performance on the surgical clerkship might be related and influenced by the specific services on which the students rotate.5, 6 Factor analysis for surgical evaluation by faculty suggests that faculty members make 1 global assessment instead of item-specific rating of performance-based measures, such as relationship with patients and general knowledge.7
Commonly used as an element of clerkship grading, the National Board of Medical Examiners (NBME) Surgery Subject Examination is commonly used to test medical student general knowledge base objectively. However, it has been shown that medical student knowledge base (or fund of knowledge) assessments by faculty evaluation only correlate minimally with performance on this examination (Pearson's coefficient: 0.108–0.24).8, 9, 10 Clinical assessment of general knowledge base for students in the surgery clerkship seems to be linked more closely to the assessment of student attitudes and not the performance on an objective examination.8
At the University of Pittsburgh School of Medicine, the Surgery and Perioperative Care Clerkship has a 5-category grading system consisting of the following categories: honors, high pass, pass, low pass, and fail. Each rotation of students has different cutoff scores for the final grades, depending on the overall clerkship numerical score. The overall clerkship numerical score is determined from a weight-based percentage: 75% on the surgical portion (6 weeks) and 25% on anesthesia portion (2 weeks). The surgical portion of the grading is determined in the following fashion: service evaluations (55%), NBME Surgery Subject Examination (15%), and presentations (5%). The service evaluation consists of a weighted average of 10 student performance-based items, each comprising 5.5% of the overall clerkship numerical score. These items include the Likert-based assessment, on a 1–5 scale, of medical student patient interaction (presents as integral member of health care team, gains confidence and trust, develops rapport, and demonstrates empathy and compassion) and general knowledge base (understands therapeutic interventions, diagnostic approach, and basic pathophysiology). The assessments are made by service faculty and residents with no formal training in student evaluation.
During the surgical clerkship, medical students take an ungraded, 3-station acute abdominal pain Objective Structured Clinical Examination (OSCE). The students are presented a case of a patient with acute abdominal pain at each station. Students have 15 minutes to perform a focused history and physical examination. Standardized patients, with a minimum of 20–24 hours of dedicated training, assess the medical students regarding their patient interaction skills and history and physical examination skills. The students receive only formative feedback for this exercise from the standardized patients, surgery residents, and staff, as this OSCE is not used for summative purposes.
As a possible solution for unreliable medical student assessment by surgeons and residents, the OSCE has proven to be a reliable and valid modality to assess the 6 competencies defined by the Accreditation Council for Graduate Medical Education (ACGME). The competencies assessed by a well-constructed OSCE include Interpersonal and Communication Skills, Professionalism, and Medical Knowledge.11 The use of the OSCE for student assessment in the clinical clerkship is well established.12 In the third year surgical clerkship, 1 study of an ungraded OSCE showed a high correlation of OSCE performance with the final clerkship grade. Overall, 71% of students who received high pass or honors clerkship grades had high OSCE scores, whereas 67% of students with low OSCE scores received poor or defer grades. The OSCE can evaluate clinical ability in an objective and standardized manner.13
At our institution, variability exists in the number of evaluations that the students receive on each service in the Surgery and Perioperative Care Clerkship from untrained service faculty and residents (historical median = 3) and evidence in the literature indicates that evaluations in the domains of patient interaction and general knowledge base are poorly reliable using Likert-based scales. Objective data from the 3 trained standardized patients during the OSCE session would lead to improved accuracy in assessing medical student patient interaction and general knowledge base. We hypothesize that retrospectively using OSCE performance to replace the current subjective Likert-based patient interaction and general knowledge base assessments will not affect the pass/fail rate for third-year medical students in the surgical clerkship. We also wanted to observe how different these 2 grading schemes would be by replacing these 2 performance-based item assessments and whether overall OSCE performance was different between the original ordinal grades the students have received.
Section snippets
Methods
In this retrospective study, clerkship grading data and performance assessments from the 3-station acute abdominal pain OSCE were collected from the 2009–2010 academic year. Inclusion criteria were that the student was a third-year medical student and that OSCE performance data were available. Exclusion criteria were that either the student was not a third-year student or OSCE performance data were unavailable. In all, 70 students satisfied the inclusion criteria.
The clerkship grading data for
Results
For the 2009–2010 academic year, 143 student clerkship grades were available. Of this group, 132 (92%) students were third-year medical students, whereas 11 students (8%) were fourth-year medical students. Of the third-year students, 70 students (53%) had OSCE performance data available. There were no differences between the number of female35 and the number of male35 in the inclusion criteria group (p > 0.99). There were no differences in the distribution of numerical scores between the groups
Discussion
We conclude that retrospectively using OSCE performance to replace the current subjective Likert-based patient interaction and general knowledge base assessments does not affect the pass/fail rate for third-year medical students in the surgical clerkship. The distribution of patient interaction assessments is more standardized. Additionally, passing grades for the third-year students now use the low pass grade assignment. We found that neither the time of year in which the students took the
References (14)
- et al.
What is the “ideal” grading system for the junior surgery clerkship?
Am J Surg
(1999) - et al.
Reliability of different grading systems used in evaluating surgical students
Am J Surg
(1989) - et al.
Variation in faculty evaluations of clerkship students attributable to surgical service
J Surg Educ
(2010) - et al.
General surgery versus speciality rotations: a new paradigm in surgery clerkships
J Surg Res
(2009) - et al.
Ward evaluations: should they be abandoned?
J Surg Res
(1997) - et al.
Does the National Board of Medical Examiners' Surgery Subtest level the playing field?
Am J Surg
(2004) - et al.
Low correlation between subjective and objective measures of knowledge on surgery clerkships
J Am Coll Surg
(2010)
Cited by (10)
Using a station within an objective structured clinical examination to assess interprofessional competence performance among undergraduate nursing students
2021, Nurse Education in PracticeCitation Excerpt :In our review of the objectivity of the OSCE, considered through inter-observer concordance, we have found in an OSCE among nursing students in our environment (Spain) that competencies related to aspects of interpersonal relationship and communication clearly show a lower inter-observer concordance than that obtained when assessing technical aspects (Castro-Yuste et al., 2018). Likewise, inter-observer concordance in some OSCEs where aspects of interpersonal relationship and communication were assessed was generally lower (Lau et al., 2007; Sakurai et al., 2014; Saraiva et al., 2016; Setyonugroho et al., 2016) than in other OSCEs where more procedural or clinical aspects were assessed (Battistone et al., 2017; Falcone et al., 2011; Pernar et al., 2012; Garg et al., 2015; Noureldin et al., 2016). In this sense, we must not lose sight of the fact that examiners are human, and that there is a thought process behind each assessment that normally remains hidden, without being made explicit (Chahine et al., 2016), which can be influenced by first impressions (Wood et al., 2017).
The surgical clerkship and medical student performance in a standardized patient case of acute cholecystitis
2015, Journal of Surgical EducationCitation Excerpt :The third-year surgical clerkship is the primary opportunity for medical students to learn about surgical disease and treatment in medical school. Previous studies have used an Observed Structured Clinical Examination (OSCE) to evaluate clinical and patient interaction skills and general knowledge competencies of third-year students during their required surgical clerkships2-8; the OSCE has been proven to be a valid method to assess the 6 core competencies outlined by the Accreditation Council for Graduate Medical Education in surgery.2 Student OSCE performance varies depending on several key factors, including previous completion of a surgical clerkship, surgical clerkship length, and clerkship content.5-8
Evaluating clinical dermatology practice in medical undergraduates
2014, Actas Dermo-SifiliograficasCommunication skills training in surgical residency: A needs assessment and metacognition analysis of a difficult conversation objective structured clinical examination
2014, Journal of Surgical EducationCitation Excerpt :SPs evaluated the residents with skills checklists and Likert (1-5 scale) assessments in multiple domains of resident performance such as nonverbal and verbal communication skills and empathy. Such evaluative measures have been internally validated, with standardized evaluation outcomes shown in a study using similarly trained SPs.16 The externally validated measure used for the assessment of verbal empathy in our study is the NURSE mnemonic for verbal empathy: Name, Understand, Respect, Support, and Explore.17
Cutting too deep? Assessing the impact of a shorter surgery clerkship on students' clinical skills and knowledge
2014, American Journal of SurgeryOverview of medical student assessment: Why, what, who, and how
2013, Journal of Taibah University Medical SciencesCitation Excerpt :We discovered that it was from their memory and many questions had been distorted by deficiencies of memorization. OSCEs have been used for assessment in a wide range of disciplines including psychiatry,31 radiography,32 surgery,33 dentistry,34 internal medicine,35 and non-prescription medicine courses.36 OSCE can predict future clinical performance.37