Elsevier

Journal of Surgical Education

Volume 68, Issue 5, September–October 2011, Pages 408-413
Journal of Surgical Education

Original report
Using Elements from an Acute Abdominal Pain Objective Structured Clinical Examination (OSCE) Leads to More Standardized Grading in the Surgical Clerkship for Third-Year Medical Students

https://doi.org/10.1016/j.jsurg.2011.05.008Get rights and content

Background

There is poor reliability in the Likert-based assessments of patient interaction and general knowledge base for medical students in the surgical clerkship. The Objective Structured Clinical Examination (OSCE) can be used to assess these competencies.

Objective

We hypothesize that using OSCE performance to replace the current Likert-based patient interaction and general knowledge base assessments will not affect the pass/fail rate for third-year medical students in the surgical clerkship.

Methods

In this retrospective study, third-year medical student clerkship data from a three-station acute abdominal pain OSCE were collected from the 2009–2010 academic year. New patient interaction and general knowledge base assessments were derived from the performance data and substituted for original assessments to generate new clerkship scores and ordinal grades. Two-sided nonparametric statistics were used for comparative analyses, using an α = 0.05.

Results

Seventy third-year medical students (50.0% female) were evaluated. A sign test showed a difference in the original (4.45/5) and the new (4.20/5) median patient interaction scores (p < 0.01). A sign test did not show a difference in the original (4.00/5) and the new (4.11/5) median general knowledge base scores (p = 0.28). Nine clerkship grades changed between these different grading schemes (p = 0.045), with an overall agreement of 87.1% and a kappa statistic of 0.81. There were no differences in the pass/fail rate (p > 0.99).

Conclusions

We conclude that there are no differences in pass/fail rate, but there is a more standardized distribution of patient interaction assessments and utilization of the full spectrum of possible passing grades. We recommend that the current patient interaction assessment for third-year medical students in the surgical clerkship be replaced with that found through trained standardized patients in this three-station acute abdominal pain OSCE.

Introduction

Most clerkship directors believe that an optimal grading system for medical students in the surgical clerkship should consist of four or five categories. Having fewer categories does not sufficiently discriminate students. The lowest grades are rarely assigned if more than 4 or 5 categories are used in clerkship grading.1 Letter grading systems provided good discrimination with high to moderate reliability.2

However, no agreement exists on the specific items for which medical students should be evaluated during the surgical clerkship. Poor reliability is found in the assessments of medical students in multiple domains. First, evidence indicates that faculty have poor reliability with regard to assessment of medical student skills and behaviors during the medical interview. Data analysis suggests that faculty members score students on the basis of likability rather than with regard to specific behavioral skills.3 Additionally, 1 study that evaluated a 1–5 rating scale using 3 faculty raters for medical student interpersonal patient skills and fund of knowledge yielded reliabilities of 0.20 and 0.32, respectively.4 Moreover, evidence suggests that the assessment of clinical performance on the surgical clerkship might be related and influenced by the specific services on which the students rotate.5, 6 Factor analysis for surgical evaluation by faculty suggests that faculty members make 1 global assessment instead of item-specific rating of performance-based measures, such as relationship with patients and general knowledge.7

Commonly used as an element of clerkship grading, the National Board of Medical Examiners (NBME) Surgery Subject Examination is commonly used to test medical student general knowledge base objectively. However, it has been shown that medical student knowledge base (or fund of knowledge) assessments by faculty evaluation only correlate minimally with performance on this examination (Pearson's coefficient: 0.108–0.24).8, 9, 10 Clinical assessment of general knowledge base for students in the surgery clerkship seems to be linked more closely to the assessment of student attitudes and not the performance on an objective examination.8

At the University of Pittsburgh School of Medicine, the Surgery and Perioperative Care Clerkship has a 5-category grading system consisting of the following categories: honors, high pass, pass, low pass, and fail. Each rotation of students has different cutoff scores for the final grades, depending on the overall clerkship numerical score. The overall clerkship numerical score is determined from a weight-based percentage: 75% on the surgical portion (6 weeks) and 25% on anesthesia portion (2 weeks). The surgical portion of the grading is determined in the following fashion: service evaluations (55%), NBME Surgery Subject Examination (15%), and presentations (5%). The service evaluation consists of a weighted average of 10 student performance-based items, each comprising 5.5% of the overall clerkship numerical score. These items include the Likert-based assessment, on a 1–5 scale, of medical student patient interaction (presents as integral member of health care team, gains confidence and trust, develops rapport, and demonstrates empathy and compassion) and general knowledge base (understands therapeutic interventions, diagnostic approach, and basic pathophysiology). The assessments are made by service faculty and residents with no formal training in student evaluation.

During the surgical clerkship, medical students take an ungraded, 3-station acute abdominal pain Objective Structured Clinical Examination (OSCE). The students are presented a case of a patient with acute abdominal pain at each station. Students have 15 minutes to perform a focused history and physical examination. Standardized patients, with a minimum of 20–24 hours of dedicated training, assess the medical students regarding their patient interaction skills and history and physical examination skills. The students receive only formative feedback for this exercise from the standardized patients, surgery residents, and staff, as this OSCE is not used for summative purposes.

As a possible solution for unreliable medical student assessment by surgeons and residents, the OSCE has proven to be a reliable and valid modality to assess the 6 competencies defined by the Accreditation Council for Graduate Medical Education (ACGME). The competencies assessed by a well-constructed OSCE include Interpersonal and Communication Skills, Professionalism, and Medical Knowledge.11 The use of the OSCE for student assessment in the clinical clerkship is well established.12 In the third year surgical clerkship, 1 study of an ungraded OSCE showed a high correlation of OSCE performance with the final clerkship grade. Overall, 71% of students who received high pass or honors clerkship grades had high OSCE scores, whereas 67% of students with low OSCE scores received poor or defer grades. The OSCE can evaluate clinical ability in an objective and standardized manner.13

At our institution, variability exists in the number of evaluations that the students receive on each service in the Surgery and Perioperative Care Clerkship from untrained service faculty and residents (historical median = 3) and evidence in the literature indicates that evaluations in the domains of patient interaction and general knowledge base are poorly reliable using Likert-based scales. Objective data from the 3 trained standardized patients during the OSCE session would lead to improved accuracy in assessing medical student patient interaction and general knowledge base. We hypothesize that retrospectively using OSCE performance to replace the current subjective Likert-based patient interaction and general knowledge base assessments will not affect the pass/fail rate for third-year medical students in the surgical clerkship. We also wanted to observe how different these 2 grading schemes would be by replacing these 2 performance-based item assessments and whether overall OSCE performance was different between the original ordinal grades the students have received.

Section snippets

Methods

In this retrospective study, clerkship grading data and performance assessments from the 3-station acute abdominal pain OSCE were collected from the 2009–2010 academic year. Inclusion criteria were that the student was a third-year medical student and that OSCE performance data were available. Exclusion criteria were that either the student was not a third-year student or OSCE performance data were unavailable. In all, 70 students satisfied the inclusion criteria.

The clerkship grading data for

Results

For the 2009–2010 academic year, 143 student clerkship grades were available. Of this group, 132 (92%) students were third-year medical students, whereas 11 students (8%) were fourth-year medical students. Of the third-year students, 70 students (53%) had OSCE performance data available. There were no differences between the number of female35 and the number of male35 in the inclusion criteria group (p > 0.99). There were no differences in the distribution of numerical scores between the groups

Discussion

We conclude that retrospectively using OSCE performance to replace the current subjective Likert-based patient interaction and general knowledge base assessments does not affect the pass/fail rate for third-year medical students in the surgical clerkship. The distribution of patient interaction assessments is more standardized. Additionally, passing grades for the third-year students now use the low pass grade assignment. We found that neither the time of year in which the students took the

Cited by (10)

  • Using a station within an objective structured clinical examination to assess interprofessional competence performance among undergraduate nursing students

    2021, Nurse Education in Practice
    Citation Excerpt :

    In our review of the objectivity of the OSCE, considered through inter-observer concordance, we have found in an OSCE among nursing students in our environment (Spain) that competencies related to aspects of interpersonal relationship and communication clearly show a lower inter-observer concordance than that obtained when assessing technical aspects (Castro-Yuste et al., 2018). Likewise, inter-observer concordance in some OSCEs where aspects of interpersonal relationship and communication were assessed was generally lower (Lau et al., 2007; Sakurai et al., 2014; Saraiva et al., 2016; Setyonugroho et al., 2016) than in other OSCEs where more procedural or clinical aspects were assessed (Battistone et al., 2017; Falcone et al., 2011; Pernar et al., 2012; Garg et al., 2015; Noureldin et al., 2016). In this sense, we must not lose sight of the fact that examiners are human, and that there is a thought process behind each assessment that normally remains hidden, without being made explicit (Chahine et al., 2016), which can be influenced by first impressions (Wood et al., 2017).

  • The surgical clerkship and medical student performance in a standardized patient case of acute cholecystitis

    2015, Journal of Surgical Education
    Citation Excerpt :

    The third-year surgical clerkship is the primary opportunity for medical students to learn about surgical disease and treatment in medical school. Previous studies have used an Observed Structured Clinical Examination (OSCE) to evaluate clinical and patient interaction skills and general knowledge competencies of third-year students during their required surgical clerkships2-8; the OSCE has been proven to be a valid method to assess the 6 core competencies outlined by the Accreditation Council for Graduate Medical Education in surgery.2 Student OSCE performance varies depending on several key factors, including previous completion of a surgical clerkship, surgical clerkship length, and clerkship content.5-8

  • Communication skills training in surgical residency: A needs assessment and metacognition analysis of a difficult conversation objective structured clinical examination

    2014, Journal of Surgical Education
    Citation Excerpt :

    SPs evaluated the residents with skills checklists and Likert (1-5 scale) assessments in multiple domains of resident performance such as nonverbal and verbal communication skills and empathy. Such evaluative measures have been internally validated, with standardized evaluation outcomes shown in a study using similarly trained SPs.16 The externally validated measure used for the assessment of verbal empathy in our study is the NURSE mnemonic for verbal empathy: Name, Understand, Respect, Support, and Explore.17

  • Overview of medical student assessment: Why, what, who, and how

    2013, Journal of Taibah University Medical Sciences
    Citation Excerpt :

    We discovered that it was from their memory and many questions had been distorted by deficiencies of memorization. OSCEs have been used for assessment in a wide range of disciplines including psychiatry,31 radiography,32 surgery,33 dentistry,34 internal medicine,35 and non-prescription medicine courses.36 OSCE can predict future clinical performance.37

View all citing articles on Scopus
View full text