Inter-Rater Reliability of Select Emergency Medicine Milestones in Simulation

Objectives: In 2012, the ACGME established the Milestones in emergency medicine (EM) training to provide competency-based benchmarks for residency training. Small observational studies have shown variable correlation between faculty assessment and resident selfassessment. Using simulation clinical scenarios, we sought to determine (1) the correlation between resident selfassessment and faculty assessment of clinical competency using selected Milestones; and (2) the inter-rater reliability between EM faculty using both Milestone scoring and a critical actions checklist.

Methods: This is an observational study in which secondyear EM residents at an urban academic medical center were assessed with two simulation cases focusing on management of cardiogenic shock and sepsis. Twentythree residents completed both cases; they were assessed by two EM faculty in eight select Milestones (scored 1-5, increments of 0.5) and with a checklist of critical actions to perform (scored 0 or 1). Intra-class correlation coefficients (ICC) were used to compare Milestone scoring between faculty and to assess correlation between resident self-assessment and faculty scoring. Faculty checklist inter-observer agreement was assessed using kappa statistics. Correlation between Milestone achievement and checklist performance were assessed using Spearman and Pearson correlation coefficients.

Results: The ICCs for inter-rater agreement between faculty for Milestone level were 0.12 and 0.15 for the cardiogenic shock and sepsis cases, respectively. The ICC comparing resident self-assessment with the average of faculty Milestone level scoring for each case was 0.00. The inter-rater agreement on checklist items for the cardiogenic shock and sepsis cases had kappa coefficients of 0.83 and 0.78, respectively. Pearson and Spearman correlation coefficients comparing Milestone scoring and checklist items in the cardiogenic shock case were 0.27 and 0.29; in the sepsis case, 0.085 and -0.021.

Conclusion: When compared to critical action checklists, use of Milestones lacks consistency between faculty raters for simulation-based competency assessment. Resident self-assessment shows no correlation with faculty assessments.

Author(s): Kathleen Wittels, Michael E Abboud, Yuchiao Chang, Alexander Sheng, and James K Takayesu

Abstract | Full-Text | PDF

Share this  Facebook  Twitter  LinkedIn  Google+