A new initiative at Penn Medicine is aiming to use artificial intelligence (AI) to improve the training of medical students and residents. The project, called Clinical Reasoning Insights for Shaping Performance (CRISP), will leverage AI-enabled systems to provide data-driven feedback on clinical reasoning skills.
The CRISP project is led by faculty from the Perelman School of Medicine, including Jessica Dine, MD, MSHP; Janae Heath, MD, MSCE; Jennifer Kogan, MD; and Ilene Rosen, MD, MSCE. It has received a four-year, $1.1 million grant from the American Medical Association’s Transforming Lifelong Learning Through Precision Education Grant Program.
“Clinical reasoning is a pivotal piece of excellent patient care,” said Heath, associate program director for the Internal Medicine Residency Program and assistant professor of Pulmonary, Allergy and Critical Care. “And so an improvement in those skills, resulting in higher levels of expertise, is going to directly benefit our patients.”
Precision education seeks to tailor training to individual learners rather than using a uniform approach. This aligns with Penn Medicine’s goal to personalize learning throughout medical education.
“It’s incredibly exciting because not only is this the first true precision education at Penn, but I think it’s a really unique way to do precision education,” said Dine, associate dean of Assessment, Evaluation and Medical Education Research and a professor of Pulmonary, Allergy and Critical Care.
Penn Medicine has already implemented various AI tools aimed at improving clinician workflow and patient care. These include technology that synthesizes patient histories quickly and systems that assist with replying to patient messages or note-taking during visits.
Teaching clinical reasoning can be difficult since it often depends on unpredictable patient encounters and subjective assessments. Faculty typically cannot observe every interaction directly. New technologies now allow for more comprehensive assessment methods.
“Clinical reasoning is a really hard skill to capture well,” said Heath. “But over the last two years, there’s been an explosion of new technologies available...that have created this new opportunity to capture clinical reasoning in different domains.”
The CRISP initiative will involve undergraduate and graduate learners in Internal Medicine, Emergency Medicine, Surgery and Radiology. The objective is to develop a scalable tool that supports individualized coaching across specialties.
Penn Medicine’s integrated educational structure provides an environment conducive to testing these novel assessment models. Project leaders highlighted the collaborative culture across specialties as beneficial for innovation.
“Penn Medicine is thinking about education across the continuum and through a really innovative lens,” Heath said. “We have team members across the institution...who are pivotal to this project.”
During the first year of funding, researchers will create prototypes tailored for each specialty's workflow. AI will be used in various ways: analyzing ambient audio from clinician conversations or interviews in some departments; examining documentation in radiology; or tracking electronic health record interactions in surgery.
“This application of generative AI will allow us to both map the pondered clinical facts from those conversations to medical knowledge graphs as well as analyze linguistic markers of critical thinking,” said Danielle Mowery, PhD, MS, MS. “This framework will enable the system to provide both clinically grounded and data-driven feedback...about their diagnostic and therapeutic reasoning as well as their case review and presentation.”
Researchers plan to create profiles reflecting diagnostic skills at different stages of training while maintaining direct observation as part of instruction. They hope this approach offers clearer insights into learner progress.
“The idea is to really make a difference in how learners progress through their clinical education and think about their own growth over time...” said Rosen.
Part of skill analysis will look at word count and semantic richness in speech or documentation—since experienced clinicians tend toward concise summaries—as well as contextual factors like work hours or team familiarity.
Kogan noted concerns about privacy: “It could feel very Big Brother pretty quickly...how do you deploy this in a way that is meant to give people useful information as opposed to feeling like you’re in a surveillance system?”
The team also aims for equity by monitoring potential biases within AI assessments. “We’ve built in a lot of equity monitoring,” Heath said. “We’re being really intentional about that piece...because we don’t want to contribute to problematic assessments.”
The project includes input from learners themselves via events such as an upcoming hackathon scheduled for March 13-15 at Penn.
“The learner voice is really critical,” Heath said. “Residents and medical students have a role...They’re instrumental in development and refining.”
Testing with several hundred trainees across four specialties will begin in 2027 before broader implementation based on initial feedback.
As part of its grant activities, CRISP participates in an AMA-led consortium where teams share updates regularly with other recipients working on similar projects nationwide—a collaboration intended ultimately to transform medical education standards nationally.
“By having those conversations we can accelerate that growth more than if we were all working in silos...” Kogan said.
Faculty hope CRISP may serve as a model elsewhere: “If our goal is to have a true learning health system then this is a great prototype...” Dine stated.