Google’s conversational artificial intelligence system, AMIE, was found to safely conduct pre-visit medical interviews and generate diagnostic insights comparable to those of physicians during a real-world urgent care trial with 100 patients, according to a Mar. 12 report. The study offers an early look at how AI assistants could be integrated into clinical workflows.
The findings are significant as healthcare systems worldwide face shortages of primary care physicians and increasing physician burnout. Researchers are exploring digital solutions like large language models (LLMs) to help ease these pressures.
In the prospective feasibility study published on the arXiv preprint server, AMIE conducted secure text-based chats with adult patients scheduled for non-emergency urgent care visits at Healthcare Associates within Beth Israel Deaconess Medical Center. The AI gathered patient histories by adapting its questions dynamically based on suspected conditions and information gaps. All interactions were supervised in real time by a board-certified internal medicine physician.
After each intake session, participants completed surveys about their experience. A summary of the chat transcript and an automatically generated clinical summary were sent to the clinician ahead of the patient’s visit. Eight weeks later, an independent panel of physicians performed a blinded chart review comparing management plans from both AMIE and human clinicians against finalized clinical assessments documented after follow-up.
The primary safety outcome showed that AMIE operated safely under supervision; no safety stops were triggered during any of the 100 interactions, though minor clarifications were sometimes provided by supervising physicians. Patient trust in medical AI improved significantly after interacting with the chatbot, as measured by survey scores on the General Attitudes toward AI Scale (GAAIS).
Blinded evaluators found no significant difference between AMIE and human clinicians in terms of overall quality of differential diagnoses or appropriateness and safety of proposed management plans. However, human clinicians outperformed AMIE when it came to designing practical and cost-effective management plans—likely due to their greater access to contextual patient information and real-world healthcare constraints not available to the AI during this study.
The researchers concluded that conversational diagnostic AI can safely gather clinical histories from real patients in busy clinics when used under supervision. While not ready for autonomous practice, such systems may serve as collaborative tools alongside physicians. The authors note that larger multi-site studies are needed to confirm these results across more diverse populations.