谷歌团队通过Fitbit对近1.4万名用户进行了为期9个月的AI症状检查测试。在盲评中,临床医生将AI诊断列为首选的比例达53%,显著高于独立医生的24%。研究核心发现并非“AI击败医生”,而是揭示了当前消费级大模型(如ChatGPT)仅凭用户输入直接回答的模式存在缺陷——其诊断准确率较AI主导的结构化访谈下降约27%。同时,可穿戴设备能提前数天监测到心率上升、睡眠紊乱等生理变化,早于用户主动报告症状。这表明,结合主动问询的对话AI与提前预警的传感器,才是未来医疗诊断的发展方向。
Reserach scientists at Google just tested an AI symptom checker on 14,000 real patients over 9 months via Fitbit.
In blinded evaluation, clinicians ranked the AI diagnosis as #1 in 53% of cases. Independent physicians: 24%.
But the real finding isn't "AI beats doctors.", but when users just type their symptoms and get an answer (the default mode of every consumer LLM right now), diagnostic accuracy drops ~27% compared to a structured AI-led interview.
ChatGPT, Claude, Gemini, none of them systematically interview users about their symptoms. They just respond. This study shows that's a measurable failure mode.
And then there's the second breakthrough: Fitbit data showed physiological shifts DAYS before users reported symptoms. Heart rate up, sleep disrupted, steps down, all visible before patients even opened the app.
Conversational AI that asks the right questions + wearable sensors that detect illness before you feel it. That's the exciting find here.