将大型语言模型训练用于临床事件预测
本研究将Foresight Learning方法扩展至临床预测领域。核心创新在于,将MIMIC-III数据集中的纵向临床笔记转化为“上下文-问题-标签”三元组的训练样本,自动生成了涵盖用药、手术、死亡风险等多维度的6,900个预测实例。基于此训练的轻量级LoRA适配器,显著提升了模型的预测性能与校准能力,其预期校准误差从0.1269大幅降至0.0398,Brier分数从0.199降至0.145。该方法证明了无需人工构建结构化特征或专用分类器,即可从临床文本中提取可复用预测监督信号的可行路径。
Longitudinal clinical notes contain rich evidence of how patients evolve over time, but converting this signal into training supervision for clinical prediction remains challenging. We extend Foresight Learning to clinical prediction by converting time-ordered MIMIC-III notes into examples consisting of past patient context, a natural-language question about a possible future event, and a label resolved from later documentation. This process yields 6,900 prediction examples from 702 admissions across medications, procedures, organ support, microbiology, and mortality. A small LoRA adapter trained on these examples improves over the prompted base model, reducing expected calibration error from 0.1269 to 0.0398 and Brier score from 0.199 to 0.145, while slightly outperforming GPT-5 point estimates on held-out questions. The approach enables reusable clinical prediction supervision from longitudinal notes without hand-engineered structured features or endpoint-specific classifiers.