Databricks AI研究团队指出,构建数据智能体比代码智能体更困难,因为后者有可验证的测试,而前者需在海量表格、文档和仪表盘中寻找“真相”。其开发的Genie在企业数据分析任务中达到91.6%的准确率,远超领先代码智能体32%的表现。关键方法结合了专门知识搜索、并行思考与多LLM架构。据团队介绍,Genie已显著改变Databricks用户的数据工作方式,其准确率是通用智能体的三倍。
Super cool work from Databricks AI research team.
Data agents are harder than coding agents. Coding agents have verifiable tests. Data agents have to find "truth" across millions of tables, docs, dashboards.
Databricks Genie got to 91.6% accuracy, while the leading coding agent only got 32% on enterprise data analysis tasks.
Specialized knowledge search + Parallel Thinking + Multi-LLM is the key.
Databricks has an amazing research team, and I've been enjoying working with them!