答案已发布4天前Last edited 4天前19 来源

AI 智能体连基础生物学任务都搞不定？一场暴露科学数据基础设施危机的实验

Anthropic、NCBI、Broad 研究所和 Chan Zuckerberg Initiative 的里程碑式研究发现，顶级 AI 模型在获取病毒序列数据时惨遭滑铁卢，由于公共数据库缺乏可复现的接口，准确率最低仅 16.9%。核心症结在于生物数据基础设施的“非确定性”：同一个查询在 Claude Sonnet 4 上三次运行，竟返回了 106、15 和 5 个截然不同的结果，这直接导致系统发育分析将埃博拉疫情起源时间错误地推算到了 1922 年。

使用 Studio Global AI 搜索并核查事实浏览更多热门页面

130K0

Abstract illustration of a DNA helix intersecting with digital circuitry and database nodes, symbolizing the infrastructure gap between AI and biological data. — What do researchers from Anthropic, NCBI, the Broad Institute, and the Chan Zuckerberg Initiative reveal about why AI agents fail at retrievThe gap between AI and biology is not a failure of intelligence but of infrastructure — a lesson made clear by new research from Anthropic and leading scientific institutions.
AI 提示
Create a landscape editorial hero image for this Studio Global article: What do researchers from Anthropic, NCBI, the Broad Institute, and the Chan Zuckerberg Initiative reveal about why AI agents fail at retriev. Article summary: In a collaboration between Anthropic, NCBI, the Broad Institute, and the Chan Zuckerberg Initiative (CZI), researchers demonstrated that state-of-the-art AI agents fail at retrieving biological data from public databases. Topic tags: general, government, academic, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "# Artificial Intelligence agents for biological research: a survey. A **.gov** website belongs to an official government organization in the United States. Inclusion in an NLM data" source context "Artificial Intelligence agents for biological research: a survey - PMC" Reference image 2: vis
openai.com

由 Anthropic、NCBI、Broad 研究所和 Chan Zuckerberg Initiative（CZI）联手进行的一项重磅合作，揭露了 AI 驱动科学的一个“肮脏秘密”：当今最强大的 AI 智能体，在一个极其简单的任务上——从公共数据库抓取病毒 DNA 序列——表现得完全不可靠。这项 2026 年 6 月发布的研究发现，像 Claude Sonnet 4 这样的模型在这项常规工作上的准确率低至 16.9%。

然而，罪魁祸首并非 AI 的智力不足，而是“管道”出了问题。这些基础设施是为人类点击网页表单设计的，而不是为自主智能体打造的。通过构建一个名为 gget virus 的确定性检索层，团队瞬间将准确率提升至接近 100%，证明了修复数据管道是通往可信赖 AI 生物学的最快路径。

为什么 AI 智能体在生物数据库上“翻车”了？

Laura Luebbert 及其同事用一个强有力的类比来框定这个问题：让 AI 智能体来导航生物数据，就像开着现代汽车在中世纪的城市里穿行。车辆本身技术先进，但道路压根不是为它设计的。

这个合作团队测试了多个领先的 AI 系统——包括 Claude、基于 GPT 的模型、Biomni Open Source 和 Edison Analysis——任务是看似简单的从 NCBI Virus 数据库检索病毒序列数据。NCBI Virus 是病毒学家追踪疫情和开发诊断方法时的首选资源库，但结果令人震惊。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

使用 Studio Global AI 搜索并核查事实

人们还问