答案已發布3 天前Last edited 3 天前19 來源

AI Agent連基本生物學都搞唔掂？原來係數據「水管」出事

由Anthropic、NCBI、Broad Institute同CZI聯手嘅研究發現，頂尖AI模型喺擷取病毒序列數據時表現極唔穩定，Claude Sonnet 4嘅準確率可以低至16.9%，同一個Query做三次竟然會出106、15同5個完全唔同嘅結果 [3][16]。元兇唔係AI蠢，而係成個生物數據基礎設施缺乏「確定性」——NCBI Virus呢啲數據庫嘅設計係畀人類撳掣用，唔係畀AI程式化存取，搞到AI Agent好似「揸住架林寶堅尼喺中世紀古城入面行」咁狼狽 [10][7]。

使用 Studio Global AI 搜尋並查核事實瀏覽更多熱門頁面

130K0

Abstract illustration of a DNA helix intersecting with digital circuitry and database nodes, symbolizing the infrastructure gap between AI and biological data. — What do researchers from Anthropic, NCBI, the Broad Institute, and the Chan Zuckerberg Initiative reveal about why AI agents fail at retrievThe gap between AI and biology is not a failure of intelligence but of infrastructure — a lesson made clear by new research from Anthropic and leading scientific institutions.
AI 提示
Create a landscape editorial hero image for this Studio Global article: What do researchers from Anthropic, NCBI, the Broad Institute, and the Chan Zuckerberg Initiative reveal about why AI agents fail at retriev. Article summary: In a collaboration between Anthropic, NCBI, the Broad Institute, and the Chan Zuckerberg Initiative (CZI), researchers demonstrated that state-of-the-art AI agents fail at retrieving biological data from public databases. Topic tags: general, government, academic, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "# Artificial Intelligence agents for biological research: a survey. A **.gov** website belongs to an official government organization in the United States. Inclusion in an NLM data" source context "Artificial Intelligence agents for biological research: a survey - PMC" Reference image 2: vis
openai.com

當全世界都喺度吹捧AI Agent可以自動寫Code、做研究嘅時候，一個由Anthropic、NCBI、Broad Institute同埋Chan Zuckerberg Initiative (CZI) 聯手嘅重磅研究，就好似一盆冷水照頭淋落嚟。佢哋發現，今日最頂尖嘅AI模型，對住一個連研究生都嫌簡單嘅工作——由公共數據庫NCBI Virus下載病毒DNA序列——竟然錯到離晒大譜，準確率最低可以去到16.9% 。

更令人震驚嘅係，個問題唔係出喺AI「蠢」，而係出喺科研界啲數據「水管」實在太殘舊。成個基礎設施嘅設計，仲係為咗方便人類研究員用滑鼠逐吓逐吓撳，冇諗過會有AI Agent要嚟程式化存取。研究團隊用咗一個好貼切嘅比喻：「叫AI Agent去生物數據庫拎數據，就好似揸住架現代跑車喺中世紀嘅古城入面行。架車技術上係先進嘅，但條路根本唔係為咗佢而起。」。

點解AI Agent會喺生物數據庫撞晒大板？

Laura Luebbert帶領嘅研究團隊，測試咗包括Claude、GPT、Biomni Open Source同Edison Analysis在內嘅多個頂尖AI系統，叫佢哋做一個病毒學家日常工作：由NCBI Virus數據庫擷取病毒序列數據。結果簡直係一場災難。

為人類而設嘅介面，AI完全無從入手

NCBI Virus同好多其他公共生物數據庫一樣，佢哋嘅工作流程係為咗「互動式、瀏覽器操作」而設計。科學家會用滑鼠撳選項、用肉眼檢查結果、靠視覺提示去判斷。呢種介面邏輯，同AI Agent需要嘅「結構化、程式化指令」完全唔夾。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

使用 Studio Global AI 搜尋並查核事實

人們還問