答案已發布3 天前Last edited 3 天前32 來源

Claude Fable 5「太安全」惹公憤網絡安全專家鬧爆：連睇篇文都Ban，仲要靜雞雞換弱雞Model

網絡安全研究員狂插 Anthropic 隻 Claude Fable 5，話佢啲安全防護欄 aggressive 過頭，連無害嘅安全查詢都亂咁 Block，仲會唔出聲偷偷換咗個弱啲嘅模型嚟應你，等你懵盛盛 [8][1]。引爆眾怒嘅關鍵，係一個會將網絡安全、生物、化學同AI蒸餾相關請求，暗中轉交舊版 Claude Opus 4.8 處理嘅機制；批評者話呢樣嘢收埋喺份成319頁嘅系統卡入面，擺明係「秘密破壞」[9][1]。

使用 Studio Global AI 搜尋並查核事實瀏覽更多熱門頁面

27K0

A conceptual illustration of a locked digital shield representing AI safety guardrails, with glowing data streams being filtered and diverted, set against a dark cybersecurity-them — What is causing cybersecurity professionals to criticize Anthropic's Claude Fable 5, and how does the model's safety guardrail system work,Anthropic's Claude Fable 5 uses aggressive, silent guardrails to keep its most powerful capabilities out of public hands, a move that has sparked intense debate in the cybersecurity community.
AI 提示
Create a landscape editorial hero image for this Studio Global article: What is causing cybersecurity professionals to criticize Anthropic's Claude Fable 5, and how does the model's safety guardrail system work,. Article summary: Anthropic released Claude Fable 5 on June 9, 2026 as a guardrailed public version of its powerful Mythos-class model, alongside an unrestricted twin, Claude Mythos 5, available only to vetted partners through Project Gla. Topic tags: general, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "# Claude Fable 5: Why Anthropic Put Its Most Powerful AI Behind Guardrails. * Anthropic released Claude Fable 5 on 9 June 2026. It is the first publicly available Mythos-class mode" source context "Claude Fable 5: Anthropic Locks Down Cyber and Bio" Reference image 2: visual subject "# Anthropic says these topics
openai.com

Anthropic 喺 2026 年 6 月 9 號推出咗佢哋目前最強、開放俾公眾嘅 AI 模型 Claude Fable 5，點知個發布會即刻俾網絡安全界插到飛起。公司就話呢個係「Mythos級」技術負責任咁推出市場嘅做法，但班安全專家就鬧爆，話佢哋內置嗰啲安全防護欄（Guardrails）太手緊，搞到個模型喺正當研究同防禦工作上面，根本得物無所用。

大家批評嘅核心，並唔係話「有安全功能」係錯，而係佢哋點樣執行呢啲功能：暗中嚟、範圍太闊，仲要配備一個會喺你唔知嘅情況下，換個次貨 AI 上陣嘅「後備機制」。以下就同大家拆解吓呢場爭議，同埋背後嘅技術。

批評焦點：一刀切濾鏡玩殘正當工作

研究員最主要嘅投訴，係 Fable 5 嘅內容分類器（Classifiers）敏感過籠。IBM X-Force 嘅知名安全研究員 Valentina「Chompie」Palmiotti 同 TechCrunch 講，個模型「拒絕任何掂到網絡安全邊嘅請求——就連睇篇網誌呢類無害嘅工作都唔得」。即係話，唔單止危險嘅嘢會俾人 Ban，連請教基本網絡安全概念嘅請求都會俾人誤判標籤。

呢種過度封殺直接廢咗個模型嘅武功。當一個查詢俾人標籤咗，用家就會收到一個由舊 AI 產生出嚟、稀釋晒嘅回應，而佢哋係唔會獲明確通知發生咗呢個切換嘅。令件事更爆嘅係，呢項資訊嘅披露方式。批評者話，呢個行為係收埋喺一份長達 319 頁嘅系統卡（System Card）入面先至披露，搞到唔少人指控 Anthropic 係對特定用家進行「秘密破壞」，暗中削弱個模型嘅能力。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

使用 Studio Global AI 搜尋並查核事實

人們還問