答案已发布3天前Last edited 3天前32 来源

为何网安专家说Claude Fable 5过于“安全”而无法使用

网安研究者痛批Claude Fable 5的“护栏”乱杀无辜：请求阅读一篇博客都能触发安全警报，连基础网络安全概念都被过滤。争议核心是一套“静默切换”机制：凡是涉网安、生物、化学、AI提取的请求，都被偷换成较弱的旧版本回答，且不给提示。 Anthropic同步释出公开“阉割版”Fable 5与仅供受信机构的“满血版”Mythos 5，这种按资质分层部署AI的模式成行业新常态。

使用 Studio Global AI 搜索并核查事实浏览更多热门页面

39K0

A conceptual illustration of a locked digital shield representing AI safety guardrails, with glowing data streams being filtered and diverted, set against a dark cybersecurity-them — What is causing cybersecurity professionals to criticize Anthropic's Claude Fable 5, and how does the model's safety guardrail system work,Anthropic's Claude Fable 5 uses aggressive, silent guardrails to keep its most powerful capabilities out of public hands, a move that has sparked intense debate in the cybersecurity community.
AI 提示
Create a landscape editorial hero image for this Studio Global article: What is causing cybersecurity professionals to criticize Anthropic's Claude Fable 5, and how does the model's safety guardrail system work,. Article summary: Anthropic released Claude Fable 5 on June 9, 2026 as a guardrailed public version of its powerful Mythos-class model, alongside an unrestricted twin, Claude Mythos 5, available only to vetted partners through Project Gla. Topic tags: general, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "# Claude Fable 5: Why Anthropic Put Its Most Powerful AI Behind Guardrails. * Anthropic released Claude Fable 5 on 9 June 2026. It is the first publicly available Mythos-class mode" source context "Claude Fable 5: Anthropic Locks Down Cyber and Bio" Reference image 2: visual subject "# Anthropic says these topics
openai.com

Anthropic于2026年6月9日发布了Claude Fable 5，号称是其面向公众的最强AI模型。但这次发布迅速激起了网络安全圈的强烈反弹。在公司将其包装成“负责任的发布”的同时，安全专家们却认为，内置的“安全护栏”太过激进，导致模型在合法的网络防御、安全研究领域也几乎成了摆设。

批评的焦点不在于护栏本身，而在于其实现的致命缺陷：它过滤得过于宽泛，且失败时会偷偷换上一个较弱的旧模型，完全不让用户知道。以下是对这场争议和技术细节的梳理：

批评声浪：一刀切式的“安全”扼杀了合法研究

最典型的声音来自IBM X-Force的知名研究员Valentina “Chompie” Palmiotti，她对TechCrunch说，Fable 5会拒绝“任何跟网络沾边的请求——哪怕只是读取一篇博客文章都算”。这意味着，不仅是恶意请求，连理解和研究基础网络安全知识都可能被拦下来。

这种“误杀率”之所以难以饶恕，还因为它对用户体验的影响。当一个查询被标记后，模型并不会直接拒绝，而是偷偷将请求转给一个更旧、性能更差的AI模型来生成回答，而用户完全被蒙在鼓里。更糟的是，这种机制在发布时并未公开说明，而是被塞在一份长达319页的系统卡深处，直到被研究者挖出才引发众怒，批评者直指这是Anthropic的“暗箱破坏”。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

使用 Studio Global AI 搜索并核查事实

人们还问