答え公開済み3 日前Last edited 3 日前32 ソース

AnthropicのClaude Fable 5はなぜ「安全すぎる」と批判されるのか？

Anthropicが公開した最新AI「Claude Fable 5」に対し、サイバーセキュリティ研究者から「無害なブログの読み込みすら拒否され、研究に使えない」という批判が噴出している [8]。問題の中核は、サイバーセキュリティや生物学などの話題を検知すると、ユーザーに無断で旧型の「Claude Opus 4.8」に静かに切り替わる仕組みであり、その存在は319ページにも及ぶ膨大なシステムカードに隠されていたと指摘されている [1][9]。

Studio Global AIで検索して事実確認さらにトレンドページを見る

39K0

A conceptual illustration of a locked digital shield representing AI safety guardrails, with glowing data streams being filtered and diverted, set against a dark cybersecurity-them — What is causing cybersecurity professionals to criticize Anthropic's Claude Fable 5, and how does the model's safety guardrail system work,Anthropic's Claude Fable 5 uses aggressive, silent guardrails to keep its most powerful capabilities out of public hands, a move that has sparked intense debate in the cybersecurity community.
AI プロンプト
Create a landscape editorial hero image for this Studio Global article: What is causing cybersecurity professionals to criticize Anthropic's Claude Fable 5, and how does the model's safety guardrail system work,. Article summary: Anthropic released Claude Fable 5 on June 9, 2026 as a guardrailed public version of its powerful Mythos-class model, alongside an unrestricted twin, Claude Mythos 5, available only to vetted partners through Project Gla. Topic tags: general, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "# Claude Fable 5: Why Anthropic Put Its Most Powerful AI Behind Guardrails. * Anthropic released Claude Fable 5 on 9 June 2026. It is the first publicly available Mythos-class mode" source context "Claude Fable 5: Anthropic Locks Down Cyber and Bio" Reference image 2: visual subject "# Anthropic says these topics
openai.com

Anthropicは2026年6月9日、一般向けの最も強力なAIモデルとして「Claude Fable 5」をリリースしたが、この発表はサイバーセキュリティ業界からの強い反発を招いている。同社はMythosクラスの技術を「責任ある形で公開」したと位置づける一方、セキュリティ専門家たちは、組み込まれた安全ガードレールがあまりに強力すぎて、合法的な研究や防御目的の作業さえ事実上不可能になっていると主張しているからだ。

批判の的となっているのは、「安全機能の存在」そのものではない。問題は、その実装方法にある。すなわち、ユーザーに一切知らせず、広範囲にフィルターをかけ、より性能の劣るAIに裏で切り替える、という手法だ。以下に、この騒動の詳細とその背後にあるテクノロジーをまとめた。

「広すぎるフィルター」が研究の足を引っ張る

研究者たちが訴える最大の問題は、Fable 5のコンテンツ判定機能が極めて過敏である点だ。IBM X-Forceの著名なセキュリティ研究者、ヴァレンティナ・“チョンピー”・パルミオッティ氏はTechCrunchに対し、このモデルは「サイバーセキュリティにわずかでも関係しそうなリクエストは何でも拒否する。ブログの記事を読ませるような無害なタスクでさえも」と語っている。つまり、危険な質問だけでなく、基礎的なサイバーセキュリティ概念の理解を求めるような問い合わせまでが引っかかってしまうのだ。

この過剰な判定は、AIの実用性を著しく損なっている。クエリが検知されると、ユーザーには古いAIによって弱められた回答が返されるが、その切り替えは明示的には知らされない。さらに、この動作の開示方法も批判に拍車をかけた。問題の挙動は319ページに及ぶシステムカードの奥深くにしか記されておらず、Anthropicが特定のユーザーに対して密かにモデルの能力を引き下げる「秘密のサボタージュ」を行っているという非難を生んだのである。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

Studio Global AIで検索して事実確認

人々も尋ねます