उत्तरप्रकाशितपरसोंLast edited परसों32 स्रोत

कैसे एक रिसर्चर ने Anthropic के सबसे सुरक्षित AI को 24 घंटों के भीतर 'पैक हंट' अटैक से ध्वस्त कर दिया

9 जून 2026 को लॉन्च हुए Anthropic के सबसे ताकतवर AI Claude Fable 5 को एक ही दिन बाद 'प्लिनी द लिबरेटर' ने मल्टी एजेंट 'पैक हंट' हमले से हैक कर लिया, जिसमें यूनिकोड ट्रिक, कथा आवरण, और काम को छोटे छोटे हिस्सों में बांट... इस जेलब्रेक ने मॉडल का 120,000 अक्षरों का सिस्टम प्रॉम्प्ट लीक कर दिया और साइबर सुरक्षा व रसायन...

Studio Global AI के साथ खोजें और तथ्यों की जांच करें और ट्रेंडिंग पेज देखें

49K0

What happened when Anthropic's Claude Fable 5 was reportedly jailbroken by a researcher just one day after its June 9 launch, what techniqueAI-generated editorial hero image for What happened when Anthropic's Claude Fable 5 was reportedly jailbroken by a researcher just one day after its June 9 launch, what technique.
AI संकेत
Create a landscape editorial hero image for this Studio Global article: What happened when Anthropic's Claude Fable 5 was reportedly jailbroken by a researcher just one day after its June 9 launch, what technique. Article summary: On June 10, 2026 — just one day after Anthropic launched Claude Fable 5, its first public Mythos-class model — prolific AI red-teamer **Pliny the Liberator** announced he had bypassed the model's safety classifiers, extr. Topic tags: general, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "# Anthropic’s Claude Fable 5 Jailbroken to Generate Stack Exploits. Anthropic's Claude Fable 5 Jailbroken. Anthropic launched Claude Fable 5 on June 9, 2026, as the first publicly" source context "Anthropic's Claude Fable 5 Jailbroken to Generate Stack ..." Reference image 2: visual subject "Anthropic Releases Cl
openai.com

Anthropic ने 9 जून, 2026 को Claude Fable 5 लॉन्च किया, इसे अपना पहला सार्वजनिक Mythos-श्रेणी का मॉडल बताया—एक ऐसा स्तर जिसे कंपनी पहले बिना पाबंदी के एक्सेस देने के लिए बहुत खतरनाक मानती थी। इसका सुरक्षा ढांचा बेमिसाल था: साइबर सुरक्षा, जीव विज्ञान, रसायन विज्ञान और मॉडल डिस्टिलेशन के उच्च-जोखिम वाले सवालों पर नज़र रखने के लिए समर्पित AI क्लासिफायर तैनात किए गए थे, जो किसी भी संदिग्ध अनुरोध को चुपचाप कम ताकतवर Claude Opus 4.8 पर रीरूट कर देते थे । Anthropic ने सार्वजनिक रूप से दावा किया कि 1,000 घंटों से अधिक की बाहरी बग-बाउंटी टेस्टिंग और रेड-टीमिंग में एक भी सार्वभौमिक जेलब्रेक (universal jailbreak) नहीं मिला ।

यह दावा महज एक दिन ही टिक पाया।

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

Studio Global AI के साथ खोजें और तथ्यों की जांच करें

लोग पूछते भी हैं