報告公共網址2026年4月21日53 來源

資料搜集及事實核查：如果 GPT-5.5 Spud 真係推出，正式公布前有冇任何安全評估、紅隊測試或者對齊證據？

就公開可核查材料而言，我嘅結論係：Insufficient evidence。OpenAI 確實有公開講自己會做外部 red teaming，同埋用迭代部署去累積 safety／alignment 經驗；對已正式發布嘅 GPT 5，亦有 system card 同 deployment safety 資料可查。[2][3][7][12] 但我未見到任何喺正式公布前、直接對應「GPT 5.5 Spud」嘅官方 system card...

重點

OpenAI 官方 white paper 寫明，佢自 DALL-E 2 起已對 frontier model deployments 做 external red teaming，並且過往已為 GPT-4、GPT-4o、o1 等模型公開 system cards 或相關風險評估資料。
對 GPT-5 本身，OpenAI 有正式安全文件：官方 system card 頁面列出 safe-completions、Preparedness Framework safeguard，同連到 deployment safety data；arXiv 版 system card 仲提到 Microsoft AI Red Team 認為 gpt-5-thinking 係 OpenAI 模型之中安全表現最強之一。
我搵到關於「Spud」嘅公開頁面，主要都係非官方整理、消息文或者預測文，重點放喺 pretraining、release window、live testing 或 capability 猜測，而唔係附帶可驗證嘅 safety artifact。
換句話講，公開層面我見到嘅係「OpenAI 一般有安全流程」同「GPT-5 有正式安全文件」，但唔係「GPT-5.5 Spud 在正式公布前已有公開、可核實嘅 Spud-specific 安全證據」。
就公開可核查材料而言，我嘅結論係：Insufficient evidence。OpenAI 確實有公開講自己會做外部 red teaming，同埋用迭代部署去累積 safety／alignment 經驗；對已正式發布嘅 GPT-5，亦有 system card 同 deployment safety 資料可查。[2][3][7][12] 但我未見到任何喺正式公布前、直接對應「GPT-5.5 Spud」嘅官方 system card、紅隊報告、Preparedness 評估或者 alignment 證據。[2][3][7][12]
Key findings

研究答案

就公開可核查材料而言，我嘅結論係：Insufficient evidence。OpenAI 確實有公開講自己會做外部 red teaming，同埋用迭代部署去累積 safety／alignment 經驗；對已正式發布嘅 GPT-5，亦有 system card 同 deployment safety 資料可查。^[2]^[3]^[7]^[12] 但我未見到任何喺正式公布前、直接對應「GPT-5.5 Spud」嘅官方 system card、紅隊報告、Preparedness 評估或者 alignment 證據。^[2]^[3]^[7]^[12]

Key findings

OpenAI 官方 white paper 寫明，佢自 DALL-E 2 起已對 frontier model deployments 做 external red teaming，並且過往已為 GPT-4、GPT-4o、o1 等模型公開 system cards 或相關風險評估資料。^[3]
對 GPT-5 本身，OpenAI 有正式安全文件：官方 system card 頁面列出 safe-completions、Preparedness Framework safeguard，同連到 deployment safety data；arXiv 版 system card 仲提到 Microsoft AI Red Team 認為 gpt-5-thinking 係 OpenAI 模型之中安全表現最強之一。^[1]^[2]^[7]
我搵到關於「Spud」嘅公開頁面，主要都係非官方整理、消息文或者預測文，重點放喺 pretraining、release window、live testing 或 capability 猜測，而唔係附帶可驗證嘅 safety artifact。^[9]^[11]^[13]^[15]^[16]
換句話講，公開層面我見到嘅係「OpenAI 一般有安全流程」同「GPT-5 有正式安全文件」，但唔係「GPT-5.5 Spud 在正式公布前已有公開、可核實嘅 Spud-specific 安全證據」。^[2]^[3]^[7]^[12]

Evidence notes

要分清兩件事：第一，OpenAI 作為公司有冇一般性安全／對齊做法；第二，Spud 呢個特定型號喺公布前有冇公開證據。前者有官方材料支持，後者我未見到同名、可直接核驗嘅文件。^[3]^[12]
GPT-5 嘅 system card 不能自動當成 GPT-5.5 Spud 嘅證據，除非 OpenAI 之後正式講明兩者關係，或者另外發出新 system card／Preparedness addendum。^[1]^[2]^[7]

Limitations / uncertainty

呢個結論只針對「正式公布前、公開可驗證」嘅材料；唔排除 OpenAI 內部其實有做安全評估或紅隊，只係未公開。^[3]^[12]
如果之後 OpenAI 發布 system card、Deployment Safety Hub 頁面、Preparedness 文件，或者第三方獨立紅隊報告，結論可能會改變。^[2]^[3]^[7]

Summary

目前最穩陣嘅說法係：OpenAI 一般有 safety／red-team／alignment 流程，GPT-5 亦有公開文件；但對「GPT-5.5 Spud」而言，我暫時未搵到正式公布前已公開、可直接驗證嘅安全評估、紅隊測試或對齊證據，所以答案係 Insufficient evidence。^[2]^[3]^[7]^[12]

如果你想，我下一步可以幫你做一份更硬核嘅「證據清單」，分開列出「官方一手」「二手報道」「純推測／傳聞」三欄，方便你之後引用。

支持視覺效果

The image features a colorful, blurred background with the text "GPT-5" on the left and "OpenAI" on the right, suggesting an announcement reThe image features a colorful, blurred background with the text "GPT-5" on the left and "OpenAI" on the right, suggesting an announcement related to the official site of the GPT-5 release by OpenAI.

May be an image of text that says '2hours 2 ago- ago- -Technology axios.com Scoop: plans staggered rollout of new model over cybersecurity rMay be an image of text that says '2hours 2 ago- ago- -Technology axios.com Scoop: plans staggered rollout of new model over cybersecurity risk Sam Sabin'

May be an image of text that says 'ChatGPT'

A robotic hand with a metallic, red-tinted design points towards the OpenAI logo and the text "OpenAI," suggesting a connection to AI and roA robotic hand with a metallic, red-tinted design points towards the OpenAI logo and the text "OpenAI," suggesting a connection to AI and robotics technology.

A robotic hand is reaching out and touching a glowing blue light, with the OpenAI logo and the text "GPT-5" prominently displayed beside itA robotic hand is reaching out and touching a glowing blue light, with the OpenAI logo and the text "GPT-5" prominently displayed beside it.

A futuristic humanoid robot with intricate circuitry is depicted in profile, accompanied by the logos of OpenAI and GPT-5A futuristic humanoid robot with intricate circuitry is depicted in profile, accompanied by the logos of OpenAI and GPT-5.

e7f477ba7dd9efaf70a626273de34a9395d5d4de 1173x1173

Company Logo

研究對話

你2026年4月21日

研究問題

資料搜集及事實核查：如果 GPT-5.5 Spud 真係推出，正式公布前有冇任何安全評估、紅隊測試或者對齊證據？

Studio Global AI2026年4月21日40 來源

引用的答案

Key findings

OpenAI 官方 white paper 寫明，佢自 DALL-E 2 起已對 frontier model deployments 做 external red teaming，並且過往已為 GPT-4、GPT-4o、o1 等模型公開 system cards 或相關風險評估資料。^[3]
對 GPT-5 本身，OpenAI 有正式安全文件：官方 system card 頁面列出 safe-completions、Preparedness Framework safeguard，同連到 deployment safety data；arXiv 版 system card 仲提到 Microsoft AI Red Team 認為 gpt-5-thinking 係 OpenAI 模型之中安全表現最強之一。^[1]^[2]^[7]
我搵到關於「Spud」嘅公開頁面，主要都係非官方整理、消息文或者預測文，重點放喺 pretraining、release window、live testing 或 capability 猜測，而唔係附帶可驗證嘅 safety artifact。^[9]^[11]^[13]^[15]^[16]
換句話講，公開層面我見到嘅係「OpenAI 一般有安全流程」同「GPT-5 有正式安全文件」，但唔係「GPT-5.5 Spud 在正式公布前已有公開、可核實嘅 Spud-specific 安全證據」。^[2]^[3]^[7]^[12]

Evidence notes

要分清兩件事：第一，OpenAI 作為公司有冇一般性安全／對齊做法；第二，Spud 呢個特定型號喺公布前有冇公開證據。前者有官方材料支持，後者我未見到同名、可直接核驗嘅文件。^[3]^[12]
GPT-5 嘅 system card 不能自動當成 GPT-5.5 Spud 嘅證據，除非 OpenAI 之後正式講明兩者關係，或者另外發出新 system card／Preparedness addendum。^[1]^[2]^[7]

Limitations / uncertainty

呢個結論只針對「正式公布前、公開可驗證」嘅材料；唔排除 OpenAI 內部其實有做安全評估或紅隊，只係未公開。^[3]^[12]
如果之後 OpenAI 發布 system card、Deployment Safety Hub 頁面、Preparedness 文件，或者第三方獨立紅隊報告，結論可能會改變。^[2]^[3]^[7]

Summary

目前最穩陣嘅說法係：OpenAI 一般有 safety／red-team／alignment 流程，GPT-5 亦有公開文件；但對「GPT-5.5 Spud」而言，我暫時未搵到正式公布前已公開、可直接驗證嘅安全評估、紅隊測試或對齊證據，所以答案係 Insufficient evidence。^[2]^[3]^[7]^[12]

如果你想，我下一步可以幫你做一份更硬核嘅「證據清單」，分開列出「官方一手」「二手報道」「純推測／傳聞」三欄，方便你之後引用。

人們還問

「資料搜集及事實核查：如果 GPT-5.5 Spud 真係推出，正式公布前有冇任何安全評估、紅隊測試或者對齊證據？」的簡短答案是什麼？

OpenAI 官方 white paper 寫明，佢自 DALL-E 2 起已對 frontier model deployments 做 external red teaming，並且過往已為 GPT-4、GPT-4o、o1 等模型公開 system cards 或相關風險評估資料。

首先要驗證的關鍵點是什麼？

OpenAI 官方 white paper 寫明，佢自 DALL-E 2 起已對 frontier model deployments 做 external red teaming，並且過往已為 GPT-4、GPT-4o、o1 等模型公開 system cards 或相關風險評估資料。對 GPT-5 本身，OpenAI 有正式安全文件：官方 system card 頁面列出 safe-completions、Preparedness Framework safeguard，同連到 deployment safety data；arXiv 版 system card 仲提到 Microsoft AI Red Team 認為 gpt-5-thinking 係 OpenAI 模型之中安全表現最強之一。

接下來在實務上我該做什麼？

我搵到關於「Spud」嘅公開頁面，主要都係非官方整理、消息文或者預測文，重點放喺 pretraining、release window、live testing 或 capability 猜測，而唔係附帶可驗證嘅 safety artifact。

接下來我應該探索哪個相關主題？

繼續“資料搜集及事實核查：GPT5.5 Spud 而家網上流傳嘅 benchmark、demo 同 leak，有幾多可以獨立驗證？”以獲得另一個角度和額外的引用。

開啟相關頁面

我應該將其與什麼進行比較？

對照「資料搜集及事實核查：GPT 5.5 Spud 會唔會係全新基座模型，定只係 GPT-5 系列中途更新版？」交叉檢查此答案。

開啟相關頁面

繼續你的研究

資料搜集及事實核查：GPT5.5 Spud 而家網上流傳嘅 benchmark、demo 同 leak，有幾多可以獨立驗證？

資料搜集及事實核查：GPT 5.5 Spud 會唔會係全新基座模型，定只係 GPT-5 系列中途更新版？

資料搜集及事實核查：Claude Opus 4.7 寫繁體中文、英文夾雜，同香港常用語順唔

Research and fact-check: GPT-5.5 Spud, Agentic coding and tool orchestration, including tool calling, web search, and tool-heavy workflows.

Research and fact-check: GPT-5.5 Spud, Agentic coding and tool orchestration, including tool calling, web search, and...

來源

[1] Anthropic vs. OpenAI red teaming methods reveal different security priorities for enterprise AI | VentureBeatventurebeat.com
For OpenAI, the Gray Swan Shade platform found that the o1 system card delivered 6% ASR for harmful text and 5% for malicious code, all based on single-attempt metrics. In Apollo Research's evaluation of o1, the model attempted to turn off oversight mechanisms 5% of the time when it believed it would be shut down. | Scheming (Apollo) | Opus 4.5 showed very low rates of sabotage propensity in internal evaluations, orders of magnitude lower than o1. Opus 4.5's 0% ASR at 200 attempts in computer use versus o1's 5% malicious code rate are not measuring the same thing. METR also detected rewar…
[2] GPT-5 Security Report - AI Red Teaming Results | Promptfoopromptfoo.dev
Tests handling of WMD-related content. Tests handling of child exploitation content. Tests handling of cybercrime-related content. Tests handling of graphic or violent content. Tests handling of methamphetamine-related content. Tests handling of weapons-related content. Tests handling of sexual crime content. Tests handling of explicit sexual content. Tests handling of violent crime content. Tests handling of political content and bias. Tests handling of religious content and bias. Tests handling of WMD-related content. Tests handling of child exploitation content. Tests handling of cybercrim…
[3] GPT-5 Under Fire: Red Teaming OpenAI's Latest Model ...splx.ai
GPT-5 Under Fire: Red Teaming OpenAI’s Latest Model Reveals Surprising Weaknesses. * SPLX Prompt Hardening brings GPT-5 to enterprise-grade safety levels — especially for Business Alignment and Security. 3. Hardened Prompt (SPLX SP): Our Prompt Hardening engine applied to GPT-5. | Model | Prompt Layer | Overall | Security | Safety | Business Alignment |. | GPT-5 | No SP | 11 | 2.26 | 13.57 | 1.74 |. | GPT-4o | No SP | 29 | 81.95 | 20.06 | 0.00 |. | GPT-5 | Basic SP | 57 | 43.27 | 57.15 | 43.06 |. | GPT-4o | Basic SP | 81 | 52.37 | 94.54 | 72.03 |. | GPT-5…
[4] How we think about safety and alignment | OpenAIopenai.com
Such iterative deployment helps us understand threats from real world use⁠ and guides the research for the next generation of safety measures, systems, and practices. Our models are supported by complementary systemic defenses: continuous monitoring post-deployment⁠, open-source intelligence⁠ ([OSINT⁠(opens in a new windo…
[5] OpenAI's Red Team reveal how they broke ChatGPT and GPT4 pre-release – Matthew Griffin | Keynote Speaker & Master Futuristfanaticalfuturist.com
An eclectic mix of white-collar professionals: academics, teachers, lawyers, risk analysts and security researchers, and largely based in the US and Europe the Red Team’s job was kept hidden for quite a time, and their findings were fed back to OpenAI which used them to mitigate and “retrain” GPT-4 before launching it more widely. OpenAI said it takes safety seriously, tested plug-ins prior to launch and will update GPT-4 regularly as more people use it. “There was a moment when I was testing the model when it acted like a white person talking to me,” Gollo said. Dan Hendrycks, an AI safety e…
[6] [PDF] GPT-5 System Card | OpenAIcdn.openai.com
We first evaluate the factual correctness of gpt-5-thinking and gpt-5-main on prompts representa-tive of real ChatGPT production conversations, using an LLM-based grading model with web access to identify major and minor factual errors in the assistant’s responses. We evaluate the gpt-5-thinking, gpt-5-thinking-mini, and gpt-5-thinking-nano models as well as OpenAI o3 and o4-mini, and find that the GPT-5 models have significantly lower hallucination rates in both "browse-on" and "browse-off" settings. Across all four evaluation settings, gpt-5-thinking demonstrates significantly less deceptiv…
[7] Safety evaluation competition on OpenAI gpt-oss concluded | Nils Durner’s Blogndurner.github.io
Safety evaluation competition on OpenAI gpt-oss concluded. The Kaggle safety evaluation “red-teaming” challenge on OpenAI gpt-oss has concluded with a ~~workshop~~ symposium this week. Sculley, our host and OpenAI researcher focused on responsible and reliable ML, and Samuel Marks, an AI safety researcher at Anthropic. After the keynotes, we prize-winning teams and honorable mentions presented our respective work. My favorite project “[Policy over Values: Alignment Hacking via CoT Forgery]”(https://www.kaggle.com/competitions/openai-gpt-oss-20b-red-teaming/writeups/lucky-coin-jailbreak), by…
[8] What OpenAI's Latest Red-Teaming Challenge Reveals About the ...techpolicy.press
What OpenAI's Latest Red-Teaming Challenge Reveals About the Evolution of AI 'Safety' Practices. OpenAI recently launched its Red-Teaming Challenge for two new open-weight models, gpt-oss-120b and gpt-oss-20b, marking another evolution in how the AI “Safety” field (which itself is understood in varying ways) approaches model evaluation alongside rapid product releases. As part of a broader set of approaches for evaluating AI systems, red teaming emphasizes flexibility, creativity, and diverse perspectives, making it particularly valuable for spotting emergent threats and vulnerabilities in…
[9] OpenAI: Only A Select Few To Get GPT 5.5 | Facebookfacebook.com
AI & You | Agents | Evolution Unleashed: Your Gateway To The AI Era | OpenAI: Only A Select Few To Get GPT 5.5 | Facebook. Log In. Forgot Account?. [](https://www.facebook.com/groups/5781171931930510/?__cft__[0]=AZZinPv9mhjCyP_5VmK-qn2k-VLyv5j75yJ4woWmnCg7M3MhtwKKjycJ_yvQCfSoSNmZgFEAg_3SJkMmXcd94h0P_0k_1QCP9nSuCCzzPu-Dz49VdYNA5mURgpzabjxnSkOyeuFpRAHdHPCE…
[10] GPT-5.5 “Spud” Explained – The Truth Behind OpenAI’s Next Big Modelyoutube.com
. - [x] Include playlist. . 26:15 Can you steal $10,000 from a locked iPhone?Veritasium 1.3M views • 11 hours ago Live Playlist ()Mix (50+)42:38 Why Chinese AI Is Suddenly So Good (ft. DeepSeek, SeeDance 2.0) | AB Explained Asian Boss 345K views • 2 days ago Live Playlist ()Mix (50+)[2:09:13 AI Whistleblower: We Are Being Gaslit By A…
[11] OpenAI Just Leaked GPT 5.5 SPUD The Most Powerful AI Yet?youtube.com
OpenAI Just Leaked GPT 5.5 SPUD The Most Powerful AI Yet?. 13:17 OpenAI Just Dropped The Real Plan After AGI Hits AI Revolution 15K views • 11 hours ago Live Playlist ()Mix (50+)7:50 Claude’s New AI Just Changed the Internet Forever Nate Herk | AI Automation 73K views • 11 hours ago Live Playlist ()Mix (50+)[9:02 Former CIA: 3 Apps That Are Secretly Listening to Your Conversations Jason Hanson 89K views • 1 day ago Live Playlist ()Mix (50+)](https://www.…
[12] Brian Hanson - GPT-5.5 “Spud” coming soon… • New...facebook.com
OpenAI confirms GPT-5 is coming. With training already underway, this model promises to take artificial intelligence to a new level.
[13] GPT-5.5 Spud and GPT Image 2: Complete Guide to OpenAI Next Models in 2026pasqualepillitteri.it
GPT-5.5 Spud and GPT Image 2: Complete Guide to OpenAI Next Models in 2026. ##### Complete guide to GPT-5.5 Spud and GPT Image 2: everything about release date (ChatGPT 5.5 release date), capabilities, benchmarks, competitor comparison and how to test upcoming OpenAI models early. OpenAI is preparing two major releases for 2026: GPT-5.5 Spud, the successor to GPT-5 with evolved agentic capabilities, and GPT Image 2, the new image generation model that appeared on Chatbot Arena before the official announcement. If you are searching for gpt 5.5, chatgpt 5.5 release date or **g…
[14] GPT-5.5 Spud Released: Mid-Tier Model with Enhanced Efficiencyaiindigo.com
GPT-5.5 Spud Released: Mid-Tier Model with Enhanced Efficiency. # GPT-5.5 Spud Released: Mid-Tier Model with Enhanced Efficiency. OpenAI releases GPT-5.5 codenamed Spud, a mid-tier model positioned between GPT-4o and GPT-5. # GPT-5.5 Spud Released: Mid-Tier Model with Enhanced Efficiency. OpenAI releases GPT-5.5 codenamed Spud, a mid-tier model positioned between GPT-4o and GPT-5. This model release represents a significant development in the AI landscape. #openai#model-release#2026#major#arena. Share on X →Share on LinkedIn🏛️ Discuss on Terrace →. ## Never Miss a Breakthrough AI Tool. Get t…
[15] GPT-5.5 Spud: Everything About OpenAI Next Frontier Modelpasqualepillitteri.it
GPT-5.5 Spud: Everything About OpenAI Next Frontier Model. ##### GPT-5.5 Spud is OpenAI next frontier model: pretraining complete, Q2 2026 release expected. GPT-5.5, code-named "Spud", is the next frontier model from OpenAI. GPT-5.5 Spud OpenAI next AI model leak 2026. | GPT-5.5 "Spud" | OpenAI | Pretraining complete | April–May 2026 |. OpenAI uses code names during development (like "Orion" for GPT-5). Both are expected for Q2 2026. Claude Mythos was discovered through a data leak on March 26 and described as "the most powerful AI model ever developed" by Anthropic. **Use G…
[16] OpenAI Spud: GPT-5.5 Pretraining Done, April Release Likely | Abhishek Gautamabhs.in
OpenAI Spud: GPT-5.5 Pretraining Done, April Release Likely. * Improved tool use: GPT-5's function calling and tool use is good; Spud's is reportedly meaningfully better on multi-step tool chains — the specific capability that agentic frameworks like LangChain, AutoGPT successors, and Claude Code depend on. OpenAI shipping Spud in April compresses the window for Anthropic and Google to establish their models as the default for production workloads. * OpenAI "Spud" completed pretraining on March 24, 2026 — the post-training safety and alignment phase typically adds 3–8 weeks, putting…
[17] Will OpenAI announce a new full-size, frontier model >5.4 before May 1, 2026? (aka “Spud”) | Manifoldmanifold.markets
Title: Will OpenAI announce a new full-size, frontier model >5.4 before May 1, 2026? (aka “Spud”) | Manifold Will OpenAI announce a new full-size, frontier model >5.4 before May 1, 2026? Resolves YES if OpenAI officially announces a new frontier-class model with a version number greater than 5.4 (e.g., GPT-5.5 or GPT-6) or other full-size, general purpose, frontier model before May 1, 2026. Minor patches, safety updates, or specialist or API-only variants (like GPT-5.x-Codex) do not count—there must be a distinct new model designation marketed as a capability upgrade over GPT-5.4. * 5.5 cod…
[18] OpenAI's GPT-5.5 'Spud' Finishes Pretraining — Greg Brockman Calls It a Massive Leap Toward AGIhappycapyguide.com
OpenAI's GPT-5.5 'Spud' Finishes Pretraining — Greg Brockman Calls It a Massive Leap Toward AGI. OpenAI has completed pretraining on GPT-5.5, internally codenamed 'Spud.' President Greg Brockman says it represents two years of research and a massive qualitative leap in reasoning, coding, and agentic capabilities. | GPT-5.5 "Spud" | OpenAI | Pretraining complete | 47% | Reasoning, coding, agentic AI |. GPT-5.5, codenamed "Spud" internally at OpenAI, is the next frontier model that completed pretraining in late March/early April 2026. The engineering resources and compute freed by sunsetting…
[19] GPT-5.5 Spud OpenAI pretraining 2026 at DuckDuckGohtml.duckduckgo.com
OpenAI Spud: GPT-5.5 Pretraining Done, April Release Likely. www.abhs.in/blog/openai-spud-gpt-5-5-release-date-polymarket-april-2026 2026-04-11T00:00:00.0000000. OpenAI's next model 'Spud' completed pretraining on March 24. ## OpenAI Spud: Leaked April 16 Release, Mythos-Level Benchmarks, and What ... adam.holter.com/openai-spud-leaked-april-16-release-mythos-level-benchmarks-and-what-gpt-5-5-or-gpt-6-might-mean/ 2026-04-08T00:00:00.0000000. ## OpenAI's 'Spud' Set to Unify AI Models as GPT-5.5 or GPT-... opentools.ai/news/openais-spud-set-to-unify-ai-models-as-gpt-55-or-gpt-6-t…
[20] OpenAI Spud model codename April 2026 at DuckDuckGoduckduckgo.com
OpenAI Spud: Leaked April 16 Release, Mythos-Level Benchmarks, and What ... adam.holter.com/openai-spud-leaked-april-16-release-mythos-level-benchmarks-and-what-gpt-5-5-or-gpt-6-might-mean/ 2026-04-08T00:00:00.0000000. As of April 14, 2026, the global financial markets are fixated on "Spud," the internal codename for OpenAI's much-rumored GPT-6 model. ## OpenAI's 'Spud' Set to Unify AI Models as GPT-5.5 or GPT-... opentools.ai/news/openais-spud-set-to-unify-ai-models-as-gpt-55-or-gpt-6-this-spring. lumichats.com/blog/gpt-5-5-spud-openai-release-date-features-april-2…
[21] OpenAI Spud model pre-training GPT-5 at DuckDuckGohtml.duckduckgo.com
OpenAI Spud: GPT-5.5 Pretraining Done, April Release Likely. www.abhs.in/blog/openai-spud-gpt-5-5-release-date-polymarket-april-2026. ## GPT-5.5 Review (Spud) 2026: Everything We Know About OpenAI's Most ... Spud is OpenAI's internal codename for its next major AI model, which completed pre-training around March 24, 2026. adam.holter.com/openai-spud-leaked-april-16-release-mythos-level-benchmarks-and-what-gpt-5-5-or-gpt-6-might-mean/. ## OpenAI Spud (GPT-6) — Chat with OpenAI's Next Frontier AI Model. Spud is the internal codename for OpenAI's next frontier *model…
[22] OpenAI to launch GPT-5.5 by early 2026voice.lapaas.com
Title: OpenAI to launch GPT-5.5 by early 2026 Home Technology Artificial Intelligence OpenAI to launch GPT-5.5 by early 2026. # OpenAI to launch GPT-5.5 by early 2026. OpenAI is preparing to launch its next advanced AI model, GPT-5.5, by early 2026, according to multiple industry reports. The new model, internally known as “Garlic,” is expected to offer major improvements in reasoning, coding, and complex problem-solving. This launch could help OpenAI regain momentum in an increasingly competitive AI market. ## What GPT-5.5 Aims to Improve. GPT-5.5 is expected to be a mid-generation u…
[23] Openai Gpt 54 Official Release Announcement March 2026instagram.com
#gpt5.3 #chatgpt #openai #futuretech #ainews](https://scontent-sof1-1.cdninstagram.com/v/t51.82787-15/628352623_17851728090654050_5284829647511568375_n.jpg?stp=dst-jpg_e35_s640x640_tt6&_nc_cat=102&ccb=7-5&_nc_sid=18de74&efg=eyJlZmdfdGFnIjoiQ0xJUFMuYmVzdF9pbWFnZV91cmxnZW4uQzMifQ%3D%3D&_nc_ohc=uJBwo8NAUXcQ7kNvwHOILdE&_nc_oc=AdrjgZl67qu9N4umNcrfzyzm34ogXPOgyXxF-04TVF_b48fnQxu0l1x7AT7bHKobKQ_yqhMsGXMTLqyguDzc9fzf&_nc_zt=23&_nc_ht=scontent-sof1-1.cdninstagram.com&_nc_gid=GFZX23x0WO06oEVRonJ5NQ&_nc_ss=7a289&oh=00_Af0cQ9z6SDX6LqN3gxq8jkc2AeMq9E1ozsAOC2-lgThWTg&oe=69EC996F) [[![Image 12: future_now36…
[24] GPT-5 System Cardarxiv.org
The Microsoft AI Red Team concluded that the gpt-5-thinking model exhibits one of the strongest AI safety profiles among OpenAI's models—on par with or better
[25] [PDF] gpt5-system-card-aug7.pdf - OpenAIcdn.openai.com
Table 22: standard disallowed content evaluation Category gpt-5-thinking-mini gpt-5-thinking-nano OpenAI o4-mini gpt-5-main-mini hate (aggregate) 0.996 0.987 0.983 0.996 illicit/non-violent 1.000 0.991 0.991 0.974 illicit/violent 1.000 1.000 1.000 1.000 personal-data 0.898 0.963 0.930 0.975 personal-data/restricted 0.966 0.955 0.933 0.989 self-harm/intent and self-harm/instructions 0.989 0.989 1.000 0.989 sexual/exploitative 1.000 0.980 1.000 1.000 sexual/minors 0.990 0.990 1.000 1.000 Table 23: Production Benchmarks Category gpt-5-thinking-mini gpt-5-thinking-nano OpenAI o4-mini gpt-5-main-m…
[26] GPT-5 System Card - Deployment Safety Hubdeploymentsafety.openai.com
We first evaluate the factual correctness of gpt-5-thinking and gpt-5-main on prompts representative of real ChatGPT production conversations, using an LLM-based grading model with web access to identify major and minor factual errors in the assistant’s responses. We evaluate the gpt-5-thinking, gpt-5-thinking-mini, and gpt-5-thinking-nano models as well as OpenAI o3 and o4-mini, and find that the GPT-5 models have significantly lower hallucination rates in both "browse-on" and "browse-off" settings. Across all four evaluation settings, gpt-5-thinking demonstrates significantly less deceptive…
[27] OpenAI's ChatGPT 5.5 Enters Final Safety Review With April Release Windowanalyticsinsight.ae
OpenAI's ChatGPT 5.5 Enters Final Safety Review With April Release Window. ChatGPT 5.5 Spud Near Launch With Multimodal Upgrade and Early April Release Speculation. The competition in the AI race has intensified with a focus on redefined baselines instead of making incremental feature changes or upgrades. According to reports, it is widely expected to launch as ChatGPT 5.5. This is built for a fundamentally different approach to building and deploying AI models. Native multimodality would be one of the most anticipated upgrades of ChatGPT 5.5. Unlike earlier models that stitched together ca…
[28] OpenAI Deployment Safety Hub: System cards & other updatesdeploymentsafety.openai.com
GPT-5.4 Thinking System Card. GPT-5.4 Thinking is the latest reasoning model in the GPT-5 series, and explained in our blog. GPT-5.3 Instant System Card. As described in our blog , GPT-5.3 Instant responds faster,…Feb 05, 2026. GPT-5.3-Codex System Card. Addendum to GPT-5.2 System Card: GPT-5.2-Codex. GPT-5.2-Codex is our most advanced agentic coding model yet for complex, real-world software engineering. Update to GPT-5 System Card: GPT-5.2. GPT-5.2 is the latest model family in the GPT-5 series, and explained in our blog . GPT-5.1-Codex-Max System Card. This system card outlines the compreh…
[29] GPT-5 System Card - OpenAIopenai.com
All of the GPT‑5 models additionally feature safe-completions, our latest approach to safety training to prevent disallowed content. Similarly
[30] OpenAI’s GPT-5 Is Here: A Deep Dive Into the AI That’s Smarter, Safer, and Faster | by Adnan Masood, PhD. | Mediummedium.com
OpenAI’s GPT-5 Is Here: A Deep Dive Into the AI That’s Smarter, Safer, and Faster | by Adnan Masood, PhD. Sitemap. Open in app. [Sign in](https://medium.com/m/signin?operation=login&redirect=https%3A%2F%2Fmedium.com%2F%40adnanmasood%2Fopenais-gpt-5-is-here-a-deep-dive-into-the-system-card-for-ai-that-s-smarter-safer-and-faster-bca6effe5a8d&source=post_page---top_nav_layou…
[31] Leak: OpenAI's Next Model Just Went Live (Launch Could Be Days Away) — LumiChats Bloglumichats.com
It launched powered by GPT-5.4, but Spud is the model expected to take it to the next level — intent-aware reasoning inside a unified workspace is a fundamentally different product than what anyone has today. | GPT-5.4 | Current OpenAI flagship — available now | Deepest enterprise integration; #2 knowledge work (GDPval 1,674) and computer use (OSWorld 75.0%) after Opus 4.7's April 16 launch | $15/M |. It launched powered by GPT-5.4, but Spud is the model expected to take it to the next level — intent-aware reasoning inside a unified workspace is a fundamentally different product than what any…
[32] OpenAI Spud: They Killed Sora for This | FindSkill.ai — Learn AI for Your Jobfindskill.ai
OpenAI shut Sora to free GPUs for Spud — a model Altman says can 'accelerate the economy.' Facts, speculation, and what ChatGPT users should expect. On March 24, The Information reported that OpenAI finished pretraining a new AI model codenamed “Spud.” In the same breath, Sam Altman told employees that Sora — the AI video generator that launched six months ago with a billion-dollar Disney deal — is shutting down. Every video generation burned through expensive compute that OpenAI could have spent on text, coding, and agentic AI — the stuff that actually generates business revenue. OpenAI’s pr…
[33] OpenAI Just Shipped 3 Specialized Models in 72 Hours - Towards AIpub.towardsai.net
. . [![Image 14: Towards AI](https://miro.medium.com/v2/resize:fill:20:20/1*JyIThO-cLjlChQLb6kSlV…
[34] OpenAI GPT-5.5 LEAKED: Roman City 3D Render Stunsintheworldofai.com
Codenamed Spud, shipping as GPT-5.5, the model has been in safety evaluation since March 24 and is expected to release any day now. Sam
[35] AI Daily Automation on Instagram: "OpenAI just quietly upgraded ChatGPT — and most people missed what it actually means. Here's what changed and how to use it right now: → GPT-5.5 is live for Plus and Pro users today. → It has better memory management — no more mid-session resets. → Task continuityinstagram.com
. ![Image 2: aidaily.automatio…
[36] Digitfacebook.com
Digit - ChatGPT 5.5 aka Spud model may debut next week:... Log In. Forgot Account?. ## Digit's Post. [](https://www.facebook.com/stories/103552625205327/UzpfSVNDOjIzMzg0MDMwNjAwMTUzMTc=/?view_single=false&__cft__[0]=AZYhCK_XqG3j-0G8MDa37uHPMoKImQ2rBK4F-SmAmO31exuh7tnxpzqkFcGrs-hpwCdxMfllIgCag7OAkq7B0ie3B…
[37] The Spud Leaks & The New Frontier of Omnimodal AI. : r/ChatGPTreddit.com
Skip to main contentGPT-5.5: The Spud Leaks & The New Frontier of Omnimodal AI. Open menu Open navigationGo to Reddit Home. Get App Get the Reddit app Log InLog in to Reddit. Go to ChatGPT. [r/ChatGPT]…
[38] New GPT 5.5 Pro SPUD Leak - YouTubeyoutube.com
New GPT 5.5 Pro SPUD Leak. 13 views · 15 hours ago. #OpenAI #GPT6 ... Delete Google Chrome NOW: The 3 "Safe" Browsers You Should Use Instead.
[39] [PDF] OpenAI's Approach to External Red Teaming for AI Models and ...cdn.openai.com
Table 2: Pros and cons of diﬀerent types of model access for red teamers Type of Access Advantages Disadvantages Pre-deployment models or snapshots without mitigations Might inform earliest rounds of post-training, understanding initial nascent capabilities Models without post-training tend to be more diﬃcult to use and are less helpful, generally not representing the full extent of possible capabilities Will not represent the safety proﬁle models intended to be deployed in production Generally are non-standard models and may not always be trained or available by default, requiring separate t…
[40] AI Red Teaming vs AI Pentesting vs AI Safety: Key Differences Explainedredfoxsec.com
Instead of web apps and network services alone, AI pentesting targets model endpoints, inference APIs, training pipelines, vector databases, embedding models, and agent orchestration layers. Beyond prompt injection, AI pentesting also covers vector store poisoning, model inversion attacks, and tool-call abuse in agentic systems. Redfox Cybersecurity offers structured AI pentesting services that cover the full attack surface of AI-integrated applications, from inference APIs to RAG pipelines and agent tool abuse. If you want hands-on training in these techniques, Redfox Cybersecurity Academy o…
[41] OpenAI enhances AI safety with new red teaming methodsthenaai.org
OpenAI enhances AI safety with new red teaming methods. In their latest push for advancement, OpenAI is sharing two important documents on red teaming — a white paper detailing external engagement strategies and a research study introducing a novel method for automated red teaming. OpenAI has shared four fundamental steps in their white paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” to design effective red teaming campaigns:. OpenAI’s research introduces “Diverse And Effective Red Teaming With Auto-Generated Rewards And Multi-Step Reinforcement Learning,” a me…
[42] Red Teaming AI Systems - Article | OpenAI Forumforum.openai.com
Red Teaming AI Systems. ## Lama Ahmad discusses the organization’s efforts in red teaming AI systems. In a recent talk at OpenAI, Lama Ahmad discussed the organization’s efforts in red teaming AI systems, a critical process for identifying risks and vulnerabilities in models to improve their safety. Ahmad emphasized that red teaming is a continuous, collaborative process that involves internal teams, external experts, and automated systems to assess risks at different stages of AI development. * As AI systems become more complex, the red teaming process will need to evolve, combinin…
[43] Researcher, Pretraining Safety | OpenAIopenai.com
Researcher, Pretraining Safety | OpenAI. ## Researcher, Pretraining Safety. The Safety Systems team is responsible for various safety work to ensure our best models can be safely deployed to the real world to benefit the society and is at the forefront of OpenAI's mission to build and deploy safe AGI, driving our commitment to AI safety and fostering a culture of trust and transparency. The Pretraining Safety team’s goal is to build safer, more capable base models and enable earlier, more reliable safety evaluation during training. The Pretraining Safety team is pioneering how safety is bui…
[44] (PDF) OpenAI's Approach to External Red Teaming for AI Models ...researchgate.net
Red teaming has emerged as a critical practice in assessing the possible risks of AI models and systems. It aids in the discovery of novel risks,
[45] Advancing red teaming with people and AI - OpenAIopenai.com
Two new papers show how our external and automated red teaming efforts are advancing to help deliver safe and beneficial AI.
[46] OpenAI's Red Teaming Sets the Gold Standard for AI Securitymedium.com
They built the most comprehensive AI security testing framework we've ever seen, achieving defense rates that seemed impossible just years ago.
[47] Evaluating OpenAI's Red Teaming and FLI AI Safety Standardslinkedin.com
An analysis of OpenAI's red teaming practices against the FLI AI Safety Index, focusing on transparency, governance, and risk management.
[48] How OpenAI's red team made ChatGPT agent into an AI fortress | VentureBeatventurebeat.com
How OpenAI's red team made ChatGPT agent into an AI fortress. Looking at OpenAI's ChatGPT agent system card, the "red team" employed by the company to test the feature faced a challenging mission: specifically, 16 PhD security researchers who were given 40 hours to test it out. Through systematic testing, the red team discovered seven universal exploits that could compromise the system, revealing critical vulnerabilities in how AI agents handle real-world interactions. In response to the red teamers' findings, OpenAI classified ChatGPT Agent as "High capability" for biological and chemical…
[49] GPT-5 System Card - OpenAI Deployment Safety Hubdeploymentsafety.openai.com
We first evaluate the factual correctness of gpt-5-thinking and gpt-5-main on prompts representative of real ChatGPT production conversations, using an LLM-based grading model with web access to identify major and minor factual errors in the assistant’s responses. We evaluate the gpt-5-thinking, gpt-5-thinking-mini, and gpt-5-thinking-nano models as well as OpenAI o3 and o4-mini, and find that the GPT-5 models have significantly lower hallucination rates in both "browse-on" and "browse-off" settings. Across all four evaluation settings, gpt-5-thinking demonstrates significantly less deceptive…
[50] GPT-5 System Card Unpacked: Safety, Speed, and Real-World AI - The Prompt Indexthepromptindex.com
GPT-5 System Card Unpacked: Safety, Speed, and Real-World AI. GPT-5 System Card is breaking down the safety, speed, and real-world potential of OpenAI’s latest system. ## GPT-5 System Card Unpacked: Safety, Speed, and Real-World AI. * Microsoft AI Red Team: Independent evaluation deemed GPT-5-thinking one of the safest models among OpenAI’s lineup, particularly in frontier harms and content safety. * GPT-5 is a multi-model, router-guided system designed to be both fast and deeply reasoning-capable, with explicit safety layers built into architecture and workflow. * The overarching message:…
[51] OpenAI Red Teaming Networkopenai.com
The OpenAI Red Teaming Network is a community of trusted and experienced experts that can help to inform our risk assessment and mitigation efforts.
[52] Red‑Teaming Challenge - OpenAI gpt-oss-20b | Kagglekaggle.com
Description · Safety testing is at the heart of progress in AI. · gpt-oss-20b is an ideal target to push forward state of the art in red-teaming.
[53] GPT-5 red teaming underway : r/singularity - Redditreddit.com
Red teaming in this context is testing an LLM to see if it can safely be "shared" with others. They did this with GPT4 as well, and many other

報告公共網址2026年4月21日53 來源

資料搜集及事實核查：如果 GPT-5.5 Spud 真係推出，正式公布前有冇任何安全評估、紅隊測試或者對齊證據？

重點

OpenAI 官方 white paper 寫明，佢自 DALL-E 2 起已對 frontier model deployments 做 external red teaming，並且過往已為 GPT-4、GPT-4o、o1 等模型公開 system cards 或相關風險評估資料。
對 GPT-5 本身，OpenAI 有正式安全文件：官方 system card 頁面列出 safe-completions、Preparedness Framework safeguard，同連到 deployment safety data；arXiv 版 system card 仲提到 Microsoft AI Red Team 認為 gpt-5-thinking 係 OpenAI 模型之中安全表現最強之一。
我搵到關於「Spud」嘅公開頁面，主要都係非官方整理、消息文或者預測文，重點放喺 pretraining、release window、live testing 或 capability 猜測，而唔係附帶可驗證嘅 safety artifact。
換句話講，公開層面我見到嘅係「OpenAI 一般有安全流程」同「GPT-5 有正式安全文件」，但唔係「GPT-5.5 Spud 在正式公布前已有公開、可核實嘅 Spud-specific 安全證據」。
就公開可核查材料而言，我嘅結論係：Insufficient evidence。OpenAI 確實有公開講自己會做外部 red teaming，同埋用迭代部署去累積 safety／alignment 經驗；對已正式發布嘅 GPT-5，亦有 system card 同 deployment safety 資料可查。[2][3][7][12] 但我未見到任何喺正式公布前、直接對應「GPT-5.5 Spud」嘅官方 system card、紅隊報告、Preparedness 評估或者 alignment 證據。[2][3][7][12]
Key findings

研究答案

Key findings

OpenAI 官方 white paper 寫明，佢自 DALL-E 2 起已對 frontier model deployments 做 external red teaming，並且過往已為 GPT-4、GPT-4o、o1 等模型公開 system cards 或相關風險評估資料。^[3]
對 GPT-5 本身，OpenAI 有正式安全文件：官方 system card 頁面列出 safe-completions、Preparedness Framework safeguard，同連到 deployment safety data；arXiv 版 system card 仲提到 Microsoft AI Red Team 認為 gpt-5-thinking 係 OpenAI 模型之中安全表現最強之一。^[1]^[2]^[7]
我搵到關於「Spud」嘅公開頁面，主要都係非官方整理、消息文或者預測文，重點放喺 pretraining、release window、live testing 或 capability 猜測，而唔係附帶可驗證嘅 safety artifact。^[9]^[11]^[13]^[15]^[16]
換句話講，公開層面我見到嘅係「OpenAI 一般有安全流程」同「GPT-5 有正式安全文件」，但唔係「GPT-5.5 Spud 在正式公布前已有公開、可核實嘅 Spud-specific 安全證據」。^[2]^[3]^[7]^[12]

Evidence notes

要分清兩件事：第一，OpenAI 作為公司有冇一般性安全／對齊做法；第二，Spud 呢個特定型號喺公布前有冇公開證據。前者有官方材料支持，後者我未見到同名、可直接核驗嘅文件。^[3]^[12]
GPT-5 嘅 system card 不能自動當成 GPT-5.5 Spud 嘅證據，除非 OpenAI 之後正式講明兩者關係，或者另外發出新 system card／Preparedness addendum。^[1]^[2]^[7]

Limitations / uncertainty

呢個結論只針對「正式公布前、公開可驗證」嘅材料；唔排除 OpenAI 內部其實有做安全評估或紅隊，只係未公開。^[3]^[12]
如果之後 OpenAI 發布 system card、Deployment Safety Hub 頁面、Preparedness 文件，或者第三方獨立紅隊報告，結論可能會改變。^[2]^[3]^[7]

Summary

目前最穩陣嘅說法係：OpenAI 一般有 safety／red-team／alignment 流程，GPT-5 亦有公開文件；但對「GPT-5.5 Spud」而言，我暫時未搵到正式公布前已公開、可直接驗證嘅安全評估、紅隊測試或對齊證據，所以答案係 Insufficient evidence。^[2]^[3]^[7]^[12]

如果你想，我下一步可以幫你做一份更硬核嘅「證據清單」，分開列出「官方一手」「二手報道」「純推測／傳聞」三欄，方便你之後引用。

支持視覺效果

Company Logo

研究對話

你2026年4月21日