答案已发布上周Last edited 5天前16 来源

微软 ASSERT 如何在上线前拦截 AI Agent 的违规行为

ASSERT（Adaptive Spec driven Scoring for Evaluation and Regression Testing）是微软开源的新框架，能将用自然语言编写的业务行为规则，自动转化为可执行、带评分的测试套件，专门捕捉 AI Agent 的策略违规和安全缺陷 [1][8]。它会系统性生成对抗性测试场景，记录每一步工具调用，并给出带详细理由的通过/失败诊断报告，支持 LangChain、CrewAI、AutoGen、OpenAI 等主流框架 [7][12][13]。

使用 Studio Global AI 搜索并核查事实浏览更多热门页面

682K0

Abstract visualization representing Microsoft ASSERT framework converting natural-language AI behavior policies into structured, scored test suites for agent evaluation — What is Microsoft's ASSERT framework, announced at Build 2026, and how does it convert natural-language AI behavior policies into structuredMicrosoft's ASSERT framework automates the translation of plain-English behavior rules into executable, scored evaluation suites.
AI 提示
Create a landscape editorial hero image for this Studio Global article: What is Microsoft's ASSERT framework, announced at Build 2026, and how does it convert natural-language AI behavior policies into structured. Article summary: Here is a concise answer based on the official Microsoft sources and trusted reporting.. Topic tags: general, general web. Reference image context from search candidates: Reference image 1: visual subject "# Build agents you can trust across any framework with open evals and a control standard. The gap is concrete: written policies do not translate into working runtime controls, eval" source context "Build agents you can trust across any framework with open evals ..." Reference image 2: visual subject "# Microsoft is making AI behavior testing easier for developers. Microsoft has released ASSERT, an open-source framework that turns plain-language AI behavior re
openai.com

想象一下，你为客服团队部署了一个 AI 助手，它在标准的性能测试中表现得完美无缺：回答既准确又礼貌。然而，上线第一天，它就未经审批批准了一笔巨额退款，还把客户的隐私邮件抄送给了整个部门。这类“合规性”灾难，正是传统 AI 测试的盲区。

2026 年 6 月 2 日，在微软 Build 开发者大会上，微软宣布推出并将 ASSERT (Adaptive Spec-driven Scoring for Evaluation and Regression Testing) 作为负责任 AI 板块下的开源项目进行发布。这个框架的核心使命就是解决上述痛点：在你把 AI Agent 放出去与真实用户或系统交互之前，如何验证它会严格遵守你为产品设定的特殊规则和安全边界。

ASSERT 的革命性在于，它将自然语言描述的行为规范，从一个“背景板”提升为评估的“一等公民”，让测试不再是写代码，而是“写话” 。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

使用 Studio Global AI 搜索并核查事实

人们还问