答え公開済み先週Last edited 5 日前16 ソース

Microsoft ASSERT は、平易な英語を自動テストに変える、AIエージェントの「虎の巻」

ASSERT（Adaptive Spec driven Scoring for Evaluation and Regression Testing）は、平易な英語の行動ルールを、スコア付きの実行可能なテストスイートに変換するオープンソースフレームワークで、AIエージェントのポリシー違反や安全性の欠陥を本番前に検出します [1][7][13]。敵対的なテストシナリオを自動生成し、エージェントの全ツール呼び出しをログに記録。各テストの合否判定と詳細な理由を提示し、LangChain、CrewAI、AutoGen、OpenAIなど主要フレームワークで動作します [1][7][12][13]。

Studio Global AIで検索して事実確認さらにトレンドページを見る

682K0

Abstract visualization representing Microsoft ASSERT framework converting natural-language AI behavior policies into structured, scored test suites for agent evaluation — What is Microsoft's ASSERT framework, announced at Build 2026, and how does it convert natural-language AI behavior policies into structuredMicrosoft's ASSERT framework automates the translation of plain-English behavior rules into executable, scored evaluation suites.
AI プロンプト
Create a landscape editorial hero image for this Studio Global article: What is Microsoft's ASSERT framework, announced at Build 2026, and how does it convert natural-language AI behavior policies into structured. Article summary: Here is a concise answer based on the official Microsoft sources and trusted reporting.. Topic tags: general, general web. Reference image context from search candidates: Reference image 1: visual subject "# Build agents you can trust across any framework with open evals and a control standard. The gap is concrete: written policies do not translate into working runtime controls, eval" source context "Build agents you can trust across any framework with open evals ..." Reference image 2: visual subject "# Microsoft is making AI behavior testing easier for developers. Microsoft has released ASSERT, an open-source framework that turns plain-language AI behavior re
openai.com

Microsoftは、2026年6月2日に開催された開発者向けイベント「Build 2026」において、ASSERT（Adaptive Spec-driven Scoring for Evaluation and Regression Testing、適応型仕様駆動スコアリング評価・回帰テスト） を発表し、責任あるAI（Responsible AI）の一環としてGitHub上でオープンソース化しました。

このフレームワークが解決しようとするのは、自律型AIエージェントの開発現場で急速に高まっている「品質保証」の難しさです。つまり、「作ったAIエージェントが、実際のユーザーやシステムと対話する前に、自社のビジネスルールや安全基準をきちんと守れるか」を、どう検証するかという問題です。

従来のAI評価は、「どれだけ役に立つか」「有害な発言をしないか」といった汎用的な性能指標の測定が中心でした。しかしこれでは、「承認なしに5万円を超える返金処理をしない」「社外の人間に顧客のメールアドレスを送信しない」といった、アプリケーション固有の重大な違反を見逃してしまう可能性があります。ASSERTは、こうした自然言語で書かれた「仕様」そのものを評価のインプットとし、そのギャップを埋めるわけです。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

Studio Global AIで検索して事実確認

人々も尋ねます