The APIs support grounding across multiple content types, including web pages, news, images, and videos . This positions Web IQ as a comprehensive web intelligence layer rather than a narrow text search tool.
Microsoft is making aggressive performance claims about Web IQ. Jordi Ribas, Microsoft's President of Search and AI, stated in an interview that the system achieves sub-165 millisecond P95 latency, responding to 95% of requests in less than 165 ms . The company additionally claims the system is roughly 2.5 times faster than the next best alternative on the market
.
On token efficiency, the design choice to return passages and structured evidence rather than full web pages is itself a significant optimization. Microsoft frames this as delivering "the best quality answers at the lowest cost" , though it has not published specific token savings benchmarks against competitors in the provided sources.
Web IQ is already integrated into Microsoft's own AI products. The APIs form the web grounding layer for Microsoft Copilot and also power web search grounding in OpenAI's ChatGPT . Jordi Ribas confirmed both integrations in media interviews around the Build 2026 launch, though he declined to name additional future customers at the time
.
The API is part of Microsoft IQ, a broader intelligence layer that is now generally available across GitHub Copilot, Microsoft Foundry, and Copilot Studio . This means developers building agents on Microsoft's platform can tap into Web IQ for live web grounding alongside the other IQ pillars.
Web IQ is one of four interconnected capabilities under Microsoft IQ, a unified context layer designed to ground agents in both world knowledge and enterprise intelligence :
This platform approach means developers can build once and reuse trusted organizational context everywhere their agents run . An agent might use Work IQ to understand someone's email history, Fabric IQ to query a sales database, and Web IQ to pull in the latest news or market data—all through a consistent grounding layer.
One of Web IQ's most consequential design decisions is what gets returned by the API. Traditional search returns documents. Web IQ returns passages and structured evidence objects .
Microsoft's reasoning is straightforward: "Models do not need documents, they need the right evidence" . By stripping away everything except the relevant information, Web IQ reduces the token overhead of each retrieval call. This matters especially for agentic workflows, where a single task might involve dozens of sequential web lookups—each retrieving only the precise passage needed rather than a full page
.
The retrieval pipeline includes its own intelligence layer that reasons about how to search: what query variations to run, how many results to fetch, and when to stop deepening . This is a departure from simpler RAG implementations that treat search as a one-shot keyword-to-document pipeline.
While the assistant's original answer avoided repeating certain claims, the provided source list makes the timeline clear. Microsoft retired the Bing Search API v7 and Bing Custom Search APIs on August 11, 2025 . After that date, existing instances were fully decommissioned and new signups were blocked
.
The initial replacement path was Grounding with Bing Search inside Azure AI Agents, which wrapped Bing results inside a Microsoft-managed agent—a fundamentally different architecture from the old standalone REST API . Developers who needed direct search API access were pointed to Third-party alternatives like Brave, DuckDuckGo, and Firecrawl
.
Web IQ represents the next generation of that pivot. Rather than simply redirecting developers into the Azure AI Agent ecosystem, it provides a purpose-built grounding stack that repackages Bing's web crawling and indexing infrastructure for AI-native consumption . It is both a spiritual successor to the retired Bing APIs and an architectural departure from their human-oriented design.
Web IQ enters a market where multiple companies are racing to build the best web grounding infrastructure for AI systems—including Google, Brave, DuckDuckGo, Firecrawl, and Perplexity. Microsoft's bet, articulated through Web IQ, is that Bing's existing web-scale index—combined with a retrieval stack rebuilt specifically for AI consumption—can provide a competitive advantage in speed, token efficiency, and grounding quality .
The launch positions Microsoft not just as a provider of AI models through Azure and Copilot, but as a provider of the data infrastructure that AI systems need to stay connected to the live web. That infrastructure decision—whether to use Web IQ, an alternative provider, or in-house retrieval—will shape how agentic applications handle real-time information for years to come.
Comments
0 comments