Resource · Evaluation criteria

How to evaluate an AI control plane

A buyer's checklist for AI governance. The criteria a complete control plane has to meet across Connect, Control, Secure, and Observe, and where each category of tooling stops short.

Scroll for checklist
Cameron McClellan headshotBy Cameron McClellan, Growth Engineer
Published

Every enterprise is now shopping for AI governance, and the market is a maze. LLM gateways, MCP gateways, identity providers, observability platforms, point security tools, and hyperscaler controls all claim to govern AI. Each one covers a different slice of the same problem, and most of them cover only one.

The useful question is not “which AI gateway.” It is whether a system gives a single governed path across every model call, every tool call, and every identity, with one correlated view of all three. That is the job of an AI control plane, and it is the standard this checklist evaluates against.

Use it to assess any vendor, or the patchwork of tools already in the building. The criteria are grouped by what the layer has to do, Connect, Control, Secure, and Observe, plus how it deploys.

What is an AI control plane?

An AI control plane is the governing layer between every AI agent in an organization and every system it can reach, putting each model call, tool call, and identity on a single controlled path. It closes the gaps that open when AI spreads ungoverned: tools no one approved, prompts and tool calls that leak sensitive data or carry injection attacks, and compliance questions no team can answer.

In practice it does four things across that path:

  • connects every agent and tool onto one plane
  • controls who is allowed to do what
  • secures the content of every prompt and tool call
  • observes all of it as audit-grade evidence.

The payoff is a correlated view that no single-layer tool has, one place that ties a model call, a tool call, and the identity behind them together.

For the full reference architecture, see What is an AI control plane?. The rest of this page is the checklist for evaluating one.

The AI control plane checklist

This checklist is intended to be used by companies evaluating any vendor, or reviewing their homegrown setup. The bullets below are a non-exhaustive list of the guardrails a company would typically want to have in place to guide the use of AI internally.

Connect: bring every agent and tool onto one plane

  • A central, per-team registry for MCP servers and tools
  • A catalog of pre-approved MCP servers for common SaaS tools
  • Support for the import existing MCP servers
  • Support for the creation of custom MCP servers
  • Token-efficient tool exposure, code-mode or similar
  • Open protocols: MCP, OAuth 2.1, OIDC, and SAML
  • AI agent support
    • Claude Code
    • Claude Desktop
    • Cursor
    • Codex
  • Browser-based assistants support
    • ChatGPT
    • Claude.ai
    • Gemini
  • AI embedded in SaaS (Slack AI, Notion AI)
  • Integration with provider compliance APIs (Anthropic, OpenAI) to govern hosted chat

Control: enforce who is allowed to do what

  • SSO-integrated identity for every agent, with SCIM provisioning
  • Role-based access with least-privilege tool scoping
  • Block and allow policies at the MCP server and individual tool level
  • Graduated responses, not just hard blocks, including warn-and-continue user nudges
  • Human-in-the-loop approval for high-risk or irreversible actions
  • Just-in-time, time-bound access grants
  • Versioned access policy enforced at runtime, with conditions on tool arguments, network, and request context
  • Session isolation, so an agent cannot expand access or diverge mid-session
  • Credential and secret management
  • Enforced per-team budgets and rate limits, not just metrics
  • Access revoked the moment an identity changes

Secure: catch threats and data leaks

  • Real-time inspection of every prompt and tool call
  • Detection rules that apply in every direction: user input, assistant output, tool input, and tool result
  • A configurable detection rule catalogue, with custom rules alongside built-ins
  • Sensitive data detection, blocking, and masking
    • PII
    • Cardholder data (CHD)
    • Protected health information (PHI)
  • Prompt-injection detection, and blocking
  • Destructive command detection and blocking
  • Shadow MCP detection and blocking
  • Tool-definition change detection and version pinning, to prevent tool poisoning
  • Risk scoring with severity ratings for MCP servers, tools, and skills
  • In-platform alerting on high-risk and malicious activity, with forwarding to a SIEM, Slack, and incident or ticketing tools (PagerDuty, Jira) by webhook

Observe: produce audit-grade evidence of AI activity

  • Full session capture: user input, assistant output, tool input, and tool result
  • Every interaction attributed to a user, MCP client, MCP server, and the skill used
  • Corporate versus personal account distinguished on each connection
  • A structured, queryable audit log that is tamper-evident, retained to policy, and exportable
  • Audit evidence for SOC 2 and the EU AI Act
  • Adoption and cost metrics by team, tool, and user
  • Cross-layer incident reconstruction, correlating the prompt, identity, tool call, and underlying API behavior in one place

Architecture and deployment criteria

  • Integration via Agent plugins
  • Integration via MDM providers
  • Fail-open behavior when the service is unavailable
  • Support for OpenTelemetry
  • API integrations available
  • Self-hosting available
  • Regional deployments available

Vendor security and compliance

  • ISO 27001 certification
  • SOC 2 Type 2 report
  • A recent third-party penetration test report
  • HIPAA coverage
  • GDPR coverage
  • Certificate of cyber insurance
  • Documented security policies
  • A published list of sub-processors
  • A bug bounty or responsible-disclosure program
  • A public status page with uptime history

The current ecosystem

Today, the job of an AI control plane is split across a fragmented set of point tools: an AI gateway, an MCP gateway, an identity provider, an observability platform, a content-inspection tool. The result is governance with seams in it. Policy is enforced inconsistently from one tool to the next, an incident that crosses layers leaves traces in several systems with no common thread, and the platform team is left to wire it all together and keep it in sync as the AI stack keeps changing.

We believe AI is better governed by a single integrated system, one that sees the model call, the tool call, and the identity behind them as parts of the same request. That end-to-end view is the only way to enforce policy consistently and catch threats wherever they surface.

Architecture · prompt lifecycle

Which tool covers which stage

Traditional security governs identity, endpoint, and network, then goes blind. The AI control plane is the layer that covers the model call, the tool call, and the audit trail.
IDENTITY & ACCESSENDPOINT / DEVICENETWORK / INFRAAPPLICATION / AIDATAPrompt submissionSSO / identity gateverifies identity, issues tokensAgent on deviceexecutes AI sessions, calls toolsFirewall / ZTNAegress policy by destinationLLM gatewayinspects prompts, model-call policyModel callAnthropic · OpenAI · GoogleMCP gatewayauthenticates agents, tool policyProduction datadatabases · APIs · file systemsSIEM / auditevent busTRADITIONAL ENTERPRISE SECURITYIdentity providerOkta · EntraEndpoint / MDMdevice postureFirewall / ZTNAegress blind to prompts, tool calls, and data accessAI CONTROL PLANELLM gatewaymodel callsMCP gatewaytool callsPolicy & threatcontentObservabilityaudit trail one correlated view across model calls, tool calls, and identity

How Speakeasy measures up

Speakeasy is building a fully integrated AI control plane. We started with the connection and identity layer, an MCP gateway and registry that is the first place companies get stuck, and have been extending across the four functions since. Policy is enforced server-side, every tool call is logged with the identity behind it, and the gateway runs inside the customer’s own VPC.

The reference architecture in our AI control plane guide describes what a complete system looks like, function by function, and the coverage map shows where we are today against it. If a platform or security team is using this checklist to evaluate options, that is where the conversation starts.

Further reading

Frequently asked questions

An AI control plane is the governing layer between every AI agent in an organization and every system it can reach. It unifies connection, identity, policy enforcement, and observability so that every prompt, response, and tool call flows through a single controlled path, producing one correlated view across model calls, tool calls, and identity.

An AI gateway, or LLM gateway, sits at the model-call layer and handles routing, API keys, rate limiting, and cost tracking across providers. It has no visibility into the tool call an agent makes or the identity behind it. An AI control plane covers the model call, the tool call, and the identity together, which is what produces a correlated view and lets policy be enforced end to end.

Coverage across every layer first: model calls, tool calls, and identity, with one correlated view. Then the four functions: Connect (a central registry and SSO-integrated identity), Control (role-based policy enforced at the tool-call boundary), Secure (real-time inspection and blocking), and Observe (audit-grade logging and cost metrics). Finally, the deployment criteria: in-VPC deployment, low latency overhead, and open standards such as OAuth 2.1, OpenTelemetry, and MCP.

Each category was built to own one layer. LLM gateways see the model call, MCP gateways see the tool call, identity providers see the user, observability tools record outcomes, and point tools inspect content. Governing AI requires owning the path between those layers, and none of the single-layer tools is positioned to do that without rebuilding significant parts of its architecture.

Yes. The structured audit log, human oversight through policy enforcement, and in-VPC data handling that a control plane provides are the same controls that frameworks like the EU AI Act, SOC 2, and the NIST AI Risk Management Framework require. Centralizing them in one layer makes an audit a query against the log instead of a cross-team investigation.

AI everywhere.