Resource · Reference

What is Agent Compliance

AI agents do compliance-relevant work at machine speed. Where the record lives depends on which door the AI came through.

Scroll for reference
Cameron McClellan headshotBy Cameron McClellan, Growth Engineer
Published
Definition

Agent compliance

The practice of proving that the AI agents operating inside an organization stay within the controls a framework requires. The proof is a runtime record: what each agent saw, what it produced, and what it did, on whose authority and against which system.


AI GovernanceReferenceSpeakeasy

An AI agent does two kinds of compliance-relevant work, and each carries its own obligations. The first is the conversation: the data the agent is given and the output it produces. Data-handling, confidentiality, and retention obligations attach here, the moment information reaches the model, whether or not the agent goes on to act. The second is the action: the tool calls the agent makes to read a record, write a row, or invoke an API, under whatever authority it held at the start of the session. The obligations that govern changing a system of record attach here. Both happen at machine speed, across systems that were never built to report what an agent saw or did to a compliance team.

So when an auditor asks how the organization governs its AI, they are asking two things: where is the record, and what does it contain? The answer changes with the vendor and the license tier. A managed Claude or ChatGPT Enterprise tenant exposes a compliance API that can export the chat record. A developer running Cursor or Copilot produces almost no exportable record of what the model generated. An employee on a personal ChatGPT Plus or Claude Pro subscription sits outside all of it. The sections below cover what compliance requires, the three gaps that block a complete record today, and how to close them.

What does agent compliance require?

ISO/IEC 42001, the first certifiable standard for managing AI, is mostly a management system: policies, risk processes, and accountability an organization documents before anything ships. A handful of its Annex A controls are different. They describe what has to be true while the AI runs, and they resolve into evidence only at runtime.

The runtime controls, and the evidence they need

Requirement
Event logs
Where it appears
ISO 42001 A.6.2.8, SOC 2 CC7, EU AI Act Art. 12
Evidence it needs
A reconstructable record of system operation: what the model was given, what it produced, and what it did
Operation and monitoring
Where it appears
ISO 42001 A.6.2.6, SOC 2 CC7
Evidence it needs
Continuous monitoring of the deployed system while it operates in production
Intended use
Where it appears
ISO 42001 A.9.4
Evidence it needs
The system is used the way it was approved to be used, demonstrably
Data handling
Where it appears
ISO 42001 A.7, SOC 2 CC6, GDPR
Evidence it needs
Defined controls over the data flowing through the system

These controls share one hard property. The evidence is a byproduct of the system running, and it exists only if something recorded it as it happened. For an agent that means capturing both surfaces, the conversation and the action, and whether either one gets recorded is decided by the tool the agent runs inside. Most tools capture at most one.

The three gaps that block agent compliance

The evidence these controls need is capturable in principle. In practice, three gaps stand between an organization and a complete record:

  1. The compliance APIs that agent vendors ship are new and uneven
  2. A lot of AI usage runs on individual licenses the organization never sees
  3. GRC platforms that owns the audit cannot collect any of it.

Vendor compliance APIs are new and uneven

The major vendors fall into two camps, and neither one is complete.

The enterprise chat products expose a compliance API that hands the conversation record to downstream tooling. Anthropic’s Compliance API is the most complete of the set. Running under https://api.anthropic.com/v1/compliance/*, it exposes an activity feed retained for six years, the underlying chat, file, and project content for claude.ai organizations, the directory of users and roles, and the effective settings for each linked organization. Content endpoints support both retrieve and delete, so a legal team can pull a conversation or honor a deletion request programmatically. Access is gated to Claude Enterprise; Team plans get a narrower CSV audit-log export, and Pro and Free get neither.

OpenAI’s Compliance Platform, launched in July 2024, exposes time-stamped interactions as immutable log files: conversations including prompt and response text, uploaded files, custom GPT configuration and metadata, memories, and workspace users. Eight eDiscovery and DLP vendors built integrations at launch, among them Microsoft Purview, Relativity, Smarsh, Netskope, and Zscaler. It is available on ChatGPT Enterprise and Edu only.

The coding tools log the administrative shell and leave the content out. Cursor offers audit logs and an Admin API on its Enterprise plan, but both carry administrative and usage metadata only, and the documentation is explicit that “we do not log agent responses or generated code content.” GitHub Copilot is the same shape. Its audit log captures seat assignment, policy changes, and configuration, and the docs state plainly that it “does not include client session data, such as the prompts a user sends to Copilot locally.” For the tools where an agent reads your codebase and writes changes, there is no native export of what it generated.

What each tool can hand a downstream system

Tool
Claude
Compliance interface
Compliance API
Tier required
Enterprise
What it exports
Activity feed plus full chat, file, and project content; retrieve and delete
ChatGPT
Compliance interface
Compliance Platform
Tier required
Enterprise, Edu
What it exports
Conversations, files, GPT configs, memories, users, as immutable logs
Cursor
Compliance interface
Audit logs, Admin API
Tier required
Enterprise
What it exports
Admin and usage metadata. Responses and generated code are not logged
GitHub Copilot
Compliance interface
Audit log API, metrics API
Tier required
Business, Enterprise
What it exports
Config events and aggregate usage. Prompts and suggestions are excluded

Individual licenses sit outside every agent platforms control

Everything above assumes the AI came through a door the organization controls. Much of it does not. When an employee uses a personal ChatGPT Plus subscription, a Claude Pro account, or Cursor on the free tier, none of the enterprise machinery applies: no admin console, no compliance API, no audit log, no retention control, no SSO. The compliance API bought at the Enterprise tier sees nothing of it, because the activity belongs to a tenant the company does not own. This is the core of shadow AI, and it is the hardest part for compliance: the evidence an organization can produce is bounded by the licenses it centrally manages, and everything bought on a personal card is dark.

Same vendor, two doors

Admin visibility
Individual license
None. No console, no roster.
Managed enterprise tenant
Admin console, SSO, SCIM
Compliance API
Individual license
Not available
Managed enterprise tenant
Available (chat vendors); content metadata only (coding tools)
Training on your data
Individual license
On by default, opt-out required
Managed enterprise tenant
Excluded by default
GRC evidence
Individual license
Invisible to the organization
Managed enterprise tenant
User roster via access connector; usage via DLP archive

GRC platforms never see AI usage

Teams often assume their compliance platform already has this covered. It does not. The vendor compliance APIs above export to eDiscovery and data-loss-prevention archives. They do not directly export to Vanta or Drata.

The AI-vendor connectors that GRC platforms do ship are identity connectors. Vanta’s OpenAI integration and Drata’s OpenAI and Anthropic integrations sync the user roster and roles so that AI accounts can be folded into access reviews and deprovisioning. They pull who can log in. They pull nothing about how the AI is used: no conversations, no tool calls, no model configuration, no AI-specific audit trail.

The diagram below traces where each surface of agent activity actually lands. Read it as three rows: what gets generated on the left, the path it travels in the middle, and where it comes to rest on the right.

Closing the gaps with an AI control plane

Closing these gaps means recording agent activity at the boundaries the organization owns, rather than inside each vendor’s tenant. An AI control plane is the governing layer that does this: it sits between every AI agent in an organization and every system it can reach, and keeps the record the vendor APIs and GRC connectors leave behind, in a form built to export into the GRC platform itself. Three capabilities map onto the three gaps above.

Each gap, and the capability that closes it

The gap that stays open
Each vendor API covers one vendor's conversation surface in one format, the coding tools export nothing, and the action surface is recorded nowhere
The capability that closes it
Cross-agent capture: one uniform record of both the conversation and the tool calls, across every agent
Individual licenses sit outside every admin console and compliance API
The capability that closes it
A device agent: records usage at the endpoint, so a personal account is captured like a managed seat
Consumer AI runs in the browser, with no API and nothing to install vendor-side
The capability that closes it
Web AI monitoring: captures the AI tools an employee opens in a tab

Cross-agent capture in one AI audit log

The vendor APIs each cover a single vendor’s conversation surface in a single vendor’s format, and the coding tools export nothing at all. A control plane records every agent the same way, so Claude, ChatGPT, Cursor, Copilot, and whatever ships next produce one uniform, queryable log instead of four partial exports stitched together by hand. Because the record is generated on the path rather than inside a vendor’s tenant, it captures both surfaces: the conversation, what the model saw and produced, and the action, what each agent did, with what arguments, against which system, under whose identity, and what came back. That is the same record ISO 42001 A.6.2.8 asks for, in one shape regardless of which model or tool generated it.

A device agent that captures shadow AI on any license

The hole individual licenses open is that the activity belongs to a tenant the company does not own, so no admin console or compliance API can reach it. A device agent installed on the endpoint closes it. It captures AI usage at the machine, which means a personal ChatGPT Plus or Claude Pro account is recorded the same as a managed enterprise seat, and the evidence an organization can produce stops being bounded by the licenses it centrally manages. Delivered through the MDM the fleet already runs, the agent is in place before the employee opens the laptop.

How Speakeasy’s AI control plane fits

Speakeasy is building the AI control plane. The MCP gateway routes and governs every agent-to-tool connection, enforcing authentication and access policy server-side rather than trusting each laptop. Agent hooks instrument every tool invocation in a signed, append-only log: who called what, with what data, and what happened, across managed and unmanaged tools alike, in a form built to survive forensic review and to export into the GRC platform that owns the rest of the program.

Speakeasy does not get a company certified, write its policies, or run its audit. It produces the runtime enforcement and the operational evidence that ISO 42001, SOC 2, and the EU AI Act require for agents, the part the compliance platform cannot auto-collect and the vendor APIs only partly reach. For a platform or security team mapping how AI flows through the organization before an auditor or a regulator asks, the Speakeasy AI control plane is where that record begins.

Further reading

Frequently asked questions

Agent compliance is the practice of proving that the AI agents and assistants operating inside an organization stay within the controls a framework like ISO 42001, SOC 2, or the EU AI Act requires. The proof is a runtime record of two surfaces: the conversation, what the model saw and produced, and the action, the tool calls the agent made, on whose authority and against which system.

Yes. The Claude Compliance API runs under api.anthropic.com/v1/compliance and exposes an activity feed retained for six years, the underlying chat, file, and project content for claude.ai organizations, the directory of users and roles, and organization settings. Content endpoints support both retrieve and delete. It is gated to Claude Enterprise; Team plans get a narrower CSV audit-log export, and Pro and Free plans get neither.

Yes. The OpenAI Compliance Platform, launched in July 2024, exposes time-stamped interactions as immutable logs: conversations including prompt and response text, uploaded files, custom GPT configs, memories, and workspace users. Eight eDiscovery and DLP vendors built integrations at launch, including Microsoft Purview, Relativity, and Smarsh. It is available on ChatGPT Enterprise and Edu only, not Team, Plus, or Free.

No. Cursor offers audit logs on its Enterprise plan and an Admin API, but both carry administrative and usage metadata only. The documentation states explicitly that it does not log agent responses or generated code content, so there is no native export of what the model actually produced. GitHub Copilot is the same: its audit log excludes the prompts a user sends locally.

Only partially. Vanta and Drata ship identity connectors for AI vendors that sync the user roster and roles into access reviews, so they can prove who can log in to an AI account. They do not ingest the vendor compliance APIs, conversations, tool calls, or AI usage data. The runtime behavior of the agents themselves falls outside what GRC automation collects today and needs an evidence source in the request path.

When an employee uses a personal ChatGPT Plus, Claude Pro, or free Cursor account, none of the enterprise machinery applies: no admin console, no compliance API, no audit log, no retention control, and no SSO. The data is also used for training by default on consumer tiers. The evidence an organization can produce is bounded by the licenses it centrally manages, so anything bought on a personal card is invisible to both the vendor compliance API and the GRC platform.

Because every prompt, response, and tool call passes through it, an AI control plane records what each agent did, with what arguments, against which system, under whose identity, and what came back. That identity-attributed, append-only log is the same record ISO 42001 A.6.2.8 asks for. It is uniform across vendors and independent of license tier, and it is built to export into the SIEM and the GRC platform that map it across the compliance program.

AI everywhere.