AI & MCP

Speakeasy vs. Stainless vs. Postman: MCP server generation showdown

Nolan Sullivan

February 11, 2026 - 14 min read

AI & MCP

Note

This comparison reflects the state of each platform as of February 2026. All three companies are evolving rapidly, especially around AI-agent integration. If anything here needs updating, let us know!

The Model Context Protocol (MCP) has quickly become the standard for connecting AI agents to external APIs and data sources. Since Anthropic introduced MCP in November 2024, it has been adopted by OpenAI, Google DeepMind, and dozens of major developer tools companies. In December 2025, Anthropic donated MCP to the Agentic AI Foundation under the Linux Foundation, cementing it as a vendor-neutral open standard.

With MCP adoption accelerating, engineering and platform teams face a practical question: how do you generate production-ready MCP servers from your existing API specs?

This post compares the three most prominent approaches to MCP server generation: Speakeasy , Stainless , and Postman . We’ll evaluate each on automation, type-safety, deployment flexibility, protocol support, and developer experience so you can pick the right tool for your team.

What is MCP server generation?

MCP server generation is the automated creation of lightweight services that expose API operations as AI- or workflow-friendly tools, typically from OpenAPI specs. Instead of manually writing MCP tool definitions for each endpoint, a generator reads your API contract and produces a deployable MCP server with properly typed inputs, outputs, and documentation.

This matters because:

AI agents need structured tool access. LLMs like Claude and GPT interact with APIs through discrete tool calls. An MCP server translates your API surface into a format that AI agents understand natively.
Manual MCP servers drift from the API. Hand-maintained tool definitions fall out of sync as APIs evolve. Automated generation from OpenAPI keeps the MCP server as a direct reflection of the API contract.
Teams need repeatability at scale. As organizations expose dozens or hundreds of APIs to AI workflows, manually writing and maintaining MCP servers for each becomes untenable.

For a deeper look at how OpenAPI specs map to MCP tools, see Generate MCP tools from OpenAPI.

Generated vs. hand-written MCP servers

Before comparing generators, it’s worth addressing a common question: should you generate an MCP server at all, or write one by hand?

The honest answer is that a great MCP server is not a 1:1 mapping of your API. Raw API endpoints don’t always make good AI tools. A well-designed MCP server curates which operations are exposed, combines related endpoints into higher-level tools, tailors descriptions for LLM comprehension, and shapes inputs and outputs for the way agents actually work. A hand-written server built with this intentionality will outperform a naive one-endpoint-per-tool mapping every time.

That said, hand-writing MCP servers has real costs:

It doesn’t scale. If your organization has dozens or hundreds of services, hand-writing and maintaining an MCP server for each is a massive investment. Every API change requires a corresponding manual update to the MCP layer.
It’s error-prone. Hand-maintained tool definitions drift from the API over time. Parameter types go stale, new endpoints get missed, and documentation falls behind — exactly the conditions that cause LLM hallucinations.
It delays time-to-value. Writing an MCP server from scratch for a complex API can take days or weeks. Generation gives you a working server in minutes.

The pragmatic approach is to use generation as a starting point, then curate. Generate the initial MCP server from your OpenAPI spec to get correct types, authentication, and baseline tool coverage. Then refine: disable endpoints that don’t make sense as AI tools, add custom descriptions optimized for LLM comprehension, combine related operations into composite tools, and organize tools into scopes for different use cases.

Key criteria for comparing MCP generation tools

Before diving into each platform, here are the benchmarks that matter most when evaluating MCP generation tools:

Protocol support: Does it handle HTTP, streaming (SSE), WebSockets, and the Streamable HTTP transport that MCP clients expect?
Hosting: Can the server run as a managed remote service, self-hosted on your own infrastructure, or both?
Token efficiency: How does the generated server minimize LLM token usage? Does it support techniques like tool scoping, response filtering, or code-mode execution?
Authentication: Does the generator produce servers with built-in OAuth token lifecycle management, or does the developer handle this manually?
Automation: Can the tool fit into automated build, test, and deploy pipelines without manual intervention?
MCP gateway: Does the platform provide centralized routing, observability, access control, and tool curation across multiple MCP servers?

Speakeasy: OpenAPI-native, automation-first

Speakeasy generates standalone MCP server code repositories directly from OpenAPI specifications. The workflow starts with a single CLI command:


speakeasy quickstart --mcp

This produces a full TypeScript MCP server with one tool per API endpoint, complete with typed inputs, Zod-validated outputs, and deployment configurations. The generated code is fully inspectable and customizable — every tool definition lives in its own file, and developers can modify behavior directly.

Standout features

OAuth proxy generation. Most APIs don’t support MCP-native authentication out of the box. Speakeasy autogenerates an OAuth proxy that handles the full token lifecycle — acquisition, caching, refresh, and retry — so AI agents can authenticate seamlessly. For APIs without Dynamic Client Registration (DCR), the proxy bridges the gap automatically. Integrations with WorkOS, Auth0, Clerk, and Descope are supported out of the box.
Runtime validation with Zod. Every generated tool includes Zod schemas for runtime type checking. This is especially critical for AI use cases where LLMs may pass unexpected data types. Validation happens before the HTTP request is made, catching errors early.
Streaming and real-time protocol support. The generator handles streaming uploads and downloads, server-sent events (SSE), and webhook signature validation. Binary responses (images, audio) are automatically Base64-encoded for MCP transport.
Scope-based tool filtering. Using the x-speakeasy-mcp OpenAPI extension, teams can organize tools into scopes and restrict which operations are exposed. This prevents tool explosion for APIs with hundreds of endpoints.
Optional sandbox execution. For complex multi-step workflows, Speakeasy supports code-mode execution where the LLM’s tool calls run in a sandboxed environment. This reduces token consumption by up to 100x by handling orchestration server-side rather than serializing everything back to the model.
Out-of-the-box third-party API support. Through Gram , teams can connect to a catalog of prebuilt MCP servers for popular third-party APIs — no OpenAPI spec required. Combine these with custom-generated servers to build toolsets that span internal and external services from a single gateway.
MCP app support. Speakeasy generates MCPB files for one-click installation in Claude Desktop and other MCP clients, removing the need for end users to edit JSON configs or install Node.js.
Air-gapped and on-prem deployment. The Speakeasy CLI is a standalone binary that runs without network egress. Teams can generate MCP servers entirely within locked-down infrastructure.

Deployment options

Speakeasy MCP servers deploy to Cloudflare Workers, Docker containers, AWS Lambda, or any environment that runs Node.js. For teams that prefer a managed approach, Gram by Speakeasy provides hosted MCP servers with production-grade infrastructure, tool curation, a gateway for centralized routing and observability, and one-click deployment.

For a deep dive into lessons from production MCP servers, see Generating MCP from OpenAPI: lessons from 50+ production servers.

Stainless: SDK-bundled MCP with a code execution model

Stainless generates MCP servers as a subpackage within the TypeScript SDK at packages/mcp-server. Rather than creating a standalone repository, the MCP server ships alongside the SDK and is published as a separate npm package.

Architecture: code sandbox by default

Stainless is highly opinionated about MCP server architecture. Every generated server contains exactly two tools:

Code execution tool. Instead of exposing one MCP tool per API endpoint, Stainless gives the LLM a tool that accepts TypeScript code. The LLM writes a script using the generated SDK, and the server executes it in a Deno sandbox.
Docs search tool. A second tool lets the LLM query API documentation in Markdown format before writing code.

This means the LLM writes TypeScript against the SDK rather than calling individual API tools directly. There’s no option to generate discrete tools. Code mode is the only architecture Stainless supports.

Limitations of code mode

Code mode shines when an agent needs to chain multiple API calls: intermediate results stay in the sandbox as variables rather than feeding back through the LLM just to be copied into the next call’s inputs, saving time and tokens. But it’s overkill to architect every MCP server this way. Most MCP interactions are simple — one tool call, one result, and for those, code mode adds overhead: the LLM must first query documentation, then generate correct TypeScript, and retry when the code fails (roughly 15-20% of the time ). The token cost isn’t eliminated; it’s shifted from tool definitions to documentation retrieval and code generation.

Mandating code mode also has practical costs. Observability suffers because you see “the LLM ran some TypeScript” rather than discrete, logged tool calls with clear inputs and outputs. And composability breaks down: MCP was designed so agents can freely mix tools from multiple servers, but code-mode servers force the LLM to context-switch between writing SDK scripts and calling traditional tools in the same workflow.

Postman: API network meets MCP

Postman approaches MCP from a fundamentally different angle. Rather than generating MCP servers from your own OpenAPI specs, Postman offers an MCP server generator built on top of its API Network of 100,000+ public APIs.

How it works

Select APIs from the Postman API Network (Salesforce, UPS, Stripe, etc.).
The generator produces a custom MCP server containing only the tools needed.
Download and run the server locally, or connect through the Postman remote MCP server at mcp.postman.com.

Postman also provides visual MCP tooling within its desktop app — MCP requests can be built, debugged, and evaluated in the same interface used for REST and GraphQL testing.

What Postman is (and isn’t)

Postman is a powerful API collaboration platform with over 30 million registered users. It excels at request/response testing, monitoring, and team workflows. However, it’s not a specialized MCP server or SDK generator in the same category as Speakeasy or Stainless:

No generation from your own OpenAPI specs. The MCP generator draws from the Postman API Network, not from arbitrary specs you provide.
No type-safe code output. Generated servers don’t include Zod or equivalent runtime validation.
No OAuth proxy generation. Authentication relies on Postman API keys rather than full OAuth lifecycle management.
No streaming protocol support. SSE, WebSocket, and gRPC support is planned but not yet available.
Platform-dependent. The MCP server is tied to the Postman ecosystem and API key infrastructure.

Postman is the right choice for teams that want quick MCP access to well-known public APIs without writing code. It’s not designed for teams generating MCP servers from their own API contracts.

Head-to-head comparison

MCP server generation comparison

Feature

1P MCP support

Speakeasy

✅

Stainless

✅

Postman

❌

3P MCP support

Speakeasy

✅

Stainless

❌

Postman

✅ (API Network)

Local MCP package

Speakeasy

✅

Stainless

✅

Postman

❌

Managed remote hosting

Speakeasy

✅ (Gram)

Stainless

✅

Postman

✅ (mcp.postman.com)

Per-operation server structure

Speakeasy

✅ (default)

Stainless

❌ (code mode only)

Postman

✅

Code-mode server structure

Speakeasy

✅ (optional)

Stainless

✅ (mandatory)

Postman

❌

Self-hosted

Speakeasy

✅

Stainless

✅

Postman

❌

Air-gapped / on-prem generation

Speakeasy

✅

Stainless

❌

Postman

❌

OAuth proxy generation

Speakeasy

✅ (auto)

Stainless

⚠️ (Cloudflare only)

Postman

❌

BYO OAuth

Speakeasy

✅

Stainless

✅

Postman

❌

Feature	Speakeasy	Stainless	Postman
1P MCP support	✅	✅	❌
3P MCP support	✅	❌	✅ (API Network)
Local MCP package	✅	✅	❌
Managed remote hosting	✅ (Gram)	✅	✅ (mcp.postman.com)
Per-operation server structure	✅ (default)	❌ (code mode only)	✅
Code-mode server structure	✅ (optional)	✅ (mandatory)	❌
Self-hosted	✅	✅	❌
Air-gapped / on-prem generation	✅	❌	❌
OAuth proxy generation	✅ (auto)	⚠️ (Cloudflare only)	❌
BYO OAuth	✅	✅	❌

Protocol support

Speakeasy supports both MCP transports (Streamable HTTP and STDIO) alongside full SSE, streaming, and automatic Base64 encoding for binary responses. Stainless supports the core transports but uses a non-standard SSE implementation and lacks binary data handling. Postman supports STDIO for local use only.

Hosting

Speakeasy offers the broadest deployment surface: Gram for managed hosting, plus self-hosted deployment to Cloudflare Workers, Docker, AWS Lambda, or any Node.js environment. The CLI runs without network egress for air-gapped generation. Stainless provides managed hosting and Cloudflare Workers but requires cloud connectivity for generation. Postman offers a remote server and local download only.

Token efficiency

Speakeasy tackles token efficiency at multiple levels without forcing a single architecture. Scope-based filtering exposes only the tools relevant to a given use case. JQ filters strip unnecessary fields from responses. And for workflows that genuinely benefit from multi-step chaining, optional code-mode execution keeps intermediate results server-side. The key difference: Speakeasy lets teams choose the right approach per use case rather than mandating one architecture.

Stainless uses code-mode execution as its only architecture. This can reduce overhead for complex multi-step workflows, but for the majority of MCP interactions — single tool calls, simple lookups — code mode adds cost: the LLM must query documentation, generate TypeScript, and retry when the code fails (roughly 15-20% of the time ). The token cost is restructured, not eliminated.

Authentication

Speakeasy autogenerates an OAuth proxy with full token lifecycle management (acquisition, refresh, retry) and out-of-the-box integrations with WorkOS, Auth0, Clerk, and Descope. Stainless requires manual OAuth implementation or a Cloudflare Workers-based proxy. Postman relies on API keys only.

Automation

The Speakeasy CLI is a standalone binary that integrates with any pipeline — GitHub Actions, GitLab CI, Jenkins, or fully air-gapped environments. Trusted publishing supports npm and PyPI without storing long-lived secrets. Stainless orchestrates generation through its cloud dashboard with optional GitHub hooks. Postman doesn’t support automated MCP generation.

MCP gateway

Gram functions as a full MCP gateway: centralized routing, unified OAuth across servers, real-time logging, and cross-server tool curation from a single control plane. Teams can combine prebuilt integrations with custom-generated servers into purpose-built toolsets. Neither Stainless nor Postman offers gateway functionality. For a detailed comparison, see Choosing an MCP gateway.

Use cases: when to pick each tool

Best-fit scenarios

Scenario

Automated CI/CD pipelines that regenerate MCP servers on every API change

Best fit

Speakeasy

Air-gapped or regulated environments requiring on-prem generation and hosting

Best fit

Speakeasy

Large API surfaces (100+ endpoints) needing scope-based tool curation and token optimization

Best fit

Speakeasy

Real-time and streaming APIs using SSE, webhooks, or binary data

Best fit

Speakeasy

Managed MCP hosting with one-click deployment and tool curation

Best fit

Speakeasy (Gram)

Centralized gateway for routing, observability, and access control across multiple MCP servers

Best fit

Speakeasy (Gram)

Teams already using Stainless for SDK generation who want code-execution-style MCP

Best fit

Stainless

Dashboard-driven workflows with less need for CLI or custom automation

Best fit

Stainless

Quick MCP access to public APIs without writing code

Best fit

Postman

API workflow management, manual testing, and collaborative design

Best fit

Postman

Scenario	Best fit
Automated CI/CD pipelines that regenerate MCP servers on every API change	Speakeasy
Air-gapped or regulated environments requiring on-prem generation and hosting	Speakeasy
Large API surfaces (100+ endpoints) needing scope-based tool curation and token optimization	Speakeasy
Real-time and streaming APIs using SSE, webhooks, or binary data	Speakeasy
Managed MCP hosting with one-click deployment and tool curation	Speakeasy (Gram)
Centralized gateway for routing, observability, and access control across multiple MCP servers	Speakeasy (Gram)
Teams already using Stainless for SDK generation who want code-execution-style MCP	Stainless
Dashboard-driven workflows with less need for CLI or custom automation	Stainless
Quick MCP access to public APIs without writing code	Postman
API workflow management, manual testing, and collaborative design	Postman

Recommendations

For teams evaluating MCP server generation tools, the decision maps to six questions:

What protocols does your API use? If you need SSE, streaming, binary data, or webhooks, Speakeasy has the broadest protocol coverage. Stainless covers the basics. Postman doesn’t generate streaming code.
Where do you need to host? Speakeasy offers managed hosting (Gram), self-hosted deployment to five+ targets, and air-gapped generation. Stainless provides managed and Cloudflare Workers options. Postman ties you to its remote server or local download.
How large is your API surface? For APIs with dozens or hundreds of endpoints, token efficiency matters. Speakeasy’s scope-based filtering, JQ response transforms, and code mode keep token usage manageable. Stainless’s code execution sandbox is an alternative approach. Postman offers basic config levels.
How does your API authenticate? If your API uses OAuth, Speakeasy handles the full lifecycle automatically. Stainless and Postman require manual implementation.
How automated is your workflow? Speakeasy’s CLI runs anywhere, including air-gapped pipelines. Stainless requires cloud connectivity. Postman isn’t designed for automated generation.
Do you need a gateway for multiple MCP servers? If you’re managing more than a handful of MCP servers, a centralized gateway for routing, auth, and observability saves significant infrastructure work. Gram is the only option here with built-in gateway capabilities.

For teams prioritizing automation, protocol breadth, token efficiency, and production-grade hosting, Speakeasy is the strongest choice. To get started, run speakeasy quickstart --mcp or explore Gram for managed MCP hosting .

For a detailed 1:1 comparison of Speakeasy and Stainless across SDK generation, see In Depth: Speakeasy vs Stainless. For more on building production MCP servers, see the Speakeasy MCP documentation.

Frequently asked questions

What is MCP server generation and why does it matter?

MCP server generation is the process of automatically creating server endpoints from API specifications, enabling rapid integration with AI agents and automation tools. It matters because manually maintaining MCP tool definitions doesn’t scale as API surfaces grow, and automated generation keeps tools synchronized with the underlying API contract.

How does protocol support affect MCP server capabilities?

Protocol support determines what kinds of APIs your MCP server can expose. Basic HTTP request/response covers most REST APIs, but streaming protocols like SSE are essential for real-time use cases — LLM token streams, live event feeds, and file transfers. Without proper protocol support, teams must hand-write workarounds or forgo real-time capabilities entirely.

What is token efficiency and why does it matter for MCP?

Every MCP tool call consumes LLM tokens for the tool description, input schema, and response payload. As APIs grow, naive generation can flood the context window with hundreds of tool definitions, degrading model performance and increasing cost. Token-efficient generation uses techniques like scope-based filtering, response transformation, and code-mode execution to minimize overhead.

Why does authentication matter for MCP servers?

AI agents calling authenticated APIs need OAuth tokens to be acquired, cached, and refreshed automatically. If the MCP server doesn’t handle this lifecycle natively, developers must write and maintain authentication plumbing themselves — a common source of bugs and security vulnerabilities in production.

How does CI/CD automation improve MCP server maintenance?

Automation lets teams regenerate and redeploy MCP servers whenever API specs change, keeping tool definitions in sync with the actual API. Without automation, every endpoint change requires manual regeneration and redeployment — a process that’s error-prone and breaks down as the number of services grows.

What is an MCP gateway and when do I need one?

An MCP gateway centralizes routing, authentication, observability, and access control across multiple MCP servers. Without one, each server needs its own OAuth implementation, credential storage, and audit trail. Teams managing more than a few MCP servers benefit from a gateway to avoid reimplementing infrastructure for every new service. For a deeper look, see What is an MCP gateway and do I need one?.

Last updated on February 12, 2026