Gram Agents API overview

The Gram Agents API is currently in early beta. The API surface may change as development continues.

The Gram Agents API provides an interface for executing cloud-based agent workflows with Gram tools. Designed for programmatic use, it allows applications to run agent tasks that leverage Gram toolsets alongside your preferred model provider.

The API is heavily inspired by the OpenAI Responses API . Input and output structures directly conform to the Responses API format.

Key features

Sync and async execution: Run agent tasks synchronously for immediate results, or asynchronously to poll for completion
Multi-turn conversations: Build conversational agents by passing previous_response_id to chain responses
Sub-agents: Define specialized sub-agents with their own tools and instructions for complex workflows
Configurable models: Select the model, temperature, and base instructions for each request
Response storage control: Use the store flag to control whether agent run history is persisted

How it works

The Gram Agents API endpoint accepts a request with model configuration, instructions, input, and toolsets. The agent executes the workflow, calling tools as needed, and returns the result.


POST https://app.getgram.ai/rpc/agents.response

A basic request includes:

model: The model to use (e.g., openai/gpt-4o)
instructions: System prompt for the agent
input: The user’s input or conversation context
toolsets: Array of toolsets to make available to the agent

Request parameters reference

Parameter	Type	Description
`model`	string	Model identifier (e.g., `openai/gpt-4o`)
`instructions`	string	System prompt for the agent
`input`	string or array	User input or conversation history
`toolsets`	array	Toolsets to make available
`sub_agents`	array	Sub-agent definitions
`async`	boolean	Enable async execution (default: `false`)
`store`	boolean	Store response history (default: `true`)
`previous_response_id`	string	Link to previous response for multi-turn
`temperature`	number	Model temperature setting

Supported models

The following models are currently supported:

Provider	Models
OpenAI	`openai/gpt-4o`, `openai/gpt-4o-mini`, `openai/gpt-4.1`, `openai/gpt-5`, `openai/gpt-5.1`, `openai/gpt-5.1-codex`
Anthropic	`anthropic/claude-sonnet-4`, `anthropic/claude-sonnet-4.5`, `anthropic/claude-haiku-4.5`, `anthropic/claude-opus-4`, `anthropic/claude-opus-4.5`, `anthropic/claude-3.7-sonnet`
Google	`google/gemini-2.5-pro-preview`, `google/gemini-3-pro-preview`
Mistral	`mistralai/mistral-medium-3`, `mistralai/mistral-medium-3.1`, `mistralai/codestral-2501`
Kimi	`moonshotai/kimi-k2`

Execution modes

Synchronous execution

By default, requests execute synchronously and return the complete response when finished. This is ideal for quick tasks or when immediate results are needed.

Asynchronous execution

For longer-running tasks, set async: true in the request. The API returns immediately with a response ID that can be polled for status and results:


GET https://app.getgram.ai/rpc/agents.response?response_id={id}

Poll until status changes from in_progress to completed or failed.

Multi-turn conversations

Build multi-turn agents by passing previous_response_id with each new request. This links responses together, allowing the agent to reference context from previous turns without manually managing conversation history.

Sub-agents

For complex workflows, define sub-agents that specialize in specific tasks. Each sub-agent can have its own:

name and description
instructions (system prompt)
toolsets and/or specific tools (tool URNs)
environment_slug for credential management

The main agent orchestrates sub-agents as needed to complete the task.

Response storage

The store flag controls whether agent run history is persisted:

store: true (default): Response history is saved and can be retrieved later
store: false: Response is deleted after completion

Setting store: false with async: true is not supported, as there would be no way to retrieve the result.

Stored responses can also be deleted via the API:


DELETE https://app.getgram.ai/rpc/agents.response?response_id={id}

Credential management

The Gram Agents API uses Gram environments for credential management. Credentials cannot currently be passed directly in requests. Configure the necessary API keys and secrets in the Gram dashboard, then reference the appropriate environment_slug in toolset configurations.

Dashboard (coming soon)

A Gram dashboard view of agent runs (for responses where history was not deleted) will show execution details and results.

Last updated on January 22, 2026