Gram Agents API overview
The Gram Agents API is currently in early beta. The API surface may change as development continues.
The Gram Agents API provides an interface for executing cloud-based agent workflows with Gram tools. Designed for programmatic use, it allows applications to run agent tasks that leverage Gram toolsets alongside your preferred model provider.
The API is heavily inspired by the OpenAI Responses API
Key features
- Sync and async execution: Run agent tasks synchronously for immediate results, or asynchronously to poll for completion
- Multi-turn conversations: Build conversational agents by passing
previous_response_idto chain responses - Sub-agents: Define specialized sub-agents with their own tools and instructions for complex workflows
- Configurable models: Select the model, temperature, and base instructions for each request
- Response storage control: Use the
storeflag to control whether agent run history is persisted
How it works
The Gram Agents API endpoint accepts a request with model configuration, instructions, input, and toolsets. The agent executes the workflow, calling tools as needed, and returns the result.
POST https://app.getgram.ai/rpc/agents.responseA basic request includes:
model: The model to use (e.g.,openai/gpt-4o)instructions: System prompt for the agentinput: The user’s input or conversation contexttoolsets: Array of toolsets to make available to the agent
Request parameters reference
| Parameter | Type | Description |
|---|---|---|
model | string | Model identifier (e.g., openai/gpt-4o) |
instructions | string | System prompt for the agent |
input | string or array | User input or conversation history |
toolsets | array | Toolsets to make available |
sub_agents | array | Sub-agent definitions |
async | boolean | Enable async execution (default: false) |
store | boolean | Store response history (default: true) |
previous_response_id | string | Link to previous response for multi-turn |
temperature | number | Model temperature setting |
Supported models
The following models are currently supported:
| Provider | Models |
|---|---|
| OpenAI | openai/gpt-4o, openai/gpt-4o-mini, openai/gpt-4.1, openai/gpt-5, openai/gpt-5.1, openai/gpt-5.1-codex |
| Anthropic | anthropic/claude-sonnet-4, anthropic/claude-sonnet-4.5, anthropic/claude-haiku-4.5, anthropic/claude-opus-4, anthropic/claude-opus-4.5, anthropic/claude-3.7-sonnet |
google/gemini-2.5-pro-preview, google/gemini-3-pro-preview | |
| Mistral | mistralai/mistral-medium-3, mistralai/mistral-medium-3.1, mistralai/codestral-2501 |
| Kimi | moonshotai/kimi-k2 |
Execution modes
Synchronous execution
By default, requests execute synchronously and return the complete response when finished. This is ideal for quick tasks or when immediate results are needed.
Asynchronous execution
For longer-running tasks, set async: true in the request. The API returns immediately with a response ID that can be polled for status and results:
GET https://app.getgram.ai/rpc/agents.response?response_id={id}Poll until status changes from in_progress to completed or failed.
Multi-turn conversations
Build multi-turn agents by passing previous_response_id with each new request. This links responses together, allowing the agent to reference context from previous turns without manually managing conversation history.
Sub-agents
For complex workflows, define sub-agents that specialize in specific tasks. Each sub-agent can have its own:
nameanddescriptioninstructions(system prompt)toolsetsand/or specifictools(tool URNs)environment_slugfor credential management
The main agent orchestrates sub-agents as needed to complete the task.
Response storage
The store flag controls whether agent run history is persisted:
store: true(default): Response history is saved and can be retrieved laterstore: false: Response is deleted after completion
Setting store: false with async: true is not supported, as there would be no way to retrieve the result.
Stored responses can also be deleted via the API:
DELETE https://app.getgram.ai/rpc/agents.response?response_id={id}Credential management
The Gram Agents API uses Gram environments for credential management. Credentials cannot currently be passed directly in requests. Configure the necessary API keys and secrets in the Gram dashboard, then reference the appropriate environment_slug in toolset configurations.
Dashboard (coming soon)
A Gram dashboard view of agent runs (for responses where history was not deleted) will show execution details and results.
Last updated on