AI & MCP
Choosing an AI chat builder kit: CopilotKit vs OpenAI ChatKit vs Gram Elements
Nolan Sullivan
February 28, 2026 - 23 min read

Chatbots are great for automated support, but if you want users to talk to your app and take action, a chatbot alone isn’t enough. Chatbots are designed to answer, not to act. That’s where chat builder kits come in: they connect your UI to an AI agent (a model that can call functions), update state, and trigger workflows in response to natural language. The result is an interface where users can create tasks, move records, and kick off processes by typing, rather than clicking through forms.
Once you’ve decided a chat builder kit is the right approach, the next question is which one to use. CopilotKit , OpenAI ChatKit , and Gram Elements each integrate agents differently, and those differences ripple through every part of your stack.
To make the comparison concrete, we built the same task board app three times, once with each framework. A task board is a useful benchmark because it demands a meaningful range of agent capabilities:
- Creating records
- Moving records between states
- Assigning records
- Filtering records by priority
If a chat builder kit handles a task board well, it will handle most product workflows.
The task board application
The GitHub project directory includes three versions of the same task board application in React. Each version includes a task board with columns and a chat panel on the right side of the screen. Users type requests in the chat panel and watch the board update in real time.

The task board tests core agent capabilities across all three frameworks. Users can create tasks, move them between columns, update priorities, and assign work through conversation.

Feature matrix
| Feature | CopilotKit | ChatKit | Gram Elements |
|---|---|---|---|
| Setup complexity | Medium (self-host) / Low (cloud) | Medium (requires Agent Builder) | Low (MCP-first) |
| Bidirectional state sync | ✅ Yes (core feature) | ❌ No | ❌ No |
| Shared app state to agent | ✅ useCopilotReadable | ❌ Manual via MCP/functions | ❌ Manual via MCP |
| Agent triggering UI updates | ✅ useCopilotAction | ⚠️ ClientEffectEvent (manual) | ⚠️ Manual backend notification |
| Theming customization | ✅✅ CSS variables + component replacement | ⚠️ Limited (colors, fonts, spacing only) | ✅ ThemeConfig + modal config |
| Cost model | Free (open source) + LLM costs | Pay-as-you-go token costs | Tiered SaaS ($0–$29/mo) |
| Cost predictability | ✅ Transparent | ❌ Unpredictable (token burn) | ✅ Fixed monthly fee |
| Self-hosting option | ✅ Full control | ❌ Managed only | ❌ Managed only |
| Observability | ⚠️ Logs only (errors) | ✅ Usage, threads, tool calls | ✅✅ Sessions, insights, telemetry, scoring |
| Learning curve | Medium (hooks, actions) | Medium (Agent Builder) | Low (config-driven) |
| Vendor lock-in | ✅ Low (open source) | ⚠️ High (OpenAI ecosystem) | ⚠️ Medium (Gram platform) |
| Enterprise features | ❌ Contact sales | ✅ Available | ✅ SSO, audit logs, custom dataplane |
| Production-ready | ✅ Yes | ✅ Yes | ✅ Yes |
| MCP server support | ✅ Yes | ✅ Yes (native) | ✅✅ Core architecture |
| Conversation history | ✅ Yes | ✅✅ Rich (branching, context) | ✅ Yes |
| Human-in-the-loop | ✅ Yes | ⚠️ Via workflows | ✅ Yes |
| Real-time sync | ✅✅ Built-in | ❌ No | ❌ No |
| Message count limit | ❌ No (action limit: 20-30) | ✅ No | ✅ No |
Evaluation criteria
We evaluate each framework across six criteria. These categories surface where each framework creates friction and where it excels:
-
Setup: We integrate each framework into a task board project with three pages to test SDK loading and configuration complexity across multiple routes.
-
Chat performance: After setup, we test task creation through each chat interface to evaluate response speed, UI synchronization, and approval workflows.
-
Customization: We assess how well we can customize the interfaces to match the application’s visual design. Users expect consistent colors, typography, and spacing across the entire product.
-
State management: State management is where the three frameworks diverge most fundamentally. We check whether the agent knows about the app’s state, and whether it can trigger updates directly.
-
Pricing: We compare pricing models across the three frameworks. Each operates on a different business model, which affects your entire cost structure.
-
Observability: Logging and monitoring are essential for tracking usage, measuring error rates, and diagnosing issues in production. We review the tools available for each chat builder kit.
CopilotKit
CopilotKit is an open-source framework for building AI copilot interfaces into applications.
Setup
Using the CopilotKit integration offers you two deployment options:
- Self-hosted deployments run the agent runtime in your backend with an LLM API key you provide.
- Cloud deployments use CopilotKit’s hosted service. You add a public API key and agent name from the Copilot Cloud dashboard, and CopilotKit runs the runtime and LLM infrastructure.
We tested both approaches and found minimal configuration differences.
The cloud-hosted integration wraps your React application with the CopilotKit provider:
import "./globals.css";
import { ReactNode } from "react";
import { CopilotKit } from "@copilotkit/react-core";
export default function RootLayout({ children }: { children: ReactNode }) {
return (
<html lang="en">
<body>
{/* Use the public api key you got from Copilot Cloud */}
<CopilotKit publicApiKey="<your-copilot-cloud-public-api-key>">
{children}
</CopilotKit>
</body>
</html>
);
}The self-hosted integration uses identical code. The only difference is in your backend configuration, not in the React layer.
CopilotKit communicates through CopilotActions, React hooks you define in your application to manage UI state. The agent calls these hooks to execute mutations.
Creating a task requires defining a useCopilotAction hook with parameter schemas and a handler function:
// Create task action
useCopilotAction({
name: "createTask",
description:
"Create a new task on the Kanban board. Use this when the user wants to add a new task.",
parameters: [
{
name: "title",
type: "string",
description: "The title of the task",
required: true,
},
{
name: "description",
type: "string",
description: "A description of the task",
required: false,
},
{
name: "status",
type: "string",
description:
"The status column: backlog, todo, in_progress, done, or canceled",
required: false,
},
{
name: "priority",
type: "string",
description: "Priority level: low, medium, or high",
required: false,
},
{
name: "assignee",
type: "string",
description: "The name of the person to assign the task to",
required: false,
},
{
name: "project_id",
type: "string",
description:
"The UUID of the project to associate with. Check the projects list for available IDs.",
required: false,
},
],
handler: async ({
title,
description,
status,
priority,
assignee,
project_id,
}) => {
try {
await createTask({
title,
description: description || "",
status: (status || "todo") as "backlog" | "todo" | "in_progress" | "done" | "canceled",
priority: (priority || "medium") as "low" | "medium" | "high",
assignee: assignee || null,
project_id: project_id || null,
});
return `Task "${title}" created successfully`;
} catch (err) {
throw err;
}
},
});
The LLM calls these actions as tools. Ensure your parameter descriptions are precise, because the LLM uses them to understand which data to extract from user requests and how to structure the function call.
This architecture works best in two scenarios:
- Simple applications with limited actions benefit from the straightforward hook pattern. For example, an e-commerce portal where users search for products and add items to the cart maps cleanly to a few well-defined hooks.
- UI-intensive applications like Lucidchart, where users manipulate many objects to build diagrams, also benefit from the chat interface, which simplifies complex interactions. The cost is hook complexity that scales with feature count.
Hook complexity is the trade-off for CopilotKit’s tight React integration. It’s much tighter than ChatKit, which moves capability definitions to the OpenAI platform, and Gram, which moves the definitions to Model Context Protocol (MCP) servers. These other two integrations keep your component tree leaner as feature count grows, but at the cost of losing native state visibility in your UI layer.
Chat performance
CopilotKit handles requests directly in the application state. UI updates are nearly instantaneous because the agent executes actions in your client code without network round trips to external services. By default, CopilotKit runs commands without requesting user approval.
Adding user approval for mutation actions requires defining an approval card component and passing it to renderAndWaitForResponse on the corresponding useCopilotAction. The app then displays the card before the action runs, letting users approve or deny the operation.

Running without approval by default is the most aggressive stance of the three frameworks. ChatKit and Gram use a safer default for most production apps; both require user confirmation before executing mutations by default. With CopilotKit, you have to opt in to that protection explicitly.
Customization
CopilotKit exposes theming through CSS custom properties.
You can pass in a CopilotKitCSSProperties object to configure colors, backgrounds, spacing, and shadows:
const customStyles: CopilotKitCSSProperties = {
"--copilot-kit-primary-color": "#6366f1",
"--copilot-kit-contrast-color": "#ffffff",
"--copilot-kit-background-color": "#f9fafb",
"--copilot-kit-input-background-color": "#ffffff",
"--copilot-kit-secondary-color": "#e5e7eb",
"--copilot-kit-separator-color": "#d1d5db",
"--copilot-kit-muted-color": "#6b7280",
"--copilot-kit-shadow-md": "0 4px 6px -1px rgb(0 0 0 / 0.1)",
};You can also customize the labels and text that it displays to the user depending on the context:
<CopilotChat
labels={{
title: "Travel Buddy",
placeholder: "Ask about travel plans...",
error: "Oops! Something went wrong.",
copyToClipboard: "Copy",
}}
/>For deeper customization, you can replace entire components with your own while keeping the core functionality. This gives you complete control over the UI, but you’re responsible for maintaining it.
There’s no vendor lock-in on design. You’re not stuck with CopilotKit’s aesthetic. But you need to know what you’re doing with CSS and React.
Of the three frameworks, CopilotKit gives you the most control over appearance. ChatKit’s theming is limited to colors, fonts, and spacing through an options object. Gram’s ThemeConfig covers density, color scheme, and modal positioning. Neither can match CopilotKit if pixel-perfect design alignment is a hard requirement.
State management
CopilotKit is built for agent-native apps where the UI and agent share state bidirectionally. The application state is visible to the agent, and the agent can trigger updates directly.
We expose the app state to the agent through useCopilotReadable:
import { useCopilotReadable } from "@copilotkit/react-core";
useCopilotReadable({
description: "The current list of tasks on the Kanban board",
value: tasks,
});
useCopilotReadable({
description: "The current list of projects",
value: projects,
});The agent sees the actual board state. When it reasons about which tasks exist or which columns are available, it uses the real data. When the agent calls an action, it runs in your client code. Your app updates its state (via API and revalidation), and the board immediately reflects the change. The agent sees the updated state on the next message.
CopilotKit is the only framework of the three that natively exposes app state to the agent. ChatKit and Gram both treat state as a separate concern, requiring the agent to operate without direct knowledge of what’s currently on screen, and putting the burden of context entirely on your MCP tools or function definitions.
Pricing
CopilotKit’s core framework is completely free and open source under the MIT license. You can self-host the entire platform, manage your own infrastructure, and pay nothing. This covers the framework itself (which consists of the React hooks, the runtime, and the chat UI).
CopilotKit Cloud provides hosted features such as analytics, A/B testing, persistence, and reinforcement learning. Pricing is not publicly listed. You contact the CopilotKit team for a quote.

CopilotKit is the only framework in this comparison where you can run a full production deployment with zero platform costs. ChatKit’s token-based model has no ceiling, and Gram charges a monthly fee once you exceed free-tier limits. The trade-off is that, with no published rates, you can only access CopilotKit Cloud’s enterprise feature pricing after a sales conversation.
Observability
CopilotKit is open source by default. The self-hosted version does not provide a monitoring UI. The CopilotKit Cloud dashboard includes a Logs page that only displays errors.

The lack of monitoring UI makes sense because you can access the logs you need elsewhere:
- Most of the integration happens in React (you have logs there)
- The agent runtime runs on CopilotKit (you can see logs there)
- Depending on your LLM provider, you should also be able to see logs on their platform.
Although you benefit from this separation of concerns, you have to look into three different logging sources when there is an issue.
Compared to the other two frameworks, CopilotKit has the weakest observability story. ChatKit gives you thread-level logs that let you reconstruct exactly what the agent did and why. Gram goes further with session scoring, which tells you whether users are actually satisfied, not just whether the system ran without errors. CopilotKit’s approach works for developers comfortable piecing together distributed logs, but it won’t scale for a support team trying to respond to user complaints or a business trying to measure success.
OpenAI ChatKit
OpenAI ChatKit targets teams already using the OpenAI ecosystem.
Setup
Unlike CopilotKit and Gram Elements, ChatKit requires you to create an agentic workflow in the OpenAI platform before you can integrate the task board UI component.

The Agent Builder workflow defines the agent’s capabilities. You configure the Agent node with the tools the agent can access:
- MCP servers
- OpenAI tools
- Functions
- Code interpreter
- Web browser
- Image generation for generative AI features.

For this comparison, we created an MCP server with Gram and added it to the workflow, so the agent could handle task board operations.
After configuring the workflow, integrate ChatKit by mounting the ChatKit component in your React application.
import { ChatKit as ChatKitComponent, useChatKit } from '@openai/chatkit-react';
function ChatKitPanelInner() {
const [sessionError, setSessionError] = useState<string | null>(null);
const getClientSecret = useMemo(() => {
return async (currentSecret: string | null) => {
if (currentSecret) {
return currentSecret;
}
const response = await fetch("/api/chatkit", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ workflow: { id: WORKFLOW_ID } }),
});
const payload = await response.json();
if (!response.ok) {
const msg = payload.error ?? "Failed to create session";
setSessionError(msg);
throw new Error(msg);
}
setSessionError(null);
if (!payload.client_secret) {
throw new Error("Missing client secret in response");
}
return payload.client_secret;
};
}, []);
const chatkit = useChatKitHook({
api: { getClientSecret },
});
return (
<ChatKitComponent
control={chatkit.control}
className="h-full min-h-[300px] w-full"
/>
);
}
The React integration code is minimal compared to CopilotKit’s hook definitions. But you’ve already paid the complexity tax by building the workflow in OpenAI’s Agent Builder. MCP servers and OpenAI tools define capabilities on the platform side, rather than relying solely on functions (or client functions) to define capabilities in React. This results in fewer React components and more platform configuration.
The upfront investment in Agent Builder is real. Compared to CopilotKit’s provider wrapper or Gram’s config object, ChatKit has the most pre-integration work before you write a line of React. The payoff is lean component code, but only if you keep capability definitions on the platform side rather than falling back to client functions.
Chat performance
ChatKit’s chat experience is responsive once the workflow configuration is complete. The interface provides access to conversation history, letting users reference and continue past discussions.
By default, ChatKit requires user approval for all actions. This adds a confirmation step between the user’s request and the agent’s execution, giving users control over mutations before they run.
This is the appropriate default for most production applications. CopilotKit takes the opposite approach, as commands run immediately unless you explicitly wire up an approval card. Gram also requires approval by default, making ChatKit and Gram the safer choices out-of-the-box for apps where accidental mutations carry real consequences.
Customization
ChatKit ships with a prebuilt UI. You can customize the appearance through the options object or the ChatKit Playground, which lets you experiment with settings visually and generates the configuration code. Customization is limited to certain aspects of the UI.
- Color scheme: Light or dark mode and accent colors
- Typography: Font size, type, and density
- Spacing: Corner radii and padding
- Message bubble styles: How messages are displayed
- Start screen: The greeting and prompt suggestions
- Header actions: The buttons in the header
The configuration code looks like this:
const options: Partial<ChatKitOptions> = {
theme: {
colorScheme: "light",
accentColor: "#0066cc",
cornerRadius: "8px",
},
startScreen: {
greeting: "What can I help you with?",
prompts: [
{ name: "Check status", prompt: "...", icon: "search" }
],
},
header: {
customButtonLeft: { icon: "settings-cog", onClick: () => {} },
customButtonRight: { icon: "home", onClick: () => {} },
},
};You can customize colors and the basic layout, but you can’t make major changes to the component structure like you can with CopilotKit. You’re working within ChatKit’s design boundaries. For example, you can’t customize:
- The shape and border styling of the chat bubble container (you’re stuck with ChatKit’s default rounded rectangle)
- The header layout structure (you can add buttons, but you can’t rearrange or restructure the header itself)
- Animation transitions and timing (how messages appear, and how the chat window opens and closes)
- The internal spacing and padding of message containers
- The exact positioning of input elements relative to other UI components
If you need pixel-perfect control over these visual elements, your only option is to fork ChatKit and maintain your own version, which defeats the purpose of using a third-party library.
ChatKit sits in the middle of the customization spectrum. CopilotKit gives you full component replacement and control over CSS variables. Gram’s ThemeConfig offers a comparable range of customization to ChatKit’s options, with different knobs but a similar ceiling. For most SaaS products with standard design systems, ChatKit’s options are sufficient. The constraint only becomes painful when your design team has strong opinions about component structure.
State management
ChatKit does not expose your app state to the agent. The agent runs on the OpenAI backend, and the app runs on your backend. They are separate concerns.
ChatKit provides a mechanism for reloading your app state after the agent makes changes through ClientEffectEvent. When a server-side tool completes, the server streams ClientEffectEvent back to the client. Your app receives it via the onEffect callback and reacts, typically by refetching data.
Unlike CopilotKit, the ChatKit agent has no native visibility into your application state; it can’t reason about which tasks currently exist on the board without being explicitly given that context through tool definitions. The ClientEffectEvent mechanism solves the reverse problem (pushing updates back to the UI after a tool runs), but the agent is still operating without the real-time awareness that CopilotKit’s useCopilotReadable provides. Gram has the same limitation, with no equivalent callback mechanism.
Pricing
ChatKit itself is free. OpenAI doesn’t charge you for the UI framework, but every message, tool call, and step in the agent’s reasoning costs money because you pay OpenAI for the models running the agent.
You pay standard OpenAI API rates per 1 million tokens (text in, text out). GPT-4o input tokens cost around $2.50 per 1M tokens. Output tokens cost around $10 per 1M tokens.
Pay-as-you-go sounds simple, but costs are unpredictable. If your agent is verbose, or if tool calls trigger long agentic loops, your bill spikes fast. You’re charged for every thought step, every retrieval, every model invocation.
ChatKit has the most unpredictable cost model of the three frameworks. CopilotKit self-hosted has no platform fees, so you only pay your LLM provider directly. Gram’s tiered pricing caps your exposure at a fixed monthly rate. With ChatKit, a verbose agent workflow or an unexpected traffic spike can produce a bill you haven’t planned for.
Observability
OpenAI provides ChatKit logs and observability directly on the OpenAI platform.
-
You can check usage metrics directly:

-
You can also check logs for response calls and the tools the agent calls to perform tasks:

-
More importantly, you can check the ChatKit threads:

Each thread displays message information, the sender, and approval requests:

ChatKit’s thread logs help you understand why a response was sent and what the execution flow was. They also provide data for model fine-tuning.
The ChatKit observability is the best of the three frameworks for engineers optimizing model behavior. ChatKit logs are more structured than CopilotKit’s error-only dashboard, but they don’t measure user satisfaction the way Gram does. You can confirm the agent ran correctly without knowing whether the user actually got what they needed. For debugging execution, ChatKit is strong. For understanding user outcomes, it falls short.
Gram Elements
Gram Elements is a component from Speakeasy’s Gram platform for embedding chat experiences into applications. Gram creates, curates, and hosts MCP servers from OpenAPI specifications or custom TypeScript functions.
Setup
Gram’s architecture centers on MCP servers. The chat experience exposes the capabilities your MCP server defines. An MCP server with task creation and board management tools gives users conversational access to those exact operations. Setup requires a Gram account and API key.
The React integration mounts the Gram provider with your MCP server configuration.
import { GramElementsProvider, Chat, type ElementsConfig } from '@gram-ai/elements'
function GramChatInner({ getSession }: { getSession: () => Promise<string> }) {
const config = {
projectSlug: PROJECT_SLUG,
mcp: MCP_URL,
variant: "standalone",
// Required: session fetcher so auth loads and "Session is loading" doesn't stick
api: {
sessionFn: async (_init: { projectSlug: string }) => getSession(),
},
welcome: {
title: "Gram Assistant",
subtitle:
"I can help you manage tasks. Try asking me to create, update, or list tasks.",
},
composer: {
placeholder: "Ask me to manage your tasks...",
},
tools: {
toolsRequiringApproval: [],
},
};
return (
<GramElementsProvider config={config}>
<Chat />
</GramElementsProvider>
);
}
If you already have an MCP server, setting up Gram Elements is faster than CopilotKit or ChatKit. For this comparison, we generated an OpenAPI specification, uploaded it to Gram to create the MCP server, and integrated the chat component. The entire flow, from API spec to working chat, takes only minutes.
That speed advantage is strong but conditional. If you start from scratch without a defined API layer, it takes time to build the MCP server before you can start integrating the chat component into your app. CopilotKit’s hook-based pattern sidesteps the need to create an MCP server by letting you define capabilities directly in React from day one.
Chat performance
Gram Elements displays tool call inputs and outputs directly in the chat interface. Users see exactly which data the agent sends to each tool and what the tool returns. This transparency helps users understand how the agent interprets their requests and what operations it executes.
Gram requests approval for tool execution by default.
To disable the default approval request, set toolsRequiringApproval to an empty array in the configuration:
const config = {
projectSlug: PROJECT_SLUG,
mcp: MCP_URL,
variant: "standalone",
// Required: session fetcher so auth loads and "Session is loading" doesn't stick
api: {
sessionFn: async (_init: { projectSlug: string }) => getSession(),
},
...
tools: {
toolsRequiringApproval: [],
},
};If necessary, you can re-enable or customize approval requirements by listing specific action names in the toolsRequiringApproval array:
const config = {
...
tools: {
toolsRequiringApproval: [
"createTask",
"updateTask",
"createProject",
"createRelationship",
],
},
};Neither CopilotKit nor ChatKit displays tool call inputs and outputs inline. By displaying them inline, Gram Elements makes the agent’s reasoning visible to users, which builds trust in complex workflows. However, in simpler apps where users just want results, inline tool call information can feel like unnecessary technical detail.
Customization
Gram Elements provides theming customization through the ThemeConfig interface:
const config = {
theme: {
colorScheme: 'dark',
density: 'compact',
radius: 'round',
},
}In this interface:
-
Use
colorSchemeto select light or dark mode, or to detect and reflect system preferences automatically. -
Use
densityto control the overall spacing. Choosecompactto reduce padding and margins for a dense layout,spaciousto increase them for an airy layout, ornormalto retain the default layout. -
Use
radiusto control the border roundness. Chooseroundfor large border radii,softfor moderate radii, andsharpfor minimal radii.
You can also use the ModalConfig to control the chat window itself:
const config = {
modal: {
title: 'My Assistant',
position: 'bottom-right', // where the chat trigger appears
defaultOpen: false, // whether chat opens by default
expandable: true, // can the window expand
defaultExpanded: false, // expanded by default
dimensions: { // custom window size
width: 400,
height: 600,
},
icon: (state) => <CustomIcon state={state} />, // custom trigger icon
},
}Gram Elements gives you control over density, spacing, color scheme, and modal positioning. It’s less extensive than CopilotKit’s use of CSS variables, and similar in scope to ChatKit’s theming options (although approached differently).
If you need to stay within the framework’s visual boundaries, both Gram and ChatKit provide enough customization to match a standard design system. However, neither can compete with CopilotKit if your design requires restructuring the component itself rather than reskinning it.
State management
Gram Elements is a chat UI component. It doesn’t manage or share application state with the agent. The chat and session state remain in Gram. Your app state is separate, and the agent gets context via the MCP server tools.
Gram doesn’t have a built-in callback equivalent to ChatKit’s ClientEffectEvent for when your agent executes an MCP tool or updates data (for example, creating a task in your database).
The three types of state remain separate in this integration:
- The chat and messages are in the Gram Elements UI layer and your session API.
- The app state is in your React state and API backend.
- The agent context is conveyed via your MCP server tools (completely separate from the Gram Elements UI layer).
Of the three frameworks, Gram has the least built-in support for keeping your UI in sync after agent actions. CopilotKit handles this natively with CopilotActions and its readable system, and ChatKit provides ClientEffectEvent as a trigger for refetching data, but with Gram, you need to implement your own polling, WebSocket, or invalidation strategy to reflect changes after tool calls.
Pricing
Gram has a clear, published pricing structure:
- The Free plan gives you access to community support, one MCP server, $5 in chat credits, and 1,000 tool calls per month.
- The Pro plan costs $29 a month and gives you access to email support, three MCP servers, $25 in chat credits, and 5,000 tool calls per month.
- Custom Enterprise plans include SSO, audit logs, self-hosted dataplane, dedicated Slack channel, and concierge onboarding.

Gram’s tiered pricing is the most predictable of the three frameworks. Unlike ChatKit’s pay-per-token model, you know your ceiling at signup. The free tier is genuinely usable for small projects, and the $29 Pro tier covers most early-stage production use. The only comparable cost predictability comes from CopilotKit self-hosted, but in using CopilotKit, you trade Gram’s operational convenience for greater infrastructure responsibility.
Observability
Gram provides three features for observability and logging: Logs, Chat Sessions, and Insights.

Logs is a collection of recent MCP tool calls made to your Gram MCP server. This might seem out of scope, but because Gram is built around MCP servers, watching this page helps catch failures and issues.
Insights is a collection of charts showing chat activity, tool performance, and overall system health.

Under Insights, the Tools by Failure Rate section shows which tools are working and which are failing.

Chat Sessions is the most useful observability feature.

Each session has a score showing overall user satisfaction based on language quality, tool error rates, and other factors. Clicking on a session gives you:
-
An Overview that includes the discussion’s thread history for auditing

-
Telemetry Logs showing requests made to the MCP server, the API, and the model, as well as the token usage

-
And finally, the Tool Calls executed during the chat session
No other framework in this comparison comes close to this level of observability. CopilotKit shows you only errors, while ChatKit shows you only the execution flow. Gram displays both and adds session-level satisfaction scoring.
Satisfaction scores tell you whether your users are actually getting what they need, not just whether the system ran without faults. For a team running a production support workflow, that’s the difference between reactive debugging and proactive improvement.
Wrapping up
This comparison tests three fundamentally different approaches to building agentic chat interfaces. The chat builder kit you choose affects how you define agent capabilities, whether your UI shares state with the agent, how you customize the look and feel of the chat interface, and how you debug issues in production.
Here’s a practical guide for choosing the right framework.
-
For content-heavy SaaS with moderate agent usage: Self-hosting CopilotKit gives you the greatest degree of control over state sync, costs, and visual customization. It is the only option that gives the agent direct knowledge of the app state and enables you to fully customize the UI. At the cost of complex hook definitions and limited monitoring, you can enjoy the maximum control CopilotKit provides and avoid the token-billing surprises ChatKit delivers.
-
If you’re already in the OpenAI ecosystem and need to ship this week: As long as you have an Agent Builder workflow, you can add ChatKit to your app with minimal React integration constraints. Like Gram Elements, it has limited theming options, but offers enough customization for most use cases. ChatKit’s greatest drawback is its unpredictable token-based billing, but the risk may be worth it for the operational simplicity and ability to skip building infrastructure and ship faster.
-
If you’re scaling to thousands of users and need production observability: Gram is the only platform with observability built for production support teams. The lack of session context severely limits the usefulness of CopilotKit’s error-only logs for debugging user complaints. While ChatKit’s thread view helps engineers understand execution flow, it doesn’t indicate whether users are actually satisfied. Gram’s session scores tell you if your chat is working from the user’s perspective, not just the technical perspective.