Steadier assistants, hardened hooks, and resilient functions
This release smooths a lot of rough edges. Assistants pick up newly connected integrations mid-conversation and stop churning their runtimes, hooks fail safely and retry transient errors instead of dropping events, Gram Functions ride out load spikes, and token and cost counts come out right for coding sessions.
Features
- Assistants pick up MCP changes mid-conversation #3013 - When you attach or remove an MCP server, the assistant reconciles its live connections on the next turn instead of staying blind to the change until its runtime restarts — so a newly connected integration like GitHub MCP is usable right away. (Author: @danielkov )
- Organization-level observability mode for hooks #3565 - Turn on a mode, from the organization logging settings, that makes generated hook plugins fully non-blocking, so they only observe and report and can never deny or delay a tool call. It defaults off, preserving existing behavior. (Author: @danielkov )
- Gram Functions ride out saturation #3627 - Tool-call and resource-read requests now retry on a saturated runner’s
429and Fly’s503with jittered backoff instead of surfacing transient saturation as a hard failure, and recovered attempts log as warnings rather than errors. (Author: @bflad ) - Function concurrency sized to real capacity #3591 - Function concurrency limits now track actual execution capacity, so a memory bump no longer inflates the request cap, and a saturated runner returns a retryable response instead of dropping the connection. (Author: @bflad )
- Operator memory and scale overrides persist #3543 - Function deployments now prefer operator-set memory and scale overrides and carry them forward across redeploys, so a later customer deploy no longer resets them. (Author: @bflad )
Bug fixes
- No more assistant runtime teardown loops #3551 - A single bad assistant turn no longer tears down and recreates its runtime forever: errors from a live runtime are treated as terminal and capped, and a hard ceiling fails a stuck event instead of churning machines indefinitely. (Author: @danielkov )
- Idle assistant VMs are reaped #2821 - Stopped per-thread runtime VMs are now reaped after 14 days idle instead of waiting for the whole assistant to fall silent, so busy projects stop accumulating orphaned machines while a dormant thread still cold-launches into the same app. (Author: @danielkov )
- Truncated tool calls no longer wedge threads #3549 - A model response cut off mid-tool-call is dropped at capture, with the preceding messages kept, so the thread stays usable instead of failing and retrying forever. (Author: @danielkov )
- Chat deletion guards for active assistants #3592 - Deleting a chat that backs an active assistant is now blocked and returns a conflict, instead of soft-deleting the conversation out from under a running assistant. (Author: @danielkov )
- Assistants recover from a deleted backing chat #3593 - A chat that backs an active assistant clears its soft-deleted state automatically on the next message, so an assistant whose chat was deleted out from under it recovers; chats with no active assistant stay deleted. (Author: @danielkov )
- Accurate token totals for coding sessions #3620 - Per-session and per-user totals now derive from input plus output tokens when a provider omits an explicit total, fixing “0 tokens” readings for Claude Code and similar AI-coding sessions. (Author: @simplesagar )
- Hardened hook ingest #3560 - Hook senders retry a dropped request with backoff instead of blocking the tool call or losing the event, and the server de-duplicates redelivered events so each is recorded exactly once. (Author: @danielkov )
- Hooks fail closed when Gram is unreachable #3522 - The Cursor hook now emits a deny with a readable reason when Gram is unreachable or returns an error, instead of silently allowing the call and bypassing blocking policies, and sends its MCP inventory on stdin so large inventories no longer risk dropping telemetry. (Author: @bradcypert )
- Clearer hook verification message #3556 - When an MCP tool call can’t be verified, the deny reason now tells you to restart Claude or run /reload-plugins and includes an error code, instead of implying the session is still initializing. (Author: @danielkov )
- Per-session identity isolation in batched logs #3525 - When a collector or gateway re-batches multiple sessions into one OTLP logs export, each Claude Code session keeps its own identity, so a session is never cached or authorized with another session’s email or organization. (Author: @bflad )
- Trusted trace boundaries on public routes #3663 - Public MCP and OAuth routes start a fresh server-side trace per request and record the inbound trace context as a link, so third parties can’t merge unrelated requests into one trace or steer your trace IDs. (Author: @bflad )
- Cleaner challenge list #3628 - Challenges raised by users outside the organization, such as a staff member impersonating a customer org, are filtered out so they no longer clutter the list or skew its counts. (Author: @adaam2 )