Product

Tune risk detection sensitivity, one policy at a time

Dennis Babyak

June 29, 2026 - 3 min read

Product

A detection rule that flags everything is a detection rule nobody trusts. There are strings that look like a credit card, internal IDs shaped like IBANs, values that score just high enough to trip a global threshold tuned for some other policy. If a PII scanner runs over production traffic and flags all of these data points that aren’t actually sensitve, that could be a problem.

Until now, Speakeasy’s confidence threshold was global. Every risk policy shared one minimum confidence, so tightening one policy meant tightening all of them. Today that changes. Each risk policy carries its own detection sensitivity, and you set it where you build the policy.

What shipped

Risk policies now expose a per-policy minimum match confidence — the score a detection has to clear before it becomes a finding. In the policy wizard there’s a new Sensitivity step with a slider: drag it up to demand higher-confidence matches and cut noise, drag it down to catch more and accept some false positives. The value is saved with the policy and loaded back when you edit it.

The global default is unchanged: every policy still starts at a minimum confidence of 0.5. We tested moving it higher, but against a 100k-message production sample a 0.75 floor started dropping real findings, so 0.5 stays the right baseline for most rules. What this release adds is the headroom to go higher per policy where it helps. Analyzers that match structured values like IBANs or phone numbers key off tight, well-shaped patterns, so raising their sensitivity trims noise without costing you true positives. The change is additive and backward compatible, so nothing re-tunes itself under you.

Why per-policy

The same detector is the right amount of strict in one place and far too loud in another. A policy watching user prompts for leaked secrets wants to err toward catching things. A policy scanning a tool response that legitimately returns customer records wants to stay quiet unless it’s sure. One global number can’t satisfy both; it just forces a compromise that’s wrong for every policy at once.

Sensitivity is the precision dial for that problem. It sits alongside the scoping and exemption controls we wrote about earlier: scope decides where a policy runs, exemptions decide what to skip, and sensitivity decides how sure a match has to be before it counts. Together they let you author a deliberately broad policy and then tune it down to signal without reaching for a blunter detector.

Tuning against real traffic

Sensitivity is most useful when you set it against evidence rather than instinct. Built-in detection rules can be run over a sample of text or a set of existing sessions before a policy goes live, so you can watch how a threshold change moves the line between caught and ignored, then commit the number that keeps the policy quiet without making it weak.

Get started

Open the policy wizard, build or edit a risk policy, and step to Sensitivity. Set the slider where the policy’s job warrants — higher for precision, lower for recall — and save. Existing policies keep their current behavior until you choose to tune them.

Tuning risk policies across a large agent footprint? Book time with our team and we’ll walk through it with you.

Last updated on June 30, 2026