Manual gate
Execution requires a token that can only be minted by a deliberate gesture — a click, a keystroke. A spoken "confirm" is only a request to confirm. Structurally unforgeable.
VOICE → API · SAFETY-FIRST
A spoken word never executes.
Point Voxy at an API spec and get a working voice agent in an afternoon — with the confirmation gate, no-guessing parser, and audit trail already built. The safety is the product: a misheard command never fires a destructive call.
spoken commands are read back a deliberate click confirms destructive actions are flagged
Wiring voice to an API means STT, intent parsing, slot resolution, confirmation UX, error recovery, and an audit trail — and then getting the safety right so a mis-parse never charges a card or deletes a deployment. That safety work is the part that quietly sinks these projects. Voxy already solved it. You bring the spec.
THE WORKFLOW
Drop in an OpenAPI 3.x spec. Voxy normalizes it into one API model — operations, typed params, auth scheme, error shapes.
It generates a Voice Layer Definition and surfaces every write operation for you to review — destructive flags, read-backs, slots. Human-in-the-loop where it matters.
Speak a command. Hear it read back. Confirm with a click. It executes through the gated runtime, and every step lands in the audit log.
$ voxy generate resend.json --out resend.vld.json draft VLD — Resend (3 write ops to review) ┌──────────────┬────────┬───────────┬─────────────┬───────────────────┐ │ operation │ method │ path │ destructive │ reason │ ├──────────────┼────────┼───────────┼─────────────┼───────────────────┤ │ sendEmail │ POST │ /emails │ YES │ verb-heuristic │ │ createDomain │ POST │ /domains │ no │ write — review │ │ deleteApiKey │ DELETE │ /api-keys │ YES │ method:DELETE │ └──────────────┴────────┴───────────┴─────────────┴───────────────────┘ Every write op surfaced for review. Confirm flags before use. ✓ wrote draft VLD → resend.vld.json
THE CORE
Four invariants, proven on the hardest case — a live trading agent — then lifted out so they hold for any API unchanged.
Execution requires a token that can only be minted by a deliberate gesture — a click, a keystroke. A spoken "confirm" is only a request to confirm. Structurally unforgeable.
Strict, schema-conformant tool use. Missing a required slot elicits a question — it never invents a value. Deliberation and venting are caught before any action path.
Self-declared hints plus verb/method heuristics flag every irreversible or costly action. Missing one is the catastrophic failure — so detection errs toward flagging.
Every step — heard, parsed, read back, confirmed, executed — is one immutable row. Full session reconstruction, and an alarm on any execute without a matching confirmation.
Battle-tested on a live trading agent — the hardest safety case we could find. See the trading demo →
TWO WAYS TO SHIP
You operate your own account. Auth is a single token, regulatory exposure is low, the value is personal speed. This is where you start.
Your users operate your API by voice — and never see Voxy. The gate, the secrets, and execution stay server-side. Per-tenant isolation; no credential ever reaches the browser.
ONE ENGINE, EVERY DOMAIN
Early access rolls out across the Tier-1 APIs first. Tell us the API you'd point it at — we'll send back a working draft VLD to try.
No spam. Just your draft agent and a note when your tier opens.