Getting started

Core concepts

A quick mental model before you dive in. Zelyx has five core ideas — once these click, everything else makes sense.

The proxy

Zelyx sits between your application and every AI provider. Your app talks to Zelyx using an OpenAI-compatible API; Zelyx decrypts your provider key and forwards the request. The provider sees a normal call — your app never holds the real key.

Provider key

Your real OpenAI / Anthropic / Google key. Stored encrypted in Zelyx's vault (AES-256-GCM). Only workspace admins can add or reveal it. Developers never receive it — they use a Zelyx key instead.

Zelyx key (nk_…)

The key developers put in their code. Format: nk_<id>_<secret>. Shown once on creation, then hashed — if you lose it, generate a new one. Each key belongs to one user in one workspace, enabling per-person cost attribution.

The gate

Every call is evaluated before it reaches the provider. The gate runs up to ten checks in order — budget limits, model policies, payment-risk patterns — and returns one of three decisions: Approve (call proceeds), Block (HTTP 402/403, no provider cost), or Review required (call proceeds, flagged for human review).

Budget layers

Budgets stack: company → team → project → per-key → per-model → per-run → per-session. All enforced atomically — parallel agents cannot race past the same cap. Set any layer you need; layers you skip are unlimited.

Events

Every call produces an llm_cost event: model, provider, input tokens, output tokens, cost, latency, TTFB, team, project, environment, run ID. Gate decisions produce gate_decision events. All of this drives the dashboard — no separate instrumentation needed.

How a request flows

From your app's perspective, calling Zelyx is identical to calling the provider directly. Internally, the proxy does this on each request:

  1. Authenticate the Zelyx key → resolve workspace and user
  2. Decrypt provider key from vault (LRU-cached, re-fetched after TTL)
  3. Parse request: extract model, estimated cost, tool names, message count
  4. Run gate: enforce budgets, check model policy, detect payment risk
  5. If approved: forward request to provider, always streaming upstream for full observability
  6. Adapt response to what your client requested (streaming or buffered)
  7. Record cost event in background — non-blocking, never slows down the call
TipThe proxy always streams from the provider even when your client requests a non-streaming response. This lets Zelyx measure TTFB and token counts accurately. Your client gets the format it asked for — the internal streaming is transparent.

Key distinctions

Provider key vs Zelyx key

A provider key is a secret. It grants unlimited access to an AI provider and must never appear in developer code, environment files, or version control. Zelyx stores it encrypted.

A Zelyx key (nk_…) is a credential scoped to one workspace. It can only reach the providers your admin has connected, within the budgets and model policies you configure. Losing it is low-risk — revoke and regenerate.

Block vs Review required

Block — the call is rejected before reaching the provider. Your app receives HTTP 402 (budget) or 403 (policy). No tokens consumed, no provider cost.

Review required — the call proceeds and you get a normal response, but the response includes the header X-Zelyx-Review-Required: true. The call is also flagged in the Gate dashboard for a human to review. Budget is reserved; the call did incur cost.

Daily reset

All daily budgets (company, team, project, per-key, per-model) reset at midnight UTC. Run budgets and session limits do not reset — they are consumed until exhausted or the run ends.