Question 1

How is the cost of each request computed?

Accepted Answer

Each request is priced at the pay-as-you-go token line of the model's posted rate — input and output tokens times the per-1M-token price listed on the models page. FX and payment-channel fees are shown at checkout and kept separate from the token line.

Question 2

Can I cap spend per API key?

Accepted Answer

Yes. Every key can carry its own maxSpend cap, rate limit, and per-minute token ceiling (tokenLimitTpm). When a key hits its cap, its requests stop while the rest of the account keeps working.

Question 3

Where do I see usage and costs?

Accepted Answer

Dashboard -> Logs shows the per-request traces; Dashboard -> Usage shows requests, tokens, and spend aggregated by model and API key over time. Traces appear in real time as requests complete.

Question 4

What happens when my balance runs out?

Accepted Answer

API calls return HTTP 402 until the wallet is topped up or plan coverage applies. There is no silent overage — spend can never exceed what you have prepaid plus your plan.

Question 5

Do you store my prompts to compute costs?

Accepted Answer

No. Cost metering uses token counts and metadata only. Prompt and completion bodies are not retained — the trace records model, tokens, cost, latency, and status.

Question 6

Does this work for Claude Code and Codex CLI?

Accepted Answer

Yes. CLI agents route through the same gateway, so every coding-agent request gets the same per-request cost trace and counts against the same per-key budgets as your application traffic.

Track LLM costs per request, model, and key

What gets tracked

A cost trace per request

Per-model breakdowns

Per-key attribution

Per-key spend caps

A prepaid wallet ledger

Hard stop at zero

A ledger, not a guess

Point your stack at the gateway

FAQ