Ask a team what their LLM API spend was last month and most can answer. Ask which app, agent, or teammate spent it — and on which models — and the answer is usually a shrug followed by a spreadsheet project. This post describes a setup that makes cost attribution a property of your infrastructure instead of a monthly forensic exercise.
The core idea is simple: route every model call through one gateway, give every workload its own API key, and let the gateway's ledger do the bookkeeping.
Why provider dashboards aren't enough
If you call two or three providers directly, your spend lives in two or three dashboards, each with its own update cadence, currency handling, and definition of a "request". Worse, attribution stops at the account level: the dashboard can tell you what the whole account spent, but not which of your services made the calls, unless you maintain separate accounts per service — which multiplies billing overhead instead of reducing it.
The usual workaround is application-side logging: wrap every SDK call, estimate token counts, multiply by a price table you maintain by hand, and hope nobody calls a model you forgot to add. It works until a price changes or someone adds a provider, and it silently breaks.
Step 1: One key per workload
A gateway inverts the problem. Router One sits between your apps and 25+ supported models, so every call already passes through one place that knows the model, the token counts, and the posted rate. Attribution then comes down to one practice: create one API key per app, agent, or environment.
sk-rk-...prod-chatbotfor the production assistantsk-rk-...batch-pipelinefor the nightly enrichment jobsk-rk-...claude-code-alicefor a teammate's coding agentsk-rk-...stagingfor everything pre-production
Keys are free to create, so the granularity is yours to choose. Once each workload has its own key, the usage dashboard gives you a spend line per workload with no instrumentation in your code at all.
Step 2: Read the per-request traces
Aggregates tell you that spend moved; traces tell you why. Every request through the gateway records its model, input and output tokens, cost at the posted rate, latency, status, and the route that served it. When the batch pipeline's spend doubles, the traces show whether it sent more requests, longer prompts, or quietly switched to a pricier model.
Cost is computed at the pay-as-you-go token line of each model's posted rate — the per-1M-token prices published on the models page — with FX and payment-channel fees shown separately at checkout. There is no price table for you to maintain, and no estimation: the trace records what the request actually cost. Prompt and completion bodies are not retained; metering uses token counts and metadata only.
Step 3: Cap before it hurts
Attribution without enforcement still leaves you reading about the incident in the invoice. Each key can carry three limits:
- maxSpend — a hard ceiling on what the key may spend
- rateLimit — requests per unit time
- tokenLimitTpm — tokens per minute
A retry loop in the staging environment hits the staging key's cap and stops; production keeps running. A leaked key is bounded by its own ceiling instead of the whole wallet. And when usable credit reaches zero, the gateway returns HTTP 402 rather than accruing surprise debt — the pricing methodology page documents the billing semantics in detail.
Step 4: Review weekly, not monthly
With per-key attribution in place, a useful cadence is a five-minute weekly review: sort keys by spend, scan per-model distribution for surprises, and check whether any key is approaching its cap. Teams that do this catch model drift and prompt bloat while they are still cheap; the patterns and the dashboard views are described on the LLM cost tracking and LLM observability pages.
The takeaway
Cost attribution is not a reporting feature you bolt on later — it falls out of routing your calls through one ledger and naming your workloads with keys. Set that up once and "who spent what, on which models, and why" becomes a dashboard view instead of a quarterly mystery.
Start with one key per workload at router.one, and see the cost tracking overview for what the ledger records.