Token Tracking

Counting LLM input and output tokens per request for cost and billing visibility.

Definition

LLM providers charge per token — roughly per 4 characters of text. Token tracking is the practice of recording input token count, output token count, and total tokens for every LLM API call. These counts are then used for cost attribution (which API key spent how much), user billing (Stripe metering), and budget alerts.

Why it matters for AI APIs

Without token tracking, your LLM costs are a black box. You can't tell which customers drive the most cost, you can't accurately bill usage-based customers, and you can't set per-key spending limits. Token tracking is infrastructure, not an afterthought.

In FastAPI AI Kit

Every `llm.chat()` and `llm.stream()` call accepts `track_tokens=True`. The kit extracts usage from each provider's response format, normalizes to a standard `TokenUsage(input, output, total)` object, persists it to Postgres, and calls `meter.record()` for Stripe billing.

Token Tracking

Definition

Why it matters for AI APIs

In FastAPI AI Kit

Related terms