Token Tracking
Counting LLM input and output tokens per request for cost and billing visibility.
Definition
LLM providers charge per token — roughly per 4 characters of text. Token tracking is the practice of recording input token count, output token count, and total tokens for every LLM API call. These counts are then used for cost attribution (which API key spent how much), user billing (Stripe metering), and budget alerts.
Why it matters for AI APIs
Without token tracking, your LLM costs are a black box. You can't tell which customers drive the most cost, you can't accurately bill usage-based customers, and you can't set per-key spending limits. Token tracking is infrastructure, not an afterthought.
In FastAPI AI Kit
Every `llm.chat()` and `llm.stream()` call accepts `track_tokens=True`. The kit extracts usage from each provider's response format, normalizes to a standard `TokenUsage(input, output, total)` object, persists it to Postgres, and calls `meter.record()` for Stripe billing.
