All use casesUse Case
Build a multi-provider LLM gateway with auth and billing.
Expose a unified `/v1/chat` endpoint that routes to OpenAI, Anthropic, or local models — with per-key auth, rate limiting, token tracking, and usage metering for internal teams or customers.
FastAPIOpenAIAnthropicSSEPostgreSQLRedis
The usual pain points
- ✕Building provider-agnostic LLM routing
- ✕Tracking which API key consumed how many tokens
- ✕Exposing LLM access to internal teams with usage visibility
- ✕Handling streaming responses correctly across providers
How the kit solves them
- Unified LLM abstraction over OpenAI, Anthropic, and OpenAI-compatible APIs
- Per-API-key token tracking: usage broken down by key, date, and model
- SSE streaming built in — consistent behavior across all providers
- Admin dashboard hooks for usage reporting by team or key
Example implementation
# Gateway endpoint — provider-agnostic
@router.post("/v1/chat")
async def gateway_chat(
body: ChatRequest,
key: APIKey = Depends(get_api_key),
):
# Route to any provider via env config
async def stream():
async for chunk in llm.stream(
messages=body.messages,
model=body.model or settings.DEFAULT_MODEL,
):
await meter.record(key.id, chunk.tokens)
yield f"data: {chunk.json()}\n\n"
return StreamingResponse(stream(), media_type="text/event-stream")Ready to build your llm api gateway?
FastAPI AI Kit ships with everything shown above, pre-configured and production-ready. Clone the repo and start building in minutes.
Ready to ship your AI backend this weekend?
Join developers who skipped weeks of boilerplate and went straight to building.
No subscriptions · One-time payment · Lifetime updates
