Skip to main content
All use casesUse Case

Build a multi-provider LLM gateway with auth and billing.

Expose a unified `/v1/chat` endpoint that routes to OpenAI, Anthropic, or local models — with per-key auth, rate limiting, token tracking, and usage metering for internal teams or customers.

FastAPIOpenAIAnthropicSSEPostgreSQLRedis

The usual pain points

  • Building provider-agnostic LLM routing
  • Tracking which API key consumed how many tokens
  • Exposing LLM access to internal teams with usage visibility
  • Handling streaming responses correctly across providers

How the kit solves them

  • Unified LLM abstraction over OpenAI, Anthropic, and OpenAI-compatible APIs
  • Per-API-key token tracking: usage broken down by key, date, and model
  • SSE streaming built in — consistent behavior across all providers
  • Admin dashboard hooks for usage reporting by team or key

Example implementation

main.py
# Gateway endpoint — provider-agnostic
@router.post("/v1/chat")
async def gateway_chat(
    body: ChatRequest,
    key: APIKey = Depends(get_api_key),
):
    # Route to any provider via env config
    async def stream():
        async for chunk in llm.stream(
            messages=body.messages,
            model=body.model or settings.DEFAULT_MODEL,
        ):
            await meter.record(key.id, chunk.tokens)
            yield f"data: {chunk.json()}\n\n"
    return StreamingResponse(stream(), media_type="text/event-stream")

Ready to build your llm api gateway?

FastAPI AI Kit ships with everything shown above, pre-configured and production-ready. Clone the repo and start building in minutes.

Ready to ship your AI backend this weekend?

Join developers who skipped weeks of boilerplate and went straight to building.

Read the docs
No subscriptions · One-time payment · Lifetime updates