All use casesUse Case
Production chatbot infrastructure — sessions, history, billing.
Multi-turn conversation management, session persistence, streaming replies, per-user rate limiting, and token-based billing — the complete backend for a customer-facing AI chatbot.
FastAPISSEPostgreSQLRedisStripeOpenAI
The usual pain points
- ✕Persisting conversation history across sessions
- ✕Streaming LLM responses in real time to the browser
- ✕Preventing per-user abuse with rate limits
- ✕Billing users based on actual token consumption
How the kit solves them
- Session persistence model with full conversation history in Postgres
- SSE streaming endpoint built in — no custom EventSource wiring
- Per-key rate limiting configurable per user tier
- Token metering per session automatically fed to Stripe webhooks
Example implementation
@router.post("/v1/chat/stream")
async def chat_stream(
body: ChatRequest,
key: APIKey = Depends(get_api_key),
):
history = await session_store.get(body.session_id)
async def event_stream():
total_tokens = 0
async for chunk in llm.stream(
messages=[*history, body.message],
):
total_tokens += chunk.tokens
yield f"data: {chunk.delta}\n\n"
await meter.record(key.id, total_tokens)
await session_store.append(body.session_id, body.message)
return StreamingResponse(event_stream(), media_type="text/event-stream")Ready to build your ai chatbot backend?
FastAPI AI Kit ships with everything shown above, pre-configured and production-ready. Clone the repo and start building in minutes.
Ready to ship your AI backend this weekend?
Join developers who skipped weeks of boilerplate and went straight to building.
No subscriptions · One-time payment · Lifetime updates
