Skip to main content
FastAPI + AI Boilerplate

Ship your AI backend this weekend, not next quarter.

The production-ready FastAPI boilerplate with everything wired up — auth, LLM layer, RAG, billing hooks, background jobs, and deploy config. Skip 60+ hours of setup.

  • JWT auth + API key issuance out of the box
  • LLM integration — OpenAI, Anthropic, local models
  • Streaming responses (SSE) + token tracking
  • RAG pipeline with pgvector or Qdrant
  • Billing-ready hooks (Stripe usage metering)
  • Docker, Celery, Alembic, pytest — all wired up
View Docs

Save 60+ hours of boilerplate work · One-time license · Lifetime updates included

~/my-ai-api — zsh
$ git clone https://github.com/fastapikit/starter my-ai-api
$ cd my-ai-api && cp .env.example .env
$ docker-compose up -d
✓ Postgres started on port 5432
✓ Redis started on port 6379
✓ FastAPI AI Kit running on http://localhost:8000
$ curl localhost:8000/v1/chat -d '{"message": "Hello!"}'
> {"reply":"Hello! How can I help?","tokens":12}

Response time

142 ms

Stop building the same boilerplate for every project.

Every AI backend needs the same 10 things. Here's how the kit changes that.

Building from scratch

  • Spend 2 days wiring up JWT auth and API key management
  • Figure out streaming SSE with a custom token counter
  • Manually integrate pgvector and build a chunking pipeline
  • Wire Celery, Redis, and Alembic migrations from scratch
  • Google 'FastAPI + Stripe usage metering' for the 10th time
  • Write Dockerfiles, docker-compose, and CI configs before writing a single feature

With FastAPI AI Kit

  • Auth is wired on day one — JWT, API keys, rate limiting, the works
  • Streaming SSE endpoint ready with token tracking per request
  • RAG pipeline pre-built: ingest docs, embed, query — plug in your data
  • Background jobs, migrations, and Redis configured from the start
  • Billing hooks built in — just connect your Stripe account
  • Docker, docker-compose, CI workflow, and deploy guides included

Everything included

The full backend stack, wired up and ready.

Every piece you'd spend days wiring together, production-hardened and ready to extend.

Async FastAPI Core

Clean, modular architecture — routers, services, repositories. Scales from side project to production load.

LLM Integration Layer

One interface for OpenAI, Anthropic, or any local model. Swap providers without touching your business logic.

Streaming Responses (SSE)

Server-sent events out of the box. Token usage tracking per request, ready to wire into billing.

RAG Pipeline

Vector store integration with pgvector or Qdrant. Document ingestion pipeline with chunking and embedding.

Auth & API Keys

JWT authentication, API key issuance, per-key rate limiting. Production-grade from day one.

Billing-Ready Hooks

Stripe-compatible usage metering. Track token consumption per user and per API key — bring your own keys.

Postgres + Alembic

SQLAlchemy 2.0 async ORM, Alembic migrations, connection pooling. Schema evolution without headaches.

Background Jobs

Celery + Redis or arq for async task processing. Long-running LLM chains, email, cron — handled.

Docker-First

docker-compose for local dev, production Dockerfile, deploy guides for Railway, Render, Fly, and VPS.

Tests & CI

pytest with async support, pre-commit hooks, typed end-to-end, GitHub Actions workflow included.

LLM Layer

One interface. Every LLM.

The kit ships a unified LLM abstraction that wraps OpenAI, Anthropic (Claude), and any OpenAI-compatible local model. Swap providers with a single env-var change — your business logic never changes. Streaming responses via SSE are built in, with per-request token counting ready to pipe into your billing layer.

example.py
# One call — works with any provider
response = await llm.chat(
    messages=[{"role": "user", "content": prompt}],
    stream=True,          # SSE enabled
    track_tokens=True,    # usage metering
)

async for chunk in response:
    yield chunk.delta     # stream to client
RAG Pipeline

Retrieval-Augmented Generation, ready to extend.

Ingest PDFs, Markdown, or arbitrary text through a built-in chunking and embedding pipeline. Store vectors in Postgres (pgvector) or Qdrant — configured in one place. At query time, the kit automatically retrieves the most relevant chunks and injects them into your LLM prompt. Add your documents, not infrastructure.

example.py
# Ingest a document (async)
await rag.ingest(
    source="docs/handbook.pdf",
    collection="company-kb",
)

# Query with automatic context
answer = await rag.query(
    question="What's the refund policy?",
    collection="company-kb",
    top_k=5,
)
Auth & Rate Limiting

Auth the right way, from the first commit.

JWT-based user auth and API key issuance are wired up on day one. Every API key carries metadata: owner, tier, and per-minute / per-day request limits. Exceeding a rate limit returns a standards-compliant 429 with Retry-After headers. Add a new tier in config — no code changes needed.

example.py
# Protected route with rate limiting
@router.post("/v1/chat")
@require_api_key(tier=["pro", "enterprise"])
@rate_limit(per_minute=60, per_day=5000)
async def chat(
    body: ChatRequest,
    key: APIKey = Depends(get_api_key),
):
    usage = await llm.chat(body.messages)
    await meter.record(key.id, usage.tokens)
    return ChatResponse(reply=usage.content)
Deploy

From laptop to production in minutes.

The repo ships with a production Dockerfile, a local docker-compose stack (Postgres, Redis, FastAPI), and step-by-step deploy guides for Railway, Render, Fly.io, and bare VPS. Environment variable management, health-check endpoints, and Alembic migration commands are all documented. Push your code — not config.

example.py
# docker-compose up -d
# ─────────────────────────
# ✓ postgres   :5432
# ✓ redis      :6379
# ✓ api        :8000
#   GET /healthz → 200 OK

# Deploy to Railway in 3 commands:
$ railway link my-ai-api
$ railway env set DATABASE_URL=$DB
$ railway up

See it in action

From zero to live API in 10 minutes.

A real terminal walkthrough — no editing, no configuration battles, no surprises.

1. Clone & start
$ git clone https://github.com/fastapikit/starter .
$ docker-compose up -d
✓ API live at http://localhost:8000
2. Call the LLM endpoint
$ curl localhost:8000/v1/chat \
-H "X-API-Key: kit_live_abc123" \
-d '{"message": "Summarize this PDF"}'
> {"reply":"Here is a summary...","tokens":47}
3. RAG — ingest & query
$ python scripts/ingest.py docs/handbook.pdf
✓ 142 chunks embedded → pgvector
$ curl localhost:8000/v1/rag/query \
-d "{"q": "What is the return policy?"}"
> {"answer":"Returns within 30 days...","sources":["handbook.pdf:p3"]}

All commands run against the real codebase — no mocks, no demos, just the kit.

What's inside

Hand-picked, production-tested tools. Not bloated, not toy — the stack serious backend developers reach for.

FastAPIFramework
Python 3.11+Language
SQLAlchemy 2.0ORM
AlembicMigrations
PostgreSQLDatabase
pgvectorVector Store
QdrantVector Store
RedisCache / Queue
CeleryJobs
arqJobs
OpenAI SDKLLM
Anthropic SDKLLM
LangChainRAG (optional)
Pydantic v2Validation
DockerContainers
pytestTesting
GitHub ActionsCI
StripeBilling

From clone to deployed in 10 minutes.

No magic, no YAML hell. Just a well-structured codebase you control.

01

Clone & configure

Run `git clone` and copy `.env.example` to `.env`. Add your database credentials, LLM API keys, and secret. Takes under 2 minutes.

git clone https://github.com/fastapikit/starter my-api cd my-api && cp .env.example .env
02

Add your keys

Drop your OpenAI or Anthropic API key into `.env`. Optionally configure Stripe for billing metering or Qdrant for the vector store.

OPENAI_API_KEY=sk-... ANTHROPIC_API_KEY=sk-ant-... STRIPE_SECRET=sk_live_...
03

Deploy

Run `docker-compose up -d` locally for instant dev. Push to Railway, Render, Fly.io, or any VPS — deploy guides are included for all of them.

docker-compose up -d # → API live at http://localhost:8000 # → Docs at http://localhost:8000/docs

Simple, one-time pricing.

No subscriptions. No usage fees. Buy once, own it forever.

FastAPI AI Kit

Full stack boilerplate · everything included

$69USD

one-time · no subscription

Full source code, no obfuscation
LLM layer — OpenAI, Anthropic, local models
Streaming SSE + token tracking
RAG pipeline (pgvector / Qdrant)
JWT auth + API key management
Billing hooks for Stripe metering
Background jobs (Celery / arq)
Docker + deploy guides included
pytest + GitHub Actions CI
Lifetime updates
No subscriptions · Lifetime updates · Own the code

Frequently asked questions

Ready to ship your AI backend this weekend?

Join developers who skipped weeks of boilerplate and went straight to building.

Read the docs
No subscriptions · One-time payment · Lifetime updates