Adding streaming LLM responses with Server-Sent Events in FastAPI
How to implement real-time streaming chat responses using SSE in FastAPI, with token counting and proper error handling.
Technical writing on FastAPI, LLM integration, RAG pipelines, and building production backends.
How to implement real-time streaming chat responses using SSE in FastAPI, with token counting and proper error handling.
A practical guide to the routers, services, and repository pattern that makes FastAPI codebases easy to maintain at scale.
A complete walkthrough of building a retrieval-augmented generation pipeline: document ingestion, embedding, vector search, and LLM context injection — all in async FastAPI.
How to implement production-grade JWT authentication and API key issuance in FastAPI — with refresh tokens, per-key rate limiting, and secure storage.
How to implement token-based usage metering with Stripe's metered billing in a FastAPI backend — from per-request tracking to webhook handling.
Step-by-step production deployment of a FastAPI app with Postgres, Redis, and Celery workers on Railway and Render — including migration automation and health checks.
How to offload long-running LLM tasks to Celery workers in FastAPI — job queuing, status polling, result storage, and monitoring with Flower.
The right way to use SQLAlchemy 2.0 async sessions in FastAPI — dependency injection, transaction management, eager loading, and common pitfalls.