SSE vs WebSockets
Choosing between Server-Sent Events and WebSockets for real-time AI streams.
Definition
Server-Sent Events (SSE) and WebSockets both enable real-time communication, but with different trade-offs. SSE is unidirectional (server → client) over HTTP/1.1, works through proxies and load balancers with no special configuration, and has built-in reconnection. WebSockets are bidirectional but require stateful connections and special infrastructure support.
Why it matters for AI APIs
For LLM token streaming, SSE is almost always the right choice: you only need one direction (tokens flow from server to client), SSE works with Vercel, Railway, Cloudflare, and standard load balancers, and it requires no client-side library. WebSockets add complexity without benefit for this use case.
In FastAPI AI Kit
The kit ships SSE via FastAPI's `StreamingResponse` with `media_type='text/event-stream'`. Tokens are yielded as `data: {...}\n\n` events. Client disconnection is detected via the `asyncio.CancelledError` and handled gracefully. Works with standard fetch and EventSource APIs.
