All use casesUse Case
Ship a retrieval-augmented search API on your documents.
Ingest PDFs, Markdown, and text files into a vector store, then expose a semantic search endpoint powered by your LLM of choice — with pgvector or Qdrant pre-configured.
FastAPIpgvectorQdrantOpenAI EmbeddingsPostgreSQLAlembic
The usual pain points
- ✕Parsing and chunking documents for embedding
- ✕Choosing and integrating a vector store
- ✕Injecting retrieved context into LLM prompts
- ✕Managing embedding costs at scale
How the kit solves them
- Built-in document ingestion pipeline with configurable chunk size
- Pre-wired pgvector and Qdrant — switch with a single env var
- Automatic context injection: top-k chunks inserted into LLM prompt
- Token tracking per query for embedding + completion cost visibility
Example implementation
# Ingest a document
await rag.ingest(
source="contracts/q4-2024.pdf",
collection="legal-docs",
chunk_size=512,
overlap=64,
)
# Query with automatic context retrieval
result = await rag.query(
question="What are the termination clauses?",
collection="legal-docs",
top_k=5,
llm_model="gpt-4o",
)
# Returns answer + source referencesReady to build your rag document search api?
FastAPI AI Kit ships with everything shown above, pre-configured and production-ready. Clone the repo and start building in minutes.
Ready to ship your AI backend this weekend?
Join developers who skipped weeks of boilerplate and went straight to building.
No subscriptions · One-time payment · Lifetime updates
