Overview
LiteLLM acts as a centralized LLM and embedding gateway for Knowledge Stack. All AI-powered features route through LiteLLM, which provides per-tenant API key management, spend tracking, and model aliasing. For an overview of what LiteLLM does and why it is used, see LLM Gateway (LiteLLM).Architecture
All LLM and embedding traffic from the API, Worker, and Agent services routes through the LiteLLM proxy:Prerequisites
LiteLLM requires its own PostgreSQL database:Deployment Steps
1. Configure Secrets
Create a secrets configuration with the following values:| Secret | Description |
|---|---|
master-key | LiteLLM admin master key (you generate this) |
db-host | PostgreSQL host |
db-username | Database username |
db-password | Database password |
LLM_PROVIDER_OPENAI_API_KEY | Your OpenAI API key (or other provider key) |
2. Configure Model Routing
Set up model aliases that map logical names to provider models:| Logical Model | Description |
|---|---|
general-purpose | General-purpose LLM tasks |
ingestion-chunk-enrichment | Document processing enrichment |
text-embedding-3-small | Vector embeddings |
agent-general-purpose | AI assistant conversations |
3. Deploy LiteLLM
Docker Compose
Add LiteLLM to your Docker Compose stack:Kubernetes (Helm)
For Kubernetes deployments, use the officiallitellm-helm chart. Configure model routing in your Helm values file, with API keys referenced from environment variables resolved at runtime.
4. Configure Knowledge Stack Services
Point your Knowledge Stack services to the LiteLLM proxy:| Service | Environment Variable | Value |
|---|---|---|
| All services | LITELLM_PROXY_URL | http://litellm:4000 |
| API | GP_LLM_API_URL | http://litellm:4000/v1 |
| API | GP_LLM_MODEL | openai/general-purpose |
| API | EMBEDDING_API_BASE_URL | http://litellm:4000/v1 |
| Worker | EMBEDDING_API_BASE_URL | http://litellm:4000/v1 |
| Worker | ENRICHMENT_LLM_API_BASE_URL | http://litellm:4000/v1 |
| Worker | ENRICHMENT_MODEL | openai/ingestion-chunk-enrichment |
| Agent | GP_LLM_API_URL | http://litellm:4000/v1 |
| Agent | GP_LLM_MODEL | openai/agent-general-purpose |
5. Verify the Deployment
Check that LiteLLM is running and healthy:Upgrading
To upgrade LiteLLM, update the image version and redeploy. The upgrade is idempotent — re-running the deployment with the same or new configuration is safe. If you update secrets, redeploy both LiteLLM and the dependent services.Important Notes
- The LiteLLM master key must match the
LITELLM_MASTER_KEYconfigured on the API service - LiteLLM listens on port 4000 and should only be accessible within your cluster (no public ingress)
- Only the API service provisions teams and keys; Worker and Agent services use per-tenant keys at runtime
