Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.knowledgestack.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

LiteLLM acts as a centralized LLM and embedding gateway for Knowledge Stack. All AI-powered features route through LiteLLM, which provides per-tenant API key management, spend tracking, and model aliasing. For an overview of what LiteLLM does and why it is used, see LLM Gateway (LiteLLM).

Architecture

All LLM and embedding traffic from the API, Worker, and Agent services routes through the LiteLLM proxy:
+----------+  +----------+  +----------+
|   API    |  |  Worker  |  |  Agent   |
+----+-----+  +----+-----+  +----+-----+
     |             |              |
     +-------------+--------------+
                   |
            +------v------+
            |   LiteLLM   | :4000
            |   Proxy     |
            +------+------+
                   |
          +--------+
          v        v
      OpenAI   PostgreSQL

Prerequisites

LiteLLM requires its own PostgreSQL database:
CREATE DATABASE litellm;

Deployment Steps

1. Configure Secrets

Create a secrets configuration with the following values:
SecretDescription
master-keyLiteLLM admin master key (you generate this)
db-hostPostgreSQL host
db-usernameDatabase username
db-passwordDatabase password
LLM_PROVIDER_OPENAI_API_KEYYour OpenAI API key (or other provider key)

2. Configure Model Routing

Set up model aliases that map logical names to provider models:
Logical ModelDescription
general-purposeGeneral-purpose LLM tasks
ingestion-chunk-enrichmentDocument processing enrichment
text-embedding-3-smallVector embeddings
agent-general-purposeAI assistant conversations

3. Deploy LiteLLM

Docker Compose

Add LiteLLM to your Docker Compose stack:
litellm:
  image: ghcr.io/berriai/litellm-database:latest
  ports:
    - "4000:4000"
  environment:
    - LITELLM_MASTER_KEY=your-master-key
    - DATABASE_URL=postgresql://user:pass@postgres:5432/litellm
  depends_on:
    - postgres

Kubernetes (Helm)

For Kubernetes deployments, use the official litellm-helm chart. Configure model routing in your Helm values file, with API keys referenced from environment variables resolved at runtime.

4. Configure Knowledge Stack Services

Point your Knowledge Stack services to the LiteLLM proxy:
ServiceEnvironment VariableValue
All servicesLITELLM_PROXY_URLhttp://litellm:4000
APIGP_LLM_API_URLhttp://litellm:4000/v1
APIGP_LLM_MODELopenai/general-purpose
APIEMBEDDING_API_BASE_URLhttp://litellm:4000/v1
WorkerEMBEDDING_API_BASE_URLhttp://litellm:4000/v1
WorkerENRICHMENT_LLM_API_BASE_URLhttp://litellm:4000/v1
WorkerENRICHMENT_MODELopenai/ingestion-chunk-enrichment
AgentGP_LLM_API_URLhttp://litellm:4000/v1
AgentGP_LLM_MODELopenai/agent-general-purpose

5. Verify the Deployment

Check that LiteLLM is running and healthy:
# Docker
curl http://localhost:4000/health/liveliness

# Kubernetes
kubectl exec -n your-namespace deploy/litellm -- \
  python -c "import urllib.request; print(urllib.request.urlopen('http://localhost:4000/health/liveliness').read())"

Upgrading

To upgrade LiteLLM, update the image version and redeploy. The upgrade is idempotent — re-running the deployment with the same or new configuration is safe. If you update secrets, redeploy both LiteLLM and the dependent services.

Important Notes

  • The LiteLLM master key must match the LITELLM_MASTER_KEY configured on the API service
  • LiteLLM listens on port 4000 and should only be accessible within your cluster (no public ingress)
  • Only the API service provisions teams and keys; Worker and Agent services use per-tenant keys at runtime