Documentation Index
Fetch the complete documentation index at: https://docs.knowledgestack.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Knowledge Stack routes all LLM and embedding requests through a LiteLLM proxy. This gateway provides three key capabilities:- Per-tenant cost tracking — Every LLM request is attributed to the tenant that triggered it, giving you full visibility into AI usage costs
- Budget enforcement — You can set spending limits per tenant to control costs
- Model routing — Services reference logical model names (like
general-purposeoringestion-chunk-enrichment) that map to specific provider models in your configuration
How It Works
All AI-powered features in Knowledge Stack — document ingestion, the AI assistant, embeddings, and general-purpose LLM calls — route through the LiteLLM proxy instead of calling LLM providers directly.Per-Tenant Virtual Keys
When a new tenant is created, Knowledge Stack automatically provisions:- A LiteLLM team mapped to the tenant
- An ingestion key for document processing (no budget limit)
- An agent key for AI assistant usage (configurable budget, default $5)
Model Name Mapping
Your services use logical model names that are mapped to actual provider models in the LiteLLM configuration:| Logical Model Name | Typical Use |
|---|---|
general-purpose | General LLM tasks in the API |
ingestion-chunk-enrichment | Document processing enrichment |
text-embedding-3-small | Vector embeddings |
agent-general-purpose | AI assistant conversations |
Tenant Usage and Quotas
You can monitor tenant LLM usage through the tenant API:Self-Hosted Configuration
If you are self-hosting Knowledge Stack, you need to deploy a LiteLLM proxy instance alongside your other services.Environment Variables
Configure these on your Knowledge Stack services to point to LiteLLM:| Variable | Value | Description |
|---|---|---|
LITELLM_PROXY_URL | http://litellm:4000 | LiteLLM proxy base URL |
LITELLM_MASTER_KEY | Your master key | Admin key for team/key provisioning |
GP_LLM_API_URL | http://litellm:4000/v1 | LLM API endpoint |
EMBEDDING_API_BASE_URL | http://litellm:4000/v1 | Embedding API endpoint |
