When youDocumentation Index
Fetch the complete documentation index at: https://docs.knowledgestack.ai/llms.txt
Use this file to discover all available pages before exploring further.
POST /v1/documents/ingest, Knowledge Stack starts a durable Temporal workflow that prepares, converts, chunks, enriches, and embeds the file. The endpoint returns immediately with a workflow_id you can poll. Files become searchable as soon as the workflow completes.
Looking for the request/response schema? See
/v1/documents/ingest in the API Reference.Ingest a document
Pipeline stages
| Step | What happens | Deep dive |
|---|---|---|
| 1. Preparation | Validate format/size, persist source file to S3, create Document + DocumentVersion rows, dewatermark PDFs. | PDF watermark, S3 storage |
| 2. Conversion | Route to a converter based on MIME type (Docling for PDFs, Excel pipeline for XLSX, etc.) and extract text + visual assets. | Routing, Docling PDF, Excel |
| 3. Chunking | Walk the document hierarchy, emit Section and Chunk rows with bounding boxes, page numbers, and roles. | Chunk handling |
| 3.5. Enrichment | Generate captions for images and structured summaries for tables via an LLM. | Chunk handling |
| 4. Embedding | Embed each chunk and upsert vectors into Qdrant; mark the workflow COMPLETED. | Qdrant, Temporal workflow |
Supported formats
PDF, DOCX, PPTX, XLSX, Markdown, plaintext. Hard limits: 100 MB per file, 150 pages per document.Watch a workflow
Each ingest returns aworkflow_id. Poll it via /v1/workflows:
| Method | Path | Description |
|---|---|---|
| GET | /v1/workflows | List recent workflows for your tenant |
| GET | /v1/workflows/{id} | Live status, per-activity timing, error traces |
| POST | /v1/workflows/{id}?action=cancel | Cancel (OWNER / ADMIN only) |
| POST | /v1/workflows/{id}?action=rerun | Rerun from scratch (OWNER / ADMIN only) |
Reliability model
- Durable — each step runs as a Temporal activity with timeout + retry policy. The workflow survives worker crashes and infrastructure restarts.
- Idempotent — chunking clears prior content before re-creating; storage uploads overwrite at the same paths; identical content is deduplicated per-tenant.
- Observable — every step emits structured logs, metrics, and a span tree visible in the Temporal UI.
Retry classification
| Class | Examples | Behavior |
|---|---|---|
| Retryable | 429, 502, 503, network errors | Up to 3 retries with exponential backoff (5s → 60s) |
| Non-retryable | 400, 401, 403, 404, 500 | Fail immediately; surface error on the workflow |
Per-step timeouts
| Step | Timeout |
|---|---|
| Preparation | 60s |
| Conversion | up to 2h |
| Chunking | 10m |
| Enrichment | 2m / chunk |
| Embedding | 2m / batch |
| Workflow total | 30m |
Specialized task queues
Each stage is dispatched to its own queue so heavy work doesn’t block lighter work:| Queue | Purpose |
|---|---|
document-ingestion | Preparation, chunking, orchestration |
document-conversion | DOCX / PPTX / XLSX / MD conversion |
pdf-conversion | VLM-backed PDF conversion (GPU-heavy) |
enrichment | Image captions + table summaries |
embedding | Embedding generation + Qdrant upsert |
vector-ops | Reindex, re-embed, vector maintenance |
Re-embedding a folder
Switching embedding models? Trigger a folder-wide re-embed:Recipes
Bulk ingest from S3
Stream a whole bucket through
/documents/ingest with backpressure.CI ingest pipeline
Re-ingest changed docs on every PR using
kscli.