Documentation Index
Fetch the complete documentation index at: https://docs.knowledgestack.ai/llms.txt
Use this file to discover all available pages before exploring further.
Tenant Isolation
Each tenant has its own storage bucket, named by the tenant’s unique ID. This per-tenant isolation ensures:- Clean data separation between tenants
- Simple bulk deletion when a tenant is removed
- Independent access control at the bucket level
Storage Layout
All storage paths use a flat structure based on the document and version IDs, deliberately independent of your folder hierarchy. This means moving a document between folders never requires changing storage paths.What Gets Stored
Source files
Your original uploaded document is stored as-is under thesource.* key, preserving the original file extension.
Cleaned PDFs
If your PDF contains watermarks, the preparation step produces a cleaned version atcleaned_source.pdf. The original source is preserved. See PDF Watermark Removal for details.
Conversion output
The document conversion step produces a structured JSON representation of your document’s content, stored asstandard_pipeline.json. This intermediate format is used by the chunking step.
Visual assets
All visual assets are stored as WEBP images (quality 85, DPI 144):| Asset Type | Path Pattern | Description |
|---|---|---|
| Page screenshots | page_screenshots/p{N}.webp | Full-page renders, 1-indexed |
| Images | images/{N}.webp | Extracted image crops from the document |
| Tables | tables/{N}.webp | Screenshots of detected tables |
Storage Operations
The platform supports these storage operations:| Operation | Description |
|---|---|
| Upload | Store raw bytes at a given path |
| Download | Retrieve object content |
| List | List objects by path prefix |
| Delete | Batch delete objects |
| Presigned URLs | Generate time-limited download links |
URI Format
All internal storage references use thes3://{bucket}/{key} URI format, making it easy to locate any stored asset.