POST /v1/chunks/search to query chunks in your knowledge base. The API searches using either dense vector (semantic) similarity or BM25 keyword matching, combines results with path-based authorization, and hydrates the matches from the database before returning them.
POST /v1/chunks/search
Request body
Natural language search query. Must be at least 1 character.
The search algorithm to use. See SearchType for values. Defaults to
dense_only.Maximum number of results to return. Must be between 1 and 50. Defaults to
5.Minimum similarity score a chunk must achieve to be included in results. Defaults to
0.3. Raise this to get higher-confidence matches only.Array of path part UUIDs (non-CHUNK types) to restrict the search to. When omitted, the search defaults to your tenant’s
/shared folder.Filter results to chunks that have all of the specified tag IDs (AND logic). Pass an array of tag UUIDs.
Filter by chunk content type. Valid values:
TEXT, TABLE, IMAGE, HTML, UNKNOWN. Only chunks matching at least one listed type are returned.When
true (default), only chunks from the active document version are returned. Set to false to search across all versions.ISO 8601 datetime string. Only chunks ingested after this timestamp are returned.
When
true, each result includes the ancestor document and document_version objects. Defaults to false.Response
Returns an array ofScoredChunkResponse objects ordered by relevance score descending.
Cosine similarity score (
1 - cosine_distance). Higher is more relevant. Range: 0–1 for dense_only; BM25 scores may exceed 1.Chunk UUID.
The text content of the chunk.
Content type of the chunk:
TEXT, TABLE, IMAGE, HTML, or UNKNOWN.Chunk-level metadata object (type-specific fields vary by
chunk_type).UUID of the path part node this chunk belongs to.
UUID of the parent path part (typically the document version or section).
UUID of the preceding sibling chunk, or
null if this is the first.UUID of the following sibling chunk, or
null if this is the last.Full slash-delimited path from the root to this chunk.
Token count of the chunk content, if available.
Time-limited URLs for downloading visual assets (e.g. images) associated with the chunk. Populated for
IMAGE, TABLE, and HTML chunk types.Populated when
with_document=true. Contains the ancestor document’s id, name, document_type, and document_origin.Populated when
with_document=true. Contains the ancestor version’s id, version number, and name.ISO 8601 creation timestamp.
ISO 8601 last-updated timestamp.
Example
SearchType
Controls the search algorithm used to find matching chunks.| Value | Description |
|---|---|
dense_only | Default. Dense vector (semantic) search using cosine similarity. Best for natural language questions and conceptual queries. |
full_text | BM25 keyword search. Best for exact term matching, IDs, codes, and structured queries where token overlap matters. |
dense_only relies on embeddings and handles paraphrases and synonyms well. Use full_text when your users are searching for specific strings (e.g. contract numbers, product codes).Getting context around a result
After a search, you often want the surrounding text to provide better context for an AI model or UI. Use the neighbors endpoint to fetch sibling chunks before and after a result.GET /v1/chunks//neighbors
| Parameter | Type | Default | Description |
|---|---|---|---|
prev | integer | 1 | Number of preceding siblings to include (0–20). |
next | integer | 1 | Number of succeeding siblings to include (0–20). |
chunks_only | boolean | false | When true, traversal stops at the first non-CHUNK sibling in each direction. |
anchor_index is the position of your original chunk in the items array. Items are returned in sibling order (preceding → anchor → succeeding).
Searching within a path subtree
To retrieve all chunks under a specific folder or document, use the subtree chunks endpoint. This is useful for building indexes or pre-fetching all content under a known path.GET /v1/path-parts//subtree_chunks
SubtreeChunksResponse with a groups array. Each group represents a set of chunks that share the same path_part_ids and tag_ids — useful for batching downstream operations.