Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.knowledgestack.ai/llms.txt

Use this file to discover all available pages before exploring further.

When you ask a question in a thread, the model doesn’t just write back prose — it returns a stream of inline citations. Each citation points to a specific chunk_id (with a character offset and length), so the UI can underline the cited span, link to the source document, and let users click through to the original PDF page.
Knowledge Stack chat showing numbered inline citations on every claim

Why this matters

Without citationsWith citations
Users can’t tell hallucination from truthEvery claim is traceable to a source chunk
Compliance / legal / medical use cases are blockedAuditors can replay the evidence trail
Devs can’t debug bad answersYou can inspect exactly which chunks the model saw

How they’re produced

POST /v1/threads/{thread_id}/stream returns Server-Sent Events. Two event types matter:
  • message_delta — incremental text the model is typing.
  • citation — a structured pointer: {chunk_id, start_char, length, quote}.
Citations stream in alongside the text. Render the prose, attach citations to the spans they cover.

Stream and render citations

thread = ks.threads.create(title="Q4 deep-dive")

for ev in ks.threads.stream(thread_id=thread.id, message="What drove revenue?"):
    if ev.type == "message_delta":
        print(ev.text, end="", flush=True)
    elif ev.type == "citation":
        # ev.chunk_id, ev.start_char, ev.length, ev.quote
        print(f"\n  ↳ [{ev.chunk_id}] {ev.quote!r}")

The citation envelope

{
  "chunk_id": "01J7X...",
  "start_char": 0,
  "length": 21,
  "quote": "Revenue grew 18% in Q4"
}
FieldWhat it means
chunk_idThe chunk that grounded this span. Resolve via GET /v1/chunks/{chunk_id}.
start_char / lengthCharacter range inside the assistant message — use it to underline / superscript.
quoteThe exact text from the source that supports the claim.

Resolve a citation to its source

Once you have a chunk_id, you can fetch the chunk plus its parent document and page:
chunk = ks.chunks.get(chunk_id="01J7X...")
print(chunk.document_id, chunk.page_number, chunk.bbox)
print(chunk.content)
The bbox (bounding box) lets you highlight the exact region on the rendered PDF page, so a user clicking a citation jumps not only to the right page but to the right paragraph.

What the end user sees

In the Knowledge Stack chat workspace, citations render as small numbered badges next to every factual claim. Click a badge → side panel opens to the source chunk → click the chunk → the PDF opens at the highlighted region.
Knowledge Stack workspace home with chat history, file tree, and search

Design tips

  • Don’t drop citations — if the model returns a claim with no citation, treat it as low-confidence. Surface it in the UI.
  • Render citations as you stream — don’t buffer the whole message. Users gain trust when they see the citation appear with the claim.
  • Keep the quote shortquote is meant for tooltips and underlines, not for the citation panel. For the panel, fetch the full chunk via /v1/chunks/{chunk_id}.
  • Respect authorization — citations only ever point to chunks the requesting user is authorized to read; no extra filtering needed on your side.

Recipes

RAG with citations

Full example: stream answers, render citations, link back to source PDFs.

Streaming chat UI

Next.js + assistant-ui front end consuming /threads/{id}/stream.