Workflows

Custom workflow definitions let you automate multi-step processing pipelines over your knowledge base. A workflow definition describes what to process (source documents), how to process it (instructions), where to write output, and which runner executes the work.

Concepts

Term	Description
Workflow definition	A reusable template that specifies sources, instructions, output locations, and runner configuration
Workflow run	A single execution of a definition, triggered on demand
Self-hosted runner	Your own HTTP endpoint that Knowledge Stack calls with the run payload

1. Create a workflow definition

curl -X POST https://api-staging.knowledgestack.ai/v1/workflow-definitions \
  -H "Authorization: Bearer <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Summarise Policy Documents",
    "description": "Generates executive summaries for all documents in the policy folder.",
    "runner_type": "SELF_HOSTED",
    "runner_config": {
      "url": "https://runner.yourcompany.com/workflow",
      "webhook_secret": "<your-webhook-secret>"
    },
    "source_path_part_ids": ["<policy-folder-path-part-id>"],
    "instruction_path_part_ids": ["<instructions-doc-path-part-id>"],
    "output_path_part_ids": ["<summaries-folder-path-part-id>"],
    "max_run_duration_seconds": 600
  }'

Request fields

Field	Type	Required	Description
`name`	string	Yes	Definition name (max 255 characters)
`description`	string	No	Human-readable description of what this workflow does
`runner_type`	string	Yes	Must be `SELF_HOSTED` — the only supported runner type
`runner_config`	object	No	Configuration for the self-hosted runner (see below)
`source_path_part_ids`	UUID[]	Yes	Path parts whose content the workflow reads (1–20 IDs)
`instruction_path_part_ids`	UUID[]	Yes	Path parts containing processing instructions (1–20 IDs)
`output_path_part_ids`	UUID[]	Yes	Path parts where the runner writes results (1–20 IDs)
`template_path_part_id`	UUID	No	Optional template document to use during processing
`max_run_duration_seconds`	integer	No	Maximum allowed run time in seconds (60–86400, default 300)

Self-hosted runner configuration

When runner_type is SELF_HOSTED, Knowledge Stack calls your endpoint with the run payload. The runner_config object controls how it connects:

Field	Description
`url`	HTTPS URL of your runner endpoint (must be a valid URI, max 2083 characters)
`webhook_secret`	Secret string used to verify the webhook signature

The webhook_secret is write-only — it is never returned in API responses. Store it securely on your runner.

Response (201)

{
  "id": "wd1a2b3c-...",
  "name": "Summarise Policy Documents",
  "runner_type": "SELF_HOSTED",
  "max_run_duration_seconds": 600,
  "source_path_part_ids": ["..."],
  "instruction_path_part_ids": ["..."],
  "output_path_part_ids": ["..."],
  "created_at": "2024-10-15T09:00:00Z",
  "updated_at": "2024-10-15T09:00:00Z"
}

2. Invoke a workflow

Trigger a run of an existing definition. Knowledge Stack captures a snapshot of the current path configuration and dispatches the run to your self-hosted runner.

curl -X POST https://api-staging.knowledgestack.ai/v1/workflow-definitions/<definition-id>/invoke \
  -H "Authorization: Bearer <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "idempotency_key": "run-2024-10-15-001"
  }'

Request fields

Field	Type	Description
`idempotency_key`	string	Optional. Prevents duplicate runs from retries — if a run with this key already exists, the existing run is returned instead of creating a new one (max 255 characters)

Response (202)

{
  "id": "wr1a2b3c-...",
  "workflow_definition_id": "wd1a2b3c-...",
  "status": "PENDING",
  "runner_type": "SELF_HOSTED",
  "started_at": "2024-10-15T09:05:00Z",
  "completed_at": null,
  "run_snapshot": {
    "workflow_name": "Summarise Policy Documents",
    "sources": [...],
    "instructions": [...],
    "outputs": [...]
  },
  "error": null,
  "created_at": "2024-10-15T09:05:00Z"
}

Save the id of the returned run to monitor its progress.

3. Monitor workflow runs

List runs for a definition

curl -X GET "https://api-staging.knowledgestack.ai/v1/workflow-definitions/<definition-id>/runs?limit=20&offset=0" \
  -H "Authorization: Bearer <your-api-key>"

Get a specific run

curl -X GET "https://api-staging.knowledgestack.ai/v1/workflow-runs/<run-id>" \
  -H "Authorization: Bearer <your-api-key>"

Run statuses

Status	Description
`PENDING`	Run is queued and waiting to be dispatched to the runner
`RUNNING`	Runner has received the payload and is actively processing
`COMPLETED`	Runner reported success via the callback endpoint
`FAILED`	Runner reported failure, or the run exceeded `max_run_duration_seconds`

Poll GET /v1/workflow-runs/{run_id} until the status reaches COMPLETED or FAILED. Check the error field on failure for a description of what went wrong.

4. Callback endpoint

Your self-hosted runner must call the callback endpoint when it finishes processing. This transitions the run from RUNNING to COMPLETED or FAILED.

# Called by your runner, not by API consumers directly
curl -X POST https://api-staging.knowledgestack.ai/v1/workflow-runs/<run-id>/callback \
  -H "Authorization: Bearer <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "status": "COMPLETED"
  }'

Request fields

Field	Type	Required	Description
`status`	string	Yes	Final status: `COMPLETED` or `FAILED`
`error`	string	No	Error description when `status` is `FAILED` (max 8192 characters)

5. Cancel a run

To stop an in-progress run, delete the underlying workflow by its workflow_id. The workflow_id is returned in the run response.

curl -X DELETE "https://api-staging.knowledgestack.ai/v1/workflows/<workflow-id>" \
  -H "Authorization: Bearer <your-api-key>"

Cancellation is best-effort. If your self-hosted runner has already begun processing, it may not stop immediately. Your runner should handle a late callback gracefully.

Document version workflows

In addition to custom workflow definitions, Knowledge Stack runs an ingestion workflow for every document version. You can monitor and rerun these directly.

List all document ingestion workflows

curl -X GET "https://api-staging.knowledgestack.ai/v1/workflows/document_versions?limit=20" \
  -H "Authorization: Bearer <your-api-key>"

Tenant admins see all workflows; members see only workflows for document versions they have read access to.

Get a specific document ingestion workflow

curl -X GET "https://api-staging.knowledgestack.ai/v1/workflows/document_versions/<workflow-id>" \
  -H "Authorization: Bearer <your-api-key>"

The response includes live pipeline execution status, along with persisted fields like status, error, and chunks_processed.

Rerun a document ingestion workflow

If a document ingestion failed or you need to reprocess a version (for example after a configuration change), rerun it without re-uploading the file:

curl -X POST "https://api-staging.knowledgestack.ai/v1/workflows/document_versions/<workflow-id>" \
  -H "Authorization: Bearer <your-api-key>"

Knowledge Stack reuses the existing file in storage — no re-upload needed.

End-to-end example

Create a workflow definition

DEFINITION=$(curl -sX POST https://api-staging.knowledgestack.ai/v1/workflow-definitions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Nightly Report Generator",
    "runner_type": "SELF_HOSTED",
    "runner_config": {
      "url": "https://runner.yourcompany.com/nightly",
      "webhook_secret": "'$WEBHOOK_SECRET'"
    },
    "source_path_part_ids": ["'$SOURCE_FOLDER_ID'"],
    "instruction_path_part_ids": ["'$INSTRUCTIONS_DOC_ID'"],
    "output_path_part_ids": ["'$OUTPUT_FOLDER_ID'"],
    "max_run_duration_seconds": 3600
  }')
DEFINITION_ID=$(echo $DEFINITION | jq -r '.id')

Invoke the workflow

RUN=$(curl -sX POST "https://api-staging.knowledgestack.ai/v1/workflow-definitions/$DEFINITION_ID/invoke" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"idempotency_key": "nightly-2024-10-15"}')
RUN_ID=$(echo $RUN | jq -r '.id')

Poll until complete

while true; do
  STATUS=$(curl -s "https://api-staging.knowledgestack.ai/v1/workflow-runs/$RUN_ID" \
    -H "Authorization: Bearer $API_KEY" | jq -r '.status')
  echo "Status: $STATUS"
  if [[ "$STATUS" == "COMPLETED" || "$STATUS" == "FAILED" ]]; then
    break
  fi
  sleep 10
done

Get Started

Core Concepts

Guides

Concepts

1. Create a workflow definition

Self-hosted runner configuration

2. Invoke a workflow

3. Monitor workflow runs

List runs for a definition

Get a specific run

Run statuses

4. Callback endpoint

5. Cancel a run

Document version workflows

List all document ingestion workflows

Get a specific document ingestion workflow

Rerun a document ingestion workflow

End-to-end example

Get Started

Core Concepts

Guides

​Concepts

​1. Create a workflow definition

​Self-hosted runner configuration

​2. Invoke a workflow

​3. Monitor workflow runs

​List runs for a definition

​Get a specific run

​Run statuses

​4. Callback endpoint

​5. Cancel a run

​Document version workflows

​List all document ingestion workflows

​Get a specific document ingestion workflow

​Rerun a document ingestion workflow

​End-to-end example

Concepts

1. Create a workflow definition

Self-hosted runner configuration

2. Invoke a workflow

3. Monitor workflow runs

List runs for a definition

Get a specific run

Run statuses

4. Callback endpoint

5. Cancel a run

Document version workflows

List all document ingestion workflows

Get a specific document ingestion workflow

Rerun a document ingestion workflow

End-to-end example