What the Agent Does
The agent worker handles everything involved in generating a response to your message:- Authenticates — obtains credentials to act on your behalf, ensuring all knowledge base access respects your permissions
- Loads context — fetches your conversation history to understand follow-up questions
- Runs the AI agent — the LLM decides which tools to use, retrieves information, and generates a response
- Streams in real-time — as the agent thinks and writes, events are published to your browser immediately
- Saves the response — the complete message with citations is persisted to the database
- Closes the stream — sends completion signals so your browser knows the response is finished
Agent Capabilities
The AI agent is built on a framework that manages:- LLM communication — handles model requests, response parsing, and retries
- Streaming — provides real-time event iteration as the model generates output
- Conversation history — maintains context across messages in a thread
- Tool execution — the agentic loop where the LLM decides to call tools, receives results, and continues reasoning
- Structured output — validates agent responses
Request Processing
When a message arrives, the worker executes this flow:- Initialize — prepare the agent and generate a unique message ID
- Authenticate — obtain a token to access the API on behalf of the user
- Load context — fetch the user’s message and recent conversation history (up to 10 messages)
- Open the stream — publish a
message_startevent - Run the agent — the AI agent processes the request with streaming:
- Each text token is published as a
text_deltaevent - Each tool call and result is published as a
stepevent - The agent sends heartbeats every 10 seconds to prove it is still working
- Each text token is published as a
- Process citations — resolve inline chunk references into structured citation objects
- Save the message — persist the complete response with citations and tool steps
- Close the stream — publish
message_endanddoneevents; clear the streaming flag
Streaming Protocol
The agent publishes events that follow an industry-standard streaming protocol:Event Lifecycle
| Event | Description |
|---|---|
message_start | New assistant message beginning |
text_start | Opens a text content block |
text_delta | Incremental text fragment |
text_end | Closes the text block |
step | Tool call or result snapshot |
citations | Final citations list |
message_end | Response complete |
done | Stream finished |
Reconnection
If a client disconnects and reconnects:- If the message is still streaming, the client resumes from where it left off with missed events replayed
- If the message is no longer streaming, the client receives a
message_not_streamingevent and can fetch the complete message via REST
Conversation History
The agent loads your recent conversation history to understand context for follow-up questions.| Message Role | How It Is Used |
|---|---|
| Your messages | Provided as conversation context |
| Agent responses | Provided as conversation context |
| System messages | Handled through agent instructions (not included in history) |
API Endpoints
Send a Message
POST /v1/threads/{thread_id}/user_message
Sends your message and starts the agent. Returns 202 Accepted immediately with a workflow_id.
- Returns
409 Conflictif the agent is already processing a message on this thread
Stream the Response
GET /v1/threads/{thread_id}/stream
Opens an SSE connection to receive real-time agent output.
Query parameters:
| Param | Type | Description |
|---|---|---|
last_message_id | UUID (optional) | Message ID to resume from |
last_entry_id | string (optional) | Stream entry ID to resume from |
Security Model
The agent operates under your permissions. When the worker processes your message:- It obtains a token scoped to your user identity and tenant
- All knowledge base access goes through the API’s authorization layer
- The agent can only see and search content you have permission to access
