05. Streaming Output and the Event Model

An agent without streaming output feels sluggish. After the user submits a task, the model may think for several seconds, then request a tool, and the tool may run for several more seconds. If the interface only refreshes when the final result appears, the user cannot tell whether the system is working, stuck, or has already failed. Streaming events are not a visual optimization — they are the foundation of an agent's observability.

The real difficulty of streaming

Ordinary chat streaming only needs to keep appending text. Agent streaming has to handle many more events:

Assistant text deltas.
Tool call started.
Tool call argument deltas.
Tool call arguments complete.
Tool execution started, progress, and finished.
Turn finished.
Abort, retry, compaction, and queue updates.

Tool call arguments deserve special attention. Many providers emit JSON arguments in fragments. The UI can show "the model is preparing to call read" early on, but the runtime must wait until the arguments are complete and validated before executing. Executing a half-finished JSON payload as arguments is a common bug in streaming agents.

The event union

Start by defining a stable set of events. They should not be tied to any UI framework:

type AgentEvent =
  | { type: "assistant_text_delta"; text: string }
  | { type: "tool_call_started"; id: string; name: string }
  | { type: "tool_call_arguments_delta"; id: string; delta: string }
  | { type: "tool_call_ready"; id: string; name: string; input: unknown }
  | { type: "tool_execution_started"; id: string; name: string }
  | { type: "tool_execution_update"; id: string; text: string }
  | { type: "tool_execution_finished"; id: string; isError: boolean }
  | { type: "turn_finished"; message: AssistantMessage };

The UI, logs, extensions, and tests can all subscribe to the same event stream. This prevents the split where "the terminal shows one state, the log records another, and the SDK reports a third."

Dual views: an iterable of events and a final message

A streaming interface should ideally support two consumption styles at once:

The UI consumes events one by one.
The agent loop awaits the final assistant message.

A teaching project can express this idea with a small wrapper:

type EventStream<TEvent, TResult> = {
  events: AsyncIterable<TEvent>;
  result: Promise<TResult>;
};

The provider adapter emits deltas during streaming while accumulating the final assistant message. The agent loop can await result to decide the stop reason; the UI can iterate over events to render in real time. Both come from the same underlying stream — there is no need to send the request twice.

Abort is not a leaked exception

When the user presses stop, the underlying request receives an AbortSignal. The runtime should not simply let an AbortError bubble all the way to the top. A better approach is:

Cancel the provider request and any tools currently executing.
Emit an aborted event.
Form an assistant message with stop reason aborted.
Write it to the session log.

That way, when the user resumes the session, they can see where the task was interrupted. Extensions and the UI can also clean up resources based on an explicit state.

Observing a run

A streaming turn that reads a file might produce these events:

assistant_text_delta: Let me check the configuration file first.
tool_call_started: read
tool_call_arguments_delta: {"path":
tool_call_arguments_delta: "src/config.ts"}
tool_call_ready: read {"path":"src/config.ts"}
tool_execution_started: read
tool_execution_finished: read false
turn_finished: stopReason=toolUse

Note that turn_finished still carries toolUse, because once tool execution completes, another model request must follow. Streaming events tell the user what happened; the stop reason tells the runtime what to do next.

Production trade-offs

Events are a public contract. Once the UI, SDK, and extensions depend on them, you cannot change fields casually. When designing events, keep them stable, fine-grained, and composable. Do not name events after the action of some interface component, such as appendToChatBubble; name them after domain facts, such as assistant_text_delta.

Events also need enough correlation ids. A single assistant turn can request multiple tools in parallel, and multiple tools may emit progress at the same time. Without a tool call id, the UI cannot attribute progress to the right tool, and the log cannot be replayed.

Exercises

Add an event stream to the loop from the previous chapter.

Acceptance criteria:

Text deltas and the final assistant message are consistent.
A tool is never executed before its tool call arguments are complete.
Every tool execution has started and finished events.
After a user abort, the session shows the aborted state.
Record an event sequence with the faux provider and write an assertion that guarantees a stable event order.