§13 Agent Loop Extensions

The base agent loop (§9) handles the tool-call cycle. This section specifies six extensions that make the loop production-ready: events, cancellation, context window management, guardrails, steering, and parallel tool execution.

These extensions integrate with the existing §9.2 algorithm — specifically, they wrap and extend the tool dispatch loop and the FormatToolMessages hook (§9.4). Tool guardrail denials produce synthetic results that flow through FormatToolMessages like any other tool result.

All extensions are opt-in — a conforming implementation MUST support the base loop and MAY implement any combination of these extensions.

§13.1 Agent Events

The agent loop MUST support an optional event callback that receives structured events during execution. This enables real-time UIs, logging, and coordination without coupling the loop to any particular output mechanism.

Event Types

Event Type	Payload	When Emitted
`token`	`{ token: string }`	Each content chunk during streaming
`thinking`	`{ token: string }`	Each reasoning/chain-of-thought chunk
`tool_call_start`	`{ name: string, arguments: string }`	When a tool call is detected
`tool_result`	`{ name: string, result: string }`	After a tool call completes
`status`	`{ message: string }`	Human-readable status updates
`messages_updated`	`{ messages: Message[] }`	After messages list is mutated
`done`	`{ response: string, messages: Message[] }`	Final response produced
`error`	`{ message: string }`	Non-fatal error (e.g., tool panic)
`cancelled`	`{}`	Loop was cancelled via cancellation token

API

function invoke_agent(path_or_agent, inputs, tools=null,
                      on_event=null, ...) → result:
  // on_event is Callable[[event_type: string, data: dict], None]
  // Called synchronously — MUST NOT throw

Requirements

Implementations MUST NOT skip done — it MUST be the last event emitted on successful completion.
messages_updated MUST be emitted after every mutation to the message list (tool result appended, steering message injected, context trimmed).
token events MUST be emitted when streaming is active and the final iteration produces content chunks.
tool_call_start MUST be emitted before tool execution begins.
tool_result MUST be emitted after tool execution completes.
error is for non-fatal conditions — fatal errors MUST raise exceptions.
Event callbacks MUST NOT block the loop. If an event callback raises, implementations SHOULD log the error and continue.

§13.2 Cancellation

The agent loop MUST support cooperative cancellation via a token checked at well-defined points during execution.

CancellationToken

CancellationToken:
  cancelled: bool  // thread-safe, starts false

  cancel():
    self.cancelled = true

  is_cancelled → bool:
    return self.cancelled

Check Points

The loop MUST check cancel.is_cancelled at these points:

Top of each iteration — before any work
Before each LLM call — after context trim, before HTTP request
Before each tool execution — between tool calls within one iteration

When cancellation is detected:

Emit cancelled event (if event callback is set)
Raise CancelledError (or language equivalent)
Do NOT execute any pending tool calls
Do NOT make any further LLM calls

API

function invoke_agent(path_or_agent, inputs, tools=null,
                      cancel=null, ...) → result:

Language Mapping

Implementations SHOULD use the language’s native cancellation mechanism where one exists (e.g., standard library cancellation tokens, abort signals). Where no native mechanism exists, a simple thread-safe boolean flag suffices.

Language	Mechanism
Python	Custom `CancellationToken` (threading)
TypeScript	Native `AbortSignal` / `AbortController`
C#	Native `System.Threading.CancellationToken`

§13.3 Context Window Management

Long-running agent loops accumulate messages that may exceed the model’s context window. Implementations MUST support automatic context trimming when a budget is specified.

Algorithm

function trim_to_context_window(messages, budget_chars) → Message[]:
  1. total = estimate_chars(messages)
  2. if total <= budget_chars:
       return messages  // fits — no trimming

  3. Partition messages into:
     - system_messages: leading contiguous system-role messages
     - rest: everything after

  4. Reserve summary_budget = min(5000, budget_chars * 0.05)

  5. dropped = []
     while estimate_chars(system_messages + rest) > (budget_chars - summary_budget)
           AND len(rest) > 2:
       dropped.append(rest.pop(0))  // remove oldest non-system message

  6. summary = summarize_dropped(dropped)

  7. Insert summary as user message immediately after system_messages:
     summary_msg = Message(role: "user",
       content: [TextPart(value: "[Context summary: " + summary + "]")])

  8. return system_messages + [summary_msg] + rest

Character Estimation

function estimate_chars(messages) → int:
  total = 0
  for msg in messages:
    total += len(msg.role) + 4  // role + delimiters
    for part in msg.content:
      if part is TextPart:
        total += len(part.value)
      else:
        total += 200  // fixed estimate for non-text parts
    if msg.metadata has "tool_calls":
      total += json_length(msg.metadata.tool_calls)
  return total

Summarization

function summarize_dropped(messages) → string:
  lines = []
  for msg in messages:
    if msg.role == "user":
      lines.append("User asked: " + truncate(text_of(msg), 200))
    elif msg.role == "assistant":
      text = text_of(msg)
      if text:
        lines.append("Assistant: " + truncate(text, 200))
      if msg has tool_calls:
        names = [tc.name for tc in msg.tool_calls]
        lines.append("  Called tools: " + join(names, ", "))
    // Skip tool-result messages (captured in assistant summary)
  return join(lines, "\n")  // cap at ~4000 chars

Optional LLM Compaction

Implementations MAY support a compaction_provider — a secondary LLM used to produce a higher-quality summary of dropped messages. When provided:

Build a summarizer prompt with the dropped messages
Call the compaction provider (single-turn, no tools)
If the LLM returns a non-empty response, use it as the summary
If the call fails, fall back to summarize_dropped()

API

function invoke_agent(path_or_agent, inputs, tools=null,
                      context_budget=null, ...) → result:
  // context_budget is int (character count) or null (no trimming)

Requirements

System messages MUST never be dropped.
At least 2 non-system messages MUST be preserved (the most recent user message and the conversation’s anchor).
Trimming MUST happen before each LLM call, after steering messages are drained (§13.5).
When trimming occurs, a messages_updated event MUST be emitted.
Implementations MUST NOT trim during non-agent invoke() calls.

§13.4 Guardrails

Guardrails are validation hooks at three points in the agent loop: before the LLM call (input), after the LLM response (output), and before each tool execution (tool). Each hook returns allow or deny.

GuardrailResult

GuardrailResult:
  allowed: bool
  reason: string | null  // required when allowed=false

Guardrails Configuration

Guardrails:
  input:  Callable[[Message[]], GuardrailResult] | null
  output: Callable[[Message], GuardrailResult] | null
  tool:   Callable[[string, dict], GuardrailResult] | null

Semantics

Hook	Input	On Deny
`input`	Full message list	Abort loop, raise `GuardrailError`
`output`	Assistant response message	Abort loop, raise `GuardrailError`
`tool`	Tool name + parsed args	Skip tool, inject synthetic result: `"Tool denied: {reason}"`

Check Points in Loop

loop:
  // 1. Check input guardrail (full message list)
  if guardrails.input is not null:
    result = guardrails.input(messages)
    if not result.allowed:
      emit event("error", {message: "Input guardrail denied: " + result.reason})
      raise GuardrailError(result.reason)

  // 2. Call LLM
  response = execute_llm(agent, messages)
  assistant_msg = process(agent, response)

  // 3. Check output guardrail (assistant message)
  if guardrails.output is not null:
    result = guardrails.output(assistant_msg)
    if not result.allowed:
      emit event("error", {message: "Output guardrail denied: " + result.reason})
      raise GuardrailError(result.reason)

  // 4. For each tool call, check tool guardrail
  for tool_call in tool_calls:
    if guardrails.tool is not null:
      result = guardrails.tool(tool_call.name, tool_call.arguments)
      if not result.allowed:
        tool_result = "Tool denied by guardrail: " + result.reason
        // Do NOT execute the tool — use synthetic result
        continue
    // Execute tool normally
    tool_result = execute_tool(tool_call)

  // 5. Format tool messages via executor (§9.4)
  //    Denied tools produce synthetic results that flow through
  //    FormatToolMessages like any other tool result.
  tool_messages = executor.FormatToolMessages(
    response, tool_calls, tool_results, text_content
  )
  append tool_messages to messages

Requirements

Guardrail callbacks MUST be called synchronously with respect to the loop.
For async loops, guardrail callbacks MAY be async.
GuardrailError MUST include the deny reason.
Tool guardrail denials MUST NOT abort the entire loop — only the individual tool is skipped.
Input guardrail receives the full message list including any steering messages and after context trimming.

API

function invoke_agent(path_or_agent, inputs, tools=null,
                      guardrails=null, ...) → result:

§13.5 Steering

Steering enables external code to inject user messages into a running agent loop. This supports interactive scenarios where a user wants to redirect the agent mid-execution (e.g., “actually focus on error handling”).

Steering Queue

Steering:
  queue: ThreadSafeQueue<string>

  send(message: string):
    // Enqueue a message to be injected at the next iteration
    queue.push(message)

  drain() → string[]:
    // Atomically remove and return all queued messages
    items = queue.take_all()
    return items

  has_pending → bool:
    return not queue.is_empty

Integration with Agent Loop

loop:
  // Drain steering at the TOP of each iteration
  if steering is not null:
    pending = steering.drain()
    for msg_text in pending:
      user_msg = Message(role: "user",
        content: [TextPart(value: msg_text)])
      append user_msg to messages
    if len(pending) > 0:
      emit event("messages_updated", {messages})
      emit event("status", {message: "Injected " + len(pending) + " steering message(s)"})

  // Then: context trim, guardrails, LLM call, etc.

Requirements

Steering messages MUST be drained before context trimming (so they are visible to the input guardrail and may be trimmed if budget is tight).
Steering messages MUST be appended as role: "user" messages.
send() MUST be safe to call from any thread or async task.
drain() MUST be atomic — no message is lost or duplicated.
If no steering object is provided, the loop behaves as before.

Thread Safety

send() MUST be safe to call from any thread or async task. drain() MUST be atomic — no message is lost or duplicated. Implementations SHOULD use the language’s idiomatic concurrent queue or equivalent.

API

function invoke_agent(path_or_agent, inputs, tools=null,
                      steering=null, ...) → result:

§13.6 Parallel Tool Execution

When the LLM returns multiple tool calls in a single response, implementations MAY execute them concurrently instead of sequentially.

API

function invoke_agent(path_or_agent, inputs, tools=null,
                      parallel_tool_calls=false, ...) → result:

Algorithm

if parallel_tool_calls AND len(tool_calls) > 1:
  // Execute all tools concurrently
  results = parallel_map(tool_calls, execute_tool)
  // Results are ordered to match tool_calls
else:
  // Sequential execution (default)
  results = [execute_tool(tc) for tc in tool_calls]

Requirements

Parallel execution MUST preserve result ordering — tool results MUST be appended to messages in the same order as the original tool calls.
Each parallel tool execution MUST have its own trace span.
If any tool raises an exception, other in-flight tools SHOULD be allowed to complete (do not cancel siblings).
Tool guardrails (§13.4) MUST still be checked for each tool — denied tools receive synthetic results while other tools execute normally.
tool_call_start and tool_result events MUST be emitted for each tool regardless of parallel or sequential execution.
Implementations SHOULD use the language’s idiomatic concurrency primitive for parallel execution (e.g., task groups, promise combinators, thread pools).

§13.7 Unified Signature

The full invoke_agent signature with all extensions:

function invoke_agent(
  path_or_agent,            // string path or loaded agent
  inputs = null,            // input dictionary
  tools = null,             // tool handlers
  *,                        // keyword-only below
  max_iterations = 10,      // iteration cap
  on_event = null,          // event callback
  cancel = null,            // cancellation token
  context_budget = null,    // character budget for context window
  guardrails = null,        // validation hooks
  steering = null,          // mid-loop message injection
  parallel_tool_calls = false,  // concurrent tool execution
  raw = false,              // return raw response (no processing)
) → result

Execution Order Within Each Iteration

  1. Check cancellation
  2. Drain steering messages
  3. Trim context window (if budget set)
  4. Check input guardrail
  5. Call LLM (§9.2 step 5b)
  6. Process response (§9.2 step 5c)
  7. Check output guardrail
  8. If tool calls:
     a. Check tool guardrails (per tool)
     b. Execute tools (parallel or sequential), applying bindings (§9.6)
     c. Format tool messages via executor.FormatToolMessages (§9.4)

§11.2 Tool Binding (`bind_tools`)

The bind_tools function validates that @tool-decorated handler functions match the tool declarations in an agent’s frontmatter, and returns a handler dictionary suitable for passing to invoke_agent.

Signature

function bind_tools(agent, tools) → dict[str, callable]:
  agent:  A loaded Prompty agent
  tools:  A list of @tool-decorated functions (Python/TS) or an object
          instance with [Tool]-decorated methods (C#)

Algorithm

function bind_tools(agent, tools):
  // 1. Build a map of provided handler names → functions
  handlers = {}
  for fn in tools:
    name = fn.__tool__.name
    if name in handlers:
      raise ValueError("Duplicate tool handler: " + name)
    handlers[name] = fn

  // 2. Get declared function tool names from agent.tools
  declared = set()
  for tool_def in agent.tools:
    if tool_def.kind == "function":
      declared.add(tool_def.name)

  // 3. Validate: every handler must match a declaration
  for name in handlers:
    if name not in declared:
      raise ValueError(
        "Tool handler '" + name + "' has no matching declaration "
        + "in agent.tools. Declared function tools: " + join(declared))

  // 4. Warn: every function declaration should have a handler
  for name in declared:
    if name not in handlers:
      warn("Tool '" + name + "' is declared in agent.tools but "
           + "no handler was provided to bind_tools()")

  // 5. Return the validated handler dict
  return handlers

Requirements

bind_tools MUST only validate against kind: "function" tools. Tools with other kinds (mcp, openapi, custom) are resolved by kind handlers and do not require function handlers.
bind_tools MUST raise an error if a handler has no matching declaration. This catches typos and stale handlers early.
bind_tools SHOULD warn (not error) if a declared function tool has no handler, since the tool may be handled by the name registry or kind handler.
The returned dictionary MUST be suitable for passing as the tools parameter to invoke_agent.
bind_tools MUST NOT mutate agent.tools or the global registry. It is a pure validation and extraction step.

Language Adaptations

Language	Signature	Notes
Python	`bind_tools(agent, [fn1, fn2, ...])` → `dict`	Functions have `__tool__` attribute
TypeScript	`bindTools(agent, [fn1, fn2, ...])` → `Record`	Functions have `__tool__` property
C#	`ToolAttribute.BindTools(agent, instance)` → `Dictionary`	Reflects over `[Tool]` methods

§13 Agent Loop Extensions

§13.1 Agent Events

Event Types

API

Requirements

§13.2 Cancellation

CancellationToken

Check Points

API

Language Mapping

§13.3 Context Window Management

Algorithm

Character Estimation

Summarization

Optional LLM Compaction

API

Requirements

§13.4 Guardrails

GuardrailResult

Guardrails Configuration

Semantics

Check Points in Loop

Requirements

API

§13.5 Steering

Steering Queue

Integration with Agent Loop

Requirements

Thread Safety

API

§13.6 Parallel Tool Execution

API

Algorithm

Requirements

§13.7 Unified Signature

Execution Order Within Each Iteration

§11.2 Tool Binding (bind_tools)

Signature

Algorithm

Requirements

Language Adaptations

§11.2 Tool Binding (`bind_tools`)