Skip to content

§13 Agent Loop Extensions

The base agent loop (§9) handles the tool-call cycle. This section specifies six extensions that make the loop production-ready: events, cancellation, context window management, guardrails, steering, and parallel tool execution.

These extensions integrate with the existing §9.2 algorithm — specifically, they wrap and extend the tool dispatch loop and the FormatToolMessages hook (§9.4). Tool guardrail denials produce synthetic results that flow through FormatToolMessages like any other tool result.

All extensions are opt-in — a conforming implementation MUST support the base loop and MAY implement any combination of these extensions.


The agent loop MUST support an optional event callback that receives structured events during execution. This enables real-time UIs, logging, and coordination without coupling the loop to any particular output mechanism.

Event TypePayloadWhen Emitted
token{ token: string }Each content chunk during streaming
thinking{ token: string }Each reasoning/chain-of-thought chunk
tool_call_start{ name: string, arguments: string }When a tool call is detected
tool_result{ name: string, result: string }After a tool call completes
status{ message: string }Human-readable status updates
messages_updated{ messages: Message[] }After messages list is mutated
done{ response: string, messages: Message[] }Final response produced
error{ message: string }Non-fatal error (e.g., tool panic)
cancelled{}Loop was cancelled via cancellation token
function invoke_agent(path_or_agent, inputs, tools=null,
on_event=null, ...) → result:
// on_event is Callable[[event_type: string, data: dict], None]
// Called synchronously — MUST NOT throw
  • Implementations MUST NOT skip done — it MUST be the last event emitted on successful completion.
  • messages_updated MUST be emitted after every mutation to the message list (tool result appended, steering message injected, context trimmed).
  • token events MUST be emitted when streaming is active and the final iteration produces content chunks.
  • tool_call_start MUST be emitted before tool execution begins.
  • tool_result MUST be emitted after tool execution completes.
  • error is for non-fatal conditions — fatal errors MUST raise exceptions.
  • Event callbacks MUST NOT block the loop. If an event callback raises, implementations SHOULD log the error and continue.

The agent loop MUST support cooperative cancellation via a token checked at well-defined points during execution.

CancellationToken:
cancelled: bool // thread-safe, starts false
cancel():
self.cancelled = true
is_cancelled → bool:
return self.cancelled

The loop MUST check cancel.is_cancelled at these points:

  1. Top of each iteration — before any work
  2. Before each LLM call — after context trim, before HTTP request
  3. Before each tool execution — between tool calls within one iteration

When cancellation is detected:

  1. Emit cancelled event (if event callback is set)
  2. Raise CancelledError (or language equivalent)
  3. Do NOT execute any pending tool calls
  4. Do NOT make any further LLM calls
function invoke_agent(path_or_agent, inputs, tools=null,
cancel=null, ...) → result:

Implementations SHOULD use the language’s native cancellation mechanism where one exists (e.g., standard library cancellation tokens, abort signals). Where no native mechanism exists, a simple thread-safe boolean flag suffices.

LanguageMechanism
PythonCustom CancellationToken (threading)
TypeScriptNative AbortSignal / AbortController
C#Native System.Threading.CancellationToken

Long-running agent loops accumulate messages that may exceed the model’s context window. Implementations MUST support automatic context trimming when a budget is specified.

function trim_to_context_window(messages, budget_chars) → Message[]:
1. total = estimate_chars(messages)
2. if total <= budget_chars:
return messages // fits — no trimming
3. Partition messages into:
- system_messages: leading contiguous system-role messages
- rest: everything after
4. Reserve summary_budget = min(5000, budget_chars * 0.05)
5. dropped = []
while estimate_chars(system_messages + rest) > (budget_chars - summary_budget)
AND len(rest) > 2:
dropped.append(rest.pop(0)) // remove oldest non-system message
6. summary = summarize_dropped(dropped)
7. Insert summary as user message immediately after system_messages:
summary_msg = Message(role: "user",
content: [TextPart(value: "[Context summary: " + summary + "]")])
8. return system_messages + [summary_msg] + rest
function estimate_chars(messages) → int:
total = 0
for msg in messages:
total += len(msg.role) + 4 // role + delimiters
for part in msg.content:
if part is TextPart:
total += len(part.value)
else:
total += 200 // fixed estimate for non-text parts
if msg.metadata has "tool_calls":
total += json_length(msg.metadata.tool_calls)
return total
function summarize_dropped(messages) → string:
lines = []
for msg in messages:
if msg.role == "user":
lines.append("User asked: " + truncate(text_of(msg), 200))
elif msg.role == "assistant":
text = text_of(msg)
if text:
lines.append("Assistant: " + truncate(text, 200))
if msg has tool_calls:
names = [tc.name for tc in msg.tool_calls]
lines.append(" Called tools: " + join(names, ", "))
// Skip tool-result messages (captured in assistant summary)
return join(lines, "\n") // cap at ~4000 chars

Implementations MAY support a compaction_provider — a secondary LLM used to produce a higher-quality summary of dropped messages. When provided:

  1. Build a summarizer prompt with the dropped messages
  2. Call the compaction provider (single-turn, no tools)
  3. If the LLM returns a non-empty response, use it as the summary
  4. If the call fails, fall back to summarize_dropped()
function invoke_agent(path_or_agent, inputs, tools=null,
context_budget=null, ...) → result:
// context_budget is int (character count) or null (no trimming)
  • System messages MUST never be dropped.
  • At least 2 non-system messages MUST be preserved (the most recent user message and the conversation’s anchor).
  • Trimming MUST happen before each LLM call, after steering messages are drained (§13.5).
  • When trimming occurs, a messages_updated event MUST be emitted.
  • Implementations MUST NOT trim during non-agent invoke() calls.

Guardrails are validation hooks at three points in the agent loop: before the LLM call (input), after the LLM response (output), and before each tool execution (tool). Each hook returns allow or deny.

GuardrailResult:
allowed: bool
reason: string | null // required when allowed=false
Guardrails:
input: Callable[[Message[]], GuardrailResult] | null
output: Callable[[Message], GuardrailResult] | null
tool: Callable[[string, dict], GuardrailResult] | null
HookInputOn Deny
inputFull message listAbort loop, raise GuardrailError
outputAssistant response messageAbort loop, raise GuardrailError
toolTool name + parsed argsSkip tool, inject synthetic result: "Tool denied: {reason}"
loop:
// 1. Check input guardrail (full message list)
if guardrails.input is not null:
result = guardrails.input(messages)
if not result.allowed:
emit event("error", {message: "Input guardrail denied: " + result.reason})
raise GuardrailError(result.reason)
// 2. Call LLM
response = execute_llm(agent, messages)
assistant_msg = process(agent, response)
// 3. Check output guardrail (assistant message)
if guardrails.output is not null:
result = guardrails.output(assistant_msg)
if not result.allowed:
emit event("error", {message: "Output guardrail denied: " + result.reason})
raise GuardrailError(result.reason)
// 4. For each tool call, check tool guardrail
for tool_call in tool_calls:
if guardrails.tool is not null:
result = guardrails.tool(tool_call.name, tool_call.arguments)
if not result.allowed:
tool_result = "Tool denied by guardrail: " + result.reason
// Do NOT execute the tool — use synthetic result
continue
// Execute tool normally
tool_result = execute_tool(tool_call)
// 5. Format tool messages via executor (§9.4)
// Denied tools produce synthetic results that flow through
// FormatToolMessages like any other tool result.
tool_messages = executor.FormatToolMessages(
response, tool_calls, tool_results, text_content
)
append tool_messages to messages
  • Guardrail callbacks MUST be called synchronously with respect to the loop.
  • For async loops, guardrail callbacks MAY be async.
  • GuardrailError MUST include the deny reason.
  • Tool guardrail denials MUST NOT abort the entire loop — only the individual tool is skipped.
  • Input guardrail receives the full message list including any steering messages and after context trimming.
function invoke_agent(path_or_agent, inputs, tools=null,
guardrails=null, ...) → result:

Steering enables external code to inject user messages into a running agent loop. This supports interactive scenarios where a user wants to redirect the agent mid-execution (e.g., “actually focus on error handling”).

Steering:
queue: ThreadSafeQueue<string>
send(message: string):
// Enqueue a message to be injected at the next iteration
queue.push(message)
drain() → string[]:
// Atomically remove and return all queued messages
items = queue.take_all()
return items
has_pending → bool:
return not queue.is_empty
loop:
// Drain steering at the TOP of each iteration
if steering is not null:
pending = steering.drain()
for msg_text in pending:
user_msg = Message(role: "user",
content: [TextPart(value: msg_text)])
append user_msg to messages
if len(pending) > 0:
emit event("messages_updated", {messages})
emit event("status", {message: "Injected " + len(pending) + " steering message(s)"})
// Then: context trim, guardrails, LLM call, etc.
  • Steering messages MUST be drained before context trimming (so they are visible to the input guardrail and may be trimmed if budget is tight).
  • Steering messages MUST be appended as role: "user" messages.
  • send() MUST be safe to call from any thread or async task.
  • drain() MUST be atomic — no message is lost or duplicated.
  • If no steering object is provided, the loop behaves as before.

send() MUST be safe to call from any thread or async task. drain() MUST be atomic — no message is lost or duplicated. Implementations SHOULD use the language’s idiomatic concurrent queue or equivalent.

function invoke_agent(path_or_agent, inputs, tools=null,
steering=null, ...) → result:

When the LLM returns multiple tool calls in a single response, implementations MAY execute them concurrently instead of sequentially.

function invoke_agent(path_or_agent, inputs, tools=null,
parallel_tool_calls=false, ...) → result:
if parallel_tool_calls AND len(tool_calls) > 1:
// Execute all tools concurrently
results = parallel_map(tool_calls, execute_tool)
// Results are ordered to match tool_calls
else:
// Sequential execution (default)
results = [execute_tool(tc) for tc in tool_calls]
  • Parallel execution MUST preserve result ordering — tool results MUST be appended to messages in the same order as the original tool calls.
  • Each parallel tool execution MUST have its own trace span.
  • If any tool raises an exception, other in-flight tools SHOULD be allowed to complete (do not cancel siblings).
  • Tool guardrails (§13.4) MUST still be checked for each tool — denied tools receive synthetic results while other tools execute normally.
  • tool_call_start and tool_result events MUST be emitted for each tool regardless of parallel or sequential execution.
  • Implementations SHOULD use the language’s idiomatic concurrency primitive for parallel execution (e.g., task groups, promise combinators, thread pools).

The full invoke_agent signature with all extensions:

function invoke_agent(
path_or_agent, // string path or loaded agent
inputs = null, // input dictionary
tools = null, // tool handlers
*, // keyword-only below
max_iterations = 10, // iteration cap
on_event = null, // event callback
cancel = null, // cancellation token
context_budget = null, // character budget for context window
guardrails = null, // validation hooks
steering = null, // mid-loop message injection
parallel_tool_calls = false, // concurrent tool execution
raw = false, // return raw response (no processing)
) → result
1. Check cancellation
2. Drain steering messages
3. Trim context window (if budget set)
4. Check input guardrail
5. Call LLM (§9.2 step 5b)
6. Process response (§9.2 step 5c)
7. Check output guardrail
8. If tool calls:
a. Check tool guardrails (per tool)
b. Execute tools (parallel or sequential), applying bindings (§9.6)
c. Format tool messages via executor.FormatToolMessages (§9.4)

The bind_tools function validates that @tool-decorated handler functions match the tool declarations in an agent’s frontmatter, and returns a handler dictionary suitable for passing to invoke_agent.

function bind_tools(agent, tools) → dict[str, callable]:
agent: A loaded Prompty agent
tools: A list of @tool-decorated functions (Python/TS) or an object
instance with [Tool]-decorated methods (C#)
function bind_tools(agent, tools):
// 1. Build a map of provided handler names → functions
handlers = {}
for fn in tools:
name = fn.__tool__.name
if name in handlers:
raise ValueError("Duplicate tool handler: " + name)
handlers[name] = fn
// 2. Get declared function tool names from agent.tools
declared = set()
for tool_def in agent.tools:
if tool_def.kind == "function":
declared.add(tool_def.name)
// 3. Validate: every handler must match a declaration
for name in handlers:
if name not in declared:
raise ValueError(
"Tool handler '" + name + "' has no matching declaration "
+ "in agent.tools. Declared function tools: " + join(declared))
// 4. Warn: every function declaration should have a handler
for name in declared:
if name not in handlers:
warn("Tool '" + name + "' is declared in agent.tools but "
+ "no handler was provided to bind_tools()")
// 5. Return the validated handler dict
return handlers
  • bind_tools MUST only validate against kind: "function" tools. Tools with other kinds (mcp, openapi, custom) are resolved by kind handlers and do not require function handlers.
  • bind_tools MUST raise an error if a handler has no matching declaration. This catches typos and stale handlers early.
  • bind_tools SHOULD warn (not error) if a declared function tool has no handler, since the tool may be handled by the name registry or kind handler.
  • The returned dictionary MUST be suitable for passing as the tools parameter to invoke_agent.
  • bind_tools MUST NOT mutate agent.tools or the global registry. It is a pure validation and extraction step.
LanguageSignatureNotes
Pythonbind_tools(agent, [fn1, fn2, ...])dictFunctions have __tool__ attribute
TypeScriptbindTools(agent, [fn1, fn2, ...])RecordFunctions have __tool__ property
C#ToolAttribute.BindTools(agent, instance)DictionaryReflects over [Tool] methods