Skip to content

Agent Extensions

The Prompty agent loop (see Agent Mode) supports six optional extensions that give you fine-grained control over every iteration. All extensions are opt-in — pass only the ones you need.

ExtensionPurposeParameter
EventsObserve loop activity (tool calls, errors, etc.)on_event / onEvent
CancellationCooperatively abort a running loopcancel / signal / cancellationToken
Context WindowAuto-trim messages to fit the model’s contextcontext_budget / contextBudget
GuardrailsValidate input, output, and tool callsguardrails
SteeringInject user messages mid-loopsteering
Parallel ToolsExecute tool calls concurrentlyparallel_tool_calls / parallelToolCalls

Additionally, each runtime provides a typed tool decorator/attribute that turns a regular function into a registered tool.

1. Check cancellation
2. Drain steering messages
3. Trim context window
4. Input guardrail
5. Check cancellation (again, before LLM call)
6. Call LLM
7. Output guardrail
8. Tool guardrails (per tool)
9. Execute tools (serial or parallel)
10. Format tool results → append to messages

Subscribe to structured events emitted during the agent loop. Event callbacks must not block — exceptions are silently swallowed to keep the loop running.

from prompty import invoke_agent, load
from prompty.core import AgentEvent, EventCallback
def my_callback(event: AgentEvent) -> None:
print(f"[{event.event_type}] {event.data}")
agent = load("agent.prompty")
result = invoke_agent(
agent,
inputs={"question": "Hello"},
tools={"get_weather": get_weather},
on_event=my_callback,
)
EventWhenData
tool_call_startBefore each tool executesname, arguments
tool_resultAfter each tool executesname, result
statusInformational (e.g., steering injected)message
messages_updatedMessages array changedmessages
doneLoop completed normallyresponse, messages
errorA guardrail denied or error occurredmessage
cancelledLoop was cancellediteration

Cooperatively cancel a running agent loop. The loop checks for cancellation at the top of each iteration and just before the LLM call.

import threading
from prompty import invoke_agent, load
from prompty.core import CancellationToken
token = CancellationToken()
# Cancel from another thread after 5 seconds
threading.Timer(5.0, token.cancel).start()
try:
result = invoke_agent(agent, inputs, tools, cancel=token)
except CancelledError:
print("Agent loop was cancelled")

Automatically trim messages to fit within a character budget. The trimmer preserves system messages and the most recent conversation turns, replacing dropped messages with a compact summary.

result = invoke_agent(
agent,
inputs={"question": "Summarize our conversation"},
tools=tools,
context_budget=50_000, # characters
)
  1. Estimate the character cost of all messages (role overhead + text + tool call JSON)
  2. Partition into leading system messages vs. the rest
  3. Drop the oldest non-system messages until within budget (keeping at least 2)
  4. Summarize dropped messages into a compact [Context summary: ...] block
  5. Inject the summary as a user message after the system messages

Validate messages at three checkpoints in the loop: before the LLM call (input), after the LLM responds (output), and before each tool executes (tool). If a guardrail denies, a GuardrailError is raised.

from prompty.core import Guardrails, GuardrailResult, GuardrailError
def check_input(messages):
"""Block prompt injection attempts."""
for msg in messages:
if "ignore previous instructions" in msg.text.lower():
return GuardrailResult(allowed=False, reason="Prompt injection detected")
return GuardrailResult(allowed=True)
def check_tool(name, args):
"""Only allow known-safe tools."""
if name == "delete_all_data":
return GuardrailResult(allowed=False, reason="Dangerous tool blocked")
return GuardrailResult(allowed=True)
guardrails = Guardrails(input=check_input, tool=check_tool)
try:
result = invoke_agent(agent, inputs, tools, guardrails=guardrails)
except GuardrailError as e:
print(f"Blocked: {e.reason}")
HookWhenReceivesTypical Use
inputBefore LLM callFull message listPrompt injection detection, content policy
outputAfter LLM responseAssistant messageToxicity filtering, PII detection
toolBefore each tool callTool name + argsBlock dangerous operations, rate limiting

Inject additional user messages into a running agent loop from outside. This is useful for human-in-the-loop scenarios where you want to redirect the agent mid-conversation.

import threading
from prompty.core import Steering
steering = Steering()
def user_input_loop():
while True:
msg = input("You: ")
steering.send(msg)
# Run input loop in background
threading.Thread(target=user_input_loop, daemon=True).start()
result = invoke_agent(agent, inputs, tools, steering=steering)

At the top of each iteration, the loop calls steering.drain() to collect all pending messages and appends them to the conversation. The steering queue is thread-safe in all runtimes.


When the LLM requests multiple tool calls in a single response, you can execute them concurrently instead of sequentially.

# Async mode uses asyncio.gather for true parallelism
result = await invoke_agent_async(
agent, inputs, tools,
parallel_tool_calls=True,
)

All extensions compose naturally. Here’s a fully-configured agent call:

from prompty import invoke_agent_async, load_async
from prompty.core import (
CancellationToken,
Guardrails,
GuardrailResult,
Steering,
)
agent = await load_async("agent.prompty")
token = CancellationToken()
steering = Steering()
guardrails = Guardrails(
input=lambda msgs: GuardrailResult(allowed=True),
output=lambda msg: GuardrailResult(allowed=True),
tool=lambda name, args: GuardrailResult(allowed=True),
)
result = await invoke_agent_async(
agent,
inputs={"question": "Plan my trip"},
tools=tools,
on_event=lambda e: print(f"[{e.event_type}]"),
cancel=token,
context_budget=50_000,
guardrails=guardrails,
steering=steering,
parallel_tool_calls=True,
max_iterations=20,
)

The .prompty file is the single source of truth — tools are declared in frontmatter so the file is a complete, portable exchange format. The runtime needs a handler function for each declared tool. Each runtime provides a decorator or attribute that makes writing these handlers clean: you get typed parameters instead of raw JSON, and the boilerplate disappears.

The .prompty File Declares, Your Code Implements

Section titled “The .prompty File Declares, Your Code Implements”

Tools are always declared in the .prompty frontmatter — this is what gets sent to the LLM so it knows what tools are available:

# agent.prompty (frontmatter excerpt)
tools:
- name: get_weather
kind: function
description: Get the current weather for a city
parameters:
- name: city
kind: string
description: City name
required: true
- name: units
kind: string
default: celsius

Your code then provides the handler — the function that actually runs when the LLM calls get_weather. This is where @tool / tool() / [Tool] helps.

# ❌ Without @tool — manual dict, raw JSON parsing
import json
def get_weather(args_json):
args = json.loads(args_json)
city = args["city"]
units = args.get("units", "celsius")
return f"72°F in {city}"
tools = {"get_weather": get_weather}
result = invoke_agent(agent, inputs, tools=tools)
# ✅ With @tool + bind_tools — typed, validated, clean
from prompty import tool, bind_tools, invoke_agent
@tool
def get_weather(city: str, units: str = "celsius") -> str:
"""Get the current weather for a city."""
return f"72°F in {city}"
# bind_tools validates names match the .prompty declarations
tools = bind_tools(agent, [get_weather])
result = invoke_agent(agent, inputs, tools=tools)

A complete agent with the .prompty file and matching handlers:

# agent.prompty
---
name: assistant
model:
id: gpt-4o
provider: openai
apiType: chat
tools:
- name: get_weather
kind: function
description: Get the current weather for a city
parameters:
- name: city
kind: string
required: true
- name: get_time
kind: function
description: Get the current time in a timezone
parameters:
- name: timezone
kind: string
required: true
inputs:
- name: question
kind: string
---
system:
You are a helpful assistant with access to weather and time tools.
user:
{{question}}
from prompty import load, invoke_agent, tool, bind_tools
@tool
def get_weather(city: str) -> str:
"""Get the current weather for a city."""
return f"72°F and sunny in {city}"
@tool
def get_time(timezone: str) -> str:
"""Get the current time in a timezone."""
return f"3:42 PM in {timezone}"
agent = load("agent.prompty")
# bind_tools validates that each @tool name matches a declaration
# in agent.tools, then returns the handler dict
tools = bind_tools(agent, [get_weather, get_time])
result = invoke_agent(
agent,
inputs={"question": "What's the weather in Tokyo?"},
tools=tools,
)
print(result)
# Bare decorator — uses function name and docstring
@tool
def my_func(x: str) -> str:
"""This becomes the description."""
return x
# With overrides
@tool(name="custom_name", description="Custom description")
def my_func_v2(x: str) -> str:
return x
# Access the generated FunctionTool definition
print(my_func.__tool__.name) # "my_func"
print(my_func.__tool__.description) # "This becomes the description."
print(my_func.__tool__.parameters) # [Property(name="x", kind="string", ...)]

The decorator/attribute maps language types to schema kinds automatically:

PythonTypeScriptC#Schema Kind
str"string"stringstring
int"integer"int, longinteger
float"float"float, doublefloat
bool"boolean"boolboolean
list"array"List<T>, arraysarray
dict"object"Dictionary<,>object