Skip to content

Pipeline Architecture

Prompty processes .prompty files through a four-stage pipeline. Each stage is defined by a protocol — concrete implementations are discovered at runtime via Python entry points (or TypeScript imports). This design means you can swap any stage without touching the others: use a different template engine, a custom parser, or your own LLM provider.

graph LR
  A[".prompty File"] -->|"rendered\nstring"| B["🔹 Renderer\n(Stage 1)"]
  B -->|"rendered\nstring"| C["🔹 Parser\n(Stage 2)"]
  C -->|"list\n[Message]"| D["🔷 Executor\n(Stage 3)"]
  D -->|"raw\nresponse"| E["🟢 Processor\n(Stage 4)"]
  E --> F["✅ Result"]

The Renderer takes a PromptAgent and a dictionary of inputs, then renders the template (the instructions field) with those values. The result is a single rendered string containing role markers and filled-in variables.

PropertyValue
Registration keyagent.template.format.kind
Built-in implementationsJinja2Renderer ("jinja2"), MustacheRenderer ("mustache")
InputPromptAgent + dict of inputs
Outputstr — rendered template

The renderer also handles thread markers — when an input has kind: thread, the renderer emits special nonce markers that the pipeline later expands into Message objects for conversation history.

system:
You are an AI assistant helping {{ firstName }}.
user:
{{ question }}

The Parser takes the rendered string and splits it into a structured list of messages using role markers — lines ending with a colon that indicate who is speaking.

PropertyValue
Registration keyagent.template.parser.kind
Built-in implementationsPromptyChatParser ("prompty")
Inputstr — rendered template
Outputlist[Message] — structured message objects

Recognized role markers:

system: → { role: "system", content: "..." }
user: → { role: "user", content: "..." }
assistant: → { role: "assistant", content: "..." }

The Executor takes the list of messages and calls the LLM provider. It handles API type dispatch — routing to the appropriate SDK method based on agent.model.apiType.

PropertyValue
Registration keyagent.model.provider
Built-in implementationsOpenAIExecutor ("openai"), AzureExecutor ("azure")
Inputlist[Message] + PromptAgent (for config)
OutputRaw SDK response object

API type dispatch:

apiTypeSDK methodUse case
"chat" (default)chat.completions.create()Conversational prompts
"embedding"embeddings.create()Text → vector embeddings
"image"images.generate()DALL-E image generation
"responses"responses.create()OpenAI Responses API (latest features)

The Processor takes the raw SDK response and extracts clean, usable content. What “clean” means depends on the response type.

PropertyValue
Registration keyagent.model.provider
Built-in implementationsOpenAIProcessor ("openai"), AzureProcessor ("azure")
InputRaw SDK response + PromptAgent
OutputProcessed result (string, list, dict, parsed JSON, etc.)

Processing by response type:

Response typeOutput
Chat completionstr — the message content
Embeddinglist[float] or list[list[float]]
Imagestr — URL or base64 data
StreamingPromptyStream / AsyncPromptyStream iterator
Structured outputParsed dict matching outputSchema

You don’t always need the full pipeline. Prompty provides convenience functions that map to specific stage groupings:

graph LR
  subgraph render ["render()"]
    R["Renderer"]
  end

  subgraph parse ["parse()"]
    P["Parser"]
  end

  subgraph prepare ["prepare() — render + parse + thread expansion"]
    R2["Renderer"] --> P2["Parser"]
  end

  subgraph run ["run() — execute + process"]
    EX["Executor"] --> PR["Processor"]
  end

  subgraph execute ["execute() — full pipeline: load → render → parse → execute → process → result"]
    L["Load"] --> R3["Renderer"] --> P3["Parser"] --> EX2["Executor"] --> PR2["Processor"] --> RES["Result"]
  end
from prompty import load, prepare, execute
from prompty.core.pipeline import render, parse, run, process
agent = load("chat.prompty")
inputs = {"firstName": "Jane", "question": "What is AI?"}
# Stage 1 only — render the template
rendered = render(agent, inputs)
# Stage 2 only — parse rendered string into messages
messages = parse(agent, rendered)
# Stages 1 + 2 — render, parse, and expand threads
messages = prepare(agent, inputs)
# Stages 3 + 4 — execute LLM call and process response
result = run(agent, messages)
# Full pipeline — load + prepare + run
result = execute("chat.prompty", inputs=inputs)

The Python runtime discovers stage implementations using Python entry points — the same mechanism that powers CLI tools and pytest plugins. Each implementation registers itself under a group name in pyproject.toml.

GroupResolved fromExample keys
prompty.renderersagent.template.format.kindjinja2, mustache
prompty.parsersagent.template.parser.kindprompty
prompty.executorsagent.model.provideropenai, azure
prompty.processorsagent.model.provideropenai, azure

These are registered in Prompty’s own pyproject.toml:

[project.entry-points."prompty.renderers"]
jinja2 = "prompty.renderers.jinja2:Jinja2Renderer"
mustache = "prompty.renderers.mustache:MustacheRenderer"
[project.entry-points."prompty.parsers"]
prompty = "prompty.parsers.prompty:PromptyChatParser"
[project.entry-points."prompty.executors"]
openai = "prompty.providers.openai.executor:OpenAIExecutor"
azure = "prompty.providers.azure.executor:AzureExecutor"
[project.entry-points."prompty.processors"]
openai = "prompty.providers.openai.processor:OpenAIProcessor"
azure = "prompty.providers.azure.processor:AzureProcessor"

The discovery module (core/discovery.py) caches lookups, so entry points are only resolved once per key. Call clear_cache() if you need to reset after dynamic registration.


You can write your own implementation for any stage by implementing the corresponding protocol and registering it as an entry point.

Each protocol defines sync and async methods. Here’s an example custom executor:

from __future__ import annotations
from prompty.core.types import Message
class AnthropicExecutor:
"""Executor for the Anthropic Claude API."""
def execute(self, agent, messages: list[Message]) -> object:
import anthropic
client = anthropic.Anthropic()
return client.messages.create(
model=agent.model.id,
messages=[{"role": m.role, "content": m.content} for m in messages],
)
async def execute_async(self, agent, messages: list[Message]) -> object:
import anthropic
client = anthropic.AsyncAnthropic()
return await client.messages.create(
model=agent.model.id,
messages=[{"role": m.role, "content": m.content} for m in messages],
)

In your package’s pyproject.toml:

[project.entry-points."prompty.executors"]
anthropic = "my_package.executor:AnthropicExecutor"

After installing the package, any .prompty file with model.provider: "anthropic" will automatically route to your executor.

model:
id: claude-sonnet-4-20250514
provider: anthropic
connection:
kind: key
apiKey: ${env:ANTHROPIC_API_KEY}