Skip to content

Use with OpenAI

Terminal window
pip install prompty[jinja2,openai]

You also need an OpenAI API key.


Create chat.prompty:

---
name: openai-chat
description: Simple chat completion with OpenAI
model:
id: gpt-4o-mini
provider: openai
apiType: chat
connection:
kind: key
apiKey: ${env:OPENAI_API_KEY}
options:
temperature: 0.7
maxOutputTokens: 1024
inputSchema:
properties:
- name: question
kind: string
default: What is Prompty?
template:
format:
kind: jinja2
parser:
kind: prompty
---
system:
You are a helpful assistant. Answer concisely.
user:
{{question}}

import prompty
# One-liner: load → render → call LLM → return result
result = prompty.execute("chat.prompty", inputs={"question": "What is Prompty?"})
print(result)

For more control, use the pipeline stages individually:

import prompty
# Step 1 — Load the .prompty file → PromptAgent
agent = prompty.load("chat.prompty")
# Step 2 — Render template + parse role markers → list[Message]
messages = prompty.prepare(agent, inputs={"question": "Explain async/await"})
# Step 3 — Call OpenAI + process response → string
result = prompty.run(agent, messages)
print(result)

Async variant:

import asyncio
import prompty
async def main():
result = await prompty.execute_async(
"chat.prompty",
inputs={"question": "What is Prompty?"},
)
print(result)
asyncio.run(main())

Change model.id in the frontmatter — no code changes needed:

model:
id: gpt-4o # GPT-4o (default, fast + capable)
# id: gpt-4o-mini # Cheaper, good for simple tasks
# id: o1 # Reasoning model (higher latency)
# id: gpt-4-turbo # 128K context window

Create a .env file in the same directory as your script:

.env
OPENAI_API_KEY=sk-your-key-here

Prompty uses python-dotenv to load .env automatically. Make sure .env is in your .gitignore:

Terminal window
echo ".env" >> .gitignore

All options go under model.options: in the frontmatter:

model:
id: gpt-4o-mini
provider: openai
connection:
kind: key
apiKey: ${env:OPENAI_API_KEY}
options:
temperature: 0.3 # Lower = more deterministic
maxOutputTokens: 2048 # Max tokens in the response
topP: 0.9 # Nucleus sampling
frequencyPenalty: 0.2 # Reduce repetition
presencePenalty: 0.1 # Encourage new topics
seed: 42 # Reproducible outputs
stopSequences: # Stop generation at these strings
- "END"