Skip to content

Embeddings

A .prompty file that generates text embeddings — dense float vectors that capture semantic meaning. Use them for similarity search, retrieval augmented generation (RAG), clustering, or classification.


Set apiType: embedding in the model block. The template body becomes the default text to embed, but you’ll typically pass text as an input:

embed.prompty
---
name: text-embedder
description: Generate embeddings for text input
model:
id: text-embedding-3-small
provider: openai
apiType: embedding
connection:
kind: key
apiKey: ${env:OPENAI_API_KEY}
inputSchema:
properties:
- name: text
kind: string
default: Hello, world!
template:
format:
kind: jinja2
parser:
kind: prompty
---
{{text}}

from prompty import execute
vector = execute("embed.prompty", inputs={"text": "Prompty is awesome"})
print(type(vector)) # <class 'list'>
print(len(vector)) # 1536 (for text-embedding-3-small)
print(vector[:5]) # [0.0123, -0.0456, 0.0789, ...]

The result is a list[float] — one embedding vector for the input text.

Async works the same way:

from prompty import execute_async
vector = await execute_async("embed.prompty", inputs={"text": "Prompty is awesome"})

To embed multiple texts at once, pass a list. The API handles batching efficiently in a single request:

batch-embed.prompty
---
name: batch-embedder
description: Embed multiple texts in one call
model:
id: text-embedding-3-small
provider: openai
apiType: embedding
connection:
kind: key
apiKey: ${env:OPENAI_API_KEY}
inputSchema:
properties:
- name: texts
kind: array
description: List of texts to embed
template:
format:
kind: jinja2
parser:
kind: prompty
---
{{texts}}
from prompty import execute
texts = [
"What is machine learning?",
"How do neural networks work?",
"Explain gradient descent",
]
vectors = execute("batch-embed.prompty", inputs={"texts": texts})
print(len(vectors)) # 3 (one vector per input text)
print(len(vectors[0])) # 1536

Switch to Microsoft Foundry by changing the provider and connection:

embed-foundry.prompty
---
name: foundry-embedder
model:
id: ${env:AZURE_OPENAI_EMBEDDING_DEPLOYMENT}
provider: foundry
apiType: embedding
connection:
kind: key
endpoint: ${env:AZURE_AI_PROJECT_ENDPOINT}
apiKey: ${env:AZURE_AI_PROJECT_KEY}
inputSchema:
properties:
- name: text
kind: string
template:
format:
kind: jinja2
parser:
kind: prompty
---
{{text}}

The code is identical — only the .prompty file changes:

vector = execute("embed-foundry.prompty", inputs={"text": "Hello from Foundry"})

Generate embeddings for your document corpus, store them in a vector database, then embed the user’s query and find nearest neighbors:

from prompty import execute
import numpy as np
# Embed documents (do this once, store the vectors)
docs = ["Python is a programming language", "Cats are cute animals", "The weather is sunny"]
doc_vectors = [execute("embed.prompty", inputs={"text": d}) for d in docs]
# Embed a query
query_vector = execute("embed.prompty", inputs={"text": "coding languages"})
# Cosine similarity
def cosine_sim(a, b):
a, b = np.array(a), np.array(b)
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
similarities = [cosine_sim(query_vector, dv) for dv in doc_vectors]
best_match = docs[np.argmax(similarities)]
print(best_match) # "Python is a programming language"

Combine embeddings with a chat prompt — retrieve relevant context, then pass it to a chat completion:

from prompty import execute
# 1. Embed the user's question
query = "How do I install Prompty?"
query_vector = execute("embed.prompty", inputs={"text": query})
# 2. Search your vector store for relevant docs
relevant_docs = vector_store.search(query_vector, top_k=3)
# 3. Pass retrieved context to a chat prompt
context = "\n".join(doc.text for doc in relevant_docs)
answer = execute("rag-chat.prompty", inputs={"question": query, "context": context})

ModelDimensionsProvider
text-embedding-3-small1536OpenAI
text-embedding-3-large3072OpenAI
text-embedding-ada-0021536OpenAI / Microsoft Foundry

Set the model in model.id in your .prompty file. For Azure, use your deployment name as the model.id.