Embeddings
What You’ll Build
Section titled “What You’ll Build”A .prompty file that generates text embeddings — dense float vectors
that capture semantic meaning. Use them for similarity search, retrieval
augmented generation (RAG), clustering, or classification.
Step 1: Create an Embedding Prompt
Section titled “Step 1: Create an Embedding Prompt”Set apiType: embedding in the model block. The template body becomes the
default text to embed, but you’ll typically pass text as an input:
---name: text-embedderdescription: Generate embeddings for text inputmodel: id: text-embedding-3-small provider: openai apiType: embedding connection: kind: key apiKey: ${env:OPENAI_API_KEY}inputSchema: properties: - name: text kind: string default: Hello, world!template: format: kind: jinja2 parser: kind: prompty---{{text}}Step 2: Generate an Embedding
Section titled “Step 2: Generate an Embedding”from prompty import execute
vector = execute("embed.prompty", inputs={"text": "Prompty is awesome"})
print(type(vector)) # <class 'list'>print(len(vector)) # 1536 (for text-embedding-3-small)print(vector[:5]) # [0.0123, -0.0456, 0.0789, ...]The result is a list[float] — one embedding vector for the input text.
Async works the same way:
from prompty import execute_async
vector = await execute_async("embed.prompty", inputs={"text": "Prompty is awesome"})import { execute } from "prompty";
const vector = await execute("embed.prompty", { text: "Prompty is awesome" });
console.log(Array.isArray(vector)); // trueconsole.log(vector.length); // 1536console.log(vector.slice(0, 5)); // [0.0123, -0.0456, 0.0789, ...]Batch Embeddings
Section titled “Batch Embeddings”To embed multiple texts at once, pass a list. The API handles batching efficiently in a single request:
---name: batch-embedderdescription: Embed multiple texts in one callmodel: id: text-embedding-3-small provider: openai apiType: embedding connection: kind: key apiKey: ${env:OPENAI_API_KEY}inputSchema: properties: - name: texts kind: array description: List of texts to embedtemplate: format: kind: jinja2 parser: kind: prompty---{{texts}}from prompty import execute
texts = [ "What is machine learning?", "How do neural networks work?", "Explain gradient descent",]
vectors = execute("batch-embed.prompty", inputs={"texts": texts})
print(len(vectors)) # 3 (one vector per input text)print(len(vectors[0])) # 1536import { execute } from "prompty";
const texts = [ "What is machine learning?", "How do neural networks work?", "Explain gradient descent",];
const vectors = await execute("batch-embed.prompty", { texts });
console.log(vectors.length); // 3console.log(vectors[0].length); // 1536Microsoft Foundry Embeddings
Section titled “Microsoft Foundry Embeddings”Switch to Microsoft Foundry by changing the provider and connection:
---name: foundry-embeddermodel: id: ${env:AZURE_OPENAI_EMBEDDING_DEPLOYMENT} provider: foundry apiType: embedding connection: kind: key endpoint: ${env:AZURE_AI_PROJECT_ENDPOINT} apiKey: ${env:AZURE_AI_PROJECT_KEY}inputSchema: properties: - name: text kind: stringtemplate: format: kind: jinja2 parser: kind: prompty---{{text}}The code is identical — only the .prompty file changes:
vector = execute("embed-foundry.prompty", inputs={"text": "Hello from Foundry"})Use Cases
Section titled “Use Cases”Semantic Search
Section titled “Semantic Search”Generate embeddings for your document corpus, store them in a vector database, then embed the user’s query and find nearest neighbors:
from prompty import executeimport numpy as np
# Embed documents (do this once, store the vectors)docs = ["Python is a programming language", "Cats are cute animals", "The weather is sunny"]doc_vectors = [execute("embed.prompty", inputs={"text": d}) for d in docs]
# Embed a queryquery_vector = execute("embed.prompty", inputs={"text": "coding languages"})
# Cosine similaritydef cosine_sim(a, b): a, b = np.array(a), np.array(b) return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
similarities = [cosine_sim(query_vector, dv) for dv in doc_vectors]best_match = docs[np.argmax(similarities)]print(best_match) # "Python is a programming language"RAG (Retrieval Augmented Generation)
Section titled “RAG (Retrieval Augmented Generation)”Combine embeddings with a chat prompt — retrieve relevant context, then pass it to a chat completion:
from prompty import execute
# 1. Embed the user's questionquery = "How do I install Prompty?"query_vector = execute("embed.prompty", inputs={"text": query})
# 2. Search your vector store for relevant docsrelevant_docs = vector_store.search(query_vector, top_k=3)
# 3. Pass retrieved context to a chat promptcontext = "\n".join(doc.text for doc in relevant_docs)answer = execute("rag-chat.prompty", inputs={"question": query, "context": context})Available Embedding Models
Section titled “Available Embedding Models”| Model | Dimensions | Provider |
|---|---|---|
text-embedding-3-small | 1536 | OpenAI |
text-embedding-3-large | 3072 | OpenAI |
text-embedding-ada-002 | 1536 | OpenAI / Microsoft Foundry |
Set the model in model.id in your .prompty file. For Azure, use your
deployment name as the model.id.
Further Reading
Section titled “Further Reading”- File format reference — full
.promptyfrontmatter syntax - Connections — configuring OpenAI and Azure connections