Embeddings

What You’ll Build

A .prompty file that generates text embeddings — dense float vectors that capture semantic meaning. Use them for similarity search, retrieval augmented generation (RAG), clustering, or classification.

Step 1: Create an Embedding Prompt

Set apiType: embedding in the model block. The template body becomes the default text to embed, but you’ll typically pass text as an input:

---
name: text-embedder
description: Generate embeddings for text input
model:
  id: text-embedding-3-small
  provider: openai
  apiType: embedding
  connection:
    kind: key
    apiKey: ${env:OPENAI_API_KEY}
inputSchema:
  properties:
    - name: text
      kind: string
      default: Hello, world!
template:
  format:
    kind: jinja2
  parser:
    kind: prompty
---
{{text}}

from prompty import execute

vector = execute("embed.prompty", inputs={"text": "Prompty is awesome"})

print(type(vector))    # <class 'list'>
print(len(vector))     # 1536 (for text-embedding-3-small)
print(vector[:5])      # [0.0123, -0.0456, 0.0789, ...]

The result is a list[float] — one embedding vector for the input text.

Async works the same way:

from prompty import execute_async

vector = await execute_async("embed.prompty", inputs={"text": "Prompty is awesome"})

import { execute } from "prompty";

const vector = await execute("embed.prompty", { text: "Prompty is awesome" });

console.log(Array.isArray(vector));  // true
console.log(vector.length);          // 1536
console.log(vector.slice(0, 5));     // [0.0123, -0.0456, 0.0789, ...]

Batch Embeddings

To embed multiple texts at once, pass a list. The API handles batching efficiently in a single request:

---
name: batch-embedder
description: Embed multiple texts in one call
model:
  id: text-embedding-3-small
  provider: openai
  apiType: embedding
  connection:
    kind: key
    apiKey: ${env:OPENAI_API_KEY}
inputSchema:
  properties:
    - name: texts
      kind: array
      description: List of texts to embed
template:
  format:
    kind: jinja2
  parser:
    kind: prompty
---
{{texts}}

Python
TypeScript

from prompty import execute

texts = [
    "What is machine learning?",
    "How do neural networks work?",
    "Explain gradient descent",
]

vectors = execute("batch-embed.prompty", inputs={"texts": texts})

print(len(vectors))        # 3 (one vector per input text)
print(len(vectors[0]))     # 1536

import { execute } from "prompty";

const texts = [
  "What is machine learning?",
  "How do neural networks work?",
  "Explain gradient descent",
];

const vectors = await execute("batch-embed.prompty", { texts });

console.log(vectors.length);     // 3
console.log(vectors[0].length);  // 1536

Microsoft Foundry Embeddings

Switch to Microsoft Foundry by changing the provider and connection:

---
name: foundry-embedder
model:
  id: ${env:AZURE_OPENAI_EMBEDDING_DEPLOYMENT}
  provider: foundry
  apiType: embedding
  connection:
    kind: key
    endpoint: ${env:AZURE_AI_PROJECT_ENDPOINT}
    apiKey: ${env:AZURE_AI_PROJECT_KEY}
inputSchema:
  properties:
    - name: text
      kind: string
template:
  format:
    kind: jinja2
  parser:
    kind: prompty
---
{{text}}

The code is identical — only the .prompty file changes:

vector = execute("embed-foundry.prompty", inputs={"text": "Hello from Foundry"})

Use Cases

Semantic Search

Generate embeddings for your document corpus, store them in a vector database, then embed the user’s query and find nearest neighbors:

from prompty import execute
import numpy as np

# Embed documents (do this once, store the vectors)
docs = ["Python is a programming language", "Cats are cute animals", "The weather is sunny"]
doc_vectors = [execute("embed.prompty", inputs={"text": d}) for d in docs]

# Embed a query
query_vector = execute("embed.prompty", inputs={"text": "coding languages"})

# Cosine similarity
def cosine_sim(a, b):
    a, b = np.array(a), np.array(b)
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

similarities = [cosine_sim(query_vector, dv) for dv in doc_vectors]
best_match = docs[np.argmax(similarities)]
print(best_match)  # "Python is a programming language"

RAG (Retrieval Augmented Generation)

Combine embeddings with a chat prompt — retrieve relevant context, then pass it to a chat completion:

from prompty import execute

# 1. Embed the user's question
query = "How do I install Prompty?"
query_vector = execute("embed.prompty", inputs={"text": query})

# 2. Search your vector store for relevant docs
relevant_docs = vector_store.search(query_vector, top_k=3)

# 3. Pass retrieved context to a chat prompt
context = "\n".join(doc.text for doc in relevant_docs)
answer = execute("rag-chat.prompty", inputs={"question": query, "context": context})

Available Embedding Models

Model	Dimensions	Provider
`text-embedding-3-small`	1536	OpenAI
`text-embedding-3-large`	3072	OpenAI
`text-embedding-ada-002`	1536	OpenAI / Microsoft Foundry

Set the model in model.id in your .prompty file. For Azure, use your deployment name as the model.id.