Skip to main content
VePrompts

Embedding Cost Calculator

Estimate embedding API costs for OpenAI, Cohere, Google, Voyage, Jina, Mistral, and Anthropic. Toggle batch discounts and include RAG query costs for a complete picture.

Last updated: 2026-06-13

Quick presets:

Best balance of cost and performance for most RAG and search use cases.

Estimated embedding cost

$0.01

for 665,000 tokens

Calculation

1,000 docs × 500 words
= 500,000 words
× 1.33 tokens/word
= 665,000 tokens
× $0.02/1M tokens

Model comparison

ModelProviderDimensionsContextEffective priceCost
text-embedding-3-small selectedOpenAI1,5368,191$0.02/1M$0.01
jina-embeddings-v3 Jina AI1,0248,192$0.02/1M$0.01
voyage-3-lite Voyage AI51232,000$0.03/1M$0.02
text-embedding-ada-002 OpenAI1,5368,191$0.10/1M$0.07
embed-english-v3 Cohere1,024512$0.10/1M$0.07
embed-multilingual-v3 Cohere1,024512$0.10/1M$0.07
text-embedding-004 Google7682,048$0.10/1M$0.07
text-multilingual-embedding-002 Google7682,048$0.10/1M$0.07
voyage-3 Voyage AI1,02432,000$0.10/1M$0.07
mistral-embed Mistral1,0248,092$0.10/1M$0.07
Titan Embeddings V2 Anthropic (AWS Bedrock)1,0248,192$0.10/1M$0.07
text-embedding-3-large OpenAI3,0728,191$0.13/1M$0.09

How embedding pricing works

Pay per token, not per document

Embedding APIs charge by the number of tokens you send, not the number of documents. A 500-word document is roughly 665 tokens. If you embed 10,000 such documents, you send about 6.65 million tokens to the API.

Batch API discounts

OpenAI and several other providers offer a batch API that processes jobs asynchronously for a significant discount — often 50%. Use this calculator to see how much you can save if your embedding workload is not real-time.

RAG adds query costs

Retrieval-augmented generation has two cost components: the one-time embedding of your knowledge base, and the recurring cost of embedding each user query plus the LLM generation call. This calculator lets you estimate both.

Compare before you commit

Prices and context windows vary. A cheaper model may truncate long documents, while a more expensive model may deliver better retrieval accuracy. Use the comparison table to balance cost, context length, and output dimensions for your use case.