Embedding Models for RAG

Embedding models are used to convert text into vector embeddings. These embeddings can be used to perform various tasks like similarity search, clustering, and classification. In the context of RAG, embedding models are used to convert the input text into embeddings that are used to retrieve relevant (similar) documents from the document store.

Vendor(s)	Model	dimensions	max tokens	cost	MTEB avg score	similarity metric
OpenAI	text-embedding-3-small	1536 (scales down)	8191	$0.02 / 1M tokens	62.3	cosine, dot product, L2
OpenAI	text-embedding-3-large	3072 (scales down)	8191	$0.13 / 1M tokens	64.6	cosine, dot product, L2
Google	text-embedding-preview-0409 / text-embedding-0004	768 (scales down)	2048	$0.025/1M tokens in Vertex, free in Gemini	66.31	cosine, L2
Fireworks	thenlper/gte-large	1024	512	$0.016 / 1M tokens	63.23	cosine
Fireworks	nomic-ai/nomic-embed-text-v1.5	768 (scales down)	8192	$0.008 / 1M tokens	62.28	cosine
DeepInfra	gte-large	1024	512	$0.010 / 1M tokens	63.23	cosine
Cohere	embed-english-v3.0	1024	512	$0.10 / 1M Tokens	64.5	cosine
Voyage	voyage-large-2-instruct	1024	16000	$0.12 / 1M tokens	68.28	cosine, dot product, L2
	voyage-2	1024	4000	$0.1/ 1M tokens		cosine, dot product, L2
	voyage-code-2	1536	16000	$0.12/ 1M tokens		cosine, dot product, L2
	voyage-law-2	1024	16000	$0.12/ 1M tokens		cosine, dot product, L2

Explanation of columns

Vendor(s): The vendor(s) that provide the model as a service.
Model: The name of the model.
dimensions: The number of dimensions in the vector embeddings that the model generates
max tokens: The maximum number of tokens that can be passed to the model in a single request
cost: The cost of using the model (based on vendor pricing page, where available)
MTEB avg score: The Massive Text Embedding Benchmark (MTEB) average score. MTEB is a benchmark for evaluating the quality of embeddings across a range of tasks. The higher the score, the better the embeddings.
similarity metric: The similarity metric recommended by the model authors to use with the embeddings. We only included the metrics supported by pg_vector, some of the models may support additional metrics.