Google Embedding Models
Note that Google has to AI API products:
- Google Cloud AI, also known as Vertex AI.
- Gemini API, also known as Generative Language API.
The NodeJS SDK for Gemini API is significantly nicer, so the example below will use it. However, Vertex AI has more embedding models - multi-lingual, multi-modal, images and video.
Availble models
The example below is based on the new text-embedding-preview-0409
model, also called text-embedding-0004
in Gemini APIs.
It is the best text embedding model available from Google and ranks well in the MTEB benchmark.
Model | dimensions | max tokens | cost | MTEB avg score | similarity metric |
---|---|---|---|---|---|
text-embedding-preview-0409 / text-embedding-0004 | 768 (scales down) | 2048 | $0.025/1M tokens in Vertex, free in Gemini | 66.31 | cosine, L2 |
Usage
To use Google's embedding models, you need a Google Cloud project. The example below uses Gemini, so you will need to have Gemini Generative Language APIs enabled. and you will also need an API key with permissions to access the Generative Language API. You can get one by going to APIs & Services -> Credentials in your Google Cloud Console. (You can also use Google's AI Studio to get an API key).
Vertex has separate API to enable, separate key permissions, separate pricing and a different SDK (which we don't document here).
Installing dependencies
npm install @niledatabase/server @google/generative-ai
Generating embeddings with Google
const { GoogleGenerativeAI } = require("@google/generative-ai");
const GOOGLE_API_KEY = "my-google-api-key";
const model = "models/text-embedding-004"; // or your favorite Google model
const genAI = new GoogleGenerativeAI(GOOGLE_API_KEY);
const embed = genAI.getGenerativeModel({
model: model,
outputDimensionality: 768, // optional, default 768 but can be scaled down
});
const input_text = `The future belongs to those who believe in the beauty of their
dreams.`;
// Note, if you don't want to specify taskType and title, you can simply use:
// const doc_vec_resp = await model.embedContent(input_text);
const doc_vec_resp = await embed.embedContent({
content: { parts: [{ text: input_text }] },
taskType: "RETRIEVAL_DOCUMENT", // for embeddings of documents stored in pg_vector
// title of the document. can be used with RETRIEVAL_DOCUMENT and
// possibly improve the quality of the embeddings
// title: ""
});
const question = "Who does the future belong to?";
const question_vec_resp = await embed.embedContent({
content: { parts: [{ text: question }] },
taskType: "RETRIEVAL_QUERY", //embeddings of questions used to search documents
});
// Google returns a response object with an embedding field that
// contains the embeddings in the value field
const doc_vec = doc_vec_resp.embedding.values;
const question_vec = question_vec_resp.embedding.values;
Storing and retrieving the embeddings
// set up Nile as the vector store
const { Nile } = await import("@niledatabase/server");
const NILEDB_USER = "you need a nile user with access to a nile database";
const NILEDB_PASSWORD = "and a password for that user";
const nile = await Nile({
user: NILEDB_USER,
password: NILEDB_PASSWORD,
});
// create table to store vectors - vector size must match the model dimensions
await nile.db.query(
"CREATE TABLE IF NOT EXISTS embeddings (embedding vector(768))"
);
// store vector in a table
await nile.db.query("INSERT INTO embeddings (embedding) values ($1)", [
JSON.stringify(doc_vec.map((v) => Number(v))),
]);
// search for similar vectors
let db_resp = await nile.db.query(
"select embedding from embeddings order by embedding<=>$1 limit 1",
[JSON.stringify(question_vec.map((v) => Number(v)))]
);
// Postgres returns an object, with array of rows each column is a
// property of the row the vector is represented as a string
let similar_str = db_resp.rows[0].embedding;
let similar = similar_str
.substring(1, similar_str.length - 1)
.split(",")
.map((v) => parseFloat(v));
// check that we got the same vector back
let same = true;
for (let i = 0; i < similar.length; i++) {
if (Math.abs(similar[i] - doc_vec[i]) > 0.000001) {
same = false;
}
}
console.log("got same vector? " + same);
Additional notes
Scale down
Google's text-embedding-0004
model has 768 dimensions, but you can scale it down to lower dimensions.
The older model, text-embedding-0001
does not support scaling down.
Task types
Google's documentation about taskTypes is a bit confusing. Some documents say that taskType
is only supported
by text-embedding-0001
model, and other say that it works with 0004
as well. My experiments showed that
taskType
works with 0004
, so I have included it in the example above. I assume there are typos in the docs.
Distance metrics
Google documentation doesn't mention the distance metric used for similarity search and doesn't mention anything about normalization either. However, the MTEB benchmark records for Google's model show use of Cosine and L2 distance.