Postgres pgvector: Semantic Search with Embeddings (2026)

Postgres’s pgvector extension doesn’t just store vectors; it fundamentally redefines how you query data by treating semantic meaning as a first-class citizen.

Imagine you have a table of product descriptions. Instead of searching for exact keywords like "blue widget," you want to find products that are semantically similar to "a comfortable, lightweight running shoe." This is where pgvector shines. It allows you to store numerical representations (embeddings) of your text, generated by AI models, and then perform lightning-fast similarity searches on those embeddings.

Let’s see it in action. First, you need to install the pgvector extension.

CREATE EXTENSION vector;

Now, create a table to hold your product data, including a vector column to store the embeddings.

CREATE TABLE products (
    id serial PRIMARY KEY,
    name text,
    description text,
    embedding vector(1536) -- Assuming embeddings from a model like OpenAI's text-embedding-ada-002
);

The vector(1536) specifies that each vector will have 1536 dimensions. The exact dimension depends on the embedding model you use.

To populate this table, you’d use an AI model to generate embeddings for your product descriptions. For example, using Python with the openai library:

import openai
import psycopg2

openai.api_key = "YOUR_OPENAI_API_KEY"
conn = psycopg2.connect(database="yourdb", user="youruser", password="yourpassword", host="yourhost", port="yourport")
cur = conn.cursor()

def get_embedding(text):
    response = openai.Embedding.create(
        input=text,
        model="text-embedding-ada-002"
    )
    return response['data'][0]['embedding']

# Example product data
products_data = [
    {"name": "Nike Air Zoom", "description": "A lightweight and responsive running shoe designed for speed."},
    {"name": "Adidas Ultraboost", "description": "Experience ultimate comfort and energy return with these versatile running shoes."},
    {"name": "Brooks Ghost", "description": "A neutral running shoe known for its smooth ride and reliable cushioning."},
    {"name": "New Balance Fresh Foam", "description": "Soft cushioning meets a smooth transition for a comfortable run."}
]

for product in products_data:
    embedding = get_embedding(product["description"])
    cur.execute(
        "INSERT INTO products (name, description, embedding) VALUES (%s, %s, %s)",
        (product["name"], product["description"], embedding)
    )

conn.commit()
cur.close()
conn.close()

Once your data is loaded with embeddings, you can perform semantic searches. The core of this is the similarity operator. pgvector offers several, but the most common is <=> (cosine distance). A smaller value indicates higher similarity.

To find products similar to "a shoe for jogging with soft soles," you’d first get the embedding for this query and then search:

-- First, get the embedding for the query (this would typically be done in your application code)
-- Let's assume the embedding for "a shoe for jogging with soft soles" is:
-- [0.012, -0.034, ..., 0.056]

-- Then, perform the similarity search in PostgreSQL
SELECT
    id,
    name,
    description,
    embedding <=> '[0.012, -0.034, ..., 0.056]' AS distance
FROM
    products
ORDER BY
    distance
LIMIT 5;

This query returns the top 5 most semantically similar products, ordered by their cosine distance to the query embedding.

The real magic of pgvector lies in its indexing capabilities. Without an index, similarity search is a brute-force, O(N) operation, scanning every row. For large datasets, this is too slow. pgvector provides specialized index types: ivfflat and hnsw.

ivfflat (Inverted File Flat) is a good starting point. It partitions your vector space into a configurable number of lists. When you search, it only looks within a subset of these lists, significantly reducing the number of comparisons.

To create an ivfflat index:

CREATE INDEX ON products USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

Here, lists = 100 means the index will partition the vector space into 100 lists. The vector_cosine_ops tells the index to use cosine similarity for its operations.

hnsw (Hierarchical Navigable Small World) is generally faster and more accurate for large-scale similarity search, especially when you need to tune performance. It builds a graph where nodes are vectors and edges represent proximity. Searching involves traversing this graph.

To create an hnsw index:

CREATE INDEX ON products USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

m controls the maximum number of connections per node during index construction, and ef_construction determines the trade-off between build time and index quality. For searching, you can further tune ef_search at query time.

The mental model to build is that you’re no longer searching for exact string matches. You’re querying a space of meaning. Your text is converted into points in a high-dimensional space, and pgvector allows you to find the points closest to your query point. The indexes (ivfflat, hnsw) are clever ways to avoid checking every single point in that space, making the search efficient. The core operations are distance calculations (cosine, Euclidean, dot product) between vectors, and the indexes optimize these calculations.

What most people don’t realize is that the choice between ivfflat and hnsw isn’t just about speed; it’s about the nature of the search. ivfflat is more of a partitioning scheme, good for finding approximate nearest neighbors by focusing on relevant partitions. hnsw builds a graph that allows for more sophisticated traversal, often yielding better recall and speed for very large datasets by navigating a multi-layered graph structure.

The next step is exploring different distance metrics beyond cosine, like Euclidean distance (<->) and dot product (<#>), and understanding how they map to different types of semantic relationships.