Pinecone’s Multi-Vector feature allows you to associate multiple distinct vector representations with a single document, fundamentally changing how you can index and query your data.

Let’s see it in action. Imagine you have a document about "quantum entanglement." You might want to represent it with a dense vector capturing its core semantic meaning, but also a sparse vector highlighting specific keywords like "superposition," "non-locality," and "Bell’s theorem."

Here’s how you’d upsert this with Pinecone’s Python client:

from pinecone import Pinecone, Index
import os

# Initialize Pinecone (replace with your API key and environment)
api_key = os.environ.get("PINECONE_API_KEY")
environment = os.environ.get("PINECONE_ENVIRONMENT")
pc = Pinecone(api_key=api_key, environment=environment)

# Connect to your index
index_name = "my-multivector-index"
index = pc.Index(index_name)

# Define your document ID and data
doc_id = "quantum_entanglement_doc_1"
doc_metadata = {"title": "A Primer on Quantum Entanglement"}

# Dense vector representation
dense_vector = [0.1, 0.2, 0.3, ..., 0.9] # 1536 dimensions for example

# Sparse vector representation (keys are dimension indices, values are weights)
sparse_vector = {"indices": [10, 100, 500, 1500], "values": [0.5, 0.8, 0.3, 0.1]}

# Upsert with multiple vectors
index.upsert(
    vectors=[
        (
            doc_id,
            {
                "dense": dense_vector,
                "sparse": sparse_vector
            },
            doc_metadata
        )
    ]
)

When you query, you can specify which vector types to use and how to combine their scores. For example, to find documents semantically similar to a query vector and sharing keywords, you might construct a query like this:

query_dense = [0.15, 0.25, 0.35, ..., 0.85]
query_sparse = {"indices": [10, 500], "values": [0.6, 0.4]}

# Querying with both dense and sparse vectors, with a weight for each
query_response = index.query(
    vector="dense",  # Specify which vector to use for the primary query
    sparse_vector=query_sparse, # Provide the sparse vector to combine
    top_k=5,
    include_metadata=True,
    # You can control how scores are combined, e.g., weighted sum
    # This is conceptual; actual combination logic is often handled by the model
    # or can be specified through query parameters if supported for hybrid search.
)

This feature solves the problem of a single vector’s limitations. A dense vector might capture the "feeling" of a document but miss crucial, specific terms. A sparse vector excels at keyword matching but lacks semantic nuance. Multi-vector allows you to combine the strengths of both, leading to more precise and relevant search results. Internally, Pinecone indexes these different vector types separately but associates them with the same document ID. When querying, it can retrieve and rank based on one or a combination of these vector types.

The exact levers you control are the vector names you assign during upsert (like "dense" and "sparse" in the example) and how you formulate your queries, specifying which vector type to use for the primary search and how to incorporate others. You can also define different dimensionality for each vector type.

What most people don’t realize is that when you perform a hybrid query, Pinecone doesn’t necessarily perform two separate searches and then merge the results. For certain configurations and vector types, it can perform a fused search where the scoring logic is integrated, allowing for more efficient and potentially more accurate retrieval than a simple post-filtering or score-combining approach. The specific implementation details can vary based on the vector types and the underlying search algorithms employed.

The next step is exploring how to dynamically tune the weights for combining different vector types in your hybrid search queries.

Want structured learning?

Take the full Pinecone course →