When building RAG pipelines across multiple projects, I've used all four of these vector databases in production. Each has left me with opinions about where it shines and where it hurts. This is that comparison - not benchmarks from vendor marketing sites, but tradeoffs from actual use.
What a Vector Database Actually Does
A vector database stores high-dimensional embeddings and supports approximate nearest neighbor (ANN) search. Beyond raw ANN search, modern vector databases support metadata filtering, hybrid search (dense + sparse/BM25), multi-vector search, and namespace/tenant separation.
Pinecone
What it is: Fully managed, serverless vector database. Zero infrastructure to operate.
Strengths: Easiest operational experience in the category. Good managed scaling. Serverless tier (free up to 2GB) is genuinely useful for early-stage products.
Weaknesses: Most expensive at scale - at high query volumes, costs can be 5-10x self-hosted alternatives. No built-in hybrid search. Data leaves your infrastructure (compliance issue for healthcare/finance).
from pinecone import Pinecone, ServerlessSpec
pc = Pinecone(api_key="your-api-key")
pc.create_index(
name="documents", dimension=1536, metric="cosine",
spec=ServerlessSpec(cloud="aws", region="us-east-1")
)
index = pc.Index("documents")
index.upsert(vectors=[
{"id": "doc_1", "values": embedding_vector,
"metadata": {"source": "clinical_guidelines", "year": 2023}}
])
results = index.query(
vector=query_embedding, top_k=5,
filter={"source": {"$eq": "clinical_guidelines"}, "year": {"$gte": 2022}},
include_metadata=True
)
Best for: Prototypes that need to ship quickly, teams without infrastructure capacity.
Weaviate
What it is: Open-source vector database with optional managed cloud. Schema-first. First-class hybrid search.
Strengths: Best hybrid search in the category - BM25 + vector search with configurable alpha parameter. Module ecosystem with built-in integrations for OpenAI, Cohere, HuggingFace. GraphQL and REST APIs.
Weaknesses: Schema complexity adds friction. Memory-heavy - HNSW loads entire index into RAM. Self-hosted operational complexity is real.
import weaviate
from weaviate.classes.config import Configure, Property, DataType
client = weaviate.connect_to_local()
client.collections.create(
name="Document",
vectorizer_config=Configure.Vectorizer.none(),
properties=[
Property(name="content", data_type=DataType.TEXT),
Property(name="year", data_type=DataType.INT),
]
)
collection = client.collections.get("Document")
results = collection.query.hybrid(
query="elevated liver enzymes",
vector=query_embedding,
alpha=0.5, # 0=pure BM25, 1=pure vector
limit=5
)
Best for: Production RAG where hybrid search matters, teams comfortable with schema design.
Chroma
What it is: Open-source embedding database designed for developer experience.
Strengths: Fastest time to working prototype - five lines of Python and you have a functioning vector store. Native LangChain and LlamaIndex integrations. No schema required.
Weaknesses: Not battle-tested above ~500K vectors. Limited query capabilities. No managed cloud option.
import chromadb
from chromadb.utils import embedding_functions
client = chromadb.PersistentClient(path="./chroma_db")
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
api_key="your-api-key", model_name="text-embedding-3-small"
)
collection = client.get_or_create_collection(
name="documents", embedding_function=openai_ef
)
collection.add(
documents=["Full text of document..."],
metadatas=[{"source": "clinical_guidelines", "year": 2023}],
ids=["doc_1"]
)
results = collection.query(
query_texts=["elevated liver enzymes"],
n_results=5,
where={"year": {"$gte": 2022}}
)
Best for: Prototypes, local development, learning RAG, datasets under a few hundred thousand vectors.
Qdrant
What it is: Open-source vector database written in Rust. High performance, rich filtering, managed cloud available.
Strengths: Best performance per dollar for self-hosted deployments. Payload filtering is incredibly flexible - nested JSON, geographic, full-text, numeric ranges, all combinable. Sparse vectors supported natively. HNSW parameters are tunable per collection.
Weaknesses: Less mature ecosystem. Fewer native integrations.
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct, Filter, FieldCondition, Range
client = QdrantClient(url="http://localhost:6333")
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)
client.upsert(
collection_name="documents",
points=[PointStruct(
id=1, vector=embedding_vector,
payload={"source": "clinical_guidelines", "year": 2023}
)]
)
results = client.search(
collection_name="documents",
query_vector=query_embedding,
limit=5,
query_filter=Filter(must=[FieldCondition(key="year", range=Range(gte=2022))])
)
Best for: Production at scale, cost-sensitive deployments, fine-grained filtering control.
The Decision Framework
- Scale and cost: Under 100K vectors? Any option works. Above 10M with high query volume? Qdrant or Weaviate self-hosted will be significantly cheaper than Pinecone.
- Operational capacity: No DevOps? Pinecone. Have infrastructure capacity? Qdrant or Weaviate.
- Hybrid search requirement: If exact keyword matching matters alongside semantic search, use Weaviate or Qdrant.
- Data compliance: Healthcare or finance with data residency requirements? Self-hosted only.
- Speed to prototype: Building a demo? Chroma. Twenty minutes to a working RAG pipeline. Migrate later.
The migration cost between vector databases is manageable - an evening of re-embedding and re-indexing for most datasets under 1M vectors. Don't over-engineer the initial choice.