Engineering
Vector databases in 2026: pgvector, Pinecone, Weaviate, Qdrant
Most teams should start with pgvector and not move until they hit a real wall. An honest comparison of when each one wins, with the operational tradeoffs.
Picking a vector database in 2026 should be a 10-minute decision for most teams, not a six-week procurement exercise. The honest answer for 80% of cases is "use pgvector, you already have Postgres". The other 20% have specific reasons, and this post is about those reasons.
We've shipped production RAG on three of these four. This is the working table.
Start here: pgvector
pgvector is a Postgres extension that adds a vector type, indexes, and similarity operators. If you're already running Postgres, you've already got a vector database.
What you get out of the box:
- Vector type and three index kinds (IVFFlat, HNSW, plus the new disk-based options).
- Hybrid queries that combine vector similarity with regular SQL filters.
- All your existing Postgres operational knowledge applies. Backups, replication, monitoring, tuning.
- Joins. The ability to retrieve and reason about the related rows in the same query.
When pgvector wins:
- Under, say, 100 million vectors. Possibly much further with good hardware and HNSW tuning.
- When your retrieval almost always needs metadata filtering ("similar to this, also from this customer, also in this category").
- When you don't want to operate another service.
- When data sovereignty or compliance means the data stays in your Postgres anyway.
When pgvector stops being enough:
- Hundreds of millions to billions of vectors with strict latency requirements.
- Multi-tenant SaaS at scale where namespaces and isolation become a real design problem.
- When you need vector-native sharding patterns that Postgres can't replicate cheaply.
Honest take: most teams will never hit these limits. We've yet to migrate a client off pgvector for performance reasons. We have moved teams to pgvector because their previous vector database was expensive overkill.
Pinecone: managed, fast, expensive
Pinecone is the managed-service incumbent. You don't operate it. You don't tune it. You pay them, you get a low-latency vector search with good defaults.
When Pinecone wins:
- You don't want to run infrastructure.
- You're at a scale where pgvector becomes operationally non-trivial and you'd rather pay than tune.
- Multi-tenancy and namespacing are core to your product.
- You can absorb the per-month bill.
The downsides are predictable. Pricing is per-pod, per-month, and adds up. Your data lives in their cloud. You can't easily run a "what if I joined this with my customer table" query. If your retrieval doesn't need vector-only superpowers, you're paying for capabilities you don't use.
We use Pinecone occasionally for clients who explicitly want managed and have the budget. We rarely recommend it as a first choice anymore.
Qdrant: the open-source, self-hostable workhorse
Qdrant is the one we increasingly reach for when a client has outgrown pgvector and wants self-hosted. Written in Rust, fast, with a good Python client and decent docs.
What we like:
- Self-hostable on the same VPS that's running the rest of the stack.
- Filtering on payload fields is fast and clean.
- Hybrid search (vector + keyword) is first-class.
- Supports quantisation for cost savings at scale.
- Open-source, with an open-source cloud option if you want managed later.
What's annoying:
- You're running a service now. Backups, upgrades, monitoring.
- The query model is its own DSL. Some teams find that a friction; we find it fine.
We use Qdrant for clients at the "tens of millions of vectors" scale who need self-hosted and are uncomfortable putting that load directly on their primary Postgres.
Weaviate: opinionated, batteries-included
Weaviate sits in a slightly different position. It bundles vector search, hybrid retrieval, modular embedders, and a query language (GraphQL or REST) into one product. You can deploy it self-hosted or use their managed cloud.
What we like:
- Built-in hybrid search with rankers.
- Schemas and types, which keeps your data clean.
- Multi-modal support for image and text embeddings in one store.
What's annoying:
- More opinionated than Qdrant. You're buying into their data model.
- Resource-heavy. Don't try to run it on a tiny VPS alongside everything else.
We've shipped one engagement on Weaviate where the multi-modal story was a real fit. Beyond that, Qdrant has done what we need.
A short comparison table
Honest, with the caveats above.
| Constraint | First choice | Second choice |
|---|---|---|
| Already on Postgres, under 10M vectors | pgvector | Qdrant |
| Self-hosted, 10M+ vectors | Qdrant | pgvector with tuning |
| Don't want to operate anything | Pinecone | Managed Qdrant |
| Multi-modal (text + image) at scale | Weaviate | Qdrant |
| Compliance / data sovereignty | Self-hosted Qdrant | pgvector |
| Smallest possible footprint | pgvector | Qdrant |
For the multi-tenant SaaS work we do on the Krypto Forge Platform, we default to pgvector because it lives next to the tenant data, the joins are useful, and the operational story is one service instead of two.
What actually matters more than the database
A point we'd rather make than skip. The choice of vector database almost never determines RAG quality. The retrieval pipeline does. Specifically:
- Chunking strategy.
- Embedding model choice.
- Hybrid retrieval with keyword search alongside vectors.
- Reranking the top-N before passing to the LLM.
- Citation discipline.
We've seen teams agonise over Pinecone-vs-Qdrant and ship a worse product than a team that picked pgvector and spent that week on reranking. The reranker is more important than the database.
If your RAG is bad, the database is rarely the reason. Look at chunking and reranking first. The vector database is usually doing its job.
The migration path that works
If you start on pgvector and outgrow it, the migration is a known shape:
- Keep the embedding model the same. Re-embedding is the expensive part you're trying to avoid.
- Export vectors and payloads in batches. Bulk-load into Qdrant or wherever you're going.
- Run both stores in parallel for a week. Shadow queries from production and compare results.
- Cut over when results match within tolerance.
This is a 2-week project, not a quarter. We've done it twice. It's boring infrastructure.
The summary
pgvector first. Qdrant when you outgrow it and want to stay self-hosted. Pinecone when you want managed and have the budget. Weaviate when multi-modal is core. Don't optimise the database before you've optimised the retrieval pipeline.
The boring choice is usually the right one.
The vector database is plumbing. The retrieval pipeline is the product.
Tags
- vectordb
- pgvector
- pinecone
- qdrant
- rag
More on engineering
- LangGraph: when the complexity actually pays offLangGraph is the most powerful and most painful agent framework. A walk through when state machines and checkpoints earn their cost, and when you should just use the Claude Agent SDK and move on.
- The cost-aware LLM pipeline: when to use Haiku, Sonnet, Opus70% of LLM cost is wasted on calls that didn't need the smartest model. A working pattern for routing prompts to the right tier, with caching and graceful degradation.
- RAG, fine-tuning, or just better prompts: a 2026 decision treeMost teams reach for RAG when prompting would do, and fine-tuning when RAG would do. The honest decision tree, with the cases where each actually wins.