If you’ve been anywhere near the tech stack of a modern application lately, you’ve definitely felt the shift. It’s subtle, but in a way, it’s seismic. Everyone seems to be building Retrieval-Augmented Generation (RAG) apps, chatbots, or recommendation engines. And suddenly, the database layer—which used to be the "boring," reliable part of the stack—is the hottest topic in the room.
The big question keeps coming up: PostgreSQL vs Vector DBs. Specifically, can the trusty workhorse we've used for decades really compete with shiny new specialized tools like Pinecone, Weaviate, or Qdrant?
Here's the thing—I've built systems with both. I've felt the rush of using a specialized tool designed for one specific job, and I've felt the profound relief of realizing my existing Postgres instance could handle the workload without adding a new line item to my cloud bill.
In this deep dive, we’re going to explore PostgreSQL's role in hybrid search using pgvector. We aren't just looking at feature lists; we're looking at architecture, operational sanity, and the complex reality of merging "old school" keyword search with "new school" semantic understanding.
1. Introduction: The Rise of Vector Search
Let's rewind for a second. Why are we even talking about vectors?
For years, search was purely lexical. You typed "running shoes," and the database looked for the string "running" and the string "shoes." If you searched for "jogging footwear," a strict keyword search might return zero results, even though the intent is identical.
Then came the explosion of Large Language Models (LLMs). We realized we could turn text (or images, or audio) into embeddings—long arrays of floating-point numbers that represent meaning in a high-dimensional space. "Running shoes" and "jogging footwear" end up close to each other in that mathematical space.
This capability—finding things based on meaning rather than spelling—is what we call semantic search.
But here's the catch: traditional B-Tree indexes, the kind SQL databases have used since the 70s, are terrible at finding "nearest neighbors" in multi-dimensional space. They are designed for exact matches or range queries, not calculating the cosine similarity between two 1536-dimensional vectors.
This gap in the market is essentially what birthed the "Vector Database."
2. What are Vector Databases? (And Why They're Popular)
Dedicated vector databases are specialized infrastructure components built from the ground up to store, index, and query vector embeddings. They use algorithms like HNSW (Hierarchical Navigable Small World) to perform Approximate Nearest Neighbor (ANN) searches incredibly fast.
Why did they explode in popularity?
- Performance out of the box: They are optimized for one thing. You throw a million vectors at them, and they return the top 10 matches in milliseconds.
- Specialized Features: Many offer native metadata filtering, hybrid search algorithms, and cloud-native scaling that feels "serverless."
- The AI Hype Cycle: Let’s be honest. When a new paradigm shifts, we developers love new toys. There was a moment when it felt like you couldn't do AI without a vector DB.
But as the dust settled, architects started noticing a problem. We were fragmenting our data. Your user data lived in Postgres. Your product catalog lived in Postgres. But your vectors lived in a separate, specialized database. Keeping them in sync became a distributed systems nightmare.
3. PostgreSQL: More Than Just Relational
This is where I get a bit defensive of PostgreSQL. Calling Postgres just a "relational database" feels like calling a smartphone a "calculator." It’s technically true, but it misses the point entirely.
PostgreSQL is a platform. It is an object-relational database management system that is famously extensible. It has an extension API that allows developers to hook into the core query planner, index access methods, and data storage layers.
PostGIS turned Postgres into the world's best geospatial database. TimescaleDB turned it into a time-series powerhouse. And JSONB support essentially turned it into a better document store than Mongo (don't @ me).
So, when the vector search craze hit, the Postgres community didn't panic. They just built an extension.
4. Introducing pgvector: Vector Capabilities for PostgreSQL
Enter pgvector.
pgvector is an open-source extension for PostgreSQL that allows you to store vector embeddings and perform similarity searches directly in your database. It brings the capabilities of a vector database inside the relational environment you already know and trust.
The Technical Guts
When you install pgvector, you get a new data type: vector.
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE items (
id bigserial PRIMARY KEY,
content text,
embedding vector(1536) -- 1536 dimensions for OpenAI's text-embedding-3-small
);
But the magic isn't just storing the array; it's indexing it. pgvector supports:
- IVFFlat (Inverted File with Flat Compression): This divides vectors into lists (clusters) and searches a subset of those lists. It's fast to build but requires a "training" step and has some recall trade-offs.
- HNSW (Hierarchical Navigable Small World): Added in later versions, this is the industry standard for vector indexing. It's graph-based, offers high recall, and is incredibly fast for querying, though it consumes more memory and takes longer to build.
With pgvector, you can execute queries using distance operators like <=> (cosine distance), <-> (Euclidean distance), or <#> (inner product).
5. PostgreSQL with pgvector vs. Dedicated Vector DBs: A Comparison
If you're an architect trying to decide between the two, you need to look past the marketing fluff. Here is the breakdown based on real-world production constraints.
The "One Database" Argument
The strongest argument for pgvector is operational simplicity.
- Dedicated Vector DB: You have to manage a new piece of infrastructure. You need to handle ETL pipelines to sync data. If you update a product description in Postgres, you must immediately regenerate the embedding and push it to the Vector DB. If that sync fails, your search is broken.
- Postgres + pgvector: It's an ACID transaction. You insert the row and the vector in the same transaction. Commit. Done. Data consistency is guaranteed.
Performance and Scale
This is where the dedicated DBs usually win—but the gap is narrowing.
- Dedicated Vector DB: Built for massive scale (hundreds of millions to billions of vectors). They often handle memory management better for purely vector workloads.
- Postgres + pgvector: Historically slower, but with the introduction of HNSW indexing in
pgvectorand optimizations likepgvectorscale(from Timescale), Postgres is now competitive for datasets up to tens of millions of vectors. For 99% of applications, Postgres is fast enough.
Query Complexity
- Dedicated Vector DB: Filtering usually happens via "pre-filtering" metadata. It's good, but it's not SQL. You can't join your vector results against a complex financial transaction table easily.
- Postgres + pgvector: You have the full power of SQL. You can JOIN, perform complex WHERE clauses, use CTEs, and window functions alongside your vector search.
6. When to use pgvector
You should lean towards pgvector if:
- You already use Postgres: Why add complexity if you don't have to?
- You need strong consistency: Your vectors and your metadata/source data must be perfectly in sync.
- Your dataset is "mid-sized": We're talking fewer than 100 million vectors. (Trust me, that is a lot of data for most companies).
- You need complex filtering: You want to find "users similar to this user, who also bought a red shirt in the last 30 days, and have a lifetime value > $500." SQL eats this query for breakfast.
7. When to use a dedicated vector database
You should look at Pinecone, Weaviate, or Qdrant if:
- Extreme Scale: You are indexing the entire internet or have billions of vectors.
- Separation of Concerns: Your team structure requires the AI/ML team to manage their own data store independent of the core transactional DB.
- Specialized Modalities: Some DBs have specific features for multi-modal search (video/audio) that might be clunky to retrofit into Postgres.
- Serverless/Managed Preference: You don't want to manage any instances (though managed Postgres is ubiquitous).
8. Implementing Hybrid Search with pgvector
Now, let's get to the fun part. The code.
Hybrid Search is the holy grail of modern retrieval. Why? Because Semantic Search (Vectors) is great at "concepts," but Keyword Search is better at "specifics."
If I search for "Error code 503", a vector search might return general articles about "server unavailability" or "HTTP errors." A keyword search will find the exact document containing "503". You need both.
To implement this in Postgres, we combine tsvector (built-in Full-Text Search) with vector (pgvector).
9. Keyword Search (Full-Text Search)
Postgres has had world-class full-text search for years. It uses:
tsvector: A sorted list of distinct lexemes (words normalized to their root form).tsquery: The representation of the search query.ts_rank: A function to rank matching documents.
SELECT id, content, ts_rank(to_tsvector('english', content), query) as rank
FROM items, to_tsquery('english', 'database & performance') query
WHERE to_tsvector('english', content) @@ query
ORDER BY rank DESC
LIMIT 10;
This is fast, deterministic, and precise. But it fails if the user searches for "fast storage" and your document only says "high-speed database."
10. Semantic Search (Vector Search)
This is the pgvector side. We generate an embedding for the user's query and compare it to our stored embeddings.
SELECT id, content, 1 - (embedding <=> '[0.01, 0.02, ...]') as similarity
FROM items
ORDER BY embedding <=> '[0.01, 0.02, ...]'
LIMIT 10;
Note: The <=> operator computes cosine distance. Since distance is "how far apart" things are, we often treat similarity as 1 - distance.
11. Combining Both: Hybrid Search Strategies
How do we merge these two lists? You have a list of results ranked by keyword match (BM25 usually) and a list ranked by cosine similarity. The scores are on totally different scales. ts_rank might return 0.8, while cosine distance returns 0.15. You can't just add them up.
Reciprocal Rank Fusion (RRF)
This is the industry-standard algorithm for hybrid search. It ignores the raw scores and looks at the rank.
The formula is simple: $$ Score = \sum \frac{1}{k + rank} $$
Where $k$ is a constant (usually 60) and $rank$ is the position in the result list (1st, 2nd, 3rd...).
By using RRF, we normalize the importance of being "first" in keyword search vs. "first" in vector search.
Here is the robust SQL implementation using Common Table Expressions (CTEs):
WITH semantic_search AS (
SELECT id, RANK() OVER (ORDER BY embedding <=> $1) as rank
FROM items
ORDER BY embedding <=> $1
LIMIT 20
),
keyword_search AS (
SELECT id, RANK() OVER (ORDER BY ts_rank_cd(to_tsvector('english', content), plainto_tsquery('english', $2)) DESC) as rank
FROM items
WHERE to_tsvector('english', content) @@ plainto_tsquery('english', $2)
LIMIT 20
)
SELECT
COALESCE(s.id, k.id) as id,
COALESCE(1.0 / (60 + s.rank), 0.0) +
COALESCE(1.0 / (60 + k.rank), 0.0) as rrf_score
FROM semantic_search s
FULL OUTER JOIN keyword_search k ON s.id = k.id
ORDER BY rrf_score DESC
LIMIT 10;
Let's break this down:
- semantic_search CTE: Gets top 20 matches by vector distance.
- keyword_search CTE: Gets top 20 matches by text relevance.
- FULL OUTER JOIN: We want documents that appear in either list.
- RRF Calculation: We apply the formula. If a document appears in both lists, its score is boosted significantly. If it's only in one, it still has a chance to rank if it was very high up.
12. Use Cases and Examples
The E-Commerce "Red Dress" Problem
I love this example because it's so relatable.
- User query: "Red dress for a gala."
- Vector Search: Finds "elegant evening gowns" and "formal attire." It understands the gala context perfectly but might return a blue gown because it matches the "elegance" vector.
- Keyword Search: Finds "red shirt," "red pants," "red socks." It nails the color but misses the occasion completely.
- Hybrid Search: The intersection wins. It finds documents that are semantically close to "gala" AND contain the keyword "red."
Technical Documentation (RAG)
- User query: "How to config connection pool in library X?"
- Vector Search: Finds general configuration guides.
- Keyword Search: Finds the specific parameter names
max_connectionsorpool_sizeif they appear in the query. - Hybrid: Returns the specific configuration snippet you need.
13. Performance Considerations and Scalability
You can't just slap this into production and hope for the best. Vectors are heavy, and there are realities to face.
1. Indexing Strategy: HNSW is King
For production, use HNSW indexes.
CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
m: The max number of connections per node. Higher = better recall, higher memory usage.ef_construction: The size of the dynamic candidate list during index build. Higher = better index quality, slower build time.
2. The Maintenance Cost
Vectors in Postgres require maintenance, just like everything else.
- Vacuuming: Just like standard tables, heavy updates create bloat.
- Index Warm-up: HNSW indexes perform best when they fit in RAM. If your index is 50GB and you have 16GB of RAM, you are going to thrash your disk I/O.
- Toast Tables: Vectors are large. Postgres stores them in TOAST tables (The Oversized-Attribute Storage Technique). This adds a slight overhead to retrieval compared to standard integers.
3. Partial Indexing
If your app is multi-tenant (SaaS), don't index the whole table together! Use partial indexes to partition your vector space by tenant_id.
CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops)
WHERE tenant_id = 123;
Note: This approach doesn't scale to thousands of tenants easily. For that, you need table partitioning.
14. Future of Hybrid Search and pgvector
The Postgres ecosystem isn't standing still.
We are seeing the rise of pgvectorscale, an extension-on-top-of-an-extension that adds DiskANN-inspired indexing. This promises to let us store indexes on SSDs (NVMe) rather than RAM, drastically lowering the cost of hosting massive vector datasets.
Furthermore, quantization (compressing 4-byte floats into 1-byte integers or even binary) is becoming native. This reduces storage and memory requirements by 4x to 32x with minimal loss in accuracy.
The future of hybrid search in Postgres is tighter integration. Imagine a single function call that handles the RRF logic internally, optimized by the C-level query planner. It's coming.
15. Conclusion: The Power of Choice
The battle between PostgreSQL vs Vector DBs isn't a zero-sum game. It's about tradeoffs.
If you are building the next Google or a dedicated AI search engine with billions of records, go get a dedicated Vector DB. It’s the right tool for that job.
But for the rest of us? For the developers building RAG features into existing SaaS platforms, e-commerce sites, or internal enterprise tools? PostgreSQL with pgvector is the pragmatic, powerful, and sane choice.
It allows you to keep your architecture boring (a good thing!) while delivering cutting-edge AI features. It lets you perform Hybrid Search using the robust, ACID-compliant data engine you already know how to backup, scale, and secure.
So, before you spin up a new infrastructure silo, try CREATE EXTENSION vector. You might find that the best database for your AI needs is the one you’re already using.
Frequently Asked Questions
Can pgvector scale to millions of rows? Yes. With HNSW indexing, pgvector can comfortably handle tens of millions of vectors with sub-second query latency, provided you have adequate RAM to cache the index.
What is the difference between IVFFlat and HNSW in pgvector? IVFFlat is faster to build and uses less memory but requires a training step and generally has lower recall (accuracy). HNSW is slower to build and uses more memory but offers superior performance and accuracy without a training step.
Does pgvector work with Amazon RDS or Azure Database for PostgreSQL? Yes, both AWS RDS and Azure (and GCP Cloud SQL) support the
pgvectorextension. You typically just need to enable it via the command line or parameter groups.
How do I choose the dimensions for my vector column? The dimensions must match the output of your embedding model. For example, OpenAI's
text-embedding-3-smalloutputs 1536 dimensions, so you define your column asvector(1536).
Is Hybrid Search always better than Vector Search? Not always, but usually. Pure vector search can struggle with exact keyword matches (like part numbers or specific acronyms), while hybrid search covers both semantic intent and specific keyword requirements, providing a more robust user experience.