SQL query editor alongside vector embedding scatter plot visualization

February 27, 2026

When Vector Search Meets SQL: What Actually Happens

Vector search and SQL are built on fundamentally different assumptions. SQL is set-based and deterministic: given the same query, you get the same results. Vector search is probabilistic and approximate: you get the k nearest neighbors to a query vector, and “nearest” means something different depending on your distance metric, your embedding model, and how your index was built.

The join problem

When you try to combine them — run a vector similarity search and then join the results against a relational table — the challenge is that the two result sets have different cardinality semantics. Your SQL WHERE clause is filtering by exact match or range. Your vector search is returning approximate top-k results. Joining them means you have to decide: do you run the vector search first and then filter, or filter first and then search?

The answer depends on selectivity. If your relational filter is very selective (returning 0.1% of rows), you should filter first and then search that subset. If it's not selective (returning 80% of rows), you should search first. Most systems don't reason about this automatically, which means you either write it explicitly or accept suboptimal query plans.

Index compatibility

HNSW and IVFFlat vector indexes are not composable with B-tree indexes. You can't ask Postgres to use both an HNSW index on an embedding column and a B-tree index on a timestamp column in the same query node. The planner has to choose one and scan the results of the other sequentially.

This is why pgvector queries with additional filters are often slower than expected: the planner falls back to a sequential scan on the filtered dimension, which negates the benefit of the vector index for large tables.

What actually works

The approaches that perform well in production involve pre-filtering at the application layer before the vector search, or using purpose-built vector databases that understand predicate pushdown into the index structure. Dreambase handles this at the query rewrite layer: it detects queries that mix vector similarity and relational filters, profiles the selectivity of each predicate, and recommends or automatically applies the correct execution order.