SQL + Vector guide

This guide covers hybrid query syntax, how the cost-based planner chooses between scalar-first and ANN-first execution paths, when to override the planner with hints, schema design patterns for high-selectivity pre-filtering, and benchmark latency data at 100K, 1M, and 10M rows using 1536-dimensional vectors.

Hybrid query syntax

The NEAR operator takes a float32 vector parameter and ranks rows by approximate cosine distance (ANN, not exact). It can appear in ORDER BY to rank results, or as a scored expression in SELECT to return the similarity value alongside columns. All other SQL clauses — WHERE, JOIN, GROUP BY, LIMIT — work as expected.

-- Basic: return top-10 by vector similarity
SELECT id, content
FROM documents
ORDER BY embedding NEAR $1
LIMIT 10;

-- Hybrid: SQL filter + vector ranking (most common)
SELECT id, content, user_id
FROM documents
WHERE user_id = $1                    -- scalar predicate
  AND category IN ('rag', 'memory')   -- second scalar predicate
ORDER BY embedding NEAR $2            -- ANN ranking
LIMIT 5;

-- With score column
SELECT id, content,
       embedding NEAR $1 AS similarity
FROM documents
WHERE user_id = $1
ORDER BY similarity
LIMIT 10;

Cost-based optimizer hints

The planner chooses execution order automatically. In most cases you should not need hints. However, if you know your workload has unusual selectivity characteristics, you can guide the planner:

-- Force scalar-first (use when WHERE is highly selective)
SELECT /*+ SCALAR_FIRST */ id, content
FROM documents
WHERE tenant_id = $1            -- very selective in your data
ORDER BY embedding NEAR $2
LIMIT 5;

-- Force ANN-first (use when WHERE is not selective)
SELECT /*+ ANN_FIRST(k=100) */ id, content
FROM documents
WHERE category = $1             -- low selectivity in your data
ORDER BY embedding NEAR $2
LIMIT 5;

Use EXPLAIN HYBRID first to understand the planner's choice before adding hints.

Filtering before or after ANN

The key trade-off in hybrid query planning is between pre-filtering (scalar predicates reduce the ANN candidate set) and post-filtering (ANN produces candidates, then predicates filter). Dreambase handles this automatically, but understanding the trade-off helps you write better schemas:

  • Pre-filter wins when: your WHERE clause eliminates most rows (high selectivity). Example: WHERE user_id = 'u_441' on a table with millions of users.
  • Post-filter wins when: your WHERE clause matches most rows (low selectivity). Example: WHERE active = true where 90% of rows are active.
  • Schema advice: partition tables by high-selectivity dimensions (user_id, tenant_id) to ensure pre-filtering is always available.

Benchmark data at scale

Measured on Dreambase Cloud Production tier, single US-East region, 1536-dimensional float32 vectors (OpenAI text-embedding-3-small dimensionality), HNSW default parameters (M=16, ef_construction=200), warm cache. Cold start adds approximately 2–5ms to p99 for the first query after idle periods.

RowsQuery typep50 latencyp99 latency
100KPure ANN (NEAR only)4ms9ms
100KHybrid (user_id filter + NEAR)3ms7ms
1MPure ANN18ms38ms
1MHybrid (user_id filter + NEAR)11ms28ms
10MHybrid (user_id filter + NEAR)22ms47ms

Hybrid queries are often faster than pure ANN at the same row count because scalar pre-filtering reduces the ANN search space. The improvement grows with predicate selectivity.