Quickstart

Get Dreambase running in under 10 minutes. This guide walks through: installing the Python SDK, connecting via the PostgreSQL DSN, creating a hybrid table with a VECTOR(dims) column, inserting rows with float32 embeddings, and running your first hybrid query with the NEAR operator. All examples use Python 3.9+.

Step 1 — Install the SDK

pip install dreambase

# Verify installation
python -c "import dreambase; print(dreambase.__version__)"

Requires Python 3.9+. The SDK depends on numpy (for float32 array handling) and asyncpg. Both are installed automatically.

Step 2 — Connect to Dreambase

Your connection string is a standard PostgreSQL DSN. Find it in the Dreambase console under Connection > Connection string.

import dreambase

db = dreambase.connect(
    "postgresql://user:[email protected]/myapp"
)

# Async version
import asyncio
import dreambase

async def main():
    db = await dreambase.async_connect(
        "postgresql://user:[email protected]/myapp"
    )
    return db

Step 3 — Create a hybrid table

Declare a VECTOR(dims) column alongside standard SQL columns. The dims value must match your embedding model's output dimensionality exactly — text-embedding-3-small = 1536, text-embedding-3-large = 3072, embed-english-v3 (Cohere) = 1024, or whatever your model outputs. The HNSW index is created automatically when the first row is inserted; you do not run a separate CREATE INDEX statement.

db.execute("""
    CREATE TABLE IF NOT EXISTS documents (
        id         UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        user_id    TEXT NOT NULL,
        content    TEXT,
        category   TEXT,
        created_at TIMESTAMPTZ DEFAULT NOW(),
        embedding  VECTOR(1536)
    )
""")

Step 4 — Insert rows with vectors

Generate embeddings in your application using whichever model you are already using, then write them as part of the same INSERT statement. The embedding column accepts a Python list or NumPy array of float32 values. The SDK serializes it to the wire format automatically. Dreambase does not call any embedding API — you own the embedding step.

import numpy as np

# Generate embedding (any model)
embedding = your_embed_model.encode("My document text")
embedding = np.array(embedding, dtype=np.float32)

db.execute(
    """INSERT INTO documents (user_id, content, category, embedding)
       VALUES (%s, %s, %s, %s)""",
    ["u_441", "My document text", "knowledge-base", embedding]
)

Step 5 — Run your first hybrid query

The NEAR operator appears in ORDER BY and optionally in SELECT as a scored expression. It accepts a parameter (%s or $N) containing a float32 array of the same dimensionality as the stored column. The query planner decides whether to apply the WHERE predicates first or the ANN scan first — you can inspect its decision with EXPLAIN HYBRID.

query_text = "What are the deployment options?"
query_vec = np.array(your_embed_model.encode(query_text), dtype=np.float32)

results = db.query(
    """SELECT id, content, category, created_at
       FROM documents
       WHERE user_id = %s
         AND category = %s
       ORDER BY embedding NEAR %s
       LIMIT 5""",
    ["u_441", "knowledge-base", query_vec]
)

for row in results:
    print(row.content)

What's next