Quick Start Guide
Prerequisites
๐ก New to NeuronDB? For the fastest setup, use the Docker Quick Start which sets up the complete ecosystem (NeuronDB + NeuronAgent + NeuronMCP + NeuronDesktop) in 5 minutes.
Before you begin, make sure you have:
- โ NeuronDB installed - See Installation Guide for setup instructions
- โ
PostgreSQL client -
psql(or any SQL client) - โ 5-10 minutes - For complete quickstart
๐ Verify Prerequisites
Check prerequisites
# Check if psql is installed
psql --version
# Check if Docker is installed (if using Docker)
docker --version
docker compose versionStep 1: Install NeuronDB
If you haven't installed NeuronDB yet, choose your method:
Option A: Docker Compose (Recommended for Quick Start) ๐ณ
Fastest way to get started:
Start NeuronDB with Docker
# From repository root
docker compose up -d neurondb
# Wait for service to be healthy (30-60 seconds)
docker compose ps neurondbExpected output:
NAME STATUS neurondb-cpu healthy
๐ Note: Docker starts a PostgreSQL container with NeuronDB pre-installed. The first run takes 2 to 5 minutes to download images.
Option B: Native Installation ๐ง
For production or custom setups: Follow the detailed Installation Guide for native PostgreSQL installation.
โ Verify Installation
Test NeuronDB installation:
Verify installation
# With Docker Compose
docker compose exec neurondb psql -U neurondb -d neurondb -c "CREATE EXTENSION IF NOT EXISTS neurondb;"
# Or with native PostgreSQL
psql -d your_database -c "CREATE EXTENSION IF NOT EXISTS neurondb;"
# Check the version
docker compose exec neurondb psql -U neurondb -d neurondb -c "SELECT neurondb.version();"
# Or: psql -d your_database -c "SELECT neurondb.version();"Expected output:
version --------- 3.0 (1 row)
โ
Success! If you see version 3.0 or similar, NeuronDB is installed and working correctly.
Step 2: Load Quickstart Data Pack
The quickstart data pack provides ~500 sample documents with pre-generated embeddings, ready for immediate use.
๐ What's in the Data Pack?
- ~500 documents - Sample text documents
- Pre-generated embeddings - Vector representations (384 dimensions)
- HNSW index - Pre-built index for fast search
- Ready to query - No setup required
Option 1: Using the CLI (Recommended) ๐
Easiest method - handles everything automatically:
Load quickstart data with CLI
# From repository root
./scripts/neurondb-cli.sh quickstartWhat it does:
- Creates the
quickstart_documentstable - Loads ~500 sample documents
- Creates HNSW index
- Verifies data is loaded
Option 2: Using the Loader Script ๐
Manual control over the process:
Load with script
# From repository root
./examples/quickstart/load_quickstart.shOption 3: Using psql Directly ๐ป
For maximum control:
Load with psql
# With Docker Compose
docker compose exec neurondb psql -U neurondb -d neurondb -f examples/quickstart/quickstart_data.sql
# Or with native PostgreSQL
psql -d your_database -f examples/quickstart/quickstart_data.sqlโ Verify Data Loaded
Check that data was loaded successfully:
Verify data
# Count documents
psql "postgresql://neurondb:neurondb@localhost:5433/neurondb" -c "SELECT COUNT(*) FROM quickstart_documents;"Expected output:
count ------- 500 (1 row)
Check table structure:
Check table structure
\d quickstart_documentsExpected columns:
id- Document IDtitle- Document titlecontent- Document contentembedding- Vector embedding (384 dimensions)
โ Perfect! Your data is loaded and ready to query.
Step 3: Try SQL Recipes
The SQL recipe library provides ready-to-run queries for common operations.
Example 1: Basic Similarity Search ๐ฏ
Find documents similar to a specific document:
Similarity search
-- Find documents similar to document #1
SELECT
id,
title,
embedding <=> (SELECT embedding FROM quickstart_documents WHERE id = 1) AS distance
FROM quickstart_documents
WHERE id != 1
ORDER BY embedding <=> (SELECT embedding FROM quickstart_documents WHERE id = 1)
LIMIT 10;What this does:
- Gets embedding of document #1
- Calculates cosine distance to all other documents
- Returns top 10 most similar documents
๐ Understanding distance: Lower distance = more similar. Cosine distance ranges from 0 (identical) to 2 (opposite).
Example 2: Query with Text Embedding ๐ค
Search using a text query:
Text embedding search
-- Generate embedding for query text
WITH query AS (
SELECT embed_text('machine learning algorithms', 'all-MiniLM-L6-v2') AS q_vec
)
-- Find similar documents
SELECT
id,
title,
embedding <=> q.q_vec AS distance
FROM quickstart_documents, query q
ORDER BY embedding <=> q.q_vec
LIMIT 10;What this does:
- Generates embedding for "machine learning algorithms"
- Searches for documents with similar embeddings
- Returns top 10 results
๐ก Embedding models: The all-MiniLM-L6-v2 model is fast and works well for general text. See embedding models documentation for more options.
Example 3: Hybrid Search (Vector + Full-Text) ๐
Combine vector similarity with PostgreSQL full-text search:
Hybrid search
-- Hybrid search: vector + full-text
WITH query AS (
SELECT
embed_text('machine learning', 'all-MiniLM-L6-v2') AS q_vec,
to_tsquery('english', 'machine & learning') AS q_tsquery
)
SELECT
id,
title,
content,
-- Combined score: 70% vector, 30% full-text
(embedding <=> q.q_vec) * 0.7 +
(ts_rank(to_tsvector('english', content), q.q_tsquery) * 0.3) AS combined_score
FROM quickstart_documents, query q
WHERE to_tsvector('english', content) @@ q.q_tsquery
ORDER BY combined_score DESC
LIMIT 10;What this does:
- Generates vector embedding for query
- Creates full-text search query
- Combines both scores (70% vector, 30% text)
- Returns top 10 results
๐ Why hybrid search? Vector search finds semantically similar content, while full-text search finds exact keyword matches. Combining both gives better results.
Example 4: Filtered Search ๐๏ธ
Add metadata filters to vector search:
Filtered search
-- Search with filters
WITH query AS (
SELECT embed_text('technology', 'all-MiniLM-L6-v2') AS q_vec
)
SELECT
id,
title,
embedding <=> q.q_vec AS distance
FROM quickstart_documents, query q
WHERE id > 100 -- Example filter
AND id < 200 -- Example filter
ORDER BY embedding <=> q.q_vec
LIMIT 10;What this does:
- Generates query embedding
- Applies metadata filters (e.g., date range, category)
- Searches only within filtered subset
- Returns top 10 results
๐ก Filtering tips: Apply filters BEFORE vector search for better performance. PostgreSQL will use indexes on filter columns.
Understanding the Results
๐ Key Concepts
What is an Embedding?
An embedding is a vector. It represents the semantic meaning of text. Similar texts have similar embeddings.
Example:
- "machine learning" โ
[0.1, 0.2, 0.3, ...](384 numbers) - "artificial intelligence" โ
[0.12, 0.19, 0.31, ...](similar numbers) - "banana" โ
[0.9, 0.1, 0.2, ...](different numbers)
What is Distance?
Distance measures how similar two vectors are:
- Lower distance = more similar
- Higher distance = less similar
Distance metrics:
<=>- Cosine distance (0 = identical, 2 = opposite)<->- L2/Euclidean distance (0 = identical, โ = different)<#>- Inner product (higher = more similar)
What is HNSW Index?
HNSW stands for Hierarchical Navigable Small World. It is an index. It makes vector search fast.
- Without index: O(n) - checks every vector
- With HNSW: O(log n) - checks only a few vectors
Trade-off: Slightly less accurate but much faster.
Next Steps
Continue your journey:
- ๐ Read Architecture Guide to understand components
- ๐งช Try more SQL Recipes
- ๐ Explore Complete Documentation
- ๐ Check Troubleshooting Guide if needed
- ๐ค Try NeuronAgent Examples for agent workflows
- ๐ Explore NeuronMCP Integration for MCP tools
๐ก Tips for Success
Performance Tips
- Use indexes - HNSW indexes make search 100x faster
- Filter first - Apply WHERE clauses before vector search
- Limit results - Use LIMIT to avoid processing too many rows
- Batch operations - Use
embed_text_batchfor multiple embeddings
Development Tips
- Start simple - Get basic search working first
- Add complexity gradually - Try hybrid search after basic search works
- Use examples - Copy working examples from recipes
- Check logs - Use
docker compose logsto debug issues
Learning Tips
- Read the docs - Comprehensive documentation available
- Try examples - Hands-on learning is best
- Experiment - Try different queries and see what happens
- Ask questions - Check troubleshooting or community
โ Common Questions
Q: Why is my search slow?
A: Make sure you have an HNSW index:
Create HNSW index
CREATE INDEX ON quickstart_documents USING hnsw (embedding vector_cosine_ops);Q: How do I change the embedding model?
A: Use a different model name in embed_text():
Use different model
SELECT embed_text('text', 'sentence-transformers/all-mpnet-base-v2');Q: How do I use my own data?
A: Yes! Create your own table and load your data:
Create custom table
CREATE TABLE my_docs (id SERIAL, content TEXT, embedding vector(384));Q: How do I generate embeddings for my data?
A: Use embed_text() or embed_text_batch():
Generate embeddings
UPDATE my_docs SET embedding = embed_text(content, 'all-MiniLM-L6-v2');