NEURONDB/AI Database Extension
NeuronDB
PostgreSQL AI Extension for Vector Search, ML Inference and RAG Pipeline. AI applications in PostgreSQL with GPU acceleration, 52 ML algorithms, and hybrid search.
psql
Quickstart
CREATE EXTENSION neurondb;PostgreSQL 16 to 185 Vector Types52 ML Algorithms520+ SQL Functions
NeuronDB
AI Database Extension

Architecture
Architecture with vector search, ML inference, and RAG pipeline
NeuronDB Architecture
PostgreSQL 16-18
ACID | MVCC | WAL | Replication
Vector Engine
- • HNSW & IVF Indexing
- • 10+ Distance Metrics
- • Quantization
- • SIMD Optimized
ML Engine
- • 52 ML Algorithms
- • ONNX Runtime
- • Batch Processing
- • Pure C Implementation
Embedding Engine
- • Text Embeddings
- • Multimodal Support
- • Hugging Face Integration
- • Caching & Batching
GPU Accelerator
- • CUDA (NVIDIA)
- • ROCm (AMD)
- • Metal (Apple)
- • Auto Detection
Advanced Features
Hybrid Search
Vector + FTS
Reranking
Cross-encoder, LLM
RAG Pipeline
Complete In-DB
Background Workers
4 Workers
SQL API Layer
520+ SQL Functions | Operators | Types | Views
NeuronDB Capabilities
PostgreSQL AI Extension for Vector Search, ML Inference and RAG Pipeline. AI applications in PostgreSQL with GPU acceleration, 52 ML algorithms, and hybrid search.
NeuronDB
console • demo
v1.0ready
HNSW + IVF indexing
High-performance vector search with multiple distance metrics
Vector Searchdemo
-- Create vector table
CREATE TABLE embeddings (
id SERIAL PRIMARY KEY,
embedding vector(384),
metadata JSONB
);
-- Create HNSW index
CREATE INDEX ON embeddings
USING hnsw (embedding vector_cosine_ops);
-- Vector search
SELECT id, 1 - (embedding <=> query_vec) as similarity
FROM embeddings
ORDER BY embedding <=> query_vec
LIMIT 10;Results5 rows
| id | similarity | text |
|---|---|---|
| 42 | 0.9523 | Vector search enables semantic similarity matching in high-dimensional spaces using embeddings to find related content… |
| 38 | 0.9234 | HNSW indexes provide fast approximate nearest neighbor search for vector databases with logarithmic query time… |
| 35 | 0.8945 | RAG combines retrieval with generation for accurate LLM responses by finding relevant context first… |
| 31 | 0.8656 | Embeddings convert text into numerical vectors for machine learning models to process semantic meaning… |
| 28 | 0.8367 | PostgreSQL extensions enable vector operations directly in the database without external services… |
Performance
Query Time
8.42ms
Latency (P95)
12.5ms
QPS
8.2k
Status
ready
Query Statistics
Execution
Rows Returned5
Cache Hit96%
PlanOptimized
Connection
Databaseneurondb
Versionv1.0
Statusactive
Summary
Total Queries1,247
Success Rate99.8%
Avg Latency8.2ms
AI Database Features
Why NeuronDB
Vector Search & Indexing
- •5 vector types: vector (float32, up to 16K dims), vectorp (Product Quantization), vecmap (sparse vectors), vgraph (graph-based), rtext (retrieval text)
- •HNSW and IVF indexing with automatic tuning
- •10+ distance metrics: L2, Cosine, Inner Product, Manhattan, Hamming, Jaccard, and more
- •Product Quantization (PQ) and Optimized PQ (OPQ) for 2x-32x compression
- •DiskANN support for billion-scale vectors on SSD
ML & Embeddings
- •52 ML algorithms in pure C: Random Forest, XGBoost, LightGBM, CatBoost, SVM, KNN, Neural Networks, and more
- •Built-in embedding generation with intelligent caching
- •ONNX runtime integration for model inference
- •Batch processing with GPU acceleration
- •Model catalog with versioning and A/B testing
Hybrid Search & Retrieval
- •Combine vector similarity with full-text search (BM25)
- •Configurable weighted scoring (e.g., 70% vector + 30% text)
- •Multi-vector document support
- •Faceted search with category filters
- •Temporal decay for time-sensitive relevance ranking
Reranking
- •Cross-encoder neural reranking for precision improvement
- •LLM-powered scoring (GPT-4, Claude integration)
- •ColBERT late interaction models
- •MMR (Maximal Marginal Relevance) for diversity
- •Ensemble strategies combining multiple rankers
- •Sub-10ms latency for production workloads
RAG Pipeline
- •Complete Retrieval Augmented Generation in PostgreSQL
- •Intelligent document chunking and processing
- •Semantic retrieval with automatic reranking
- •LLM integration for answer generation
- •Context management and guardrails for content safety
- •RAG operations available directly in SQL
Background Workers
- •neuranq: Async job queue executor with SKIP LOCKED, retries, and batch processing
- •neuranmon: Live query auto-tuner for search params and cache optimization
- •neurandefrag: Automatic index maintenance, compaction, and rebuild scheduling
- •neuranllm: LLM job processing with crash recovery
- •All workers are tenant-aware with QPS and cost budgets
ML Analytics Suite
- •19 clustering algorithms: K-means, DBSCAN, GMM, Hierarchical (all GPU-accelerated)
- •Dimensionality reduction: PCA, PCA Whitening, OPQ
- •Outlier detection: Z-score, Modified Z-score, IQR
- •Quality metrics: Recall@K, Precision@K, F1@K, MRR, Silhouette Score
- •Drift detection: Centroid drift, Distribution divergence, Temporal monitoring
- •Analytics: Topic discovery, Similarity histograms, KNN graph building
GPU Acceleration
- •Multi-platform support: CUDA (NVIDIA), ROCm (AMD), Metal (Apple Silicon)
- •GPU-accelerated ML algorithms: Random Forest, XGBoost, LightGBM, SVM, KNN, and more
- •Batch distance computation with 100x speedup
- •Automatic GPU detection with intelligent CPU fallback
- •Multi-stream compute overlap for maximum throughput
- •Efficient memory management and allocation
Performance & Optimization
- •HNSW index building: 606ms for 50K vectors (128-dim), 10.1x faster than pgvector
- •SIMD-optimized distance calculations (AVX2, AVX-512, NEON)
- •In-memory graph building using maintenance_work_mem for optimal index construction
- •Efficient neighbor finding during insert (not after flush) for faster builds
- •Squared distance optimization avoiding sqrt() overhead in comparisons
- •Intelligent query planning with accurate cost estimates
- •ANN buffer cache for hot centroids and frequent queries
- •WAL compression with delta encoding
- •Parallel kNN execution across multiple cores
- •Predictive prefetching for reduced latency
- •Sub-millisecond searches on millions of vectors
Security
- •Vector encryption using AES-GCM via OpenSSL
- •Differential privacy for embedding protection
- •Row-level security (RLS) integration
- •Multi-tenant isolation with resource quotas
- •HMAC-SHA256 signed results for integrity verification
- •Audit logging with tamper detection
- •GDPR-compliant data handling and governance
Monitoring & Observability
- •pg_stat_neurondb view with real-time performance metrics
- •Worker heartbeats and watchdog monitoring
- •Query latency histograms and percentile tracking
- •Cache hit rate tracking and optimization insights
- •Recall@K monitoring for search quality
- •Model cost accounting and usage analytics
- •Prometheus exporter ready for integration
- •Structured JSON logging with neurondb: prefix
PostgreSQL Native Architecture
- •Pure C implementation following 100% PostgreSQL coding standards
- •520+ SQL functions, types, and operators
- •7 new monitoring views for comprehensive observability
- •Shared memory for efficient caching
- •WAL integration for durability and crash recovery
- •SPI for safe operations and transaction handling
- •Background worker framework integration
- •Standard extension with zero external dependencies
- •SIMD-optimized (AVX2, AVX-512, NEON) with runtime CPU detection
Start
Add AI Capabilities to PostgreSQL
Install NeuronDB. Build semantic search, RAG applications, and ML features in your PostgreSQL infrastructure.