NeuronDB: AI Database Extension for PostgreSQL
AI Database Extension for PostgreSQL. Vector search, ML inference, GPU acceleration, and RAG pipeline, all within PostgreSQL
NeuronDB A PostgreSQL AI-Extension Demo
Architecture
AI database architecture with vector search, ML inference, and RAG pipeline
NeuronDB Architecture
PostgreSQL 16-18
ACID | MVCC | WAL | Replication
Vector Engine
- • HNSW & IVF Indexing
- • 10+ Distance Metrics
- • Quantization
- • SIMD Optimized
ML Engine
- • 52 ML Algorithms
- • ONNX Runtime
- • Batch Processing
- • Pure C Implementation
Embedding Engine
- • Text Embeddings
- • Multimodal Support
- • Hugging Face Integration
- • Caching & Batching
GPU Accelerator
- • CUDA (NVIDIA)
- • ROCm (AMD)
- • Metal (Apple)
- • Auto Detection
Advanced Features
SQL API Layer
473 SQL Functions | Operators | Types | Views
Why NeuronDB
Vector Search & Indexing
5 vector types: vector (float32), vectorp (packed), vecmap (sparse map), vgraph (graph-based), rtext (retrieval text). HNSW and IVF indexing with automatic tuning. Multiple distance metrics: L2 (Euclidean), Cosine, Inner Product, Manhattan, Hamming, Jaccard. Product Quantization (PQ) and Optimized PQ (OPQ) for 2x-32x compression.
ML & Embeddings
52 ML algorithms implemented in pure C: Random Forest, XGBoost, LightGBM, CatBoost, Linear/Logistic Regression, Ridge, Lasso, SVM, KNN, Naive Bayes, Decision Trees, Neural Networks, Deep Learning. Built-in embedding generation with caching. ONNX runtime integration. Batch processing with GPU acceleration. Model catalog and versioning.
Hybrid Search & Retrieval
Combine vector similarity with full-text search (BM25). Weighted scoring (70% vector + 30% text). Multi-vector documents. Faceted search with category filters. Temporal decay for time-sensitive relevance. Optimal for real-world search scenarios.
Reranking
Cross-encoder neural reranking for precision improvement. LLM-powered scoring (GPT-4, Claude). ColBERT late interaction models. MMR (Maximal Marginal Relevance) for diversity. Ensemble strategies combining multiple rankers. Sub-10ms latency.
RAG Pipeline
Retrieval Augmented Generation in PostgreSQL. Document chunking and processing. Semantic retrieval with reranking. LLM integration for answer generation. Context management. Guardrails for content safety. RAG in SQL.
Background Workers
4 production workers: neuranq (async job queue executor with SKIP LOCKED, retries, poison handling, batch processing), neuranmon (live query auto-tuner for search params, cache rotation, recall@k tracking), neurandefrag (automatic index maintenance, compaction, tombstone pruning, rebuild scheduling), neuranllm (LLM job processing with crash recovery). All tenant-aware with QPS/cost budgets.
ML Analytics Suite
Analytics: K-means, Mini-batch K-means, DBSCAN, GMM, Hierarchical clustering (all GPU-accelerated). Dimensionality reduction: PCA, PCA Whitening, OPQ. Outlier detection: Z-score, Modified Z-score, IQR, Isolation Forest. Quality metrics: Davies-Bouldin Index, Recall@K, Precision@K, F1@K, MRR. Drift detection with temporal monitoring. Topic discovery and modeling.
GPU Acceleration
GPU support: CUDA (NVIDIA), ROCm (AMD), Metal (Apple Silicon). GPU-accelerated ML algorithms: Random Forest, XGBoost, LightGBM, Linear/Logistic Regression, SVM, KNN, Decision Trees, Naive Bayes, GMM, K-means. Batch distance computation (100x speedup). Automatic GPU detection with CPU fallback. Multi-stream compute overlap. Memory management.
Performance & Optimization
SIMD-optimized distance calculations (AVX2, AVX-512, NEON). Intelligent query planning with cost estimates. ANN buffer cache for hot centroids. WAL compression with delta encoding. Parallel kNN execution. Predictive prefetching. Sub-millisecond searches on millions of vectors.
Security
Vector encryption (AES-GCM via OpenSSL). Differential privacy for embeddings. Row-level security (RLS) integration. Multi-tenant isolation. HMAC-SHA256 signed results. Audit logging with tamper detection. Usage metering and governance policies. GDPR-compliant data handling.
Monitoring & Observability
pg_stat_neurondb view with real-time metrics. Worker heartbeats and watchdog. Query latency histograms. Cache hit rate tracking. Recall@K monitoring. Model cost accounting. Prometheus exporter ready. Structured JSON logging with neurondb: prefix.
PostgreSQL Native Architecture
Pure C implementation following 100% PostgreSQL coding standards. 144 source files + 64 headers, zero compiler warnings. PGXS build system. 473 SQL functions/types/operators. Shared memory for caching. WAL integration for durability. SPI for safe operations. Background worker framework. Standard extension, zero external dependencies, no core modifications.
Capabilities
AI database features
| Capability | Description | Performance | Production Ready |
|---|---|---|---|
| Vector Search | HNSW indexing, multiple distance metrics, quantization | Sub-millisecond on millions | ✓ |
| ML Inference | ONNX runtime, batch processing, embedding generation | High-throughput batch ops | ✓ |
| Hybrid Search | Vector + FTS, multi-vector, faceted, temporal | Optimized query planning | ✓ |
| Reranking | Cross-encoder, LLM, ColBERT, ensemble | GPU-accelerated support | ✓ |
| Background Workers | Queue executor, auto-tuner, index maintenance | Non-blocking async ops | ✓ |
| RAG Pipeline | Complete in-database RAG with document processing | End-to-end optimization | ✓ |
| ML Analytics | Clustering (K-means, DBSCAN, GMM), PCA, outlier detection, quality metrics, drift detection | GPU-accelerated algorithms | ✓ |
| GPU Acceleration | CUDA (NVIDIA), ROCm (AMD), Metal (Apple), 100x speedup on batch ops | Auto-detection with CPU fallback | ✓ |
| Performance Optimization | SIMD (AVX2/AVX-512/NEON), intelligent query planning, ANN cache, WAL compression | Predictive prefetching | ✓ |
| Security | Vector encryption (AES-GCM), differential privacy, RLS integration, multi-tenant isolation | GDPR-compliant | ✓ |
| Monitoring & Observability | pg_stat_neurondb view, worker heartbeats, latency histograms, Prometheus exporter | Real-time metrics | ✓ |
| PostgreSQL Native | Pure C implementation, 473 SQL functions, zero external dependencies, WAL integration | Zero core modifications | ✓ |
NeurondB vs. Alternatives
Comparison of NeurondB with other PostgreSQL AI and vector extensions
| Feature | NeurondB | pgvector | pgvectorscale | pgai | PostgresML |
|---|---|---|---|---|---|
| Vector Indexing | HNSW + IVF | HNSW + IVF | StreamingDiskANN | Uses pgvector | pgvector-based |
| ML Inference | ONNX (C++) | None | None | API calls | Python ML libs |
| Embedding Generation | In-database (ONNX) | External | External | External API | In-database (Transformers) |
| Hybrid Search | Native (Vector+FTS) | Manual | Manual | Manual | Manual |
| Reranking | Cross-encoder, LLM, ColBERT, MMR | None | None | None | None |
| ML Algorithms | 52 algorithms: RF, XGBoost, LightGBM, CatBoost, SVM, KNN, DT, NB, NN, K-means, DBSCAN, GMM, PCA, etc. | None | None | None | XGBoost, LightGBM, sklearn suite, Linear/Logistic |
| Background Workers | 4 workers: neuranq, neuranmon, neurandefrag, neuranllm | None | None | None | None |
| RAG Pipeline | Complete In-DB | None | None | Partial (API) | Partial (Python) |
| Quantization | FP16, INT8, Binary (2x-32x) | Binary only | Binary only | None | None |
| Implementation | Pure C | Pure C | Pure C | Rust + SQL | Python + C |
| Training Models | Fine-tuning (roadmap) | None | None | None | Full training (sklearn, XGBoost, etc.) |
| Auto-Tuning | neuranmon worker | None | None | None | None |
| GPU Support | CUDA + ROCm + Metal (native C/C++) | None | None | None | CUDA (via Python) |
| PostgreSQL Versions | 16, 17, 18 | 12-18 | 15-18 | 16-18 | 14-16 |
| License | PostgreSQL | PostgreSQL | Timescale License | PostgreSQL | PostgreSQL |
| Vector Types | 5 types: vector, vectorp, vecmap, vgraph, rtext | 1 type: vector | 1 type: vector | Uses pgvector | Uses pgvector |
| Distance Metrics | 10+ metrics: L2, Cosine, Inner Product, Manhattan, Hamming, Jaccard, etc. | 3 metrics: L2, Cosine, Inner Product | 3 metrics: L2, Cosine, Inner Product | Uses pgvector | Uses pgvector |
| SQL Functions | 473 functions | ~20 functions | ~30 functions | ~15 functions | ~50 functions |
| Index Maintenance | Auto (neurandefrag worker) | Manual | Manual | Manual | Manual |
| Performance (QPS) | 100K+ (with GPU) | 10K-50K | 50K-100K | Limited (API overhead) | 5K-20K (Python overhead) |
| Memory Efficiency | Optimized (PQ/OPQ compression) | Standard | Disk-based (low memory) | Standard | High (Python models) |
| Multi-tenancy | Native (tenant-aware workers) | None | None | None | None |
| Security | Row-level security, encryption, audit logs | PostgreSQL RLS | PostgreSQL RLS | PostgreSQL RLS | PostgreSQL RLS |
| Monitoring | pg_stat_neurondb, Prometheus, Grafana | Basic | Basic | Basic | Limited |
| Documentation | 473 functions documented | Good | Moderate | Moderate | Good |
| Community Support | Active (NeuronDB) | Very Active (Anthropic) | Moderate (Timescale) | Growing | Active |
| Readiness | Ready | Ready | Beta | Early stage | Ready |
| Dependencies | Zero (pure C, optional ONNX) | Zero (pure C) | Zero (pure C) | Rust runtime | Python + ML libraries |
| Batch Processing | Native (neuranq worker) | Manual | Manual | Limited | Native (Python) |
| Model Catalog | Built-in (versioning, A/B testing) | None | None | None | Basic |
| Cost Efficiency | High (in-DB, no API costs) | High (in-DB) | High (disk-based) | Low (API costs) | Moderate (Python overhead) |
Add AI Capabilities to PostgreSQL
Install NeurondB. Build semantic search, RAG applications, and ML features in your PostgreSQL infrastructure.